Under the usual matrices-as-linear-maps interpretation, a nxm matrix is a linear map from the free vector space on m elements to the free vector space on n elements, M : Vect(Y) -> Vect(X), where X and Y are as in the article.
For a regular map of finite sets Y -> X, we can draw a picture showing how each element of Y goes to an element of X by drawing a bunch of dots for Y and a bunch of dots for X with an arrow from each dot in Y to its image in X. This is a directed bipartite graph where each dot in Y is the source of exactly one edge.
How is a linear map, Vect(Y) -> Vect(X), different? Each element of Y goes not to a single element of X, but to a linear combination of elements of X. If y goes to a x1 + b x2, we can think of this as two arrows out of y, each one now being weighted, one arrow to x1 with weight a and one to x2 with weight b (considering edges with weight 0 to be non-existent if you like). This is the graph shown in the article.
This interpretation shows why matrix multiplication is path composition. It's exactly analogous to how for the map of finite sets, you find the image of z by first following its arrow to y in Y, then following its arrow to x in X. Here you first follow z to all its images in Y, then you follow those to all their images in X. Relation composition is only a special case.
And of course the connection between XxY->R and Vect(Y)->Vect(X) is the usual correspondence between a rank (1,1) tensor and a linear map.
Re probabilities: this corresponds to imagining a random process that transforms elements of Y into elements of X where y turns into x with probability p(x,y). The probability that you end up with x (the marginal probability) is the probability that any dot in Y transformed into x, ie. sum over all the arrows that wind up at x.
There is an API formalizing this style of matrix/graph computation called GraphBLAS, coming soon to a language and database near you.
Here is an easy to read but math focused introduction to the topic by Dr Jeremy Kepner Head and Founder of the MIT Lincoln Laboratory Supercomputing Center:
It's true. I look forward to seeing more language bindings, like Python, Javascript, etc. I expect it will be soon. For my own GraphBLAS education I have been working on a PostgreSQL extension:
The semi-ring described at the end on Z_2 = {0,1} with the 'interesting' addition function where 1+1=1 might be more familiar here as Boolean logic, where:
0 -> False
1 -> True
x -> AND
+ -> OR
(a AND b)=True iff a=True and b=True; (axb)=1 iff a=1 and b=1
(a OR b)=False iff a=False and b=False; (a+b)=0 iff a=0 and b=0
And vice versa. Take a graph, write it down as a matrix, apply a discrete Laplacian operator. Then - just as an example - find the number of distinct non-negative eigenvalues. This is exactly the number of connected components in the graph. Pure magic.
A fun before and after for me was when trying to explore correlations over samples with a lot of features. In particular, on survey results on why teams adopt different programming languages:
There's also graphical linear algebra which takes matrices into a more category theory interpetation that is also visual: https://graphicallinearalgebra.net
> [...] what do matrices corresponding to equivalence relations look like?
Actually, the graphic under "Block matrices correspond to disconnected graphs" illustrates this for us already! (Just mentally replace all the non-zero entries with 1; the graphs structurally remain the same.)
I initially thought it would be "a disconnected set of cliques", but we've already established that we're modeling rectangular matrices as bipartite graphs between the input and output spaces. Square matrices in particular can be considered to operate over the same vector space, so we could have more arbitrary graphs -- such as cliques.
Just in that case you can start with any graph, not necessarily bipartite, so you don't divide vertices into 2 sets X and Y, just use the same set of all vertices V twice, so the matrix is always square:
V * V -> R
I wonder if the properties from this article still hold on adjacency matrices. Would be very useful.
Yes, it's really the same graph. There really no difference between drawing two copies of V with arrows going from one to the other, and drawing one copy of V with arrows just staying in that copy. Either way, if you trace the arrows from a given vertex you will end up at the same spots.
So for example the interpretation of matrix multiplication as path composition given here explains why the nth power of the adjacency matrix gives the number of length-n walks.
(I think the precise relation is that if A the adjacency matrix for G, then if you draw the graph from this article for the matrix A, the adjacency matrix of that graph will be the block matrix
0 A
A 0
That's assuming the edges in the article are regarded as undirected; if they are directed, just replace one of the As with 0.)
In the matrix-as-arrows-from-dots-in-Y-to-dots-in-X picture, the transpose means "turn all the arrows around". This also explains the connection between the transpose and the categorical dual.
It's crazy to me that I've had graph theory on university, and many graph algorithms, and I don't remember ever hearing that multiplicating adjacency matrices have this properties.
Given this seamless transition between linear algebra and graph theory, could someone summarize how exactly one or the other formulation helps to solve problems better?
I wouldn't want to read up a huge list of things, just trying to get a sense of what's being achieved by this kind of (re)formulation.
I've been thinking about this lately. Specifically, I see a connection between systems theory and graph theory. My intuition tells me that all relationships between systems can be expressed as graphs.
Can anyone confirm this? I'm not very familiar with either field, so I may be asking a basic question.
You would have to be more precise in describing what you mean by "all relationships between systems". A simple graph (meaning vertices and edges with no additional information) can encode the notions of a system and the existence of relationships between two systems, but without more structure you can't day much more. But this is already useful. If you have a big system you're trying to understand, a good first step is to draw a graph of the major components with edges indicating a dependency.
Systems interact through inputs and outputs, right? I mean, one system's output can be another system's input and even the second system's output can be the first one's input.
I guess my question was if there's something that can't be described or modelled with graphs.
As I said, my intuition is fuzzy, because I don't know much graph theory.
I do recall looking up the definition of 'bijection', and seeing that it slightly resembles a graph. As a generalization: so an n-dimensional tensor can be viewed as a weighted n-partite graph?
An n-partite graph has the problem that edges can still only connect two vertices at a time. I think you would also need to include hyper-edges (faces, solids, etc). The joint probabilities would be represented by these hyper-edges, and the bounding ("lower") edges represent progressively more marginal probabilities. Higher category theory studies these kinds of objects, so it wouldn't particularly surprise me.
No, because the space of tensors of order n with k-dimensional arguments is k^n-dimensional, while a complete n-partite graph with partitions of size k has only nk(n-1)k/2 edges, which doesn't give you enough coefficients to uniquely identify a tensor if n > 2.
You'd need a n-uniform n-partite hypergraph, where coefficients are assigned to n-tuples instead of pairs. So for a tensor of order 3, that would mean weighted triangles, which are somewhat hard to draw and everything gets less intuitive.
Under the usual matrices-as-linear-maps interpretation, a nxm matrix is a linear map from the free vector space on m elements to the free vector space on n elements, M : Vect(Y) -> Vect(X), where X and Y are as in the article.
For a regular map of finite sets Y -> X, we can draw a picture showing how each element of Y goes to an element of X by drawing a bunch of dots for Y and a bunch of dots for X with an arrow from each dot in Y to its image in X. This is a directed bipartite graph where each dot in Y is the source of exactly one edge.
How is a linear map, Vect(Y) -> Vect(X), different? Each element of Y goes not to a single element of X, but to a linear combination of elements of X. If y goes to a x1 + b x2, we can think of this as two arrows out of y, each one now being weighted, one arrow to x1 with weight a and one to x2 with weight b (considering edges with weight 0 to be non-existent if you like). This is the graph shown in the article.
This interpretation shows why matrix multiplication is path composition. It's exactly analogous to how for the map of finite sets, you find the image of z by first following its arrow to y in Y, then following its arrow to x in X. Here you first follow z to all its images in Y, then you follow those to all their images in X. Relation composition is only a special case.
And of course the connection between XxY->R and Vect(Y)->Vect(X) is the usual correspondence between a rank (1,1) tensor and a linear map.
Re probabilities: this corresponds to imagining a random process that transforms elements of Y into elements of X where y turns into x with probability p(x,y). The probability that you end up with x (the marginal probability) is the probability that any dot in Y transformed into x, ie. sum over all the arrows that wind up at x.