# Hall’s Marriage Theorem

Hall’s Marriage Theorem gives conditions on when the vertices of a bipartite graph can be split into pairs of vertices corresponding to disjoint edges such that every vertex in the smaller class is accounted for. Such a set of edges is called a matching. If the sizes of the vertex classes are equal, then the matching naturally induces a bijection between the classes, and such a matching is called a perfect matching.

The number of possible perfect matchings of $K_{n,n}$ is n!, which is a lot to check. Since it’s useful to have bijections, it’s useful to have matchings, so we would like a simple way to check whether a bipartite graph has a matching. Hall’s Marriage Theorem gives a way to reduce the number of things to check to $2^n$, which is still large. However, much more importantly, the condition for the existence of a matching has a form which is much easier to check in many applications. The statement is as follows:

Given bipartite graph G with vertex classes X and Y, there is a matching of X into Y precisely when for every subset $A\subset X$, $|\Gamma(A)|\ge |A|$, where $\Gamma(A)$ is the set of vertices joined to some vertex in A, called the neighbourhood of A.

Taking A=X, it is clear that $|Y|\ge |X|$ is a necessary condition for the result to hold, unsurprisingly. Perhaps the most elementary standard proof proceeds by induction on the size of X, taking the smallest A to give a contradiction, then using the induction hypothesis to lift smaller matchings up to the original graph. This lifting is based on the idea that a subset relation between sets induces a subset relation between their neighbourhoods.

In this post, I want to consider this theorem as a special case of the Max-Flow-Min-Cut Theorem, as this will support useful generalisations much more easily. The latter theorem is a bit complicated notationally to set up, and I don’t want it to turn into the main point of this post, so I will summarise. The Wikipedia article, and lots of sets of lecture notes are excellent sources of more detailed definitions.

The setting is a weakly-connected directed graph, with two identified vertices, the source, with zero indegree, and the sink, with zero outdegree. Every other vertex lies on a path (not necessarily unique) between the source and the sink. Each edge has a positive capacity, which should be thought of as the maximum volume allowed to flow down the edge. A flow is a way of assigning values to each edge so that they do not exceed the capacity, and there is volume conservation at each interior vertex. That is, the flow into the vertex is equal to flow out of the vertex. The value of the flow is the sum of the flows out of the source, which is necessarily equal to the sum of flows into the sink.

A cut is a partition of the vertices into two classes, with the source in one and the sink in the other. The value of a cut is then the sum of the capacities of any edges going from the class containing the source to the class containing the sink. In most examples, the classes will be increasing, in the sense that any path from source to sink changes class exactly once.

The Max-Flow-Min-Cut Theorem asserts that the maximum value of a flow through the system is equal to the minimum value of a cut. The proof is elementary, though it relies on defining a sensible algorithm to construct a minimal cut from a maximal flow that is not going to be interesting to explain without more precise notation available.

First we explain why Hall’s Marriage Theorem is a special case of this result. Suppose we are given the setup of HMT, with edges directed from X to Y with infinite capacity. We add edges of capacity 1 from some new vertex x_0 to each vertex of X, and from each vertex of Y to a new vertex y_0. The aim is to give necessary and sufficient conditions for the existence of a flow of value |X|. Note that one direction of HMT is genuinely trivial: if there is a matching, then the neighbourhood size condition must hold. We focus on the other direction. If the maximum flow is less than |X|, then there should be a cut of this size as well. We can parameterise a cut by the class of vertices containing the source, say S. Let A=SnX. If $\Gamma(A)\not\subset S$, then there will be an infinite capacity edge in the cut. So if we are looking for a minimal cut, this should not happen, hence $\Gamma(A)\subset S$ if S is minimal. Similarly, there cannot be any edges from X\A to $\Gamma(A)$. The value of the cut can then be given by

$|\Gamma(A)|+|X|-|A|$, which is at least |X| if the neighbourhood size assumption is given. Note we can use the same method with the original edges having capacity one, but we have to track slightly more quantities.

This topic came up because I’ve been thinking about fragmentation chains over this holiday. I have a specific example concerning forests of unrooted trees in mind, but won’t go into details right now. The idea is that we often have distributions governing random partitions of some kind, let’s say of [n]. Conditioning on having a given number of classes might give a family of distributions $P_{n,k}$ for the partitions of [n] into k parts. We would be interested to know how easy it is to couple these distributions in a nice way. One way would be via a coalescence or fragmentation process. In the latter, we start with [n] itself, then at each step, split one of the parts into two according to some (random, Markovian) rule. We are interested in finding out whether such a fragmentation process exists for a given distribution.

It suffices to split the problem into single steps. Can we get from $P_{n,k}$ to $P_{n,k+1}$?

The point I want to make is that this is just a version of Hall’s Marriage Theorem again, at least in terms of proof method. We can take X to be the set of partitions of [n] into k parts, and Y the set of partitions into (k+1) parts. Then we add a directed edge with infinite capacity between $x\in X$ and $y\in Y$ if y can be constructed from x by breaking a part into two. Finally, we connect a fresh vertex x_0 to each edge in X, only now we insist that the capacity is equal to $P_{n,k}(x)$, and similarly an edge from y to y_0 with capacity equal to $P_{n,k+1}(y)$. The existence of a fragmentation chain over this step is then equivalent to the existence of a flow of value 1 in the directed graph network.

Although in many cases this remains challenging to work with, which I will explore in a future post perhaps, this is nonetheless a useful idea to have in mind when it comes to deciding on whether such a construction is possible for specific examples.