I am aiming to write a short post about each lecture in my ongoing course on Random Graphs. Details and logistics for the course can be found here.
As we enter the final stages of the semester, I want to discuss some extensions to the standard Erdos-Renyi random graph which has been the focus of most of the course so far. Although we will not get far into the details during this course, the overall goal is to develop models which are close to Erdos-Renyi in terms of ease of analysis, while also allowing more of the features characteristic of networks observed in the real world.
One of the more obvious deficiencies of the sparse regime of Erdos-Renyi random graphs for modelling ‘real-world phenomena’ concerns the degree sequence. Indeed, the empirical degree distribution of G(n,c/n) converges to Poisson(c). By contrast, in real-world networks, a much wider range of degrees is typically observed, and in many cases it is felt that these should follow a power law, with a small number of a very highly connected agents.
One way around this problem to construct random graphs where we insist that the graph has a given sequence of degrees. The configuration model, which is the subject of this lecture and this post (and about which I’ve written before), offers one way to achieve this.
Definition and notes
Let and let
be a sequence of non-negative integers such that
is even. Then the configuration model with degree sequence d is a random multigraph with vertex set [n], constructed as follows:
- To each vertex
, assign
half-edges;
- Then, take a uniform matching of these half-edges;
- Finally, for each pair of half-edges in the matching, replace the two half-edges with a genuine edge, to obtain the multigraph
, in which, by construction, vertex i has degree
.
One should note immediately that although the matching is uniform, the multigraph is not uniform amongst multigraphs with that degree sequence. Note also that the condition on the sums of the degrees is necessary for any graph, and in this context means that the number of half-edges is even, without which it would not be possible to construct a matching.
This effect is manifest in the simplest possible example, when n=2 and d=(3,3). There are two possible graphs, up to isomorphism, which are shown below:
For obvious reasons, we might refer to these as the handcuffs and the theta , respectively. It’s helpful if we, temporarily, assume the half-edges are distinguishable at the moment we join them up in the configuration model construction. Because then there are 3×3=9 ways to join them up to form the handcuffs (think of which half-edge ends up forming the edge between the two vertices) while there are 3!=6 ways to pair up the half-edges in the theta.
In general, for multigraphs H with the correct degree sequence, we have
where is the multiplicity with which a given edge e appears in H.
Note: it might seem counterintuitive that this procedure is biased against multiple edges and self-loops, but it is really just saying that there are more ways to form two distinct edges than to form two equal edges (ie a multiedge pair) when we view the half-edges as distinguishable. (See this post for further discussion of this aspect in the 3-regular setting.)
However, a consequence of this result is that if we condition on the event that is simple, then the resulting random graph is uniform on the set of simple graphs satisfying the degree property. Note that the same example as above shows that there’s no guarantee that there exists a simple graph whose degrees are some given sequence.
d-regular configuration model
In general, from a modelling point of view, we are particularly interested in simple, connected graphs, and so it is valuable to study whether the large examples of the configuration model are likely to have these properties. In this lecture, I will mainly focus on the case where the multigraphs are d-regular, meaning that all the vertices have degree equal to d. For the purposes of this lecture, we denote by , the d-regular configuration model
.
- d=1: to satisfy the parity condition on the sums of degrees, we must have n even. But then
will consist of n/2 disjoint edges.
- d=2:
will consist of some number of disjoint cycles, and it is a straightforward calculation to check that when n is large, with high probability the graph will be disconnected.
In particular, I will focus on the case when d=3, which is the first interesting case. Most of the results we prove here can be generalised (under various conditions) to more general examples of the configuration model. The main goal of the lecture is revision of some techniques of the course, plus one new one, in a fresh setting, and the strongest possible versions of many of these results can be found amongst the references listed at the end.
Connectedness
In the lecture, we showed that is connected with high probability. This is, in fact, a very weak result, since in fact
is d-connected with high probability for
[Bol81, Wor81]. Here, d-connected means that one must remove at least d vertices in order to disconnect the graph, or, equivalently, that there are d disjoint paths between any pair of vertices. Furthermore, Bollobas shows that for
,
is a (random) expander family [Bol88].
Anyway, for the purposes of this course, the main tool is direct enumeration. The matching number satisfies
and so Stirling’s approximation gives the asymptotics
although it will be useful to use the true bounds
instead in some places. Anyway, in , there are 6n half-edges in total, and so the probability that the graph may be split into two parts consisting of
vertices, with
, and with no edges between the classes is
Continue reading