Conditional Expectations

To define an expectation conditional on an event in a probability space is essentially no more than defining a conditional probability, then constructing the expectation as an integral with respect to this measure. At an abstract level, it is often more useful to define an expectation conditional on a sigma-algebra. Informally, we want to define the expectation conditional on every event in a sigma-algebra simultaneously. Most importantly, the result is measurable with respect to this sigma-algebra, so is a random variable itself under suitable conditions.

We want to proof that such a construction exists. We take X a \mathcal{F}-measurable random variable, and consider a sub-sigma-algebra \mathcal{G}\subset\mathcal{F}. We want to define Y=\mathbb{E}[X|\mathcal{G}], which is integrable, \mathcal{G}-measurable, and such that

\mathbb{E}X1_A=\mathbb{E}Y1_A, for all A\in\mathcal{G}.

We also want to show that this conditional expectation is, up to null events, unique. When \mathcal{G}=\sigma(B_i,i\in I) is generated by a countable collection of disjoint events, then we can define

Y:=\sum_{i\in I}\mathbb{E}[X|B_i]1_{B_i},

and verify that this satisfies the conditions.

We proceed in the general case. Uniqueness is easy. Suppose have Y,Y' satisfying the conditions. Then the event A=\{Y>Y'\}\in\mathcal{G}. Substituting into the definition gives:


But (Y-Y')1_A\geq 0 and so we conclude that \mathbb{P}(A)=0, and Y'\geq Y almost surely. Of course the reverse argument applies also, and so Y=Y' a.s.

For existence, we exploit a property of Hilbert spaces. We initially assume X\in L^2. We can decompose the host space as

L^2(\mathcal{F})=L^2(\mathcal{G})+L^2(\mathcal{G})^{\perp} and $X=Y+Z$

in this orthogonal projection. The operator on this space is \langle X,Y\rangle:=\mathbb{E}XY, and so

1_A\in L^2(\mathcal{G})\Rightarrow \mathbb{E}[Z1_A]=0,

From this, we conclude that Y is suitable. For what follows, observe that \{Y<0\} is \mathcal{G}-measurable, and so by a similar argument to before, X\geq 0 a.s. implies Y\geq 0 a.s.

For general X\geq 0, set X_n:=X\wedge n\uparrow X, and Y_n:=\mathbb{E}[X_n|\mathcal{G}]. By our previous observation, everything relevant is almost surely non-negative, and we can apply monotone convergence to both sides of the relation


to obtain the definition, and take A=\Omega to check integrability. Separating general X into positive and negative parts gives the result for generally supported random variables.