# 2.3.2 - Moments

2.3.2 - Moments

Many of the elementary properties of the multinomial can be derived by decomposing $$X$$ as the sum of iid random vectors,

$$X=Y_1+\cdots+Y_n$$

where each $$Y_i \sim Mult\left(1, \pi\right)$$. In this decomposition, $$Y_i$$ represents the outcome of the $$i$$th trial; it's a vector with a 1 in position $$j$$ if $$E_j)$$ occurred on that trial and 0s in all other positions. The elements of $$Y_i$$ are correlated Bernoulli random variables. For example, with $$k=2$$ possible outcomes on each trial, then $$Y_i=(\# E_1,\# E_2)$$ on the $$i$$th trial, and the possible values of $$Y_i$$ are

(1, 0) with probability $$\pi_1$$,

(0, 1) with probability $$\pi_2 = 1− \pi_1$$.

Because the individual elements of $$Y_i$$ are Bernoulli, the mean of $$Y_i$$ is $$\pi = \left(\pi_1, \pi_2\right)$$, and its covariance matrix is

\begin{bmatrix} \pi_1(1-\pi_1) & -\pi_1\pi_2 \\ -\pi_1\pi_2 & \pi_2(1-\pi_2) \end{bmatrix}

Establishing the covariance term (off-diagonal element) requires a bit more work, but note that intuitively it should be negative because exactly one of either $$E_1$$ or $$E_2$$ must occur.

More generally, with $$k$$ possible outcomes, the mean of $$Y_i$$ is $$\pi = \left(\pi_1, \dots , \pi_k\right)$$, and the covariance matrix is

\begin{bmatrix} \pi_1(1-\pi_1) & -\pi_1\pi_2 & \cdots & -\pi_1\pi_k \\ -\pi_1\pi_2 & \pi_2(1-\pi_2) & \cdots & -\pi_2\pi_k \\ \vdots & \vdots & \ddots & \vdots \\ -\pi_1\pi_k & -\pi_2\pi_k & \cdots & \pi_k(1-\pi_k) \end{bmatrix}

And finally returning to $$X=Y_1+\cdots+Y_n$$ in full generality, we have that

$$E(X)=n\pi=(n\pi_1,\ldots,n\pi_k)$$

with covariance matrix

\begin{bmatrix} n\pi_1(1-\pi_1) & -n\pi_1\pi_2 & \cdots & -n\pi_1\pi_k \\ -n\pi_1\pi_2 & n\pi_2(1-\pi_2) & \cdots & -n\pi_2\pi_k \\ \vdots & \vdots & \ddots & \vdots \\ -n\pi_1\pi_k & -n\pi_2\pi_k & \cdots & n\pi_k(1-\pi_k) \end{bmatrix}

Because the elements of $$X$$ are constrained to sum to $$n$$, this covariance matrix is singular. If all the $$\pi_j$$s are positive, then the covariance matrix has rank $$k-1$$. Intuitively, this makes sense since the last element $$X_k$$ can be replaced by $$n − X_1− \dots − X_{k−1}$$; there are really only $$k-1$$ "free" elements in $$X$$. If some elements of $$\pi$$ are zero, the rank drops by one for every zero element.

 [1] Link ↥ Has Tooltip/Popover Toggleable Visibility