概率论 (1) 期末考点啥

不想存一堆照片复习还要靠翻相册,只是写在纸上的话又总是弄丢,为什么不写到摸鱼学导论上呢水一篇吧

被这学期突然增加的期中整怕了,lzx 的题不背一下的话还真不是每个都好做。

不知道为什么但是免责声明一下:仅供参考,记了但是没考 / 考了但是没记到的话跟我没关系,我期中后连早八都一节不漏的在上,属于是真的尽力了(

另外我的常用记号有一些和 lzx 不一样但其实应该更标准 / 普遍,比如我认为分布函数是右连续的,\(F_X(x) = \mathbb P( X \leq x)\)\(n\) 元正态分布的符号写作 \(N_n(\mu, \Sigma)\) 而非 \(N(\vec{a},B)\),等等。

多元正态分布

我认为以下三个问题都很 trivial

统计推断经典问题

据说有 \(\frac 2 3\) 的可能性期末会考,这很概率论(

对于 $X_1, X_2, ..., X_n $ i.i.d. \(\sim N(a,\sigma^2)\),记 \(\bar X = \frac{1}{n} \Sigma_{i=1} ^n X_i\)\(S_n ^2 =\frac{1}{n-1} \Sigma_{i=1} ^n (X_i - \bar X)^2\),证明:

  • \(\bar X\) 和 $S_n ^2 $ 独立;

  • \(\bar X \sim N(a, \frac{\sigma^2}{n})\)

  • \(\frac{(n-1)S_n ^2}{\sigma ^2} \sim \chi^2 (n-1)\)

Proof:

  • 取多元正态分布 \(X = (X_1 , X_2 ,..., X_n) ^T \sim N_n(\mu, \Sigma)\),其中 \(\mu = (a,a,...,a)^T,\Sigma = \sigma ^2 I_{n \times n}\)

    \(C = \begin{bmatrix} \frac{1}{\sqrt n} & \frac{1}{\sqrt n} & ... & \frac{1}{\sqrt n} \\ * & * &... & * \\ ... & ... & ... & ... \\ * & * & ... & * \end{bmatrix}\),适当取其余元素使得 \(C\) 是一个正交阵,满足 \(C^TC = CC^T = I_{n \times n}\)

    \(Y = (Y_1, Y_2,...,Y_n)^T = C X \sim N_n (C \mu , C \Sigma C^T)\),注意到 \(C \mu = (\sqrt n a ,0,0,...,0)^T, C\Sigma C^T = \sigma^2 I_{n \times n}\),因此有 \(Y_1 , Y_2, ..., Y_n\) 相互独立,\(Y_1 \sim N(\sqrt n a,\sigma^2)\)\(Y_2,Y_3,...,Y_n\) i.i.d. \(\sim N(0,\sigma^2)\)

    \(\bar X\)\(S_n\) 表示为 \(\{Y_i \}\) 的组合:

    \[\bar X = \frac 1 n \Sigma_{i=1}^n X_i =\frac{1}{ \sqrt n} Y_1\]

    \[(n-1)S_n = \Sigma_{i=1} ^n X_i^2 - n \bar X ^2 = X^TX - n \bar X^2 = X^TC^TCX - n \bar X^2 =Y^TY - Y_1 ^2 = \Sigma_{i=1}^n Y_i^2 - Y_1 ^2 = \Sigma_{i=2}^n Y_i ^2 \]

    \(Y_1 , Y_2, ..., Y_n\) 相互独立可知 \(\frac{1}{\sqrt n} Y_1\)\(\Sigma_{i=2} ^n Y_i ^2\) 相互独立,因此有 \(\bar X\)\(S_n\) 独立。

  • \(\bar X = \frac 1 n \Sigma_{i=1} ^n X_i = \begin{bmatrix} \frac 1 n & \frac 1 n & ... & \frac 1 n \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \\ ... \\ X_n \end{bmatrix} = \frac 1 n \mathbb 1_n^T X \sim N(\frac 1 n \mathbb 1_n^T \mu, \frac 1 n \mathbb 1^T_n \Sigma \frac 1 n \mathbb 1_n ) = N(a,\frac{\sigma^2}{n})\)

  • 在第一问的证明中已经得到\((n-1)S_n = \Sigma_{i=2}^n Y_i ^2\),且 \(Y_2,Y_3,...,Y_n\) i.i.d. \(\sim N(0,\sigma^2)\),因此有:

    \[\frac{Y_2}{\sigma},\frac{Y_3}{\sigma},...,\frac{Y_n}{\sigma} \text{i.i.d.} \sim N(0,1)\]

    \[\frac{(n-1)S_n}{\sigma^2} = \Sigma_{i=2}^n (\frac{Y_i}{\sigma})^2 \sim \chi^2(n-1)\]

Remark: 实际上就是中文教材上的定理 3.5.11,书上那个做法和我在统计推断课上学到的是一样的,但是写起来不如这个漂亮。另外 lzx 在课上用的也是这个方法,或许考试也是写这个最好

多元正态分布的计算问题

\(X = (X_1,X_2,X_3)^T \sim N_3(\mu, \Sigma)\) 服从 \(3\) 元正态分布,其中 \(\mu = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\)\(\Sigma = \begin{bmatrix} 4 & 2 & -2 \\ 2 & 10 &2 \\ -2 &2 &5 \end{bmatrix}\),求:

  • \(\mathbb E(X_1 - 2X_2 + X_3 | X_1 + X_2 + X_3)\)
  • \(Var(X_1 - 2X_2 +X_3 | X_1 + X_2 + X_3)\)

Solution:

\[\begin{bmatrix} U \\ V \end{bmatrix}= \begin{bmatrix}X_1 - 2X_2 +X_3 \\ X_1 +X_2+X_3 \end{bmatrix} = \begin{bmatrix} 1 & -2 & 1 \\ 1 & 1 &1 \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \\ X_3 \end{bmatrix} = CX \sim N_2(C\mu, C\Sigma C^T)\]

其中

\[C\mu = \begin{bmatrix} 1 & -2 & 1 \\ 1 & 1 &1 \end{bmatrix} \begin{bmatrix} 1 \\ 2\\ 3 \end{bmatrix} = \begin{bmatrix} 0 \\6 \end{bmatrix}\]

\[C\Sigma C^T = \begin{bmatrix} 1 & -2 & 1 \\ 1 & 1 &1 \end{bmatrix} \begin{bmatrix} 4 & 2 & -2 \\ 2 & 10 &2 \\ -2 &2 &5 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ -2 & 1 \\ 1 & 1 \end{bmatrix} =\begin{bmatrix} 29 & -19 \\ -19 & 23\end{bmatrix}\]

对于二元正态分布 \(\begin{bmatrix} U \\ V \end{bmatrix} \sim N_2(\begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}, \begin{bmatrix} \sigma_1 ^2 & r\sigma_1 \sigma_2 \\ r\sigma_1 \sigma_2 & \sigma_2 ^2 \end{bmatrix})\) 有结论

\[\mathbb E(U|V=v) = \mu_1 + r \frac{\sigma_1}{\sigma_2} (v-\mu_2)\]

\[Var(U|V) = \sigma_1 ^2 (1-r^2)\]

代入可知 \(\mathbb E(U|V) = -\frac{19}{23}(X_1 + X_2 + X_3 -6)\)\(Var(U|V) = \frac{306}{23}\)

Remark 1: 我也不知道我算对了没有,反正考试也会改数字的,不管了(

Remark 2: 事实上我不喜欢这个做法,用二元正态分布的性质这个操作并不够本质,我还以为 lzx 会在课上讲下面这个定理,但他也没讲,好失望。写一个对 \(p_1 + p_2\) 维正态分布来说都正确的定理在这里好了。展现底力的时候到了

Lemma: 对于 \(p_1 + p_2\) 元正态分布 \(\begin{bmatrix}X_1 \\ X_2 \end{bmatrix} \sim N_{p_1 + p_2}(\begin{bmatrix}\mu_1 \\ \mu_2 \end{bmatrix}, \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix}) = N_p(\mu, \Sigma)\),注意此处 \(X_1,X_2\) 分别是 \(p_1,p_2\) 元正态分布,则有 \(X_1\) 的条件分布为:

\[X_1|_{X_2 = x_2} \sim N(\mu_1 +\Sigma_{21} \Sigma_{22}^{-1} (x_2 - \mu_2),\Sigma_{11} - \Sigma_{12 } \Sigma_{22}^{-1} \Sigma_{21})\]

Proof:

\(C = \begin{bmatrix} I_{p_1 \times p_1} & -\Sigma_{12} \Sigma_{22}^{-1} \\ 0_{p_2 \times p_1} & I_{p_2 \times p_2}\end{bmatrix}\),记 \(\begin{bmatrix}Y_1 \\ Y_2 \end{bmatrix} = C \begin{bmatrix} X_1 - \mu_1 \\ X_2 - \mu_2 \end{bmatrix} \sim N_{p_1 + p_2}(0,C\Sigma C^T)\),注意到:

\[C\Sigma C^T = \begin{bmatrix} I_{p_1 \times p_1} & -\Sigma_{12} \Sigma_{22}^{-1} \\ 0_{p_2 \times p_1} & I_{p_2 \times p_2}\end{bmatrix} \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} \begin{bmatrix} I_{p_1 \times p_1} & 0_{p_1 \times p_2} \\ -\Sigma_{22}^{-1} \Sigma_{21} & I_{p_2 \times p_2}\end{bmatrix} = \begin{bmatrix} \Sigma_{11} - \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} & 0 \\ 0 & \Sigma_{22} \end{bmatrix}\]

因此有:

\[Y_1 = (X_1 - \mu_1) - \Sigma_{12} \Sigma_{22}^{-1} (X_2 - \mu_2) \sim N_{p_1} (0,\Sigma_{11}-\Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21})\]

于是 \(X_1\) 的条件分布为 \(X_1|_{X_2 = x_2} \sim N(\mu_1 +\Sigma_{21} \Sigma_{22}^{-1} (x_2 - \mu_2),\Sigma_{11} - \Sigma_{12 } \Sigma_{22}^{-1} \Sigma_{21})\),得证。

Remark 3: 据说这个题占 10 分,大概是说会作为半个题出现?

多元正态分布的反例

我们知道 \(X = (X_1 , X_2,...,X_n) ^T \sim N_n(\mu, \Sigma) \iff \forall a \in \mathbb R ^n, Y= a^TX \sim N(a^T \mu, a^T \Sigma a)\)

那么是否有对于任意的一元正态分布 \(X_k \sim N(\mu_k , \sigma_k ^2)\),均可推出 \(X = (X_1, X_2,...,X_n)^T \sim N_n(\mu, \Sigma)\)?换言之,任意 \(n\) 个服从一元正态分布的随机变量组合成的随机向量是否一定服从 \(n\) 元正态分布?

Solution: 答案是否定的,以下构造反例:

对于 \(X,Y\) i.i.d.\(\sim N(0,1)\),记 \(Z = \begin{cases} |Y| , \quad X \geq 0 \\ -|Y| \quad X <0\end{cases}\),于是有 \(Z \sim N(0,1)\)。这是因为:

\[\begin{aligned} \forall x >0 , F_Z(x) = \mathbb P(Z \leq x )& =\mathbb P(Z \leq x, X <0) + \mathbb P(Z \leq x , X \geq 0) \\ &=\mathbb P(X < 0)\mathbb P(Z\leq x | X<0) +\mathbb P(X \geq 0 )\mathbb P(Z\leq x | X \geq 0) \\&= \frac 1 2 +\frac 1 2 \mathbb P(Z\leq x|X\geq 0) \\&= \frac 1 2 + \frac 1 2 \mathbb P(-x \leq Y \leq x) \\ &=\frac 1 2 + \frac 1 2 \mathbb P(0 < Y \leq x) \\ &= \mathbb P(Y \leq x) = \Phi(x) \end{aligned}\]

其中 \(\Phi(x)\) 指标准正态分布的分布函数。同理 \(\forall x \leq 0\) 也有 \(F_Z(x) = \Phi(x)\),因此 \(Z \sim N(0,1)\)

以下证明即使有 \(Z,Y \sim N(0,1)\),但 \(\begin{bmatrix}Z \\ Y \end{bmatrix}\) 不服从二元正态分布。否则若有 \(\begin{bmatrix}Z \\ Y \end{bmatrix}\) 服从二元正态分布,则 \(Z+Y\) 服从正态分布,从而是一个连续型分布,\(\mathbb P(Z+Y=0)=0\)。事实上,

\[\begin{aligned} \mathbb P(Z+Y=0)& =\mathbb P(Z+Y=0,X\geq 0)+\mathbb P(Z+Y=0 ,X<0)\\ &=\mathbb P(Z+Y=0|X\geq 0)\mathbb P(X \geq 0)+ \mathbb P(Z+Y=0 |X<0) \mathbb P(X<0) \\&=\frac 1 2 \mathbb P(|Y|+Y=0|X\geq 0)+\frac 1 2 \mathbb P(-|Y|+Y=0 |X<0) \\&=\frac 1 2 \mathbb P(Y \leq 0)+ \frac 1 2 \mathbb P(Y>0) \\&= \frac 1 2 \end{aligned}\]

矛盾。因此虽然有 \(Z,Y \sim N(0,1)\)\(\begin{bmatrix}Z \\ Y \end{bmatrix}\) 不服从二元正态分布,反例已举出。

Remark: 据说这个题也是 10 分,不知道是不是要和上面那个合起来出一道题。

收敛定理

四种收敛之间的关系

实际上是六个小问题。(注:依分布收敛 lzx 一般都写成 \(X_n \stackrel{d}{\rightarrow} X\),但我一般都叫它弱收敛,写作 \(X_n \stackrel{w}{\rightarrow} X\) 或者按照 Durrett 上的记号是 \(X_n \Rightarrow X\)。)

  • \(X_n \stackrel{a.s.}{\rightarrow} X\) 可以推出 \(X_n \stackrel{P}{\rightarrow} X\)

    Proof:

    这里需要用到一个引理: \(X_n \stackrel{P}{\rightarrow} X \iff\)\(\{X_n \}\) 的任意子序列 \(\{X_{nk} \}\),存在一个 \(\{X_{nk}\}\) 的子序列 \(\{X_{nkj} \}\) 满足 \(X_{nkj} \stackrel{a.s.}{\rightarrow} X\)

    如果 \(X_n \stackrel{a.s.}{\rightarrow} X\) 成立,则显然对 \(\{X_n \}\) 的任意子序列 \(\{X_{nk} \}\),它的任意一个子序列 \(\{X_{nkj} \}\) 也满足 \(X_{nkj} \stackrel{a.s.}{\rightarrow} X\),因此有 \(X_n \stackrel{P}{\rightarrow} X\)。以下来证明上述的引理。

    • 固定一列 \(\{\varepsilon_{k}\} \downarrow 0\),对 \(\{X_n \}\) 的任意子序列 \(\{X_{nk} \}\),存在 \(n(m_k) > n(m_{k-1})\) 使得 \(\mathbb P(|X_{n(m_k) }-X| > \varepsilon_k) < \frac{1}{2^k}\) 对任意 \(k \in \mathbb Z_+\) 成立。于是有 \(\sum_{k=1}^{+\infty} \mathbb P(|X_{n(m_k)}-X| > \varepsilon_k)=1 < +\infty\),由 Borel-Cantelli lemma 可知 \(\mathbb P(|X_{n(m_k)}-X| > \varepsilon_k \quad i.o.)=0\),也即 \(\{X_{n(m_k)}\}\stackrel{a.s.}{\rightarrow} X\) 是符合条件的子列,由 \(\{X_{nk} \}\) 的任意性可知充分性成立。

    • 如果对 \(\{X_n\}\) 的子序列 \(\{X_{nk} \}\) 总存在它的子序列 \(\{X_{nkj}\} \stackrel{a.s.}{\rightarrow} X\),固定 \(\delta >0\),考虑数列 \(y_n = \mathbb P(|X_n-X| > \delta)\)。也就是说,对 \(\{y_n \}\) 的任意子序列 \(\{y_{nk}\}\) 总存在它的子序列满足 \(\{y_{nkj}\} \to 0\),因此有 \(\{y_n\} \to 0\),也即 \(\lim_{n \to \infty} \mathbb P(|X_n - X| > \delta) =0\)\(X_n \stackrel{P}{\rightarrow} X\)

      (注:否则有 \(\{y_n \}\) 不收敛到 \(0\),存在它的一个子序列 \(\{y_{ns}\}\) 收敛到 \(c \neq 0\),因此 \(\{y_{ns}\}\) 的任意子序列也都收敛到 \(c \neq 0\),这与题设矛盾。)

    引理得证,随即有 \(X_n \stackrel{a.s.}{\rightarrow} X\) 可以推出 \(X_n \stackrel{P}{\rightarrow} X\)

  • \(X_n \stackrel{P}{\rightarrow} X\) 不能推出 \(X_n \stackrel{a.s.}{\rightarrow}X\)

    Counterexample: 取 \(\Omega=[0,1), \mathcal F = \mathcal B (\Omega)\),概率测度 \(\mathbb P\) 即为 \(\Omega\) 上的 Lebesgue 测度。

    \(X_{11}(\omega) = 1\), \(X_{21}(\omega) = \begin{cases}1 \quad \omega \in [0,\frac 1 2) \\ 0 \quad \omega \in [\frac 1 2 , 1) \end{cases}, X_{22}(\omega) = \begin{cases}0 \quad \omega \in [0,\frac 1 2) \\ 1 \quad \omega \in [\frac 1 2 , 1) \end{cases}\), 以此类推。

    也即,对 \(\forall k \in \mathbb Z^+\),取 \(X_{ki} (\omega) = \begin{cases}1 \quad \omega \in [\frac{i-1}{k} ,\frac i k) \\ 0 \quad \text{otherwise} \end{cases} , i = 1,2,...,k\)

    于是有 \(\{ X_{ki}\}\) 是一族可数随机变量列,将其重新排列为 \(\{ Y_i \}_{i=1} ^{+\infty}\) 使得 \(Y_1 = X_{11}, Y_2 = X_{21}, Y_{3} = X_{22},...\) ,以下说明 \(Y_n \stackrel{P}{\rightarrow} 0\)\(Y_n \stackrel{a.s.}{\rightarrow} 0\) 不成立。

    事实上对 \(\forall \varepsilon >0\)\(\mathbb P(|X_{ki} - 0| > \varepsilon) = \mathbb P(X_{ki} =1) = \frac 1 k\), \(\lim _{n \to \infty} \mathbb P(|Y_n - 0 | > \varepsilon) =0\),因此有 \(Y_n \stackrel{P}{\rightarrow} 0\)

    然而对于 \(\forall k \in \mathbb Z^+\),总存在 \(n_1, n_2 > k\) 使得对 \(\forall \omega \in [0,1)\),有 \(Y_{n_1} (\omega) =1, Y_{n_2} (\omega) = 0\)。因此 \(\mathbb P(\lim_{n \to \infty} |Y_n(\omega) -0 | =0 )=0\) 对任意的 \(\omega \in [0,1)\) 成立,即 \(\{Y_n \}_{n=1} ^{+\infty}\) 在任意点处都不收敛到 \(0\)\(Y_n \stackrel{a.s.}{\rightarrow} 0\) 不成立。

  • \(X_n \stackrel{L^p}{\rightarrow} X\) 可以推出 \(X_n \stackrel{P}{\rightarrow} X\)

    Proof: 由 Markov 不等式,\(\mathbb P(\omega : |X_n(\omega) - X(\omega)| > \varepsilon) \leq \frac{\mathbb E |X_n(\omega) - X(\omega)|^p}{\varepsilon ^p}\)\(\forall \varepsilon >0\) 成立。

    \(X_n \stackrel{L^p}{\rightarrow} X\) 时有 \(\lim _{n \to \infty} \mathbb E|X_n(\omega) - X(\omega)|^p =0\),也即对任意的 \(\delta >0 , \varepsilon >0\),存在 \(N \in \mathbb Z^+\) 使得 \(n>N\) 时有 \(\mathbb E|X_n (\omega) - X(\omega)|^p \leq \delta \varepsilon ^p\),于是 \(n > N\) 时有 \(\mathbb P(\omega : |X_n(\omega) - X(\omega)| > \varepsilon)\leq \frac{\mathbb E |X_n(\omega) - X(\omega)|^p}{\varepsilon ^p} \leq \frac{\delta \varepsilon ^p}{\varepsilon ^p} = \delta\)

    \(\delta >0\) 的任意性可知,\(\lim_{n \to \infty}\mathbb P(|X_n(\omega) - X(\omega)| > \varepsilon) =0\),也即 \(X_n \stackrel{P}{\rightarrow} X\),得证。

  • \(X_n \stackrel{P}{\rightarrow} X\) 不能推出 \(X_n \stackrel{L^p }{\rightarrow} X\)

    Counterexample: 取 \(\Omega=[0,1), \mathcal F = \mathcal B (\Omega)\),概率测度 \(\mathbb P\) 即为 \(\Omega\) 上的 Lebesgue 测度。

    \(X_n( \omega) = \begin{cases}2^n \quad \omega \in (0,\frac 1 n) \\ 0 \qquad \text{otherwise} \end{cases}\),则有 \(\mathbb P(|X_n (\omega) - 0|) = \mathbb P(X_n = 2^n ) = \frac 1 n\),因此 \(X_n \stackrel{P}{\rightarrow} 0\)

    然而 \(\mathbb E|X_n -0|^p = \mathbb E |X_n|^p = \frac{2^{np}}{n} \to \infty\),因此 \(X_n \stackrel{L ^p }{\rightarrow}0\) 不成立。

  • \(X_n \stackrel{P}{\rightarrow} X\) 可以推出 \(X_n \stackrel{w}{\rightarrow} X\)

    Proof: 记 \(X_n\) 的分布函数是 \(F_n\)\(X\) 的分布函数是 \(F\)。考虑对于 \(F\) 的任意连续点 \(x \in C(F)\),取 \(\forall y <x<z\),有:

    \[\begin{aligned} \{\omega : X(\omega) \leq y \} & = \{\omega : X(\omega) \leq y , X_n (\omega) \leq x\} \cup \{\omega: X(\omega) \leq y , X_n(\omega) > x \} \\ & \subseteq \{\omega : X_n (\omega) \leq x\} \cup \{\omega: |X(\omega)- X_n(\omega)| > x-y \} \end{aligned}\]

    \[\begin{aligned} \{\omega : X_n(\omega) \leq x \} & = \{\omega : X_n(\omega) \leq x , X (\omega) \leq z\} \cup \{\omega: X_n(\omega) \leq x , X(\omega) > z \} \\ & \subseteq \{\omega : X (\omega) \leq z\} \cup \{\omega: |X(\omega)- X_n(\omega)| > z-x \} \end{aligned}\]

    反映到事件的概率测度,则有:

    \[F(y) = \mathbb P(X \leq y) \leq \mathbb P(X_n \leq x) + \mathbb P(|X_n - X | > x-y) = F_n(x)+\mathbb P(|X_n - X | > x-y)\]

    \[F_n(x) = \mathbb P(X_n \leq x) \leq \mathbb P(X \leq z)+ \mathbb P(|X_n - X | > z-x)=F(z)+\mathbb P(|X_n - X| > z-x)\]

    在上两式中分别取 \(n \to \infty\),则有:

    \[F(y) \leq \lim \inf F_n(x) \leq \lim \sup F_n(x) \leq F(z)\]

    再分别取 \(y \uparrow x, z \downarrow x\) 则有 \(\lim _{n\to \infty} F_n(x) = F(x)\) 在任意 \(x \in C(F)\) 处成立,因此 \(X_n \stackrel{w}{\rightarrow} X\)

  • \(X_n \stackrel{w}{\rightarrow} X\) 不能推出 \(X_n \stackrel{P}{\rightarrow} X\)

    Counterexample: 取 \(\{X_n \}\)\(X\) 独立同分布于 \(\begin{pmatrix}-1 & 1 \\ \frac 1 2 & \frac 1 2 \end{pmatrix}\),显然有 \(X_n \stackrel{w}{\rightarrow} X\)

    \(\varepsilon_0 = 1\)\(\mathbb P(\omega \colon |X_n(\omega) - X(\omega)| > \varepsilon_0) = \mathbb P(X(\omega)=1,X_n(\omega) = -1) + \mathbb P(X(\omega)=- 1 ,X_n(\omega) = 1) = \frac 1 4 + \frac 1 4 = \frac 1 2\) 对任意的 \(n \in \mathbb Z^+\) 都成立,因此 \(\lim _{n \to \infty} \mathbb P(\omega : |X(\omega) - X_n(\omega)| > \varepsilon_0) = \frac 1 2 \neq 0\)\(X_n \stackrel{P}{\rightarrow} X\) 不成立。

Remark: 累死了,不会这几问一下子全考了吧(

弱收敛的特征函数条件

证明 \(X_n \stackrel{w}{\rightarrow} X \iff f_n(t) \triangleq \mathbb E(e^{itX_n}) \to f(t) \triangleq \mathbb E(e^{itX}), \forall t \in \mathbb R\),也即,弱收敛等价于特征函数逐点收敛。

Proof: 充分性是显然的,由于 \(X_n \stackrel{w}{\rightarrow} X \iff\) 对任意的有界连续函数 \(g(x)\)\(\mathbb Eg(X_n) = \mathbb Eg(X)\),而 \(|e^{itx}| \leq 1\) 保证了结论成立。必要性的证明分为两步进行:

  • \(X_n\) 的分布函数为 \(F_n\)\(X\) 的分布函数为 \(F\)

    由 Helly's first theorem 可知,\(\{X_n \}\) 存在子序列 \(\{X_{nk} \}\) 满足 \(F_{nk} \stackrel{w}{\rightarrow} \hat F, k \to +\infty\),下证 \(\hat F\) 是一个分布函数。由收敛性知 \(\hat F\) 满足右连续、单调递增条件,只需证明 \(\hat F(-\infty)=0, \hat F(+\infty)=1\)

    否则存在 \(a = \hat F(+\infty) - \hat F(-\infty) \in (0,1)\),由于 \(f(0)=1,f\) 一致连续,则对任意 \(\varepsilon >0\) ,存在 \(r=r(\varepsilon)\) 使得 \(\frac{1}{2r} \int_{-r}^r f(t) dt > 1-\frac{1}{2} \varepsilon> a+ \frac{1}{2} \varepsilon\)

    \(b\) 满足 \(F_{nk} (b) - F_{nk} (-b) < a+ \frac 1 4 \varepsilon\)\(\frac{1}{br} < \frac 1 4 \varepsilon\),则有:

    \[\begin{aligned} \frac{1}{2r} \int_{-r}^r f_{nk}(t) dt &= \frac{1}{2r} \int_{-r}^r dt \int_{\mathbb R} e^{itx} F_{nk}(dx) = \frac{1}{2r} (\int_{|x| \leq b} F_{nk}(dx) \int_{-r}^r e^{itx}dt +\int_{|x| > b} F_{nk}(dx) \int_{-r}^r e^{itx}dt )\\& \leq \int_{|x| \leq b} F_{nk}(dx)+ \frac{1}{2r} \int_{|x| > b} |\frac 2 x \sin rx| F_{nk}(dx) < \int_{|x| \leq b} F_{nk}(dx)+ \frac{1}{br} \leq a + \frac 1 2 \varepsilon \end{aligned}\]

    由控制收敛定理,取 \(k \to \infty\) 则有 \(\frac{1}{2r} \int_{-r}^r f(t) dt \leq a + \frac 1 2 \varepsilon\),这与 \(r\) 的取值相矛盾。因此 \(\hat F\) 是一个分布函数。由 Helly's second theorem 可知 \(F_{nk}\) 对应的特征函数在 \(k \to +\infty\) 时收敛到 \(\int e^{itx} \hat F(dx)\)。又因为 \(f_{nk}(t)\) 逐点收敛到 \(f(t)\)\(f(t) = \int e^{itx} F(dx)\),由特征函数的唯一性可知 \(F = \hat F\)

  • 下面证明 \(F_n \stackrel{w}{\rightarrow} F\),否则存在 \(F\) 的某一连续点 \(x_0\) 满足 \(F_n(x_0)\) 不收敛到 \(F(x_0)\)。取其收敛子列 \(F_{mk}\) 使得 \(\lim_{n \to +\infty} F_{mk}(x_0) = F^*(x_0)\),由 Helly's first theorem 和上一步的结论可知,\(\{F_{mk} \}\) 存在子序列 \(\{F_{mkj} \}\) 满足 \(F_{mkj} \stackrel{w}{\rightarrow} F, j \to +\infty\)。于是 \(F(x_0) = \lim_{j\to +\infty} F_{mkj} = \lim_{k\to +\infty} F_{mk} = F^*(x_0)\),矛盾。

综上可知 \(X_n \stackrel{w}{\rightarrow} X \iff f_n(t) \to f(t)\) 逐点成立。

大数定律

Kronecker Lemma

有以下三个数分定理成立:考虑实数列 \(\{c_n \}\) 和部分和 \(S_n = \sum_{i=1}^n c_i\) 所组成的数列 \(\{S_n\}\)

  • \(c_n \to c\),则 \(\frac{S_n}{n} \to c\)
  • \(b_m=\sup \{|S_{m+k} - S_m| : k \geq 1 \}, b = \inf \{b_m:m \geq 1\}\),则 \(\sum_{n=1} ^\infty c_n\) 收敛 \(\iff b=0\)
  • \(\sum_{n=1} ^\infty \frac{c_n}{n}\) 收敛,则 \(\frac{S_n}{n} \to 0\)
  • 若数列 \(\{a_n \} \uparrow +\infty\)\(a_n >0\) 恒成立,由 \(\sum_{n=1} ^\infty \frac{c_n}{a_n}\) 收敛可以得到 \(\frac{S_n}{a_n} \to 0\)

Proof:

  • 对任意 \(\varepsilon >0\),存在 \(N \in \mathbb Z_+\) 使得对任意 \(n > N\)\(|c_n - c| < \varepsilon\),也即 \(c-\varepsilon < c_n < c + \varepsilon\)

    \(\forall n >N\)\(\frac 1 n S_n = \frac 1 n (\sum_{i=1}^N c_i + \sum_{i=N+1}^n c_i)\),其中 \((n-N)(c-\varepsilon)<\sum_{i=N+1}^n c_i <(n-N)(c+\varepsilon)\),并记 \(M = \sum_{i=1}^N c_i\) 是一个常数。

    \(N' \in \mathbb Z_+\) 使得对任意 \(n >N'\),有 \(\max \{\frac{M-N(c-\varepsilon)}{n},\frac{M-N(c+\varepsilon)}{n} \} \leq \varepsilon\),则对 \(\forall n > \max \{N,N' \}\)\(|\frac 1 n S_n - c| < \varepsilon\) 成立,于是由 \(\varepsilon\) 的任意性可知 \(\frac 1 n S_n \to c\),得证。

  • \(b=0\),则对任意的 \(\varepsilon >0\),存在 \(n_1\) 使得 \(b_{n_1} < \varepsilon\)。从而对任意 \(k \geq 1\)\(|S_{n_1 + k} - S_{n_1}| < \varepsilon\),这说明 \(S_n = \sum_{i=1}^n c_i\) 是一个柯西列,则 \(\sum_{n=1} ^\infty c_n\) 收敛;

    反之,如果数列 \(\{S_n \}\) 收敛,则它是一个柯西列,对任意 \(\varepsilon >0\),存在 \(N\in \mathbb Z_+\) 使得对任意的 \(n >N\),对任意 \(k \in \mathbb Z_+\) 都有 \(|S_{n+k} - S_n | < \varepsilon\),也即 \(b_n < \varepsilon\) 对任意 \(n \in \mathbb Z_+\) 成立。因此也有 \(0\leq b = \inf \{b_m : m\geq 1\} < \varepsilon\),由 \(\varepsilon>0\) 的任意性可知此时 \(b=0\)

  • 第三问实际上就是第四问的一个例子,我们直接证明最一般的情况。

    \(t_n = \sum_{i=1}^n \frac{c_n}{a_n}\),则有 \(\{t_n\} \to t\) 是一个收敛数列,此时有 \(c_n = a_n (t_n - t_{n-1})\)

    于是 \(S_n = \sum_{i=1}^n a_n(t_n - t_{n-1}) = a_n t_n - \sum_{i=1}^{n-1} (a_{i+1} - a_{i})t_{i}\),考虑 \(\frac{S_n}{a_n} = t_n - \sum_{i=1}^{n-1} \frac{(a_{i+1} - a_{i})}{a_n}t_{i}\) 则有每个 \(\frac{a_{i+1} -a_i}{a_n} >0\),此时有 \(t- \sum_{i=1}^{n-1} \frac{a_{i+1} - a_i}{a_n} t_i \to t-\sum_{i=1}^{n-1} \frac{a_{i+1} - a_i}{a_n} t \to 0\),也就是说 \(\frac{S_n}{a_n} \to 0\)\(n \to \infty\) 时成立。

    (这里其实在口胡了,但是注意到 \(t_n \to t,a_n \to \infty\) 然后用极限那一套 \(\varepsilon \sim N\) 写一下也很简单)

独立同分布情况的强大数定律

\(X_1,X_2,...,X_n,... \text{i.i.d.}\) 满足 \(\mathbb E|X_i| < \infty\),记 \(\mathbb E X_i = \mu\),取 \(S_n = \sum_{i=1}^n X_i\),则有\(n \to \infty\)\(\frac{S_n}{n} \stackrel{a.s.}{\rightarrow} \mu\)

Proof:\(Y_n \triangleq X_n \mathbb 1_{\{|X_n | \leq n\}}\),记 \(T_n = \sum_{i=1}^n Y_i\),注意到 \(\frac 1 n S_n - \mu = \frac 1 n (S_n - T_n)+\frac 1 n (T_n - \mathbb ET_n) + (\frac 1 n \mathbb ET_n - \mu)\),因此想要证明 \(\frac{S_n}{n} \stackrel{a.s.}{\rightarrow} \mu\) 只需要证明上述三个括号内的部分几乎处处收敛。

  • 考虑:

    \[\begin{aligned}\sum_{n=1}^\infty \mathbb P(\omega: X_n \neq Y_n) &= \sum_{n=1}^\infty \mathbb P(\omega: |X_n| >n) \leq \int_{0} ^\infty \mathbb P(|X_1| \geq x) dx = \mathbb E|X_1| \leq +\infty \end{aligned}\]

    于是有 \(\{X_n \},\{Y_n\}\) 是等价随机变量列,\(X_n - Y_n \stackrel{a.s.}{\rightarrow} 0\),由 Kronecker's lemma 的结论 1 可知 \(\frac{1}{n} \sum_{i=1}^n (X_i - Y_i) \stackrel{a.s.}{\rightarrow} 0\),也就是 \(\frac 1 n (S_n - T_n) = \frac 1 n \sum_{i=1} ^n (X_i - Y_i) \stackrel{a.s.}{\rightarrow} 0\)

  • 考虑:

    \(\begin{aligned} \sum_{n=1}^\infty \frac{Var Y_n}{n^2} & \leq \sum_{n=1} ^\infty \frac{\mathbb EY_n^2}{n^2} = \sum_{n=1}^\infty \frac{1}{n^2} \int_{-n}^n x^2 dF_{X_1}(x) = \sum_{n=1}^\infty \frac{1}{n^2} (\sum_{j=1} ^n \int_{j-1 \leq |x| \leq j} x^2 dF_{X_1}(x))\\& = \sum_{n=1}^\infty (\int_{j-1 \leq |x| \leq j} x^2 dF_{X_1} (x) \cdot \sum_{n=j} ^\infty \frac{1}{n^2}) \leq \sum_{n=1} ^\infty (\int_{j-1 \leq |x| \leq j} x^2 dF_{X_1}(x) \cdot \frac{C}{j} ) \\ & = C \sum_{n=1}^\infty \frac 1 j \int_{j-1 \leq |x| \leq j} |x|^2 dF_{X_1}(x) \leq C \sum_{n=1}^\infty \int_{j-1 \leq |x| \leq j} |x| dF_{X_1}(x) = C\mathbb E|X_1| \leq + \infty\end{aligned}\)

    由 SLLN 的其中一个定理有 \(n \to \infty\)\(\frac 1 n (T_n - \mathbb ET_n)\stackrel{a.s.}{\rightarrow} 0\) 成立。

  • 由于 \(\mathbb E Y_n = \int_{\mathbb R} x \mathbb 1_{[-n,n]} (x) dF_{X_1} (x)\),且 \(|x \mathbb 1_{[-n,n]}(x) | \leq |x|\)\(\lim_{n \to \infty} x \mathbb 1_{[-n,n]}(x) = x\),将 \(\mathbb EY_n\) 作为一个可积函数列的积分来考虑。

    \(\int_{\mathbb R} |x| dF_{X_1}(x) = \mathbb E|X_1| \leq +\infty\),则由控制收敛定理,

    \[\lim_{n \to \infty} \mathbb EY_n = \int_{\mathbb R} \lim_{n \to \infty} x \mathbb 1_{[-n,n]} dF_{X_1}(x) = \int_{\mathbb R} x dF_{X_1}(x) = \mathbb E X_1 = \mu\]

    于是由 Kronecker's lemma 有 \(\frac 1 n \sum_{i=1} ^n \mathbb E Y_n = \frac 1 n \mathbb E T_n \to \mu\) 成立。

综合上述三点可知 \(\frac 1 n S_n - \mu = \frac 1 n (S_n - T_n)+\frac 1 n (T_n - \mathbb ET_n) + (\frac 1 n \mathbb ET_n - \mu) \stackrel{a.s.}{\rightarrow}0\),也即 \(\frac{S_n}{n} \stackrel{a.s.}{\rightarrow} \mu\) 成立,得证。

Remark 1: 实际上如果期望不存在,把 \(\mathbb E |X_1| < +\infty\) 这个条件改成 \(\mathbb E |X_1| = +\infty\),则有 \(\lim \sup_{n\to \infty} \frac{|S_n|}{n} \stackrel{a.s.}{\rightarrow} +\infty\)

也就是说,对任意的 \(A > 0\)\(\mathbb E(|X_1| /A) = \infty\) 成立,则有:

\[+\infty = \mathbb E(|X_1|/A) = \int_{\mathbb R} \mathbb P(|X_1|/A > x)dx \leq \sum_{n=1}^\infty \mathbb P(|X_1|/A > n)\]

又因为 \(\{X_i \}\) 同分布,因此 \(\sum_{i=1}^n \mathbb P(|X_n| > An) = +\infty\),使用各随机变量独立情形下的 Borel-Cantelli lemma,则有 \(\mathbb P(|X_n| > An\quad \text{i.o.}) = 1\)。事实上 \(\{\omega : |S_n - S_{n-1}| = |X_n|>An \} \subseteq \{\omega : |S_n| >A n/2 \}\cup \{\omega :|S_{n-1}|>An/2 \}\),由此可知 \(\mathbb P(|S_n| > \frac {An}{2} \quad \text{i.o.})=1\) 对任意的 \(A>0\) 恒成立。

这也就是说,对任意的 \(A>0\),存在一个零测集 \(Z(A)\) 使得对于任意的 \(\omega \in \Omega \setminus Z(A)\),有 \(\lim \sup_{n \to \infty} \frac{S_n(\omega)}{n} \geq \frac{A}{2}\) 成立;取 \(Z =\cup_{m=1}^\infty Z(m)\) 仍为零测集,于是对任意的 \(\omega \in \Omega \setminus Z\) 都有 \(\lim \sup_{n \to \infty} \frac{S_n(\omega)}{n} = + \infty\)

因此 \(\lim \sup_{n\to \infty} \frac{S_n}{n} \stackrel{a.s.}{\rightarrow} +\infty\) 成立,得证。

Remark 2: 一个来自 Kai Lai Chung 的例子:\(\{X_n \}\) 独立同分布,\(\mathbb P(X_1 = n) = \mathbb P(X_1 = -n) = \frac{c}{n^2 \log n}, n =3,4,...\)\(c\) 是归一化系数使得所有的概率加起来为 \(1\)

因此有 \(\mathbb E(X_1) = 0, \mathbb E|X_1| = 2 \sum_{n=1}^\infty \frac{c}{n\log n} = +\infty\),可以推知 \(\lim \sup_{n \to \infty} \frac{S_n}{n} = +\infty\);事实上由于 \(X\) 的分布是关于原点对称的,也可以推知 \(\lim \inf_{n \to \infty} \frac{S_n}{n} = -\infty\)。与此同时由 Kolmogorov & Feller's WLLN method 有 \(\frac{S_n}{n} \stackrel{p}{\rightarrow} 0\),二者是并不矛盾的。

中心极限定理

Feller + CLT \(\iff\) Lindeberg

lzx: 必要性太复杂了,可以考虑考个充分性(看讲义应该是 Lindeberg \(\to\) Feller + CLT,Lindeberg 你好强大

对于三角随机变量列 \(\{X_{nj}\}, 1\leq j \leq n,n \in \mathbb Z_+\),有 \(\mathbb EX_{nj} = 0\)\(\sum_{j=1}^n Var(X_{nj}) = \sum_{j=1}^n \sigma_{nj}^2 = 1\) 对任意 \(n \in \mathbb Z_+\) 成立。记其部分和为 \(S_n = \sum_{j=1}^n X_{nj}\),则有以下三个条件:

  • CLT: \(S_n \stackrel{w}{\rightarrow} S \sim N(0,1)\)

  • Feller: \(\lim_{n \to \infty} \max_{1 \leq j \leq n} \sigma_{nj}^2 =0\)

    另一个等价形式是 \(\lim_{n \to \infty} \max_{1 \leq j \leq n} \mathbb P(|X_{nj} | \geq \varepsilon) = 0\) 对任意 \(\varepsilon >0\) 成立

  • Lindeberg: \(\lim_{n\to \infty} \sum_{j=1}^n \int_{|x| \geq \varepsilon} x^2 dF_{nj}(x) = 0\) 对任意 \(\varepsilon >0\) 成立

其中可以由 Lindeberg 条件成立推出 Feller 条件和 CLT 条件都成立。

Proof:

  • 首先考虑使用 Lindeberg 条件推出 Feller 条件。由 Chebyshev 不等式可知,

    \[\begin{aligned} \mathbb P(|X_{nj}| \geq \varepsilon) & \leq \frac{\sigma^2_{nj}}{\varepsilon ^2} = \frac{1}{\varepsilon^2}( \int_{|x| \leq \tau} x^2 dF_{nj}(x) + \int_{|x| > \tau} x^2 dF_{nj}(x) ) \leq \frac{\tau ^2}{\varepsilon^2} + \frac{1}{\varepsilon^2} \int_{|x| > \tau} x^2 dF_{nj}(x) \end{aligned}\]

    由此可知 \(\max_{1 \leq j \leq n} \mathbb P(|X_{nj}| \geq \varepsilon) \leq \frac{\tau ^2}{\varepsilon^2} + \frac{1}{\varepsilon^2} \sum_{j=1}^n \int_{|x| > \tau} x^2 dF_{nj}(x) \to 0, n \to +\infty\) 且取 \(\tau \to 0\) 时成立,

    同理 \(\max_{1 \leq j \leq n} \mathbb P(|X_{nj}| \leq \varepsilon) \leq \tau ^2 + \sum_{j=1}^n \int_{|x| > \tau} x^2 dF_{nj}(x) \to 0, n \to +\infty\) 且取 \(\tau \to 0\) 时成立。

  • 再考虑用 Lindeberg 条件推出 CLT 条件。记 \(S_n\) 的特征函数是 \(f_{S_n}(t)\)\(X_{nj}\) 的特征函数是 \(f_{X_{nj}}(t)\),且某一服从标准正态分布的随机变量 \(S\) 的特征函数是 \(f_S (t) = e^{-\frac 1 2 t^2}\)

    \(\mathbb EX_{nj}=0, VarX_{nj} = \sigma^2_{nj}\) 可知,\(f_{X_{nj}} = 1- \frac 1 2 t^2 \sigma_{nj}^2 + o(\frac 1 2 \sigma_{nj}^2 t^2)\),于是有:

    \[\begin{aligned} |f_{S_n}(t) - f_S(t)| &= |f_{S_n}(t) - \prod_{j=1} ^n (1- \frac 1 2 t^2 \sigma_{nj}^2) + \prod_{j=1} ^n (1- \frac 1 2 t^2 \sigma_{nj}^2) - e^{-\frac 1 2 t^2 \sum_{j=1}^n \sigma_{nj}^2} | \\ &\leq |\prod_{j=1} ^n f_{X_{nj}}(t) - \prod_{j=1} ^n (1- \frac 1 2 t^2 \sigma_{nj}^2)| + |\prod_{j=1} ^n (1- \frac 1 2 t^2 \sigma_{nj}^2) - \prod_{j=1}^n e^{-\frac 1 2 t^2 \sigma_{nj}^2} | \\ & \triangleq \Delta_1 + \Delta_2 \end{aligned}\]

    以下对 \(\Delta_1 , \Delta_2\) 分别进行估计:

    \[\begin{aligned} \Delta_1 & \leq \sum_{j=1}^n |f_{X_{nj}} (t) - (1- \frac 1 2 t^2 \sigma_{nj}^2)| = \sum_{j=1}^n |\mathbb E(e^{itX_{nj}} - 1-itX_{nj} + \frac 1 2 t ^2 X_{nj} ^2)| \leq \sum_{j=1}^n \mathbb E(\min\{|tX_{nj}|^2 , |tX_{nj}|^3 \}) \\ & \leq \sum_{j=1}^n (\mathbb E(|tX_{nj}|^3 , |X_{nj}| \leq \tau) + \mathbb E(|tX_{nj}|^2 , |X_{nj}| > \tau)) \leq \sum_{j=1}^n (\tau |t|^3 \int_{|x| \leq \tau} x^2 dF_{nj}(x) + t^2 \int_{|x| > \tau} x^2 dF_{nj}(x) )\\ & \leq \tau |t|^3 \sum_{j=1}^n \sigma_{nj}^2 + t^2 \sum_{j=1}^n \int_{|x| > \tau} x^2 dF_{nj}(x) = \tau |t|^3 + t^2 \sum_{j=1}^n \int_{|x| > \tau} x^2 dF_{nj}(x) \to 0\quad (n \to +\infty, \tau \to 0) \end{aligned}\]

    \[\begin{aligned} \Delta_2 & \leq \sum_{j=1}^n |e^{-\frac 1 2 t^2 \sigma_{nj}^2} - 1 + \frac 1 2 t^2 \sigma_{nj}^2| \leq \sum_{j=1}^n |\frac 1 2 t^2 \sigma_{nj}^2|^2 \leq \frac 1 4 t^4 \max_{1\leq j \leq n} |\sigma_{nj}^2| \sum_{j=1}^n \sigma_{nj}^2 = \frac 1 4 t^4 \max_{1\leq j \leq n} |\sigma_{nj}^2| \to 0 ,n \to +\infty \end{aligned}\]

    于是有 \(f_{S_n}(t) \to f_S(t)\) 逐点收敛,也即 \(S_n \stackrel{w}{\rightarrow} S \sim N(0,1)\),CLT 成立。

Remark 1: 实际上条件里面的 \(\mathbb EX_{nj}=0\)\(\sum_{j=1}^n Var(X_{nj}) = \sum_{j=1}^n \sigma_{nj}^2 = 1\) 并不需要成立,对于一般的三角随机变量列,只需要取 \(Y_{nj} = \frac{X_{nj} - \mathbb EX_{nj}}{\sqrt{B_n^2}}, B_n ^2 = \sum_{j=1}^n \sigma_{nj}^2\) 则可以满足上述两个标准化条件。这两个条件的加入使得计算和书写更加简便。

Remark 2: 所以说充分性难道比必要性简单很多吗(挠头(Lindeberg,你好强大

鞅论

Doob's Stopping Time Theorem

英语稀碎,但是试图把这段话翻译成中文的时候觉得浑身不得劲,就这样吧(

说实话这道题真的水,但是涉及一些概念(discrete time martingale, stopping time, etc.),不知道是记下来还是放弃比较合适(

\(S,T\) are bounded stopping times w.r.t. \(\{\mathcal F_n \}_{n=1}^{+\infty}\), while \(X = \{X_n, \mathcal F_n \}\) is a martingale (or supermartingale) on \((\Omega, \mathcal F, \mathbb P)\). \(S \leq T\).

  • \(\mathcal F_T\) and \(\mathcal F_S\) are \(\sigma -\)fields of \(\Omega\) ;
  • \(X_T\) is \(\mathcal F_T -\)measurable, \(X_S\) is \(\mathcal F_S-\)measurable ;
  • \(\mathbb E|X_S| < +\infty\), \(\mathbb E|X_T| < +\infty\) ;
  • For martingale, there is \(\mathbb E(X_T |\mathcal F_S) = X_S \quad a.s.\) and for supermartingale, there is \(\mathbb E(X_T|\mathcal F_S) \leq X_S \quad a.s.\) .

Proof:

  • \(\mathcal F_T \triangleq \{A|A \cap \{T=n\} \in \mathcal F_n , \forall n \geq 0\}\), we have the following:

    • \(\Omega \cap\{T=n\} = \{T= n\} \in \mathcal F_n\), \(\varnothing \cap \{T=n\} = \varnothing \in \mathcal F_n\), thus \(\Omega, \varnothing \in \mathcal F_T\).
    • \(\forall A \in \mathcal F_T\), there is \(A^C \cap \{T=n\} = \{T=n\} - A \cap \{T=n\} \in \mathcal F_n\), thus \(A^C \in \mathcal F_T\).
    • For pairwise disjoint \(\{A_k \}_{k=1}^{+\infty} \in \mathcal F_T\), there is \((\uplus_{k=1}^{+\infty} A_k)\cap \{T=n\} = \uplus_{k=1} ^{+\infty}(A_k \cap \{T=n \}) \in \mathcal F_n\). Thus \(\uplus _{k=1}^{+\infty} A_k \in \mathcal F_T\).

    We can conclude that \(\mathcal F_T\) is a \(\sigma-\)field of \(\Omega\), similarly \(\mathcal F_S\) is a \(\sigma-\)field of \(\Omega\).

  • To prove that \(X_T\) is \(\mathcal F_T-\)measurable, it suffices to show \(\forall B \in \mathcal B^1\), \(\{X_T \in B \} \in \mathcal F_T\).

    According to the definition, \(\{X_T \in B\} \cap \{T=n \} = \{X_n \in B \} \cap \{T=n \} \in \mathcal F_n\), thus \(\{X_T \in B \}\in \mathcal F_T\).

    Similarly, \(X_S\) is \(\mathcal F_S -\)measurable.

  • \(S \leq T\) are bounded stopping time, take \(n_0\) as their upper bound, then \(T=\{0,1,2,...,n_0\}\). According to the definition, \(X_T = \sum_{i=0}^{n_0} X_i \mathbb 1_{\{T(\omega) =i\}}\).

    Thus \(\mathbb E|X_T| \leq \sum_{i=0}^{n_0} \mathbb E (|X_i| \mathbb 1_{\{T(\omega)=i \}}) \leq \sum_{i=0}^{n_0} \mathbb E|X_i| < +\infty\), cause \(X = \{X_n, \mathcal F_n \}\) is a martingale (or supermartingale).

    Similarly \(\mathbb E|X_S| < +\infty\).

  • To treat the supermartingale case, we'd like to discuss the special case first.

    • If \(0 \leq T-S \leq 1\), for \(\forall A \in \mathcal F_S\), take \(A_j = A \cap \{S= j \} \cap \{T \geq j \}\) for \(j = 0,1,2,...,n_0\), thus \(A = \uplus_{j=0}^{+\infty} A_j\).

      If \(T-S =1\), there is \(\int_A (X_S-X_T) d\mathbb P = \sum_{j=0}^{n_0} \int_{A_j} (X_j - X_{j+1}) d \mathbb P \geq 0\), then we can conclude that \(\mathbb E(X_T|\mathcal F_S) \leq X_S\).

      Else if \(T-S =0\), it's trivial to conclude that \(\mathbb E(X_T|\mathcal F_S) = \mathbb E(X_S|\mathcal F_S) = X_S\).

    • For the general case, denote \(R_j = T \wedge (S+j)\) as a series of stopping time and \(S = R_0 \leq R_1 \leq ... \leq R_{n_0} = T\) holds. Moreover, \(0 \leq R_{j+1} - R_j \leq 1\), so the conclusion above can be utilized here.

      For \(\forall A \in \mathcal F_S\), there is \(\int_A X_S d\mathbb P = \int_A X_{R_0} d \mathbb P \geq \int_A X_{R_1} d \mathbb P \geq ... \geq \int_A X_{R_{n_0}} d \mathbb P = \int_A X_T d \mathbb P\), then we can conclude that \(\mathbb E(X_T|\mathcal F_S) \leq X_S\).

    The proof is almost the same for the case of martingale.

附录:分布函数表

从 V1ncent19 那里抄来的表,学了半年多概统了我还没能完全背出来,实在是很蚌埠住。

From Statistic Note P10, by V1ncent19

\(X\) \(p_X(k)\big/f_X(x)\) \(\mathbb{E}\) \(Var\) PGF MGF
\(\mathrm{Bern} (p)\) \(p\) \(pq\) \(q+pe^s\)
\(B (n,p)\) \(C_n^k p^k(1-p)^{n-k}\) \(np\) \(npq\) \((q+ps)^n\) \((q+pe^s)^n\)
\(\mathrm{Geo} (p)\) \((1-p)^{k-1}p\) \(\dfrac{1}{p}\) \(\dfrac{q}{p^2}\) \(\dfrac{ps}{1-qs}\) \(\dfrac{pe^s}{1-qe^s}\)
\(H(n,M,N)\) \(\dfrac{C_M^kC_{N-M}^{n-k}}{C_N^n}\) \(n\dfrac{M}{N}\) \(\dfrac{nM(N-n)(N-M)}{N^2(n-1)}\)
\(P(\lambda)\) \(\dfrac{\lambda^k}{k!}e^{-\lambda}\) \(\lambda\) \(\lambda\) \(e^{\lambda(s-1)}\) \(e^{\lambda(e^s-1)}\)
\(U(a,b)\) \(\dfrac{1}{b-a}\) \(\dfrac{a+b}{2}\) \(\dfrac{(b-a)^2}{12}\) \(\dfrac{e^{sb}-e^{sa}}{(b-a)^s}\)
\(N(\mu,\sigma^2)\) \(\dfrac{1}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) \(\mu\) \(\sigma^2\) \(e^{\frac{\sigma^2s^2}{2}+\mu s}\)
\(\varepsilon(\lambda)\) \(\lambda e^{-\lambda x}\) \(\dfrac{1}{\lambda}\) \(\dfrac{1}{\lambda^2}\) \(\frac{\lambda}{\lambda-s}\)
\(\Gamma(\alpha,\lambda)\) \(\dfrac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}\) \(\dfrac{\alpha}{\lambda}\) \(\dfrac{\alpha}{\lambda^2}\) \((\frac{\lambda}{\lambda-s})^\alpha\)
\(B(\alpha,\beta)\) \(\dfrac{1}{B(\alpha,\beta)}x^{\alpha-1}(1-x)^{\beta-1}\) \(\dfrac{\alpha}{\alpha+\beta}\) \(\dfrac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\)
\(\chi^2_n\) \(\dfrac{1}{2^{\frac{n}{2}}\Gamma(\frac{n}{2})}x^{\frac{n}{2}-1}e^{-\frac{x}{2}}\) \(n\) \(2n\) $ (1-2s)^{-n/2} $
\(t_\nu\) \(\dfrac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\Gamma(\frac{\nu}{2})}(1+\frac{x^2}{\nu})^{-\frac{\nu+1}{2}}\) \(0\) \(\dfrac{\nu}{\nu-2}\)
\(F_{m,n}\) \(\dfrac{\Gamma(\frac{m+n}{2})}{\Gamma(\frac{m}{2})\Gamma(\frac{n}{2})}\dfrac{m^\frac{m}{2}n^\frac{n}{2}x^{\frac{m}{2}-1}}{(mx+n)^{\frac{m+n}{2}}}\) \(\dfrac{n}{n-2}\) \(\dfrac{2n^2(m+n-2)}{m(n-2)^2(n-4)}\)

Consider \(X_1,X_2,\ldots,X_n\) i.i.d. \(\sim N(0,1)\); \(Y,Y_1,Y_2,\ldots,Y_m\) i.i.d. \(\sim N(0,1)\)

  • \(\chi^2\) Distribution:

    \(\chi^2\) distribution with degree of freedom \(n\): \(\xi =\sum_{i=1}^n X_i^2\sim \chi^2_n\)。 For independent \(\xi_i\sim\chi^2_{n_i},\, i=1,2,\ldots,k\): \(x_{i_0}=\sum_{i=1}^k\xi_i\sim\chi^2_{n_1+\ldots+n_k}\)

  • \(t\) Distribution:

    \(t\) distribution with degree of freedom \(n\): \(T=\frac{Y}{\sqrt{\frac{\sum_{i=1}^nX_i^2}{n}}}=\frac{Y}{\sqrt{\xi / n}}\sim t_n\)

  • \(F\) Distribution:

    \(F\) distribution with degree of freedom \(m\) and \(n\): $ F=F_{m,n}$

    • If \(Z\sim F_{m,n}\), then \(\dfrac{1}{Z}\sim F_{n,m}\)
    • If \(T\sim t_n\), then \(T^2\sim F_{1,n}\)

后记

统推经典问题考了,计算和 SLLN 的 Remark 合出了一道题,反例没考;四种收敛里面我背的最痛苦的是几乎处处收敛但是没考,好似,特征函数条件考的是 Durrett 上那个变式,我觉得可能写不好就没写。Kronecker lemma 没考,SLLN 经典题,Lindeberg 把方差和条件轻微修改了一下,鞅论是原题。背定理总比真在考场上做题的紧张感弱一点。

我还挺喜欢概率的,概统是选定了,具体的之后再说,暑假争取多读点 Durrett。

我很可爱 请给我钱(?)

欢迎关注我的其它发布渠道