第8周概率课小结

1. Markov Inequality (M.I.)

若$X \geq 0$, $c > 0$且 $E(X)$存在,则有 $$P(X \geq c) \leq \frac{E_X[X]}{c}$$

证明:$$E_X(X) = \int_0^{\infty}xf(x)dx = \int_0^cx f(x) dx + \int_c^\infty x f(x)dx \geq \int_c^{\infty}xf(x)dx = \int_{\{X \geq c \} } xf(x)dx \geq cP(x \geq c)$$

所以 $P(X \geq c) \leq \frac{E_X[X]}{c}$, 得证

2. Chebyshev Inequality (C.I.)

若随机变量$X$具有数学期望$\mu_X$, 方差$\sigma_X^2$, 且方差$\sigma_X ^2 < \infty$, 则有 $$\forall \epsilon > 0, \quad P(|X - \mu_X| \geq \epsilon) \leq \frac{\sigma_X^2}{\epsilon ^2}$$

证明: $$P(|X - \mu_X| \geq \epsilon) = P((X - \mu_X)^2 \geq \epsilon^2) \stackrel{M.I.}{\leq} \frac{E[(X - \mu_X)^2]}{\epsilon^2} = \frac{\sigma_X^2}{\epsilon^2}$$ 得证

注意到Chebyshev Inequality没有要求$\epsilon$的取值范围,而Markov Inequality要求$c>0$

3. Stochastic convergence

课上共介绍了7种convergence: UCMOPED,并且可推得M->P->D以及U->E->O->P->D

  • U: Uniform Convergence

$$X_n \xrightarrow{u} X \quad iff \quad \forall \epsilon >0, \exists n_0 \in Z^{+}, \forall n \geq n_0: |X_n(w) - X(w)| < \epsilon \quad \forall w$$

  • E: Convergence Everywhere

$$X_n \xrightarrow{e} X \quad iff \quad \forall w \in \Omega, \forall \epsilon >0, \exists n_0 \in Z^{+}, \forall n \geq n_0: |X_n(w) - X(w)| < \epsilon $$

$\quad$ U和E在定义上差别在于$w$的位置。两者有着微妙的差别。见例题7.

  • C: Cauchy Criterion

$$X_n \xrightarrow{c} X \quad iff \quad \forall w: \forall \epsilon > 0, \exists n_0 \in Z^{+}, \forall n \geq n_0, \forall m \geq n_0: |X_n(w) - X_m(w)| < \epsilon$$

  • O: Probability One (Almost Sure) Convergence $$X_n \xrightarrow{o} X \quad iff \quad P(\{w: \lim_{n \rightarrow \infty} X_n(w) = X(w)\}) = 1 \quad iff \quad P(\{w: \lim_{n \rightarrow \infty} X_n(w) \neq X(w)\}) = 0$$

  • M: Mean-Square Convergence $$X_n \xrightarrow{m} X \quad iff \quad \lim_{n \rightarrow \infty} E[(X_n - X)^2] = 0 \quad iff \quad \lim_{n \rightarrow \infty} \lim_{m \rightarrow \infty} E[(X_n - X_m)^2] = 0 $$

  • P: Convergence in Probability $$X_n \xrightarrow{p} X \quad iff \quad \forall \epsilon > 0 \lim_{n \rightarrow \infty} P(|X_n - X| > \epsilon ) = 0 \quad iff \quad \forall \epsilon > 0 \lim_{n \rightarrow \infty} P(|X_n - X| \leq \epsilon ) = 1$$

  • D: Convergence in Distribution $$X_n \xrightarrow{d} X \quad iff \quad \lim_{n \rightarrow \infty} F_{X_n}(x) = F_X(x)$$ at points of continuity

4. MSE Decomposition (Variance-Bias Decomposition)

  • $$MSE[\bar{X_n}] = E[(\bar{X_n} - X)^2] = Var[\bar{X_n}] + (E[\bar X_n] - X)^2$$ $\hspace{16cm}$ | $\hspace{15.7cm}$ $\text{bias}^2$

证明:
$MSE[\bar{X_n}] = E[(\bar{X_n} - X)^2] = E[((\bar X_n - E[\bar X_n]) + (E[\bar X_n] - X))^2] = Var[\bar X_n] + \text{bias}^2 + 2(E[\bar X_n - X])E[\bar X_n - E[\bar{X _n}]]$
其中,由期望的线性性质,上式第三部分等于0,由此等证。

  • $\bar{X_n}$ is an unbiased estimator of $X$ iff $E[\bar{X_n}] = X, \forall n$

  • $\bar{X_n}$ is an asymptotically unbiased for $X$ iff $\lim_{n \rightarrow \infty}E[\bar{X_n}] = X, \forall n$

5. Sampling Statistics

若有i.i.d.的随机变量$X_1, X_2, …, X_n$ (random sample)并且 $\sigma^2 < \infty (|\mu| < \infty)$,则有

  • $\displaystyle E[\bar{X_n} ] = \mu \text{ where } \bar{X_n} = \frac{1}{n}\sum_{k = 1}^{n}X_k$

  • $\displaystyle V[\bar{X_n}] = \frac{\sigma^2}{n}$

证明:
i.i.d. $\Rightarrow E[X_k] = \mu, \forall k; Var[X_k] = \sigma^2, \forall k$

$\displaystyle E[\bar{X_n}] = E[\frac{1}{n}\sum_{k = 1}^{n} X_k] = \frac{1}{n}E[\sum_{k = 1}^{n} X_k] = \frac{n\mu}{n} = \mu$

$\displaystyle Var[\bar{X_n}] = \frac{Var[\sum_{k = 1}^{n} X_k]}{n^2} = \frac{ n \sigma^2}{n^2} = \frac{\sigma^2}{n}$

6. Weak Law of Large Numbers (More in week 11 note)

iid的随机变量序列$X_1, X_2, …$,且$E[X_k] = \mu (k = 1, 2, …)$, 则有

$$\forall \epsilon > 0 \quad \lim_{n \rightarrow \infty} P\left\{\left|\frac{1}{n}\sum_{k = 1}^n X_k - \mu \right| < \epsilon\right\} = 1$$

简言之就是sample mean converges in probability to true mean,(sample mean也是true mean的unbiased estimator)

实际上sample variance ($\displaystyle \frac{1}{n-1}\sum_{k=1}^n (x_k - \bar{x}_n)^2$)也converges to true variance in probability, 并且也是true variance的unbiased estimator

例题

  1. Similar r.v.s $X_1, X_2, …$ are uniform: $X_n$ ~ $U(0, \frac{1}{n})$. Define the sequence of estimators $\hat{\theta_n}$ as $\hat{\theta_n} = \sqrt{n}X_n$. Is $\hat{\theta_n}$ a consistent estimator of the parameter $\theta = 0$?

  2. 随机变量序列$X_1, X_2, …$包含了相似分布的泊松随机变量 $X_n$ ~ $P(\frac{1}{n^3})$. 定义新的相似分布序列$Y_n = n X_n$, 这个新的随机变量序列$Y_1, Y_2, …$是converge to zero r.v. in distribution的吗?

  3. Random Sequence $X_1, X_2, …$ converges to r.v. X in $L^p$ iff $\lim_{n \rightarrow \infty} E[(X_n - X)^p] = 0$ for some $p > 0$. Suppose $p = 10$, does the sequence converge to $X$ in probability if it converges to $X$ in $L^{10}$?

  4. (From week 12 dis) The random sequence of estimators $\hat{\theta}_1, \hat{\theta}_2, \hat{\theta}_3, …$ obeys $\displaystyle \lim_{n \rightarrow \infty}E[(\hat{\theta}_n - \theta)^6] = 0$ for random parameter $\theta$. Is $\hat{\theta}$ consistent for $\theta$?

  5. 小明对某品牌牛奶22天的日销量数据做了些分析,他作出histogram,但是看不出明确的pdf. histogram的形状既不对称也不是unimodal的. 小明算出sample mean是一天卖50盒牛奶,sample standard deviation是每日5盒。小明的老板想知道小明是否能可靠地估计日销量在40盒至60盒的概率。小明该给出怎样的答案呢?

  6. A robot arm throws basketballs at a distant loop. r.v. $X_k$ counts the number of robotic throws until k balls make it through the hoop. Define $Y_k$ as $Y_k = \frac{1}{k^2}X_k$. Then where does the random sequence $Y_1, Y_2, …$ converge in probability?

  7. (Leon Garcia 7.41) Let $z$ be selected at random from the interval $S = [0, 1]$, and let the probability that $z$ is in a subinterval of S be given by the length of the subinterval. Define the following sequences of random variables for $n \geq 1$: $X_n(z) = z^n, Y_n(z) = cos^2 2\pi z, Z_n(z) = cos^n 2\pi z$. Do the sequences converge, and if so, in what sense and to what limiting random variable?

    解: TODO

套路总结

  1. 常见问题:是否consistent,实际就是问是否converge in probability; 或者是否converge in distribution,但是并没有告知任何概率分布信息
  2. 此时either $m \rightarrow p \rightarrow d$算mean square convergence (直接计算,或者variance-bias decomposition) or calculate convergence in prob. using Markov Inequality

参考文献

  1. 503第8周课堂笔记
  2. 503第8周讨论课笔记
  3. 盛骤. 谢式千. 潘承毅《概率论与数理统计》第四版,高等教育出版社
  4. Garcia, Alberto Leon. “Probability, statistics, and random processes for electrical engineering.” (2008).
comments powered by Disqus