alt

Week3 (2026.03.17) - Probability Theory and Statistics

Shared on March 18, 2026

CSE30301: Basic Math for AI – Quiz‑Ready Summary


1. Recap: Probability & Statistics

  • Sample Mean

    • ( \bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_i )
    • Unbiased: (E[\bar{X}]=\mu)
    • As (n\to\infty), (\bar{X}\to\mu) (Law of Large Numbers)
  • Sample Variance

    • (S^2=\frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2)
    • Unbiased: (E[S^2]=\sigma^2)

2. Central Limit Theorem (CLT)

  • For i.i.d. (X_i) with (E[X_i]=\mu,;Var[X_i]=\sigma^2<\infty):
    [ \sqrt{n}\left(\bar{X}-\mu\right)\xrightarrow{d}N(0,\sigma^2) ]
  • Holds regardless of original distribution (illustrated with Bernoulli, Uniform, and various distributions).

3. Algorithm Comparison (Illustrative Example)

  • Two algorithms A and B produce random variables (X) and (Y).
  • Goal: determine if (E[X] < E[Y]) (or vice‑versa).
  • Approach:
    1. Sampling → draw (X_1,\dots,X_n) and (Y_1,\dots,Y_n).
    2. Compute sample means (\bar{X}) and (\bar{Y}).
    3. Use statistical inference (hypothesis test or confidence interval).

4. Statistical Testing

4.1 Hypotheses

  • Null (H_0): No difference (e.g., (E[X]=E[Y])).
  • Alternative (H_1): Difference exists (e.g., (E[X]<E[Y])).

4.2 Test Statistic (Z‑test)

  • Assume normal population (N(\mu,\sigma^2)) and known (\sigma).
  • (Z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}})
  • Under (H_0): (Z\sim N(0,1)).

4.3 p‑Value

  • Two‑sided: (p=2[1-\Phi(|Z|)]).
  • One‑sided (e.g., (H_0:\mu\le\mu_0) vs (H_1:\mu>\mu_0)): (p=1-\Phi(Z)).
  • (\Phi) = standard normal CDF.

4.4 Decision Rule

  • Significance level (\alpha) (default (0.05)).
  • Reject (H_0) if (p<\alpha); otherwise fail to reject.

4.5 Power & Sample Size

  • Power (=1-\beta) (probability of correctly rejecting (H_0)).
  • For desired power, solve for (n) using the non‑centrality parameter derived from (\mu_1-\mu_0).

5. Confidence Intervals

  • For (\mu) with known (\sigma):
    [ \bar{X}\pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}} ]
  • Interpretation: 95% of such intervals will contain the true (\mu) if many samples are taken.

6. Practical Workflow for Algorithm Comparison

  1. Collect Samples from both algorithms.
  2. Compute (\bar{X}), (\bar{Y}), and estimated variances.
  3. Select Test:
    • If (\sigma) known → Z‑test.
    • If (\sigma) unknown → t‑test (not detailed in slides but implied).
  4. Calculate test statistic, p‑value.
  5. Decide using (\alpha).
  6. Report confidence interval for the difference (\bar{X}-\bar{Y}).

7. Quiz Details

  • Date: Thursday, March 19th (beginning of class).
  • Coverage: All material from the first lecture through today’s lecture, including Probability Theory and Statistics.
  • Timing: Arrive on time; quiz starts immediately.