Qingqi & Lian's Blog

List of topics for quantitative finance with concise key points and useful resources. For Quant interview preparation and quantitative finance introduction, or could be cheatsheets. Motivated by Ran Ding.

resources

key points

Bessel's correction
- $\operatorname{Var}(X)=\cfrac{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}{n-1}$
- because we use the sample mean instead of the true mean
- proof: wikipedia
Standard error (standard deviation of sample mean):
- population variance known: $\sigma_{\bar{X}}=\cfrac{\sigma}{\sqrt{n}}$
- unknown: $s_{\bar{X}}=\cfrac{s}{\sqrt{n}}, \text { where } s^{2}=\cfrac{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}{n-1}$
kernel density estimation
- a method to smooth the p.d.f.
- $\hat{f}{h}(x)=\frac{1}{n} \sum{i=1}^{n} K_{h}\left(x-x_{i}\right)$
- where K is the kernel and h > 0 is a smoothing parameter called the bandwidth
- $K_{h}=\frac{1}{h}K(\frac{x}{h})$ is called the scaled kernel
types of errors
- reject true: type I error
- not reject false: type II error
- Confusion Matrix | | predict positive | predict negative | | --------------- | ---------------- | ---------------- | | actual positive | TP | FN(type I) | | actual negative | FP(type II) | TN | | | positive | negative |
Test statistic
- p-value
  - the smallest level of significance to reject null hypothesis
  - the null hypothesis often be $x\neq0$, so smaller p-value, more further from 0, so better
- z-test
  - $TS=\cfrac{\bar{X}-\mu_{0}}{s_{\bar{X}}}$
  - compare with normal distribution values
  - 1-side 95%, 2-side 90%: 1.645
  - 1-side 97.5%, 2-side 95%: 1.96
  - 1-side 99.5%, 2-side 99%: 2.58
- t-test
  - comapre with t(n-1) (most times)
  - use t-test when sample is small and not that normal
- Chi2-test
  - $TS=\cfrac{(n-1) s^{2}}{\sigma_{0}^{2}}$
  - assume normal distribution
  - compare with $\chi^2(n-1)$
- F-test
  - $TS=\cfrac{s_{1}^{2}}{s_{2}^{2}}$
  - $df_{i}=n_{i}-1$
  - compare with $F\left(d f_{1}, d f_{2}\right)$
Linear regression
- OLS
  - $\hat{\beta}{1}=\cfrac{\sum{i=1}^{N}\left(y_{i}-\bar{y}\right)\left(x_{i}-\bar{x}\right)}{\sum_{i=1}^{N}\left(x_{i}-\bar{x}\right)^{2}}=\cfrac{s_{xy}}{s_{x}^{2}}=\cfrac{Cov(x)}{Var(x)}$
  - $\hat{\beta}{0}=\bar{y}-\hat{\beta}{1}\bar{x}$
  - $s^{2}=\frac{1}{N-2} \sum_{i=1}^{N} \hat{u}_{i}^{2}$ (variance of the error term)
  - $SE\left(\hat{\beta}{0}\right)=s\sqrt{\cfrac{\frac{1}{N}\sum x{i}^{2}}{\sum\left(x_{i}-\bar{x}\right)^{2}}}$
  - $SE\left(\hat{\beta}{1}\right)=s\sqrt{\cfrac{1}{\sum\left(x{i}-\bar{x}\right)^{2}}}$
  - $\hat{\beta}=\left(X^{\prime} X\right)^{-1}X^{\prime}y$
  - $\boldsymbol{s}^{2}=\cfrac{\hat{\mu}^{\prime} \hat{\mu}}{T-k}$ (variance of the error term) k include intercept
  - $SE\left(\hat{\beta}\right)=\sqrt{s^{2}(X^{\prime} X)^{-1}}$
  - Proof: wikipedia
  - t-test: for each variables
  - F-test: test for junk regressions, for multivariables
  - $R^2$(coefficient of determination): $R^{2}=1-\cfrac{\sum\left(y_{i}-\hat{y}{i}\right)^{2}}{\sum\left(y{i}-\bar{y}\right)^{2}}$
    - for one variable regression, $R^2=\rho(x,y)^2$
  - Multicollinearity
    - Problems:
      - $R^2$ will be high but the individual coefficients will have high standard errors.
      - The regression becomes very sensitive to small changes in the specification.
      - large standard errors
    - Solutions:
      - look at the matrix of correlations between the different individual variables to check
      - principal components
      - drop of one the collinear variables
      - include a ratio of collinear variables
      - increase the sample size, if collinear variables should be different
- Time series analysis
  - VAR: vector autoregression
- Panel data
  - dummy varibale
- Nonparametric Bootstrap and Bagging
  - takes independent draws (with reimmission) for n times, n is sample size
  - Compute any statistic of interest, repeat J times
  - the results are the distribution of statistic, call it bootstrap
  - if average the results, call it bagging
- Subset selection
  - compare BIC, AIC or adjusted $R^2$
- Ridge & Lasso regression
  - Ridge regression
    - $\min[(\mathbf{y}-X\beta)^{\prime}(\mathbf{y}-X\beta)+\lambda\beta^{\prime}\beta]$
    - $\widehat{\beta}^{\text {ridge}}(\lambda)=\left(\mathbf{X}^{\prime}\mathbf{X}+\lambda \mathbf{I}\right)^{-1}\mathbf{X}^{\prime}\mathbf{y}$
    - Covariates should be standardized
    - reducing over-fit
    - performs well when many covariates are highly correlated (than lasso)
  - Lasso regression
    - penalize term is absolute beta
    - set some coefficients to zero
Maximum likelihood
- assumption: errors be iid and normal (or student-t)
- likelihood L: $p(\mathbf{y} \mid \theta)=\prod_{i=1}^{n} p\left(y_{t} \mid \theta\right)$
- log likelihood: $I=log(L)$
- Example: the Gaussian linear model
  - $y_{i}=\beta^{T}x_{t}+\epsilon_{i},\epsilon_{i}\sim N\left(0,\sigma^{2}\right)$
  - $\theta=\left{\beta,\sigma^{2}\right},p\left(y_{t}\mid\theta\right)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{1}{2\sigma^{2}}\left(y_{t}-\beta^{T} x_{t}\right)^{2}}$
    - if use t-distribution: $\theta={\beta, \sigma^{2}, v}$ and $y_{t} \sim t\left(\beta^{T} x_{t}, \sigma^{2}, v\right)$
  - $\begin{aligned} I(\theta ; y) &=-\frac{1}{2} \sum_{i=1}^{n}\left(\log \sigma^{2}+\frac{1}{\sigma^{2}}\left(y_{i}-\beta^{\prime} x_{i}\right)^{2}\right)\ &=-\frac{n}{2} \log \sigma^{2}-\frac{1}{2 \sigma^{2}} \sum_{i=1}^{n}\left(y_{i}-\beta^{\prime} x_{i}\right)^{2} \end{aligned}$
  - $\theta_{M L}=\underset{\theta}{\arg\max}\ I(\theta;\mathbf{y})$
Logistic(logit) regression
- in fact classification but not regression
- logistic function: $y=\cfrac{1}{1+e^{-z}}=\cfrac{e^z}{1+e^z}$
- kind of 'Sigmoid' function
- log likelihood: $\sum_{i=1}^{n}\left[y_{i} \beta^{\prime} x_{i}-\log \left(1+\exp \left(\beta^{\prime} x_{i}\right)\right)\right]$

[QF] 03-Statistics

resources

key points