Review

Lucy D’Agostino McGowan

Make sure to bring a calculator to the exam

Exam

  • Two parts
    • Part 1: In class Monday (you can have one cheat sheet)
    • Part 2: The same exam, taken at home – this is open notes (there will be no class Wednesday so you can have dedicated time to work on this then)

y x1 x2
5.7 2 1
8.3 3 1
7.3 4 0

You want to predict y using x1 and x2 write out how you would calculate \(\hat\beta\) in matrix form using the data provided (you do not need to solve the matrix)

y x1 x2
5.1 2 1
6.9 3 1
7.8 4 0

Solving the above equation results in the following:

\[ \begin{bmatrix} \hat\beta_0 \\\hat\beta_1\\\hat\beta_2 \end{bmatrix} = \begin{bmatrix} 0.7\\1.8\\0.9 \end{bmatrix} \]

Using the information provided, calculate the MSE for this model.

y x1 x2
9.1 4 1
6.2 3 0
5.8 2 1

You get a new test data set (above). Using the model you fit to the training data, calculate the MSE in this test set.

Logistic Regression

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 12.04 4.51 2.67 0.01 5.21 23.63
wt -4.02 1.44 -2.80 0.01 -7.70 -1.83

We are predicting whether a car has automatic transmission based on it’s weight

How do you interpret \(\hat\beta_1\) is this a marginal or conditional effect?

Logistic Regression

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 25.89 12.19 2.12 0.03 5.60 55.83
wt -6.42 2.55 -2.52 0.01 -12.82 -2.37
mpg -0.32 0.24 -1.35 0.18 -0.87 0.12

We add in an additional variable

How do you interpret \(\hat\beta_1\)? Is this a conditional effect?

Penalized Regression

  • What is the penalty for Ridge Regression?
  • What is the penalty for Lasso?
  • What is the equation for Elastic Net?
  • How do we choose \(\lambda\)? \(\alpha\)?

Bias-variance trade off

  • What is the bias-variance trade-off?
  • As the flexibility of the model increases, how does that impact bias? variance? Training MSE? Testing MSE?
  • As \(\lambda\) increases in penalyzed regression, how does this impact the flexibility of the model?

k-fold cross validation

  • How does it work?
  • What are the advantages/disadvantages of a small \(k\) vs large \(k\)