Model Selection

Ahsan Ijaz

Model Evaluation

Making predictions using model
Evaluating loss
Remember RSS? \[\textit{RSS} = \sum_{i=1}^{N}(y_i - [\color{blue}{w_0} + \color{blue}{w_{1}}x_i])^2 \]

Loss function formalization

RSS or squared loss using testing data
Root mean square error on testing data \[\textit{RMSE} = \sqrt {\frac{1}{N}\sum_{i=1}^{N}(y_i - [w_0 + w_{1}x_i])^2} \]

Training error vs model complexity

Constant model: Only \(w_0\) parameter used.

Training error vs model complexity

Linear model: \(y = w_0 + w_1x\)

Training error vs model complexity

Quadratic model: \(y = w_0 + w_1x + w_2x^2\)

Training error vs model complexity

Training error decreases as model complexity increases.

Training error vs model complexity

Is a complex model better?

Training and testing data

Loss function calculation using testing data

Training and testing data

Training and testing error with model complexity.

K-Nearest neighbors-model complexity

Identify K nearest points to the test observation represented by \(\mathbb{N_0}\)
Calculates conditional probability of points belonging to class j
\[\textit{Pr}(Y=j|X=x_0) = \frac{1}{K}\sum_{I\in\mathbb{N}_0}I(y_i = j)\] Here \(x_0\) is the test observation, \(K\) is positive integer.

K-Nearest neighbors-model with k = 10

Black line indiates the KNN decision boundary with K = 10. The optimum decision boundary is shown as purple dashed line.

K-Nearest neighbors-model with k = 1 and k = 100

Black curves indiates the KNN decision boundary with K = 1 and K = 100. The optimum decision boundary is shown as purple dashed line. With K = 1, we have an overly flexible decision boundary. With K = 100, it is not sufficiently flexible.

Model Selection

Model Evaluation

Loss function formalization

Training error vs model complexity

Training error vs model complexity

Training error vs model complexity

Training error vs model complexity

Training error vs model complexity

Training error vs model complexity

Training and testing data

Training and testing data

K-Nearest neighbors-model complexity

K-Nearest neighbors-model with k = 10

K-Nearest neighbors-model with k = 1 and k = 100

Training vs testing with increasing complexity on KNN