How to Improve Learning Algorithms
- get more training examples
- feature selection
- get additional features
- add polynomial features
- decrease/increase
$\lambda$
Evaluate a Learning Algorithm
- training/validation/test set, e.g. 60/20/20 split
- training/validation/test error
- model selection:
- optimize parameters by minimizing training error for each model
- select the model with the least validation error (e.g. select polynomial degree
$d$
)
- estimate generalization error using test error
Machine Learning Diagnostic: Bias vs Variance
- can rule out certain courses of action as being unlikely to improve the performance of your learning algorithm significantly
- high bias = underfit = high training error, validation error
$\approx$
training error - high variance = overfit = low training error, validation error
$\gg$
training error
Regularization and Bias/Variance
- large
$\lambda$
$\Rightarrow$
high bias = underfit - small
$\lambda$
$\Rightarrow$
high variance = overfit - choose regularization parameter
- create a list of
$\lambda$
s - create a list of models
- iterate through the
$\lambda$
s and for each$\lambda$
go through all the models to optimize parameter$\Theta$
- compute validation error using the learned
$\Theta$
- select the best combo with the least validation error
- estimate generalization error using test error
- create a list of
Learning Curves
- as the training set gets larger, the training error increases
- error value will plateau out after a certain training set size
- Experiencing high bias:
- Low training set size: low training error, high validation error
- Large training set size: high training error, validation error
$\approx$
training error - getting more training data will not (by itself) help much
- Experiencing high variance:
- Low training set size: low training error, high validation error
- Large training set size: training error increases with training set size, validation error continues to decrease without leveling off; difference between 2 errors remains significant
- getting more training data is likely to help
Debugging a Learning Algorithm
- get more training examples
$\rightarrow$
fit high variance - feature selection
$\rightarrow$
fit high variance - get additional features
$\rightarrow$
fit high bias - add polynomial features
$\rightarrow$
fit high bias - increase
$\lambda$
$\rightarrow$
fit high variance - decrease
$\lambda$
$\rightarrow$
fit high bias
Neural Networks and Overfitting
- small nn
- fewer parameters
- more prone to underfitting
- computationally cheaper
- larger nn
- more parameters
- more prone to overfitting
- computationally expensive
- use regularization to address overfitting