Hyperparameter tuning

Hyperparameter tuning

Plan for the lecture

Hyperparameter tuning in general
- General pipeline
- Manual and automatic tuning
- What should we understand about hyperparameters?
Models, libraries and hyperparameter optimization
- Tree-based models
- Neural Networks
- Linear models

Plan for the lecture : models

Tree-based models
- GBDT: XGBoost, LightGBM, CatBoost
- RandomForest / ExtraTrees
Neural nets
- Pytorch, Tensorflow, Keras…
Linear models
- SVM, logistic regression
- Vowpal Wabbit, FTRL
Factorization Machines (out of scope)
- libFM, libFFM

What framework to use?

Keras, Lasagne
TensorFlow
MxNet
PyTorch
sklearn's MLP
…

They implement the same functionality! (except sklearn)

I recommend:
- PyTorch
- Keras

Tips

Don't spend too much time tuning hyperparameters

Only if you don't have any more ideas or you have spare computational resources

Be patient

It can take thousands of rounds for GBDT or nueral nets to fit

Average everything

Over random seed
Or over small deviations from optimal parameters
- e.g. average max_depth=4,5,6 for an optimal 5

How do we tune hyperparameters

Select the most influential parameters
- There are tons of parameters and we can't tune all of them
Understand, how exactly they influence the training
Tune them!
- Manually (change the examine)
- Automatically (hyperopt, etc)

Hyperparameter optimization software

A lot of libraries to try:

def xgb_score(param):
  # run XGBoost with arameters 'param'
  
def xgb_hyperopt():
  space = {
    'eta': 0.01,
    'max_depth': hp.quniform('max_depth', 10, 30, 1),
    'min_child_weight': hp.quniform('min_child_weight', 0, 100, 1),
    'subsample': hp.quniform('subsample', 0.1, 1.0, 0.1),
    'gamma': hp.quniform('gamma', 0.0, 30, 0.5),
    'colsample_bytree': hp.quniform('colsample_bytree', 0.1, 1.0, 0.1),
    'objective': 'reg:linear',
    'nthread': 28,
    'silent': 1,
    'num_round': 2500,
    'seed': 2441,
    'early_stopping_rounds': 100
  }
  
  best = fmin(xgb_score, space, algo=tpe.suggest, max_evals=1000)

Color-coding legend

Underfitting (bad) (red)
Good fit and generalization (good)
Overfitting (bad) (green)

A parameter in red

Increasing it impedes fitting
Increase it to reduce overfitting
Decrease to allow model fit easier

A parameter in green

Increasing it leads to a better fit (overfit) on train set
Increase it, if model underfits
Decrease if overfits

Conclusion

Hyperparameter tuning in general

General pipeline
Manual and automatic tuning
What should we understand about hyperparameters?

Models, libraries and hyperparameter optimization

Tree-based models
Neural networks
Linear models

Ref

https://www.coursera.org/learn/competitive-data-science/lecture/Hg3xw/hyperparameter-tuning-iii

Table of Contents