# Problems occurring during validation ## Validation stage Causes of different scores and optimal parameters 1. Too little data 2. Too diverse and inconsistent data We should do extensive validation 1. Average scores from different KFold splits 2. Tune model on one split, evaluate score on the other ## Submission stage We can observe that: - LB score is consistently higher/lower that validation score - LB score is not correlate with validation score at all ## Expect LB shuffle because of - Randomness - Little amount of data - Different public/private distributions