Cross Validation is often used as a tool for model selection across classifiers. As discussed in detail in the following paper https://ssrn.com/abstract=2967184, Cross Validation is typically performed in the following steps:
- Step 1: Divide the original sample into K sub samples; each subsample typically has equal sample size and is referred to as one fold, altogether, K-fold.
- Step 2: In turn, while keeping one fold as a holdout sample for the purpose of Validation, perform Training on the remaining K-1 folds; one needs to repeat this step for K iterations.
- Step 3: The performance statistics (e.g., Misclassification Error) calculated from K iterations reflects the overall K-fold Cross Validation performance for a given classifier.
However, one question often pops up: how to choose K in K-fold cross validation. The rule-of-thumb choice often suggested by literature based on non-financial market is K=10. The question is: is it true for Financial Market?
In the following paper, in the context of Financial Market, we compare a range of choices for K in K-fold cross validation for the following 8 most popular classifiers:
- Neural Network
- Support Vector Machine
- Ensemble
- Discriminant Analysis.
- Naïve Bayes.
- K-nearest Neighbours.
- Decision Tree.
- Logistic Regression
For those who want to know a bit more, the paper is available: https://ssrn.com/abstract=2967184