Improving Model Accuracy

by Jack Simpson December 11, 2016

written by Jack Simpson December 11, 2016

I wrote a few quick bullet points down from the article “8 Proven Ways for improving the “Accuracy” of a Machine Learning Model” for future reference.

Improving Accuracy

Add more data
Fix missing values
- Continuous: impute with median/mean/mode
- Categorical: treat as separate class
- Predict missing classes with k-nearest neighbours
Outliers
- Delete
- Bin
- Impute
- Treat as separate to the others
Feature engineering
- Transform and normalise: scale between 0-1
- Eliminate skewness (e.g. log) for algorithms that require normal distribution
- Create features: Date of transactions might not be useful but day of the week may be
Feature selection
- Best features to use: identify via visualisation or through domain knowledge
- Significance: Use p-values and other metrics to identify the right values. Can also use dimensionally reduction while preserving relationships in the data
Test multiple machine learning algorithms and tune their parameters
Ensemble methods: combine multiple week predictors (bagging and boosting)