Improving Model Accuracy

I wrote a few quick bullet points down from the article “8 Proven Ways for improving the “Accuracy” of a Machine Learning Model” for future reference.

Improving Accuracy

  • Add more data
  • Fix missing values
    • Continuous: impute with median/mean/mode
    • Categorical: treat as separate class
    • Predict missing classes with k-nearest neighbours
  • Outliers
    • Delete
    • Bin
    • Impute
    • Treat as separate to the others
  • Feature engineering
    • Transform and normalise: scale between 0-1
    • Eliminate skewness (e.g. log) for algorithms that require normal distribution
    • Create features: Date of transactions might not be useful but day of the week may be
  • Feature selection
    • Best features to use: identify via visualisation or through domain knowledge
    • Significance: Use p-values and other metrics to identify the right values. Can also use dimensionally reduction while preserving relationships in the data
  • Test multiple machine learning algorithms and tune their parameters
  • Ensemble methods: combine multiple week predictors (bagging and boosting)
The following two tabs change content below.
Computational biology PhD candidate at the Australian National University. I love writing (both articles and software), learning more about the world around us, and beekeeping. I also write for BioSky.co

Latest posts by Jack Simpson (see all)

Comments are closed.