Deep Learning PyData Talk

by Jack Simpson January 7, 2017

written by Jack Simpson January 7, 2017

Deep learning is a type of machine learning based on neural networks which were inspired by neurons in the brain. The difference between a deep neural network and a normal natural network is the number of ‘hidden layers’ between the input and output layers.

I recently watched an excellent presentation on Deep Learning by Roelof Pieters titled ‘Python for image and text understanding; One model to rule them all!‘ I can recommend watching it, and I’ve written this post for me to put down a few of my own bullet points from the talk for future reference.

Roelof had a 5 point process for training a deep neural network:

Preprocess the Data

Try to do as little as possible – the more transformations you do, the less you allow the network to come up with its own representations of the data: the more raw the better
Mean subtraction normalisation
Divide by standard deviation
If you data is noisy you may want to do some PCA and whitening: reduce the dimensions
Compute statistics on training data but apply on all (training and test) data

Choose the architecture

Three choices:
- Deep Belief Network (DBN): Series of restricted Boltzmann machines (RBM). Useful for hierarchical data like medical or audio datasets
- Convolutional Net (CNN): Convolutional layers are small filters/crops of the image that you sum together. Useful for images.
- Recurrent Net (RNN): Form of Hidden Markov Model. Useful for natural language processing.

Train

Assign layer definitions and layer parameters, learning rate etc

Optimise/Regularise

Move between Optimise/Regularise step and training step whilst improving
Visualise loss curve – lasagne comes with functions to achieve this
Visualise accuracy
Can visualise weights: with images you want to see edges in the first layer
Can optimise hyperparameters
- Grid Search (Won’t work for millions of parameters)
- Random search (Takes a long time)
- Bayesian optimisation (seems to work the best, spearmint and hypergrad libraries available)
Data augmentation: With images you can scale, rotate, contrast, flip
Dropout: randomly switch off nodes: allows the network to adapt
Batch normalisation

Tips/Tricks

Ensembles: Train multiple models and can allow them to vote on prediction or take the average vote with continuous data. Ensure classifiers are not correlated.

Python

Deep Learning PyData Talk

Sign up to my newsletter

You may also like