Research Blog

How to Train Your ResNet

September 24, 2018

The introduction to a series of posts investigating how to train Residual networks efficiently on the CIFAR10 image classification dataset. By the fourth post, we can train to the 94% accuracy threshold of the DAWNBench competition in 79 seconds on a single V100 GPU.

How to Train Your ResNet 1: Baseline

September 24, 2018

We establish a baseline for training a Residual network to 94% test accuracy on CIFAR10, which takes 297s on a single V100 GPU.

How to Train Your ResNet 2: Mini-batches

September 24, 2018

We investigate the effects of mini-batch size on training and use larger batches to reduce training time to 256s.

How to Train Your ResNet 3: Regularisation

September 24, 2018

We identify a performance bottleneck and add regularisation to reduce the training time further to 154s.

How to Train Your ResNet 4: Architecture

September 24, 2018

We search for more efficient network architectures and find a 9 layer network that trains in 79s.

How to Train Your ResNet 5: Hyperparameters

November 28, 2018

We develop some heuristics for hyperparameter tuning.


September 20, 2018

Are GPUs a good target for speech synthesis? Is Baidu's GPU implementation of WaveNet the best you can do on a GPU? We run some tests, discuss latency and find out