Deep Learning on GPU Servers

6600GT_GPUI read with interest the recent paper out of Baidu about scaling up image recognition. In it they talk about creating a supercomputer to carry out the learning phase of training a deep convolutional network. Training such things is terribly slow, with their typical example taking 212 hours on a single GPU machine because of the enormous number of weight computations that need to be evaluated and the slow stochastic gradient process over large training sets.

Baidu has built a dedicated machine with 36 servers connected by an InfiniBand switch, each server with four GPUs. In the paper they describe different ways of partitioning the problem to run on this machine. They end up being able to train the model using 32 GPUs in 8.6 hours.

One thing that is good to see is that GPU machines can really accelerate learning on these networks. They say that three servers, each with 4 GPUs, has equivalent performance to 16,000 CPU cores in the Google Brain project.

NVIDIA now has GPU cards which contain 5,760 CUDA cores and claim that only three are needed to beat the Google Brain supercomputer. They already support some tools for machine learning such as cuda-convnet.

This is good because one of the barriers to getting into this area of machine learning for new businesses is being able to actually train the models without needing to spent vast sums of money on hardware or cloud compute.

Recently I have been doing some work using OpenCL for this kind of thing.

For people who want to get going with deep learning and immediately make use of GPU-based training, I would recommend trying out the Caffe learning framework out of UC Berkeley.

For people interested in the kind of features that modern deep image recognition systems can learn once you get through the hours of training, check out the work by Matt Zeiler and Rob Fergus on convolutional networks

For more detailed theoretical understanding of what unsupervised networks learn, and a fantastic introduction to image statistics, I wholeheartedly recommend the book by Aapo Hyvarinen, Jarmo Hurri, and Patrik O. Hoyer.

Twitter: @robotbugs