What is Deep Learning?
We can consider Deep Learning as a new area of Machine Learning research with the objective of moving Machine Learning closer to Artificial Intelligence (one of its original goals). Our research group has been working in Machine Learning for a long time thanks to Ricard Gavaldà who introduced us in this wonderful world. It was during the summer of 2006, also with Toni Moreno, Josep Ll. Berral, Nico Poggi. Unforgettable moments! However, after 8 years we will make a step forward and start to work with Deep Learning. It was during a group retreat held last September when I realise that “Deep Learning” was an interesting topic thank you to Jordi Nin.
Deep Learning comes from Neural nets conceived in the 1940s, inspired by the synaptic structures of the human brain. But early neural networks could simulate only a very limited number of neurons at once, so they could not recognise patterns of great complexity. Neural networks had resurgence in the 1980s when researchers helped spark a revival of interest in them with new algorithms, but complex speech or image recognition required more computer power than was then available.
In the last decade researchers made some fundamental conceptual breakthroughs, but until few years ago computers weren’t fast or powerful enough to process the enormous collections of data that these types of algorithms require. Right now, companies like Google, Facebook, Baidu, Yahoo or Microsoft are using deep learning to better match products with consumers by building more effective recommendation engines.
Deep Learning attempts to mimic the activity in layers of neurons in the neocortex with a software system. This software creates a set of virtual neurons and then assigns random weights values to connections between them. These weights determine how each simulated neuron responds to a digitised feature. The system is trained by blitzing it with digitised versions of images containing the objects. An important thing is that the system can do all that without asking a human to provide labels for objects (as is often the case with traditional machine learning tools). If the system didn’t accurately recognize a particular pattern, an automatic algorithm would adjust the weights of the neurons.
The first layer of neurons learns primitive features, like an edge in an image. It does this by finding combinations of digitized pixels that occur more often than they should by chance. Once that layer accurately recognizes those features, they are fed to the next layer, which trains itself to recognize more complex features, like a corner. The process is repeated in successive layers until the system can reliably recognize objects or phonemes. An interesting paper that Jordi Nin sent to me is from Google, that used a neural network of a billion connections. They consider the problem of building high-level, class-specific feature detectors from only unlabelled data training a 9-layered virtual neurons (the model has 1 billion connections), with a dataset of 10 million images. Training the many layers of virtual neurons in the experiment required 16,000 computer cores!!!. Is it clear now why our research group is entering in this amazing world?
(*) Picture from Andrew Ng (Stanford)