This week, Barcelona hosts the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations. It covers topics ranging from deep learning and computer vision to cognitive science and reinforcement learning.
NIPS is one of the top Machine Learning and Artificial Intelligence conferences in the world: 6,000 attendees at #NIPS2016 in Barcelona with a sold out weeks before start date. NIPS has become the academic and industry AI conference, growing near-exponentially over the past decade and local mass media don’t pay attention to the event , incredible!.
Tickets for the main conference, despite nearly doubling in quantity since last year, sold out more than 6 weeks before the event.
Tomorrow, December 9th, we will present our contribution in hierarchical object detection with deep reinforcement learning, a pipeline to locate objects by analyzing just a few regions. You can meet us at Deep Reinforcement Learning Workshop (Area 1).
We propose a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap.
Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.
Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution. We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.