A framework for visual-context-aware object detection in still images
Computer Vision and Image Understanding, Special Issue on Multi-Camera and Multi-Modal Sensor Fusion, 2010
Visual context provides cues about an object's presence, position and size within the observed scene, which should be used to increase the performance of object detection techniques. However, in computer vision, object detectors typically ignore this information. We therefore present a framework for visual-context-aware object detection. Methods for extracting visual contextual information from still images are proposed, which are then used to calculate a prior for object detection. The concept is based on a sparse coding of contextual features, which are based on geometry and texture. In addition, bottom-up saliency and object co-occurrences are exploited, to define auxiliary visual context. To integrate the individual contextual cues with a local appearance-based object detector, a fully probabilistic framework is established. In contrast to other methods, our integration is based on modeling the underlying conditional probabilities between the different cues, which is done via kernel density estimation. This integration is a crucial part of the framework which is demonstrated within the detailed evaluation. Our method is evaluated using a novel demanding image data set and compared to a state-of-the-art method for context-aware object detection. An in-depth analysis is given discussing the contributions of the individual contextual cues and the limitations of visual context for object detection.