Figure 1. Overview of Object Bank Representation Extraction
Object bank representation is a novel image representation for high-level visual tasks, which encodes semantic and
spatial information of the objects within an image. In object
bank, an image is represented as a collection of scale-invariant responsemaps of a large
number of pre-trained generic object detectors. In Figure 1, we show the feature extraction process.
Using simple, off-the-shelf classifiers such as linear support
vector machines and logistic regression, we show that this high-level
image representation can be used effectively for high-level visual
tasks such as object and scene image classification, image annotation
and image retrieval. The results surpass reported
state-of-the-arts performance on a number of standard benchmark
The executable takes in an image in any standard image format (e.g., jpg). Using pre-trained object filters (included in the package), it outputs objectbank representation in text format.
- Source code is implemented in MATLAB and C++ with optimized time complexity.
- Pre-trained object filters are contained in the package, providing high flexibility on object filter selection.
- Responsemap of each object filter can be chosen as an optional output.
- System admits any number of pre-trained object filters; the current list can be found here.
- MATLAB version takes approximately 7 seconds per image.
- feature extraction source code: C++ and MATLAB (7 seconds per image)
- classifcation source code: MATLAB
Below are two example benchmark results on MIT-Indoor and UIUC-Event using linaer SVM (OB-SVM) and linear iregression (OB-LR).
The extracted object bank features of these two datasets can be downloaded here: MIT-Indoor and UIUC-Event.
 Li-Jia Li*, Hao Su*, Eric P. Xing and Li Fei-Fei. Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification. Proceedings of the Neural Information Processing Systems (NIPS), 2010. (*indicates equal contribution) [PDF]
 Li-Jia Li*, Hao Su*, Yongwhan Lim and Li Fei-Fei. Objects as Attributes for Scene Classification. Proceedings of the 12th European Conference of Computer Vision (ECCV), 1st International Workshop on Parts and Attributes, 2010. (*indicates equal contribution) [PDF]
 A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42, 2001.
 D. Lowe. Object recognition from local scale-invariant features. Proceedings of International Conference on Computer Vision (ICCV), 1999.
 S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
 A. Quattoni and A. Torralba. Recognizing Indoor Scenes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
 Li-Jia Li and Li Fei-Fei. What, where and who? Classifying event by scene and object recognition. IEEE International Conference in Computer Vision (ICCV), 2007.