Object Bank

	Object Bank

Figure 1. Overview of Object Bank Representation Extraction

Overview [Home]

Object bank representation is a novel image representation for high-level visual tasks, which encodes semantic and spatial information of the objects within an image. In object bank, an image is represented as a collection of scale-invariant responsemaps of a large number of pre-trained generic object detectors. In Figure 1, we show the feature extraction process. Using simple, off-the-shelf classifiers such as linear support vector machines and logistic regression, we show that this high-level image representation can be used effectively for high-level visual tasks such as object and scene image classification, image annotation and image retrieval. The results surpass reported state-of-the-arts performance on a number of standard benchmark datasets.

Software [Home]

The executable takes in an image in any standard image format (e.g., jpg). Using pre-trained object filters (included in the package), it outputs objectbank representation in text format.

Highlights

Source code is implemented in MATLAB and C++ with optimized time complexity.
Pre-trained object filters are contained in the package, providing high flexibility on object filter selection.
Responsemap of each object filter can be chosen as an optional output.
System admits any number of pre-trained object filters; the current list can be found here.
MATLAB version takes approximately 7 seconds per image.

Downloads

feature extraction source code: C++ and MATLAB (7 seconds per image)
classifcation source code: MATLAB

Benchmark

Below are two example benchmark results on MIT-Indoor and UIUC-Event using linaer SVM (OB-SVM) and linear iregression (OB-LR). The extracted object bank features of these two datasets can be downloaded here: MIT-Indoor and UIUC-Event.

References [Home]

[1] Li-Jia Li*, Hao Su*, Eric P. Xing and Li Fei-Fei. Object Bank: A High-Level Image Representation for Scene Classification and Semantic Feature Sparsification. Proceedings of the Neural Information Processing Systems (NIPS), 2010. (*indicates equal contribution) [PDF]
[2] Li-Jia Li*, Hao Su*, Yongwhan Lim and Li Fei-Fei. Objects as Attributes for Scene Classification. Proceedings of the 12th European Conference of Computer Vision (ECCV), 1st International Workshop on Parts and Attributes, 2010. (*indicates equal contribution) [PDF]
[3] A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42, 2001.
[4] D. Lowe. Object recognition from local scale-invariant features. Proceedings of International Conference on Computer Vision (ICCV), 1999.
[5] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
[6] A. Quattoni and A. Torralba. Recognizing Indoor Scenes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
[7] Li-Jia Li and Li Fei-Fei. What, where and who? Classifying event by scene and object recognition. IEEE International Conference in Computer Vision (ICCV), 2007.

Object Bank

Overview Software References

Li-Jia Li , Hao Su , Yongwhan Lim , Robert Cosgriff , Daniel Goodwin , and Li Fei-Fei

Vision Lab, Stanford University

Overview [Home]

Software [Home]

Highlights

Downloads

Benchmark

References [Home]

Overview Software References

Object Bank

Overview Software References

Li-Jia Li, Hao Su, Yongwhan Lim, Robert Cosgriff, Daniel Goodwin, and Li Fei-Fei

Vision Lab, Stanford University

Overview [Home]

Software [Home]

Highlights

Downloads

Benchmark

References [Home]

Overview Software References

Li-Jia Li , Hao Su , Yongwhan Lim , Robert Cosgriff , Daniel Goodwin , and Li Fei-Fei