- We provide skeleton codes in Matlab for each project. You can start your implementation from the code provided by us.
- You may be able to find some codes online implemented by the authors. It is definitely OK and encouraged to check these codes. But it is an honor code violation to directly copy codes from existing implementations. Note that your code will be read by the course instructors.
- We allow for both 1-person or 2-person teams for the projects. Each team hands in one copy of the codes and writeup. Grading will be fair regardless of the team size.
- Your project will be evaluated based on many factors, such as your understanding of the algorithm, performance of your algorithm, quality of your code, your creativity, write-up, and presentation.
- It is estimated to take you 20-30 hours for each project, depending on your familiarity with the algorithm and how far you want to go for the projects. The third project might take you 10 hours more.
- You are not expected to implement all the details of the algorithm or really achieve state-of-the-art performance. But if your algorithm works very well, you will get extra credits.
Grading Policy per Project:
- Technical approach and code: 35% (including correctness and completeness of your method, your innovation, etc)
- Experiment evaluation: 35% (including performance and results of your method, thoroughness of your experiments, insights and analysis that you draw from your results)
- Write-up quality: 20% (please read the write-up sample carefully)
- Project presentation: 10% (clarity and content of your project presentation)
We allow seven late days in total for all the three projects. Once you have used up these late days, the project turned in late will be penalized 20% per late day.
Project 1: Pedestrian Detection with the Deformable Part Model
Implement a pedestrian detection algorithm using the HOG (Histogram of Oriented Gradients) feature representation and the Deformable Part Model.
- Refer to [Viola & Jones, 2001] for a general idea about object detection systems.
- Refer to [Dalal & Triggs, 2004] for the HOG (Histogram of Oriented Gradients) feature representation.
- Refer to [Felzenszwalb et al, 2008] for the Deformable Part Model for object detection. Felzenszwalb et al (PAMI 2010) is a more detailed version of the paper.
- Refer to [Felzenszwalb et al, 2010] for how to make your algorithm faster. (We encourage you to read this paper to get more ideas about how to improve a detection system. But you are not required to implement this paper.)
Dataset and Project Setup:
In this project, we will use PASCAL VOC 2007 person detection dataset to train and test your program. Here is a brief introduction of the datasets (you only need to look at the "person" category). The performance of your method will be evaluated using precision & recall curve and average precision (AP). Here is a criteria to evaluate whether a specific detection result is correct or not.
We will provide code to evaluate the performance of your algorithm as part of the starter code.Please follow these steps to start your project:
- Download the starter code here (7MB). Extract the file.
- Download the complete VOC2007 dataset here (878MB) and extract it into the 'VOCdevkit' folder. We also provide a smaller subset of the VOC dataset for some quick prototyping (DO NOT REPORT ANY RESULT ON THIS SUBSET!).
- The starter code includes a fast HOG feature implementation, learning and inference code for the root feature (including SVM training) in matlab. Run "compile.m" from withing Matlab and make from your favorite terminal to compile the HOG and SVM code.
- Run "pascal('person',2)" from within Matlab to train and evaluate the detector.
- The starter code should give you an AP of 0.126, which serves as a baseline (the reference implementation has an AP of 0.362).
What is included in the starter code:
- A fast implementation of HOG.
- Training code for the root filter including the SVM ("Root Filter Initialization" in [Felzenszwalb et al, 2008]).
- Detection code for the root filter.
- Evaluation code for the VOC 2007 dataset.
What you need to implement:
- The authors of the deformable part model have the code online:http://people.cs.uchicago.edu/~pff/latent/. You can read the code before you start your own implementation. But the authors' code has many tricks that are not fully covered by their paper, and you do not need to worry if you cannot fully understand their code.
- Implement your method based on [Felzenszwalb et al, 2008] but you do not need to implement the mixture model and dynamic programming for updating deformable parts. You can refer to [Felzenszwalb et al, 2010] to have a better understanding of the method, but you do not need to implement the additional details in this paper.
- Directly copying the authors' code without mentioning it in your write-up is an honor code violation. But if you do have trouble in implementing a specific function, you might refer to the existing codes and mention this clearly in your write-up. Note that implementing a function by yourself but have poor performance is more desirable than using existing codes.
- There might be some very time-consuming parts in the method. You may use mex-files which allows you to call C functions in matlab so that your algorithm can be accelerated.
- If your machine has limited memory or your code is extremely slow, you do not need to use all training and testing images. You can use a subset of the VOC dataset (with a minimum of 1000 training and 1000 test images). But it is encouraged to use all training (2501) and testing (4952) images in your experiments.
- If you have any questions or trouble, feel free to ask questions in Piazza or send emails to the course staff email "cs231b-spr1213-staff [at] lists [dot] stanford [dot] edu".
- Although you are basically implementing an existing algorithm, the project is very open and you can do everything you can imagine to achieve good performance. You do not need to worry too much if your algorithm is not doing a perfect job. But we do encourage you to start your projects earlier so that you have more time to play with your algorithm.
- Your project will not be evaluated based only on the performance of your algorithm. Show us that you have a good understanding of the problem and the algorithm, and try to have deep insights from your experiment results.
- A short presentation of your work: Tue, Apr 23 (in class)
- Deadline of submitting the code and write-up: Wed, Apr 24 (5pm), how to submit your documents will be updated later
You need to have your implementation and evaluation ready for you presentation (Apr 23). You have an additional day to update your writeup based on the feedback you get from class.
Project 2: Interactive Image Segmentation with GrabCut
Dataset and Experiment:
Project 3: Locality-constrained Linear Coding for Scene and Action Classification
Dataset, Resources, and Experiment: