Stanford University CS 231A: Introduction to Computer Vision

CS 231A Course Project

Open Project Overview

You need to propose an original research topic, or replicate an existing paper. Need instructor's approval.
• Project Ideas and Suggestions
• Project Reports of Previous Years

Important Dates

Oct 16 (11:59pm): Finalizing team members : Maximum team size: 2. Send us an email with your team name and team members.

Oct 16 (11:59pm): Proposal submission : Submit a 0.5 page course project proposal in our provided template. Send a PDF file to cs231a2012@gmail.com

Nov 6 (11:59pm): Project milestone : Submit a 2-3 page course project milestone report.

Dec 3 (11:59pm): Code submission : No late days allowed.

Dec 4 (11:59pm): Writeup due : No late days allowed.

Dec 6 (time 1-3 pm): Course project presentation. Location: Packard Atrium

Grading Policy

Final Project: 40%

presentation: 5%
write-up: 10%
• clarity, structure, language, references: 3%
• background literature survey, good understanding of the problem: 3%
• good insights and discussions of methodology, analysis, results, etc.: 4%
technical: 15%
• correctness: 5%
• depth: 5%
• innovation: 5%
evaluation and results: 10%
• sound evaluation metric: 3%
• thoroughness in analysis and experimentation: 3%
• results and performance: 4%

Project Submission Details

Write-up and Code submission

You must use our provided templates. Even if you're sharing this project with another class, we require that you use the write-up template for CS231a, especially focusing on the honor code section as detailed in the template and on the webpage. Email your project proposal, milestone report, final report and zipped code to: cs231a2012@gmail.com , with the following format:

Subject Line: Course Project Proposal/Milestone/Report/Code
Body: Full names of all group members, SUNet IDs and Project title
Attachments: Write-up as LastName_LastName_Paper.pdf, Code as LastName_LastName_Code.zip, where the titles have all the last names of the group members.

Final Report Write-up Guidelines

Your final write-up should be between 8 - 10 pages using the template provided. After the class, we will post all the final reports online (restricted to CS231a students only) so that you can read about each others’ work. If you do not want your writeup to be posted online, then please let us know at least a week in advance of the final writeup submission deadline. The following is a suggested structure for your report:

• Title, Author(s)

• Abstract: It should not be more than 300 words;

• Introduction: this section introduces your problem, and the overall plan for approaching your problem

• Background/Related Work: This section discusses relevant literature for your project

• Approach: This section details the framework of your project. Be specific, which means you might want to include equations, figures, plots, etc

• Experiment: This section begins with what kind of experiments you're doing, what kind of dataset(s) you're using, and what is the way you measure or evaluate your results. It then shows in details the results of your experiments. By details, I mean both quantitative evaluations (show numbers, figures, tables, etc) as well as qualitative results (show images, example results, etc).

• Conclusion: What have you learned? Suggest future ideas.

• References: This is absolutely necessary. Reports without references will not receive a score higher than 20 points (total is 40 points).

• Supplementary materials: This is NOT counted toward your 8-10 page limit. Please submit your code as supplementary materials.

Poster Presentation

On the presentation day, all open project teams should prepare a 2 by 3 foot poster describing their project. Bring it with you to the Packard Atrium at 12:55 pm on the presentation day, where you will be able to set up it up. On the presentation day, the instructor and TAs will be visiting each of the posters, so prepare a roughly 2 minute presentation so you can fully describe your project in the limited time available.

Prizes

The instructor and TAs will award two prizes to the best projects. The exact prize is a surprise, but in the past it has been an Amazon gift card.

Project Proposal

We have provided the template for your final write-up. Your proposal should follow the same template, and should be no more than 1 page. Your proposal should describe as clearly as possible the following:

• What is the computer vision problem that you will be investigating? Why is it interesting?

• What image or video data will you use? If you are collecting new datasets, how do you plan to collect them?

• What method or algorithm are you proposing? If there are existing implementations, will you use them and how? How do you plan to improve or modify such implementations?

• Which reading will you examine to provide context and background?

• How will you evaluate your results? Qualitatively, what kind of results do you expect (e.g. plots or figures)? Quantitatively, what kind of analysis will you use to evaluate and/or compare your results (e.g. what performance metrics or statistical tests)?

Project Milestone

Your project milestone report should be between 2 - 3 pages using the template provided. The following is a suggested structure for your report:

• Title, Author(s)

• Introduction: this section introduces your problem, and the overall plan for approaching your problem

• Problem statement: Describe your problem precisely specifying the dataset to be used, expected results and evaluation

• Technical Approach: Describe the methods you intend to apply to solve the given problem

• Intermediate/Preliminary Results: State and evaluate your results upto the milestone

Honor Code

You may consult any papers, books, online references, or publicly available implementations (such as SIFT) for ideas and code that you may want to incorporate into your strategy or algorithm, so long as you clearly cite your sources in your code and your writeup. However, under no circumstances may you look at another group’s code or incorporate their code into your project.

If you are doing a similar project for another class, you must make this clear and write down the exact portion of the project that is being counted for CS231A. Pay close attention to this portion of the writeup template. Failure to do so is an honor code violation.

Project Reports of Previous Years

Autumn, 2011-2012

Object Detection Using Segmented Images

Optical Flow For Vision-Aided Navigation

Fully automated trimap generation for image matting with Kinect

Sign Language Recognition with Unsupervised Feature Learning

Tracking Based Semi Supervised Learning using Background Subtraction - Classification (BSC) Model

Semi-supervised learning for adaptive object recognition in RGBz images

Transparent Object Recognition Using Gradient Grids

Interpreting 3D Scenes Using Object-level Constraints

Data-driven Depth Inference from a Single Still Image

Exploring Features for Classification with Accuracy Guarantees

Cinemagraph: Automated Generation (CAG)

Self-Paced Learning for Semisupervised Image Classification

Recognizing Patient Names in Handwritten Clinical Notes in the Absence of Training Data

Hearing Sheet Music: Towards Visual Recognition of Printed Scores

Tracked-base semi-supervised learning on camera-based system

Object group model for scene studying

GPU Accelerated Image Super-Resolution

Real-Time Interactive Airbending

Unsupervised Learning of Text-sensitive Features For Large-scale Scene Classification

Tracking-Based Semi-Supervised Learning using Stationary Video

Photomosaic Mapmaking of the Seafloor using SIFT

Image Segmentation via Total Variation and Hypothesis Testing Methods

Scaling for Multimodal 3D Object Detection

Arrowsmith: Automatic Archery Scorer

ViFaI: A trained video face indexing scheme

Robust Tumbling Target Reconstruction through Fusion of Vision and LIDAR for Autonomous Rendezvous and Docking On-Orbit

Multiple Feature Learning for Action Classification

Enhancing LINE-MOD Object Recognition with Winner Take All and Fast Approximate Nearest Neighbor Search

Video understanding using part based object detection models

Robust Text Reading in Natural Scene Images

Automating Grab-Cut Selection for Single-Object Foreground Images

Large Scale Image Deduplication

Dense Object Detection in Indoor Scenes Using Depth Information

Winter, 2010-2011

3D Model Segmentation and Labeling

Generic Object/Scene recognition for the Smart Album Project

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Learning Slow Features for Object Recognition

RoboGrader: Scoring Multiple Choice Tests with a Smartphone

Joint Subclassing and Classification

Heuristics for Decision Tree Selection and Weight Assignment in Random Forest for Fine-Grained Image Classification

Hole Filling Method Using Edge Based Interpolated Depth for View Synthesis

RGB-Z Segmentation of Objects in a Cluttered Scene Using a Kinect Sensor

Simultaneous Segmentation and Tracking in Point Cloud Data using the Iterative Closest Point Algorithm

KFace3D: Facial Recognition using RGBD Data

Comparison of Aircraft Tracking Using Top-Down and Bottom-Up Approaches

Geometric Understanding of Indoor Scenes

Object Pose Estimation using Optical Flow and POSIT

Real Time Subcutaneous Vein Recognition of Forearm Veins

Face Detection and Tracking for BabyCam

Image Retrieval, Semantic & Geographic Annotation using visual/multimedia representations and textual information

Smart Album: Face Recognition and Landmark Recognition in Album

Computer-assisted Detection of Defects during the Fabrication of PDMS chips

Unsupervised Learning of Invariances with Temporal Coherence

Image-based Web Page Classification

Winter, 2009-2010

Using a Functionality Model for Chair Detection

Fusing Multi-Channel Cues for Image Organization

Generalizing ImageNet to SmartPhones

Motion-sensitive Low-noise Imaging

Unsupervised Image Segmentation using Deep Belief Nets

The Retinal Algorithm to Detect, Segment and Track Moving Objects with Observer Motion

Unsupervised Feature Learning of Bi-modal Features

Efficient Classification and Segmentation of Specular Objects

Feature Descriptors for Tiny Image Categorization

A feature tracking approach to painted aperture

Baseline Scene Classifications

Camera Tracking with Fixed Point Math for Mobile Devices

Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities

Sub-meter Indoor Localization in Unmodified Environments with Inexpensive Sensors

Segmentation of seismic images

Object Detecting in Images using Time Series Ensemble Methods

Learning Visual Invariance in a 2-Layer Neural Network

The "Find Mii" Challenge

Find Mii on Wii

"Find Mii" is a game on Nintendo Wii Play. It basically involves identifying certian avatars (Miis) from a bunch of them, standing still or moving around in various styles. If you are not yet proficient in this game, let Elvis show you how to play it in the videos below.

As we know computers are designed to work in certain areas of human endeavor that are not terribly challenging to human intelligence but sometimes beyond human patience. In this course project you will be programming the computer to play the game as good as Elvis did.

You don't have to worry about any "interface" issue (as some of you asked in class). Actually you will only be dealing with images instead of programming the Wii game controller.

Important Dates

Oct 16 (11:59pm): Finalizing team members : Maximum team size: 2. Send us an email with your team name and team members.

Oct 16 (11:59pm): Proposal submission : Submit a 0.5 page course project proposal in our provided template. Send a PDF file to cs231a2012@gmail.com

Nov 6 (11:59pm): Project milestone : Submit a 2-3 page course project milestone report. In the milestone, you should show some results on the face detection or tracking on Mii.

Dec 3 (11:59pm): Code due for evaluation : No late days allowed.

Dec 4 (11:59pm): Writeup due : No late days allowed.

Dec 6 (time 1-3 pm): Course project presentation. Location: Packard Atrium

Project Mission

In the course project, you will be focusing on 4 tasks in "Find Mii" as described below.

Task 1 : Find this Mii! ... You will be given a reference picture of one Mii. Identify that one in a crowd.

Task 2 : Find 2 look-alikes! ... Pay attention to the faces (and hair styles?) as they might be wearing different sweaters.

Task 3 : Find n odd Miis out! ... Some Miis are odd in their styles of shaking heads or footsteps. Find them out.

Task 4 : Find the fastest Mii! ... Someone is running (or swimming) fast. Catch that one.

For each task you have 3 levels: easy, medium, and hard.

For each task & level we will give you a video file from gamplay (that would be 12 in total), and you need to identify a given number of Miis.

Minimum requirement: 3 different tasks, 1 level for each.

You will be handling different tasks & levels separately (of course you can have functionalities shared among them, just follow the comments in the infrastructure code). You don't have to do all the 12 task & levels, the minimum requirement is that you complete at least three different tasks, one level for each. However you are strongly encouraged to do all 4 tasks on higher levels to earn more points for the final challenge! (See the scoring rules for detail.)

Scoring

The performance of your program (your overall score in the challenge) will be calculated according to the following rules.

Score in a given task and level: A correct click on frame 1 is worth 1 point, each frame thereafter is discounted by 0.99. And you can NOT access (even read into memory) any frame after the one you made your final click for a task. Score is averaged over multiple clicks if applicable. For example, if a task requires two clicks, you return 1st click at frame 5, and 2nd click at frame 20 (both of them are correct), your score would be (0.99^4 + 0.99^19)/2 = 0.8934. In this case, you can NOT access any frame after frame 20.

Score in a given task: Your score in a given task is the highest score you achieve on the 3 levels. When comparing the scores on different levels, scores on level 2 are multiplied by 1.5, and scores on level 3 are multiplied by 2. For example, for task 1, your score is 1.0 on level 1, and 0.77 on level 2, and you are not doing level 3. Your final score for task 1 would be max(1*1, 0.77*1.5, 0*2) = 1.155

Overall score in the challenge: Your overall score in the challenge is calculated by summing up your scores in all 4 tasks.

Important Notice: Your programs will be scored on videos slightly different from the ones we gave you. So you should make sure that your programs work properly on different gameplays for the same task and the same difficulty level. In other words, do NOT incorporate into your program any highly biased assumptions that you get from watching the videos yourself, because that's not what computer vision is about. An easy way to verify this issue is to apply your algorithm to the same video but starting from different frames, and your method should be working properly in this situation.

Your project will be scored by its overall quality, including but not limited to the performance of your program, the quality of your code and writeup, and how innovative you are. Your writeup should include your general ideas of accomplishing the tasks, the high-level structure of your code, interesting experiments you did to validate any part of your method (with beautiful plots), and any other things that you think would interest us.

Grading Policy

Final Project: 40%

write-up: 10%
• clarity, structure, language, references: 3%
• background literature survey, good understanding of the problem: 3%
• good insights and discussions of methodology, analysis, results, etc.: 4%
technical: 10%
• correctness: 3%
• depth: 3%
• innovation: 4%
evaluation and results: 20%
• competition score: 20%

Find Mii Project Submission Details

Write-up and Code submission

You must use our provided templates. Email your milestone report, final report, and zipped code to: cs231a2012@gmail.com , with the following format:

Subject Line: Course Project Milestone/Report/Code
Body: Full names of all group members, SUNet IDs
Attachments: Write-up as LastName_LastName_Paper.pdf, Code as LastName_LastName_Code.zip, where the titles have all the last names of the group members. All code should be runnable on the corn cluster.

Provided Data and Code

The data and infrastructure code for this project can be found in the AFS path: /afs/ir/class/cs231a/findmii

If you use OpenCV ...

Because we need to compile and run all your programs in 1~2 days for the final competition. Please connect your program with our Matlab infrastructure code (by following the comments therein), and make sure that your C++ program compiles and runs correctly on the 'corn' clusters with:

the header file path /afs/ir/class/cs231a/opencv/include

and the lib file path /afs/ir/class/cs231a/opencv/lib

Please follow the example given by 'Makefile' and 'opencvexample.cpp' in the project AFS path. Windows executables or Visual Studio projects will NOT work. . We apologize for the inconvenience, but this is the only feasible way to have all your programs automatically scored in a short time.

Hints for Winning the Challenge

1. Take a look at the data and infrastructure code as early as possible.

2. Be as innovative as you can.

Prizes

The top two teams will receive a special prize to be announced on the presentation day.