Stanford University CS231n: Convolutional Neural Networks for Visual Recognition

*This network is running live in your browser

The Convolutional Neural Network in this example is classifying images live in your browser, at about 10 milliseconds per image. It takes an input image and transforms it through a series of functions (e.g. convolution, rectification, pooling) into class probabilities at the end. The parameters of this function are learned with backpropagation on a dataset of (image, label) pairs. In this class, you will learn how to build and train these networks. This particular 17-layer network is classifying CIFAR-10 images into one of 10 classes and was trained with ConvNetJS.

Course Description

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are the tasks of image classification, localization and detection. This course is a deep dive into details of neural network architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

Course Instructors

Fei-Fei Li

Andrej Karpathy

Course Assistants

TBA

Tentative Syllabus

Module 1: Visual Recognition and Machine Learning

Week 1:

Overview of visual recognition and image understanding, core tasks and data-driven approach

Week 2:

A simple solution: features, SVM/Softmax loss functions, optimization

Week 3:

Intro to neural networks and backpropagation. Overfitting, regularization, numerical gradient checks

Module 2: Convolutional Neural Networks

Week 4:

Convolution, pooling layers

Week 5:

Understanding convolutional neural networks: visualizations, backpropagation to images

Week 6:

Fine-tuning pretrained networks to smaller datasets

Module 3: Building an end-to-end system

Week 7:

Testing and Evaluation of Image Classification, localization and detection. ImageNet Challenge.

Week 8:

Squeezing out last few percent: hyperparameter optimization, data augmentation, multi-scale approaches, dropout, model averaging

Week 9:

Tour of the most popular neural network libraries (e.g. Caffe). Lectures may feature guest speakers from both academia and industry.

Week 10:

Student groups present and critique their end-to-end image classification systems.

Prerequisites

Proficiency in Python, familiarity in C/C++
College Calculus, Linear Algebra
Equivalent knowledge of CS229 (Machine Learning) and CS131 (Introduction to Computer Vision)

Class Time and Location

TBA

Grading Policy

TBA