Stanford University CS231n: Deep Learning for Computer Vision

Schedule

Lectures will occur Tuesday/Thursday from 12:00-1:20pm Pacific Time at NVIDIA Auditorium.
Discussion sections will (generally) occur on Fridays from 12:30-1:20pm Pacific Time at NVIDIA Auditorium. Check Ed for any exceptions.

Updated lecture slides will be posted here shortly before each lecture. For ease of reading, we have color-coded the lecture category titles in blue, discussion sections (and final project poster session) in yellow, and the midterm exam in red. Note that the schedule is subject to change as the quarter progresses.

Date	Description	Course Materials	Events	Deadlines
04/02	Lecture 1: Introduction Computer vision overview Course overview Course logistics [slides 1] [slides 2]
———	Deep Learning Basics
04/04	Lecture 2: Image Classification with Linear Classifiers The data-driven approach K-nearest neighbor Linear Classifiers Algebraic / Visual / Geometric viewpoints SVM and Softmax loss [slides]	Image Classification Problem Linear Classification
04/05	Python / Numpy Review Session [Colab] [Tutorial]	12:30-1:20pm PT	Assignment 1 out
04/09	Lecture 3: Regularization and Optimization Regularization Stochastic Gradient Descent Momentum, AdaGrad, Adam Learning rate schedules [slides]	Optimization
04/11	Lecture 4: Neural Networks and Backpropagation Multi-layer Perceptron Backpropagation [slides]	Backprop Linear backprop example Suggested Readings: Why Momentum Really Works Derivatives notes Efficient backprop More backprop references: [1], [2], [3]
04/12	Backprop Review Session [Colab]	12:30-1:20pm PT
———	Perceiving and Understanding the Visual World
04/16	Lecture 5: Image Classification with CNNs History Higher-level representations, image features Convolution and pooling [slides]	Convolutional Networks
04/18	Lecture 6: CNN Architectures Batch Normalization Transfer learning AlexNet, VGG, GoogLeNet, ResNet [slides 1] [slides 2] [review]	AlexNet, VGGNet, GoogLeNet, ResNet
04/19	Final Project Overview and Guidelines	12:30-1:20pm PT	Assignment 2 out	Assignment 1 due
04/22				Project proposal due
04/23	Lecture 7: Recurrent Neural Networks RNN, LSTM, GRU Language modeling Image captioning Sequence-to-sequence [slides]	Suggested Readings: DL book RNN chapter Understanding LSTM Networks
04/25	Lecture 8: Attention and Transformers Self-Attention Transformers [slides]	Suggested Readings: Attention is All You Need [Original Transformers Paper] Attention? Attention [Blog by Lilian Weng] The Illustrated Transformer [Blog by Jay Alammar] ViT: Transformers for Image Recognition [Paper] [Blog] [Video]
04/26	PyTorch Review Session [Colab]	12:30-1:20pm PT
04/30	Lecture 9: Object Detection and Image Segmentation Single-stage detectors Two-stage detectors Semantic/Instance/Panoptic segmentation [slides]	FCN, R-CNN, Fast R-CNN, Faster R-CNN, YOLO DETR: End-to-End Object Detection with Transformers [Paper] [Blog] [Video]
05/02	Lecture 10: Video Understanding Video classification 3D CNNs Two-stream networks Multimodal video understanding [slides]
05/03	Midterm Review Session	12:30-1:20pm PT
05/06				Assignment 2 due
05/07	Lecture 11: Visualizing and Understanding Feature visualization and inversion Adversarial examples DeepDream and style transfer [slides]
05/09	In-Class Midterm	12:00-1:20pm
05/10	RNNs & Transformers [Colab]	12:30-1:20pm PT
———	Generative and Interactive Visual Intelligence
05/14	Lecture 12: Self-supervised Learning Pretext tasks Contrastive learning Multisensory supervision [slides]	Suggested Readings: Lilian Weng Blog Post DINO: Emerging Properties in Self-Supervised Vision Transformers [Paper] [Blog] [Video]	Assignment 3 out	Project milestone due Update: deadline moved to 5/17
05/16	Lecture 13: Generative Models Generative Adversarial Network Diffusion models Autoregressive models [slides]
05/21	Lecture 14: OpenAI Sora Guest Lecture by William (Bill) Peebles and Tim Brooks
05/23	Lecture 15: Robot Learning Deep Reinforcement Learning Model Learning Robotic Manipulation [slides]
05/28	Lecture 16: Human-Centered Artificial Intelligence		Assignment 3 due
05/30	Lecture 17: Guest Lecture by Prof. Serena Yeung-Levy
06/04	Lecture 18: 3D Vision 3D shape representations Shape reconstruction Neural implicit representations [slides]
06/05				Project final report due
06/12	Final Project Poster Session Time: 12:00 PM - 4:30 PM Location: AT&T Patio (Gates Building First Floor) Session A Check-in: 12 PM - 12:15 PM Session A: 12:15 PM - 2:15 PM Session B Check-in: 2:15 PM - 2:30 PM Session B: 2:30 PM - 4:30 PM

CS231n: Deep Learning for Computer Vision

Stanford - Spring 2024

Schedule