Schedule

Updated lecture slides will be posted here shortly before each lecture. For ease of reading, we have color-coded the lecture category titles in blue, discussion sections (and final project poster session) in yellow, and the midterm exam in red. Note that the schedule is subject to change as the quarter progresses.

Date Description Course Materials Events Deadlines
Mar 31 Lecture 1: Introduction
Computer vision overview
Course overview
Course logistics
[slides 1] [slides 2]
——— Deep Learning Basics
Apr 02 Lecture 2: Image Classification with Linear Classifiers
The data-driven approach
K-nearest neighbor
Linear Classifiers
Algebraic / Visual / Geometric viewpoints
Softmax loss
[slides]
Image Classification Problem
Linear Classification
Assignment 1 out
Apr 03 Python / Numpy Review Session
[Colab] [Tutorial]
TBD
Apr 07 Lecture 3: Regularization and Optimization
Regularization
Stochastic Gradient Descent
Momentum, AdaGrad, Adam
Learning rate schedules
[slides]
Optimization
Apr 09 Lecture 4: Neural Networks and Backpropagation
Multi-layer Perceptron
Backpropagation
Backprop
Linear backprop example
Suggested Readings:
  1. Why Momentum Really Works
  2. Derivatives notes
  3. Efficient backprop
  4. More backprop references: [1], [2], [3]
Apr 10 Backprop Review Session TBD
——— Perceiving and Understanding the Visual World
Apr 14 Lecture 5: Image Classification with CNNs
History
Higher-level representations, image features
Convolution and pooling
Convolutional Networks
Apr 16 Lecture 6: CNN Architectures
Batch Normalization
Transfer learning
AlexNet, VGG, ResNet
AlexNet, VGGNet, GoogLeNet, ResNet Project Proposal out
Assignment 1 due
Apr 17 Final Project Overview and Guidelines
TBD
Apr 21 Lecture 7: Recurrent Neural Networks
RNN, LSTM, GRU
Language modeling
Image captioning
Sequence-to-sequence
Suggested Readings:
  1. DL book RNN chapter
  2. Understanding LSTM Networks
Apr 23 Lecture 8: Attention and Transformers
Self-Attention
Transformers
Suggested Readings:
  1. Attention is All You Need [Original Transformers Paper]
  2. Attention? Attention [Blog by Lilian Weng]
  3. The Illustrated Transformer [Blog by Jay Alammar]
  4. ViT: Transformers for Image Recognition [Paper] [Blog] [Video]

Assignment 2 out
Project Proposal due
Apr 24 PyTorch Review Session TBD
Apr 28 Lecture 9: Object Detection, Image Segmentation, Visualizing and Understanding
Single-stage detectors
Two-stage detectors
Semantic/Instance/Panoptic segmentation
Feature visualization and inversion
Adversarial examples
DeepDream and style transfer
  1. FCN, R-CNN, Fast R-CNN, Faster R-CNN, YOLO
  2. DETR: End-to-End Object Detection with Transformers [Paper] [Blog] [Video]
Apr 30 Lecture 10: Video Understanding
Video classification
3D CNNs
Two-stream networks
Multimodal video understanding
May 01 RNNs & Transformers TBD
May 05 Lecture 11: Large Scale Distributed Training
Utilization, Parallelism, and Activation Checkpointing
——— Generative and Interactive Visual Intelligence
May 07 Lecture 12: Self-supervised Learning
Pretext tasks
Contrastive learning
Multisensory supervision
Suggested Readings:
  1. Lilian Weng Blog Post
  2. DINO: Emerging Properties in Self-Supervised Vision Transformers [Paper] [Blog] [Video]
Assignment 2 due
May 08 Midterm Review Session
TBD
May 12 In-Class Midterm
12:00-1:20pm PT
May 14 Lecture 13: Generative Models 1
Variational Autoencoders
Generative Adversarial Network
Autoregressive Models
Suggested Readings:
  1. Blog: ELBO — What & Why
Assignment 3 out
May 19 Lecture 14: Generative Models 2
Diffusion models
May 21 Lecture 15: 3D Vision
3D shape representations
Shape reconstruction
Neural implicit representations
May 26 Lecture 16: Vision and Language
May 28 Lecture 17: Robot Learning
Deep Reinforcement Learning
Model Learning
Robotic Manipulation
Assignment 3 due
May 29 Project Milestone Check-Ins due
Jun 02 Lecture 18: Human-Centered AI
Jun 05 Final Report due
Jun 10 Final Project Poster Session