COMP423/523: Computer Vision for Autonomous Driving – 2022
Course Information
- Syllabus [pdf]
- TA: Ege Onat Özsüer (eozsuer16[at]
- Office Hours: Monday, 5-6 pm
- Join our slack channel!
Autonomous driving is a dream shared across various disciplines and actors in the field of AI. In this course, we approach it from the computer vision perspective. We are mainly interested in teaching the car to perceive its environment and act on it, drive. Considering highly dynamic and largely unconstrained environments encountered in real-life scenarios, self-driving presents several challenges. We will cover the existing approaches to some of the main challenges in self-driving by following the two dominant paradigms: end-to-end learning of driving and the modular approach.
We will start with the end-to-end learning part by covering imitation learning, direct perception, and reinforcement learning. Then, we will continue with the planning and prediction with some examples from the industry as well. We will start the perception part first with 3D perception including reconstruction, motion estimation, mapping and localization, and sensor fusion. In the second part of perception, we will focus on semantics including object detection, tracking, and segmentation. Finally, we will finish with a review of datasets and metrics frequently used in autonomous driving.
In this course, students will develop an understanding of the capabilities and the limitations of state-of-the-art computer vision solutions to autonomous driving. In addition, they will be able to implement and train simple models for end-to-end driving, prediction, 3D perception, and the perception of semantics.
There will be an assignment for each part:
- End-to-end driving
- Future prediction
- Perception: 3D and motion
- Perception: Semantics and Tracking
The assignments will contain pen-and-paper questions and programming problems from the lecture notes and the additional readings, papers for the corresponding lectures.
Basic programming skills in Python, prior experience with Deep Learning and basic math skills including probability, linear algebra, and calculus are required. It is highly recommended to have a keen interest in computer vision and machine learning problems and algorithms for the betterment of society.
Method |
Description |
Weight (%) |
Participation | participation in discussions or an oral exam at the end of the term |
10 |
Assignments | 4 assignments, each 10% | 40 |
Exam | at the end of the term | 20 |
Project | Proposal + Progress + Final Report & Presentation | 30 |
- Introduction [pdf]
- Chapter 1 & 2 of the survey.
- Imitation Learning [pdf]
- Direct Perception [pdf]
- Reinforcement Learning [pdf] [practice]
- Planning [pdf]
- End-to-end Interpretable Neural Motion Planner, CVPR 2019 [pdf]
W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun - End-to-End Urban Driving by Imitating a Reinforcement Learning Coach, ICCV 2021 [project]
Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool - CVPR 2021 Tutorial: All About Self-Driving [link]
- End-to-end Interpretable Neural Motion Planner, CVPR 2019 [pdf]
- Prediction [pdf1, pdf2]
- LaneGCN: Learning Lane Graph Representations for Motion Forecasting, ECCV 2020 [pdf]
M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation, CVPR 2020 [project]
J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, and Schmid - Tutorial on Variational Autoencoders [pdf]
C. Doersch - Stochastic Video Generation with a Learned Prior, ICML 2018 [pdf]
E. Denton and R. Fergus
- LaneGCN: Learning Lane Graph Representations for Motion Forecasting, ECCV 2020 [pdf]
- Stereo Matching [pdf]