COMP423/523: Computer Vision for Autonomous Driving – 2022

Course Information

  • Syllabus [pdf]
  • TA: Ege Onat Özsüer (eozsuer16[at]
  • Office Hours: Monday, 5-6 pm
  • Join our slack channel!

Autonomous driving is a dream shared across various disciplines and actors in the field of AI. In this course, we approach it from the computer vision perspective. We are mainly interested in teaching the car to perceive its environment and act on it, drive. Considering highly dynamic and largely unconstrained environments encountered in real-life scenarios, self-driving presents several challenges. We will cover the existing approaches to some of the main challenges in self-driving by following the two dominant paradigms: end-to-end learning of driving and the modular approach.

We will start with the end-to-end learning part by covering imitation learning, direct perception, and reinforcement learning. Then, we will continue with the planning and prediction with some examples from the industry as well. We will start the perception part first with 3D perception including reconstruction, motion estimation, mapping and localization, and sensor fusion. In the second part of perception, we will focus on semantics including object detection, tracking, and segmentation. Finally, we will finish with a review of datasets and metrics frequently used in autonomous driving.

In this course, students will develop an understanding of the capabilities and the limitations of state-of-the-art computer vision solutions to autonomous driving. In addition, they will be able to implement and train simple models for end-to-end driving, prediction, 3D perception, and the perception of semantics.


There will be an assignment for each part:

  1. End-to-end driving
  2. Future prediction
  3. Perception: 3D and motion
  4. Perception: Semantics and Tracking

The assignments will contain pen-and-paper questions and programming problems from the lecture notes and the additional readings, papers for the corresponding lectures.


Basic programming skills in Python, prior experience with Deep Learning and basic math skills including probability, linear algebra, and calculus are required. It is highly recommended to have a keen interest in computer vision and machine learning problems and algorithms for the betterment of society.




Weight (%)

Participation participation in discussions
or an oral exam at the end of the term
Assignments 4 assignments, each 10% 40
Exam at the end of the term 20
Project Proposal + Progress + Final Report & Presentation 30


  • Introduction [pdf]
    • Chapter 1 & 2 of the survey.
  • Imitation Learning [pdf]
    • Chapter 15 of the survey.
    • Talk on “Learning Robust Driving Policies” by A. Geiger.
    • CARLA Autonomous Driving Challenge 2020: Talk
  • Direct Perception [pdf]
  • Reinforcement Learning [pdf] [practice]
  • Planning [pdf]
    • End-to-end Interpretable Neural Motion Planner, CVPR 2019 [pdf]
      W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun
    • End-to-End Urban Driving by Imitating a Reinforcement Learning Coach, ICCV 2021 [project]
      Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool
    • CVPR 2021 Tutorial: All About Self-Driving [link]
  • Prediction [pdf1, pdf2]
    • LaneGCN: Learning Lane Graph Representations for Motion Forecasting, ECCV 2020 [pdf]
      M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun
    • VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation, CVPR 2020 [project]
      J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, and Schmid
    • Tutorial on Variational Autoencoders [pdf]
      C. Doersch
    • Stochastic Video Generation with a Learned Prior, ICML 2018 [pdf]
      E. Denton and R. Fergus
  • Stereo Matching [pdf]