Overview
Recent developments in neural network (aka “deep learning”) have drastically advanced the performance of machine perception systems in a variety of areas including drones, self-driving cars and intelligent UIs. This course is a deep dive into details of the deep learning algorithms and architectures for a variety of perceptual tasks.
Announcements
- 27.07.2018
- Released mock exam (see below).
- 07.03.2018
- Schedule update: Note that there is no class on March 15th. Also, in contrast to the previous schedule, a lecture will be held on April 26th. Please check the updated schedule below.
- 21.02.2018
- Project description and relevant papers online
- 15.02.2018
- Course website online
Learning Objectives
Students will learn about fundamental aspects of modern deep learning approaches for perception. Students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in learning-based computer vision, robotics and HCI. The final project assignment will involve training a complex neural network architecture and applying it on a real-world dataset.
The core competency acquired through this course is a solid foundation in deep-learning algorithms to process and interpret human input into computing systems. In particular, students should be able to develop systems that deal with the problem of recognizing people in images, detecting and describing body parts, inferring their spatial configuration, performing action/gesture recognition from still images or image sequences, also considering multi-modal data, among others.
Schedule
Wk. | Date | Content | Slides | Extra Material | |
---|---|---|---|---|---|
1 | 22.02. | Introduction & Basic ConceptsIntroduction to class contents & admin |
slides | lecture notes | |
2 | 01.03. |
Image Classification & Input RecognitionSupport Vector Machines |
slides slides (annotated) |
lecture notes | |
3 | 08.03. |
Deep Learning IntroductionFeedforward Networks, Representation Learning |
slides slides (annotated) |
Additional reading material (link to Stanford's course notes): CNNs for Visual Recognition | |
4 | 15.03. |
No Class |
|||
5 | 22.03. |
Layer Types, Training & ClassificationBackpropagation |
slides slides (annotated) |
||
6 | 29.03. |
Convolutional Neural Networks |
slides slides (annotated) |
||
7 | 05.04. |
No Class (Easter) |
|||
8 | 12.04. |
Recurrent Neural NetworksLSTM, GRU, Backpropagation through time |
slides slides (annotated) |
Additional reading material: Deep Learning Book, chapter on RNNs | |
9 | 19.04. |
Program Committee |
|||
10 | 26.04. |
Fully Convolutional CNNsSegmentation |
slides | ||
11 | 03.05. |
Generative Models Pt 1Variational Autoencoders |
slides slides (annotated) |
||
12 | 10.05. |
No class (Ascension Day) |
|||
13 | 17.05. |
Generative Models Pt 2Generative Adversarial Networks |
slides slides (annotated) |
||
14 | 24.05. |
Hands & Eyes |
slides | ||
15 | 31.05. |
Reinforcement Learning | slides |
Exercises
There will be 3 exercises (2 pen-and-papers and 1 programming exercise), which are not graded. The exercises will help you understand the course's content more in-depth and will also prepare you for the graded multi-week project (see below). You will have one week to complete each exercise, after that we release the solutions and discuss it in the exercise session. In some of the exercise sessions there are going to be additional tutorials where we will have some demos and Q&A sessions that will be helpful for the completion of the exercises and the multi-week project. Exercise sessions are only scheduled until week 8, after that the TAs will be present to answer questions. Please find a schedule of the exercise sessions below.
Exercise sheets and solutions will only be accessible from within the ETH network.
Wk. | Date | Content | Material |
---|---|---|---|
1 | 22.02. | Introduction to Azure and TensorFlowIntroduction on how to use Microsoft's Azure GPU cluster and example of how to implement a simple linear regression model in TensorFlow. |
Tensorflow Tutorial (Azure Notebook) |
2 | 01.03. |
Exercise 1Release of exercise 1 (Support Vector Machines, pen-and-paper) |
exercise sheet slides solutions |
3 | 08.03. |
Exercise 1 solutions and TensorFlow-TutorialDiscussion of exercise 1 and practical tips on how to train your neural network. |
slides animations |
4 | 15.03. |
No class |
|
5 | 22.03. |
Exercise 2 and RegularizationRelease of exercise 2 (Backpropagation, pen-and-paper) and practical tips on how to improve the generalization performance of your neural network. |
exercise sheet solutions slides |
6 | 29.03. |
Exercise 2 solutionsDiscussion of exercise 2 |
|
7 | 05.04. |
No class |
|
8 | 12.04. |
Exercise 3 and TensorFlow-Tutorial 3Practical tutorial on how to train a CNN and RNN model in TensorFlow and some additional (optional) programming exercises. |
CNN Azure Notebook RNN Azure Notebook |
9 | 19.04. |
Exercise 3 Q&A and Project Report GuidelinesQ&A session for exercise 3 and some help and guidelines for the report to be handed in at the end of the project. |
Report Writing |
... | ... | ... |
Project
There will be a multi-week project that gives you the opportunity to have some hands-on experience with training a neural network for a concrete application. The project is to be completed in groups of two and will be graded. The grade counts 50 % towards your final grade.
There are 4 projects available for which you need to register in the beginning of the course (details are to be announced). As the various exercises will prepare you for this project, we do not expect you to work on it before week 8. There are no more activities in the exercise slots after week 8, so that you can use them to work and/or ask questions about your project.
The projects will be hosted on Kaggle, i.e., the grade will be dependent on the performance of your model. There are some baselines available that will guarantee a certain grade if your model outperforms them. You are also asked to hand in a short report (max. 3 pages) describing your best performing model. The report will be considered for your final grade. The final deadline is 2 weeks after the end of the semester (Friday, June 15th). More details will be announced in the lecture and here.
1) Gesture Recognition from Videos
This project involves classification of dynamic hand gestures from multi-modal data including RGB, depth, segmentation mask and skeletal information for videos.
Relevant papers:- Pavlo Molchanov et al. (2016) Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks
- Lionel Pigou et al. (2018) Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video
2) Hand Joint Recognition
This project tackles the problem of 2D keypoint estimation of human hands. Given an image of a human hand, we would like to infer the 2D location of the joints.
Relevant papers:- Shih-En Wei et al. (2016) Convolutional Pose Machines
- Alejandro Newell et al. (2016) Stacked Hourglass Networks for Human Pose Estimation
3) Eye Gaze Estimation
This project concerns the estimation of eye gaze direction. In other words, we try to find in which 3D direction a person's eye is looking towards, as seen from the perspective of a webcam. We work with just single eye images as input and regress two angles to represent eyeball yaw and pitch.
Relevant papers:- Xucong Zhang et al. (2017) MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation
- Kyle Krafka et al. (2016) Eye Tracking for Everyone
- Ashish Shrivastava et al. (2017) Learning from Simulated and Unsupervised Images through Adversarial Training
4) Human Motion Prediction
In this project you are given a sequence of full-body human poses, represented as 3D skeletal data, and the task is to predict how the motion continues for several frames in the future.
Relevant papers:- Katerina Fragkiadaki et al. (2015) Recurrent Network Models for Human Dynamics
- Julieta Martinez et al. (2017) On Human Motion Prediction Using Recurrent Neural Networks
- Partha Ghosh et al. (2017) Learning Human Motion Models for Long-term Predictions
Case Study
We will have an in-class case study, where we simulate a program committee meeting. This part of the course is optional and will not be graded. However, you will have the chance to discuss a paper that can be relevant for the project. In order to participate in the case study, you will have to register through a form that will be published here in due time.
Exam
The performance assessment is a written exam (2 hours) conducted during the examination session (Jul-Aug). It will constitute 50 % of the final grade.
To give you a rough idea what to expect for the exam, we release a mock exam which you can download here: