Machine Perception - SS '19

Recent developments in neural network (aka “deep learning”) have drastically advanced the performance of machine perception systems in a variety of areas including drones, self-driving cars and intelligent UIs. This course is a deep dive into details of the deep learning algorithms and architectures for a variety of perceptual tasks.


Overview

eDoz Course Nr.
263-3710-00L
Lecturer
O. Hilliges
Assistants
E. Aksan, X. Chen, M. Kaufmann, S. Park, J. Song, A. Spurr, X. Zhang
Lecture
Thu 10:00 - 12:00, CAB G 61
Exercises
Thu 13:00 - 15:00, NO C 6
Fri 13:00 - 15:00, NO C 6
Credits
5 ECTS
Recordings
The lectures are recorded. Credentials are available in the first week's slides.
Contact
Please address all questions regarding content, organisation etc. on Piazza. Please sign up to the forum using this link. The forum is closely monitored by us. Due to organisational reasons, we will not be able to respond to direct e-mails.



Announcements

16.04.2019
Schedule updated (see below).
01.04.2019
Pen & Paper Backpropagation: Instructions and solutions updated following Piazza posts 26 and 27
27.03.2019
Project descriptions online.
21.02.2019
Added link to lecture recordings (first recording will be available within 1-2 days).
18.02.2019
Please sign up to Piazza.
22.01.2019
Course website online, more information to follow.



Learning Objectives

Students will learn about fundamental aspects of modern deep learning approaches for perception. Students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in learning-based computer vision, robotics and HCI. The final project assignment will involve training a complex neural network architecture and applying it on a real-world dataset.

The core competency acquired through this course is a solid foundation in deep-learning algorithms to process and interpret human input into computing systems. In particular, students should be able to develop systems that deal with the problem of recognizing people in images, detecting and describing body parts, inferring their spatial configuration, performing action/gesture recognition from still images or image sequences, also considering multi-modal data, among others.




Schedule

Subject to change. Materials only available from within ETH network.

Wk.Date ContentMaterial Exercise Session
1 21.02.
Introduction

Introduction to class contents & admin

slides
2 28.02.
Deep Learning Introduction

Feedforward Networks, Representation Learning

slides
slides (annotated)
lecture notes
perceptron colab notebook

Tutorial Implement your own MLP

XOR Colab Notebook
Eye-Gaze Colab Notebook
Eye-Gaze Solutions Notebook
3 07.03.
Training & Classification

Backpropagation

slides (updated)

Class Tips for Training Part 1

slides
4 14.03.
-- No Class --

Tutorial Linear Regression in TensorFlow

slides
Colab Notebook

Pen & Paper Backpropagation

instructions (updated)
solutions (updated)
5 21.03.
Convolutional Neural Networks
slides
slides (annotated)

Tutorial CNNs in TensorFlow

slides
Colab Notebook

Pen & Paper CNNs

instructions
solutions (updated)
6 28.03.
Fully Convolutional Neural Networks

Segmentation

CNNs Pt 2 (updated)
CNNs Pt 2 (annotated)
Fully CNNs

Additional Reading Material:

AlexNet
VGG
GoogLeNet
ResNet

Class Tips for Training Part 2

slides (pdf)
slides (pptx for animations)
7 04.04.
Recurrent Neural Networks

LSTM, GRU, Backpropagation through time

slides
slides (annotated)

Tutorial RNNs in TensorFlow

slides
Colab Notebook

Pen & Paper RNNs

instructions
solutions
8 11.04.
Generative Models: Latent Variable Models

Sequence Modelling and Autoencoders.

slides
slides (annotated)
9 18.04.
Generative Models: Latent Variable Models

Variational Autoencoders.

slides
slides (annotated)
10 25.04.
-- No Class (Easter) --
11 02.05.
Generative Models: Implicit Models

Generative Adversarial Networks & Co

slides
slides (annotated)
12 09.05.
Generative Models: Autoregressive Models

PixelCNN, PixelRNN, WaveNet, Stochastic RNNs

slides
slides (annotated)

Additional Reading Material:

NADE MADE
PixelRNN PixelCNN
WaveNet VRNN
DeepWriting STCN
Deep Learning Ch. 20.10.7ff.

Pen & Paper Generative Models

instructions
solutions
13 16.05.
Applications: Eyes
Reinforcement Learning Pt. I
Eyes: slides
RL: slides
RL: slides (extended)
14 23.05.
Reinforcement Learning Pt. II

slides
slides (annotated)
15 30.05.
-- No Class (Ascension Day) --



Exercise Sessions

Please refer to the above schedule for an overview of the planned exercise slots. We will have three different types of activities in the exercise sessions:

  1. Tutorial: Interactive programming tutorial in Python taught by a TA. Code will be made available.
  2. Class: Lecture-style class taught by a TA to give you some tips on how to train your neural network in practice.
  3. Pen & Paper: There will be 4 pen & paper exercises. They are not graded but are helpful to prepare for the written exam. Solutions will be published on the website a week after the release and discussed in the exercise session if desired.

The exercises are meant to help you understand the course's content more in-depth and to prepare you for the graded multi-week project. Exercise sessions are primarily scheduled until week 8 so that you have time to work on the project after week 8.




Project

Overview

There will be a multi-week project that gives you the opportunity to have some hands-on experience with training a neural network for a concrete application. The project is to be completed in groups of two and will be graded. The project grade counts 40 % towards your final grade.

The project grade will be determined by two factors: 1) a competitive part based on how well your model fairs compared to your fellow students' models and 2) the idea/novelty/innovativeness of your approach based on a written report to be handed in after the project deadline. For each project there will be baselines available that guarantee a certain grade for the competitive part if you surpass them. The competition will be hosted on a online platform - more details will be announced here.

In the beginning of the course we will provide 4 projects from which your group can choose one. You will have to register for a project around week 5. As the various exercises will prepare you for this project, we do not expect you to work on it before week 8. There are no more activities in the exercise slots after week 8, so that you can use them to work on your project. The final project deadline is 2 weeks after the end of the semester (Friday, June 14th).

Descriptions

  1. Dynamic Gesture Recognition
  2. 3D Human Pose Estimation
  3. Eye Gaze Estimation
  4. Human Motion Prediction

Submission

For the final submission, please write a report (using this LaTeX template) and upload your report along with your code. Details about the submission are announced later. We will re-train and re-evaluate your model. Hence, your submission should provide easy-to-follow instructions that let us reproduce your final score on the leaderboard. Please include a readme with the submission with respective instructions how to train and evaluate your model. For submissions for which it is not possible to reproduce the results, grades will be penalized accordingly.

Grading

This project constitutes 40% of the final course grade. Project grades will be determined by taking the average of public and private scores. We will provide two baselines. Beating the easy baseline guarantees a grade of 4. As we want to encourage novel solutions and ideas, we also ask you to write a short report (3 pages, excluding references) detailing the model you used for the final submission and your contributions. Depending on this, we will weigh the grade determined by your performance w.r.t. the baselines. In other words, the grade computed as mentioned above can go up or down, depending on the contributions of your project. If you passed the easy baseline, your final grade cannot go lower than 4 after the weighting.




Exam

The performance assessment is a written exam (2 hours) conducted during the examination session (Jul-Aug). It will constitute 60 % of the final grade.

To give you a rough idea what to expect for the exam, we release a mock exam which you can download here: