Machine Perception - SS '20

Recent developments in neural network (aka “deep learning”) have drastically advanced the performance of machine perception systems in a variety of areas including drones, self-driving cars and intelligent UIs. This course is a deep dive into details of the deep learning algorithms and architectures for a variety of perceptual tasks.


Overview

eDoz Course Nr.
263-3710-00L
Lecturer
O. Hilliges
Assistants
E. Aksan, S. Christen, X. Chen, M. Kaufmann, S. Park, J. Song, A. Spurr, X. Zhang
Lecture
Thu 10:00 - 12:00, live streamed via Zoom
Exercises
Thu 13:00 - 15:00, live streamed via Zoom
Credits
5 ECTS
Contact
Please address all questions regarding content, organisation etc. on Piazza. Please sign up to the forum using this link. The forum is closely monitored by us. Due to organisational reasons, we will not be able to respond to direct e-mails.



Announcements

02.04.2020
Future video recordings will be released via Zoom cloud service. The password is the same as the one used for the ETH video portal, which can be found in first lecture’s slides.
01.04.2020
Exercise sessions on tips for training part 1&2 will be held offline. We will release a recording of both lectures early next week and schedule a Q&A session on April 10 (Friday) at 13:00.
13.03.2020
Both the lecture and the exercise will be live-streamed via Zoom and recorded for offline viewing. Note that we will only live stream the Thursday slot of the exercise.
26.02.2020
The lecture and tutorial sessions at week 2 (27.02) are cancelled. Please refer to the schedule for updates. We apologize for the inconvenience.
12.02.2020
Please sign up to Piazza.
12.02.2020
Course website online, more information to follow.



Learning Objectives

Students will learn about fundamental aspects of modern deep learning approaches for perception. Students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in learning-based computer vision, robotics and HCI. The final project assignment will involve training a complex neural network architecture and applying it on a real-world dataset.

The core competency acquired through this course is a solid foundation in deep-learning algorithms to process and interpret human input into computing systems. In particular, students should be able to develop systems that deal with the problem of recognizing people in images, detecting and describing body parts, inferring their spatial configuration, performing action/gesture recognition from still images or image sequences, also considering multi-modal data, among others.




Schedule

Subject to change. Materials only available from within ETH network.

Wk.Date ContentMaterial Exercise Session
1 20.02.
-- No Class --
2 27.02.
-- No Class --
3 05.03.
Deep Learning Introduction

Class content & admin,
Feedforward Networks,
Representation Learning

slides
slides (annotated)

Perceptron Visualization Notebook

Tutorial Implement your own MLP

slides
XOR Notebook
XOR Solution
Eye-Gaze Notebook
Eye-Gaze Solution
4 12.03.
Training Neural Networks

Backpropagation

slides
slides (annotated)

Tutorial Linear Regression

slides
Linear Regression Notebook

Pen & Paper Backpropagation

exercise
solution
5 19.03.
Convolutional Neural Networks

slides
slides (annotated)
annotations
recording
recording (audio-only)

Additional material:
RSVP
Cortical Neuron

Tutorial CNNs in TensorFlow

slides
recording
recording (audio-only)
CNN Notebook

Pen & Paper CNN

exercise
solution
6 26.03.
Recurrent Neural Networks

LSTM, GRU, Backpropagation through time

slides
slides (annotated)
annotations
recording
recording (audio-only)

Tutorial RNNs in TensorFlow

slides
recording
recording (audio-only)
RNN Notebook

Pen & Paper RNN

exercise
solution
7 02.04.
Fully Convolutional Neural Networks

Advanced Vision Topics

slides CNN-Pt-II
slides FCNN
recording

Class Tips for Training Part 1

slides
recording
8 09.04.
Generative Models: VAE Pt. I

Variational Autoencoders, etc.

slides
slides (annotated)
annotations
recording

Additional material:
Denoising Autoencoders
Image Super Resolution
Celeste
VAE Hands
DeepWriting
Chemical Design
World Models

Class Tips for Training Part 2

slides
recording

Pen & Paper VAE

exercise
solution
9 16.04.
-- No Class (Easter) --
10 23.04.
Generative Models: VAE Pt. II & GANs Pt. I

Generative Adversarial Networks & Co

slides VAE Pt. II
slides GANs Pt. I
slides VAE Pt. II (anno.)
slides GANs Pt. I (anno.)
recording

Pen & Paper GAN

exercise
solution
11 30.04.
GANs Pt. II & Autoregressive Models Pt. I

PixelCNN, PixelRNN, WaveNet, Stochastic RNNs

slides GANs Pt. II
slides ARM Pt. I
slides GANs Pt. II (anno.)
slides ARM Pt. I (anno.)
recording

Pen & Paper Autoregressive VAE

exercise
solution
12 7.05.
Autoregressive Models Pt. II & Reinforcement Learning Pt. I
slides ARM Pt. II
slides RL Pt. I
slides ARM Pt. II (anno.)
slides RL Pt. I (anno.)
recording
13 14.05.
Reinforcement Learning Pt. II
slides RL Pt. II
slides RL Pt. II (anno.)
recording

Pen & Paper RL

exercise
solution
14 21.05.
-- No Class (Ascension Day) --
15 28.05.
Recent Research

slides
recording



Exercise Sessions

Please refer to the above schedule for an overview of the planned exercise slots. We will have three different types of activities in the exercise sessions:

  1. Tutorial: Interactive programming tutorial in Python taught by a TA. Code will be made available.
  2. Class: Lecture-style class taught by a TA to give you some tips on how to train your neural network in practice.
  3. Pen & Paper: Pen & paper exercises that are not graded but are helpful to prepare for the written exam. Solutions will be published on the website a week after the release and discussed in the exercise session if desired.




Project

Overview

There will be a multi-week project that gives you the opportunity to have some hands-on experience with training a neural network for a concrete application. The project is to be completed in groups of two or three and will be graded. The project grade counts 40 % towards your final grade.

The project grade will be determined by two factors: 1) a competitive part based on how well your model fairs compared to your fellow students' models and 2) the idea/novelty/innovativeness of your approach based on a written report to be handed in after the project deadline. For each project there will be baselines available that guarantee a certain grade for the competitive part if you surpass them. The competition will be hosted on a online platform - more details will be announced here.

We provide 4 project topics. You will have to register for a project via this link by 29th March. As the various exercises will prepare you for this project, we do not expect you to work on it before week 8. There are no more activities in the exercise slots after week 8, so that you can use them to work on your project. The final project deadline will be at the end of this semester (exact date to be announced).

Descriptions

  1. Dynamic Gesture Recognition
  2. 3D Human Pose Estimation
  3. Eye Gaze Estimation
  4. Human Motion Prediction