263-3710-00L  Machine Perception

SemesterSpring Semester 2018
LecturersO. Hilliges
Periodicityyearly recurring course
Language of instructionEnglish
CommentStudents, who have already taken 263-3700-00 User Interface Engineering are not allowed to register for this course!

AbstractRecent developments in neural network (aka “deep learning”) have drastically advanced the performance of machine perception systems in a variety of areas including drones, self-driving cars and intelligent UIs. This course is a deep dive into details of the deep learning algorithms and architectures for a variety of perceptual tasks.
ObjectiveStudents will learn about fundamental aspects of modern deep learning approaches for perception. Students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in learning-based computer vision, robotics and HCI. The final project assignment will involve training a complex neural network architecture and applying it on a real-world dataset of human motion.

The core competency acquired through this course is a solid foundation in deep-learning algorithms to process and interpret human input into computing systems. In particular, students should be able to develop systems that deal with the problem of recognizing people in images, detecting and describing body parts, inferring their spatial configuration, performing action/gesture recognition from still images or image sequences, also considering multi-modal data, among others.
ContentWe will focus on teaching how to set up the problem of machine perception, the learning algorithms (e.g. backpropagation), practical engineering aspects as well as advanced deep learning algorithms including generative models.

The course covers the following main areas:
I) Machine-learning algorithms for input recognition, computer vision and image classification (human pose, object detection, gestures, etc.)
II) Deep-learning models for the analysis of time-series data (temporal sequences of motion)
III) Learning of generative models for synthesis and prediction of human activity.

Specific topics include: 
• Deep learning basics:
○ Neural Networks and training (i.e., backpropagation)
○ Feedforward Networks
○ Recurrent Neural Networks
• Deep Learning techniques user input recognition:
○ Convolutional Neural Networks for classification
○ Fully Convolutional architectures for dense per-pixel tasks (i.e., segmentation)
○ LSTMs & related for time series analysis
○ Generative Models (GANs, Variational Autoencoders)
• Case studies from research in computer vision, HCI, robotics and signal processing
LiteratureDeep Learning
Book by Ian Goodfellow and Yoshua Bengio
Prerequisites / NoticeThis is an advanced grad-level course that requires a background in machine learning. Students are expected to have a solid mathematical foundation, in particular in linear algebra, multivariate calculus, and probability. The course will focus on state-of-the-art research in deep-learning and is not meant as extensive tutorial of how to train deep networks with Tensorflow..

Please take note of the following conditions:
1) The number of participants is limited to 100 students (MSc and PhDs).
2) Students must have taken the exam in Machine Learning (252-0535-00) or have acquired equivalent knowledge
3) All practical exercises will require basic knowledge of Python and will use libraries such as TensorFlow, scikit-learn and scikit-image. We will provide introductions to TensorFlow and other libraries that are needed but will not provide introductions to basic programming or Python.

The following courses are strongly recommended as prerequisite:
* "Machine Learning"
* "Visual Computing" or "Computer Vision"

The course will be assessed by a final written examination in English. No course materials or electronic devices can be used during the examination. Note that the examination will be based on the contents of the lectures, the associated reading materials and the exercises.