365-1183-00L  Reinforcement Learning: Insights and Applications

SemesterHerbstsemester 2023
DozierendeC. Cuchiero, B. J. Bergmann, J. Teichmann
Periodizitätjährlich wiederkehrende Veranstaltung
KommentarExclusively for MAS MTEC students (1st and 3rd semester).


365-1183-00 SReinforcement Learning: Insights and Applications
Two-day course.
Friday: 08:30-17:00; Saturday: 08:30-16:45.
16s Std.
13.10.08:15-17:00HG E 33.3 »
08:15-17:00HG E 33.5 »
14.10.08:15-17:00HG E 33.3 »
08:15-17:00HG E 33.5 »
C. Cuchiero, B. J. Bergmann, J. Teichmann


KurzbeschreibungReinforcement learning (RL) is a field of machine learning that focuses on developing algorithms that enable an
agent by novel machine learning technologies to learn optimal strategies through interaction with its environment.
In this course we shall understand the main building blocks of (deep) RL and we shall discuss recent applications from finance and robotics.
LernzielAfter taking this course, students will
- have an understanding of the fundamentals of reinforcement learning (RL), including the definition of an agent, environment, and rewards.
- understand the idea of a Markov Decision Process, which is a mathematical framework used to model decision-making problems in RL.
- understand the concept of a value function, which is used to measure the expected reward an agent can receive from a given state.
- review various techniques for optimizing an agent's policy to maximize its expected reward
- get an idea of Deep Reinforcement Learning: We will explore the use of deep neural networks in reinforcement learning and their advantages over traditional RL methods.
- will understand the concept of Partially Observed Markov Decision Processes (POMDP) and its relation to MDPs.
- gain hands-on experience with RL algorithms (optional) for MDPs and POMDPs.
- see applications of DRL with a discussion of the real-world applications including finance and robotics.
InhaltReinforcement learning is a subfield of machine learning that focuses on developing algorithms that enable an agent to learn through trial and error by interacting with its environment. RL differs from other ML algorithms, e.g. supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead, the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). The environment is typically stated in the form of a Markov decision process (MDP). In this course we will go through the main architecture of reinforcement learning and review some of its applications.

On day 1 of the 2-day course the concept of a Markov Decision Process (MDP), its value function and the Bellmann
equation are introduced and discussed. Several classical and ML powered algorithms are introduced and showcases
presented. On Day 2 the concept of a partially observed Markov Decision Process is introduced. Aspects of Filtering and embedding partially observed Markov decision processes into the framework of MDPs are presented. Showcases from Robotics and Finance with an emphasis on the latter are presented in theory and applications.

An understanding of basic machine learning concepts is welcomed but not mandatory (e.g. you took the class “Fundamentals on ML for Executives” or “AI for Executives”). In the beginning of the course, we will do a short primer on mathematics and statistics and some fundamental aspects of machine learning. We will provide coding examples for those you would like to follow the code.

Grading (ungraded semester performance) is based on active participation in the class and a short written report (ungraded) after the course.
Fachspezifische KompetenzenKonzepte und Theoriengefördert
Verfahren und Technologiengefördert
Methodenspezifische KompetenzenAnalytische Kompetenzengefördert
Medien und digitale Technologiengefördert
Soziale KompetenzenKooperation und Teamarbeitgefördert
Persönliche KompetenzenKreatives Denkengefördert
Kritisches Denkengefördert


Information zur Leistungskontrolle (gültig bis die Lerneinheit neu gelesen wird)
Leistungskontrolle als Semesterkurs
ECTS Kreditpunkte1 KP
PrüfendeJ. Teichmann, B. J. Bergmann, C. Cuchiero
Formunbenotete Semesterleistung
RepetitionRepetition nur nach erneuter Belegung der Lerneinheit möglich.
Zusatzinformation zum PrüfungsmodusCredit points only be assigned if the following criteria are met: Full attendance of all course days and full completion of all course assignments.


Keine öffentlichen Lernmaterialien verfügbar.
Es werden nur die öffentlichen Lernmaterialien aufgeführt.


Keine Informationen zu Gruppen vorhanden.


PlätzeMaximal 25
VorrangDie Belegung der Lerneinheit ist nur durch die primäre Zielgruppe möglich
Primäre ZielgruppeMAS ETH in Management, Technology, and Economics (365000) ab Semester 01
WartelisteBis 08.10.2023
BelegungsendeBelegung nur bis 10.09.2023 möglich

Angeboten in

MAS in Management, Technology, and EconomicsWahlfächer, 1. und 3. SemesterWInformation