Search result: Catalogue data in Spring Semester 2021

Statistics Master Information
The following courses belong to the curriculum of the Master's Programme in Statistics. The corresponding credits do not count as external credits even for course units where an enrolment at ETH Zurich is not possible.
Master Studies (Programme Regulations 2020)
Core Courses
Statistical Modelling
Course units are offered in the autumn semester.
Applied Statistics
NumberTitleTypeECTSHoursLecturers
401-3632-00LComputational StatisticsW8 credits3V + 1UM. Mächler
AbstractWe discuss modern statistical methods for data analysis, including methods for data exploration, prediction and inference. We pay attention to algorithmic aspects, theoretical properties and practical considerations. The class is hands-on and methods are applied using the statistical programming language R.
ObjectiveThe student obtains an overview of modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. The methods are applied using the statistical programming language R.
ContentSee the class website
Prerequisites / NoticeAt least one semester of (basic) probability and statistics.

Programming experience is helpful but not required.
Mathematical Statistics
Course units are offered in the autumn semester.
Subject Specific Electives
NumberTitleTypeECTSHoursLecturers
252-3900-00LBig Data for Engineers Information
This course is not intended for Computer Science and Data Science MSc students!
W6 credits2V + 2U + 1AG. Fourny
AbstractThis course is part of the series of database lectures offered to all ETH departments, together with Information Systems for Engineers. It introduces the most recent advances in the database field: how do we scale storage and querying to Petabytes of data, with trillions of records? How do we deal with heterogeneous data sets? How do we deal with alternate data shapes like trees and graphs?
ObjectiveThis lesson is complementary with Information Systems for Engineers as they cover different time periods of database history and practices -- you can even take both lectures at the same time.

The key challenge of the information society is to turn data into information, information into knowledge, knowledge into value. This has become increasingly complex. Data comes in larger volumes, diverse shapes, from different sources. Data is more heterogeneous and less structured than forty years ago. Nevertheless, it still needs to be processed fast, with support for complex operations.

This combination of requirements, together with the technologies that have emerged in order to address them, is typically referred to as "Big Data." This revolution has led to a completely new way to do business, e.g., develop new products and business models, but also to do science -- which is sometimes referred to as data-driven science or the "fourth paradigm".

Unfortunately, the quantity of data produced and available -- now in the Zettabyte range (that's 21 zeros) per year -- keeps growing faster than our ability to process it. Hence, new architectures and approaches for processing it were and are still needed. Harnessing them must involve a deep understanding of data not only in the large, but also in the small.

The field of databases evolves at a fast pace. In order to be prepared, to the extent possible, to the (r)evolutions that will take place in the next few decades, the emphasis of the lecture will be on the paradigms and core design ideas, while today's technologies will serve as supporting illustrations thereof.

After visiting this lecture, you should have gained an overview and understanding of the Big Data landscape, which is the basis on which one can make informed decisions, i.e., pick and orchestrate the relevant technologies together for addressing each business use case efficiently and consistently.
ContentThis course gives an overview of database technologies and of the most important database design principles that lay the foundations of the Big Data universe.

It targets specifically students with a scientific or Engineering, but not Computer Science, background.

We take the monolithic, one-machine relational stack from the 1970s, smash it down and rebuild it on top of large clusters: starting with distributed storage, and all the way up to syntax, models, validation, processing, indexing, and querying. A broad range of aspects is covered with a focus on how they fit all together in the big picture of the Big Data ecosystem.

No data is harmed during this course, however, please be psychologically prepared that our data may not always be in normal form.

- physical storage: distributed file systems (HDFS), object storage(S3), key-value stores

- logical storage: document stores (MongoDB), column stores (HBase)

- data formats and syntaxes (XML, JSON, RDF, CSV, YAML, protocol buffers, Avro)

- data shapes and models (tables, trees)

- type systems and schemas: atomic types, structured types (arrays, maps), set-based type systems (?, *, +)

- an overview of functional, declarative programming languages across data shapes (SQL, JSONiq)

- the most important query paradigms (selection, projection, joining, grouping, ordering, windowing)

- paradigms for parallel processing, two-stage (MapReduce) and DAG-based (Spark)

- resource management (YARN)

- what a data center is made of and why it matters (racks, nodes, ...)

- underlying architectures (internal machinery of HDFS, HBase, Spark)

- optimization techniques (functional and declarative paradigms, query plans, rewrites, indexing)

- applications.

Large scale analytics and machine learning are outside of the scope of this course.
LiteraturePapers from scientific conferences and journals. References will be given as part of the course material during the semester.
Prerequisites / NoticeThis course is not intended for Computer Science and Data Science students. Computer Science and Data Science students interested in Big Data MUST attend the Master's level Big Data lecture, offered in Fall.

Requirements: programming knowledge (Java, C++, Python, PHP, ...) as well as basic knowledge on databases (SQL). If you have already built your own website with a backend SQL database, this is perfect.

Attendance is especially recommended to those who attended Information Systems for Engineers last Fall, which introduced the "good old databases of the 1970s" (SQL, tables and cubes). However, this is not a strict requirement, and it is also possible to take the lectures in reverse order.
252-0220-00LIntroduction to Machine Learning Information Restricted registration - show details
Limited number of participants. Preference is given to students in programmes in which the course is being offered. All other students will be waitlisted. Please do not contact Prof. Krause for any questions in this regard. If necessary, please contact Link
W8 credits4V + 2U + 1AA. Krause, F. Yang
AbstractThe course introduces the foundations of learning and making predictions based on data.
ObjectiveThe course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexitiy. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a course project.
Content- Linear regression (overfitting, cross-validation/bootstrap, model selection, regularization, [stochastic] gradient descent)
- Linear classification: Logistic regression (feature selection, sparsity, multi-class)
- Kernels and the kernel trick (Properties of kernels; applications to linear and logistic regression); k-nearest neighbor
- Neural networks (backpropagation, regularization, convolutional neural networks)
- Unsupervised learning (k-means, PCA, neural network autoencoders)
- The statistical perspective (regularization as prior; loss as likelihood; learning as MAP inference)
- Statistical decision theory (decision making based on statistical models and utility functions)
- Discriminative vs. generative modeling (benefits and challenges in modeling joint vy. conditional distributions)
- Bayes' classifiers (Naive Bayes, Gaussian Bayes; MLE)
- Bayesian approaches to unsupervised learning (Gaussian mixtures, EM)
LiteratureTextbook: Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press
Prerequisites / NoticeDesigned to provide a basis for following courses:
- Advanced Machine Learning
- Deep Learning
- Probabilistic Artificial Intelligence
- Seminar "Advanced Topics in Machine Learning"
401-4632-15LCausality Information W4 credits2GC. Heinze-Deml
AbstractIn statistics, we are used to search for the best predictors of some random variable. In many situations, however, we are interested in predicting a system's behavior under manipulations. For such an analysis, we require knowledge about the underlying causal structure of the system. In this course, we study concepts and theory behind causal inference.
ObjectiveAfter this course, you should be able to
- understand the language and concepts of causal inference
- know the assumptions under which one can infer causal relations from observational and/or interventional data
- describe and apply different methods for causal structure learning
- given data and a causal structure, derive causal effects and predictions of interventional experiments
Prerequisites / NoticePrerequisites: basic knowledge of probability theory and regression
401-3602-00LApplied Stochastic Processes Information W8 credits3V + 1UV. Tassion
AbstractPoisson processes; renewal processes; Markov chains in discrete and in continuous time; some applications.
ObjectiveStochastic processes are a way to describe and study the behaviour of systems that evolve in some random way. In this course, the evolution will be with respect to a scalar parameter interpreted as time, so that we discuss the temporal evolution of the system. We present several classes of stochastic processes, analyse their properties and behaviour and show by some examples how they can be used. The main emphasis is on theory; in that sense, "applied" should be understood to mean "applicable".
LiteratureR. N. Bhattacharya and E. C. Waymire, "Stochastic Processes with Applications", SIAM (2009), available online: Link
R. Durrett, "Essentials of Stochastic Processes", Springer (2012), available online: Link
M. Lefebvre, "Applied Stochastic Processes", Springer (2007), available online: Link
S. I. Resnick, "Adventures in Stochastic Processes", Birkhäuser (2005)
Prerequisites / NoticePrerequisites are familiarity with (measure-theoretic) probability theory as it is treated in the course "Probability Theory" (401-3601-00L).
401-3642-00LBrownian Motion and Stochastic Calculus Information W10 credits4V + 1UW. Werner
AbstractThis course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations.
ObjectiveThis course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations.
Lecture notesLecture notes will be distributed in class.
Literature- J.-F. Le Gall, Brownian Motion, Martingales, and Stochastic Calculus, Springer (2016).
- I. Karatzas, S. Shreve, Brownian Motion and Stochastic Calculus, Springer (1991).
- D. Revuz, M. Yor, Continuous Martingales and Brownian Motion, Springer (2005).
- L.C.G. Rogers, D. Williams, Diffusions, Markov Processes and Martingales, vol. 1 and 2, Cambridge University Press (2000).
- D.W. Stroock, S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer (2006).
Prerequisites / NoticeFamiliarity with measure-theoretic probability as in the standard D-MATH course "Probability Theory" will be assumed. Textbook accounts can be found for example in
- J. Jacod, P. Protter, Probability Essentials, Springer (2004).
- R. Durrett, Probability: Theory and Examples, Cambridge University Press (2010).
401-6228-00LProgramming with R for Reproducible Research Information W1 credit1GM. Mächler
AbstractDeeper understanding of R: Function calls, rather than "commands".
Reproducible research and data analysis via Sweave and Rmarkdown.
Limits of floating point arithmetic.
Understanding how functions work. Environments, packages, namespaces.
Closures, i.e., Functions returning functions.
Lists and [mc]lapply() for easy parallelization.
Performance measurement and improvements.
ObjectiveLearn to understand R as a (very versatile and flexible) programming language and learn about some of its lower level functionalities which are needed to understand *why* R works the way it does.
ContentSee "Skript": Link
Lecture notesMaterial available from Github
Link

(typically will be updated during course)
LiteratureNorman Matloff (2011) The Art of R Programming - A tour of statistical software design.
no starch press, San Francisco. on stock at Polybuchhandlung (CHF 42.-).

More material, notably H.Wickam's "Advanced R" : see my ProgRRR github page.
Prerequisites / NoticeR Knowledge on the same level as after *both* parts of the ETH lecture
401-6217-00L Using R for Data Analysis and Graphics
Link

An interest to dig deeper than average R users do.

Bring your own laptop with a recent version of R installed
401-4627-00LEmpirical Process Theory and ApplicationsW4 credits2VS. van de Geer
AbstractEmpirical process theory provides a rich toolbox for studying the properties of empirical risk minimizers, such as least squares and maximum likelihood estimators, support vector machines, etc.
Objective
ContentIn this series of lectures, we will start with considering exponential inequalities, including concentration inequalities, for the deviation of averages from their mean. We furthermore present some notions from approximation theory, because this enables us to assess the modulus of continuity of empirical processes. We introduce e.g., Vapnik Chervonenkis dimension: a combinatorial concept (from learning theory) of the "size" of a collection of sets or functions. As statistical applications, we study consistency and exponential inequalities for empirical risk minimizers, and asymptotic normality in semi-parametric models. We moreover examine regularization and model selection.
401-4637-67LOn Hypothesis TestingW4 credits2VF. Balabdaoui
AbstractThis course is a review of the main results in decision theory.
ObjectiveThe goal of this course is to present a review for the most fundamental results in statistical testing. This entails reviewing the Neyman-Pearson Lemma for simple hypotheses and the Karlin-Rubin Theorem for monotone likelihood ratio parametric families. The students will also encounter the important concept of p-values and their use in some multiple testing situations. Further methods for constructing tests will be also presented including likelihood ratio and chi-square tests. Some non-parametric tests will be reviewed such as the Kolmogorov goodness-of-fit test and the two sample Wilcoxon rank test. The most important theoretical results will reproved and also illustrated via different examples. Four sessions of exercises will be scheduled (the students will be handed in an exercise sheet a week before discussing solutions in class).
Literature- Statistical Inference (Casella & Berger)
- Testing Statistical Hypotheses (Lehmann and Romano)
401-3629-00LQuantitative Risk Management Information W4 credits2V + 1UP. Cheridito
AbstractThis course introduces methods from probability theory and statistics that can be used to model financial risks. Topics addressed include loss distributions, risk measures, extreme value theory, multivariate models, copulas, dependence structures and operational risk.
ObjectiveThe goal is to learn the most important methods from probability theory and statistics used in financial risk modeling.
Content1. Introduction
2. Basic Concepts in Risk Management
3. Empirical Properties of Financial Data
4. Financial Time Series
5. Extreme Value Theory
6. Multivariate Models
7. Copulas and Dependence
8. Operational Risk
Lecture notesCourse material is available on Link
LiteratureQuantitative Risk Management: Concepts, Techniques and Tools
AJ McNeil, R Frey and P Embrechts
Princeton University Press, Princeton, 2015 (Revised Edition)
Link
Prerequisites / NoticeThe course corresponds to the Risk Management requirement for the SAA ("Aktuar SAV Ausbildung") as well as for the Master of Science UZH-ETH in Quantitative Finance.
261-5110-00LOptimization for Data Science Information W10 credits3V + 2U + 4AB. Gärtner, D. Steurer, N. He
AbstractThis course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in data science.
ObjectiveUnderstanding the theoretical guarantees (and their limits) of relevant optimization methods used in data science. Learning general paradigms to deal with optimization problems arising in data science.
ContentThis course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in machine learning and data science.

In the first part of the course, we will first give a brief introduction to convex optimization, with some basic motivating examples from machine learning. Then we will analyse classical and more recent first and second order methods for convex optimization: gradient descent, Nesterov's accelerated method, proximal and splitting algorithms, subgradient descent, stochastic gradient descent, variance-reduced methods, Newton's method, and Quasi-Newton methods. The emphasis will be on analysis techniques that occur repeatedly in convergence analyses for various classes of convex functions. We will also discuss some classical and recent theoretical results for nonconvex optimization.

In the second part, we discuss convex programming relaxations as a powerful and versatile paradigm for designing efficient algorithms to solve computational problems arising in data science. We will learn about this paradigm and develop a unified perspective on it through the lens of the sum-of-squares semidefinite programming hierarchy. As applications, we are discussing non-negative matrix factorization, compressed sensing and sparse linear regression, matrix completion and phase retrieval, as well as robust estimation.
Prerequisites / NoticeAs background, we require material taught in the course "252-0209-00L Algorithms, Probability, and Computing". It is not necessary that participants have actually taken the course, but they should be prepared to catch up if necessary.
252-0526-00LStatistical Learning Theory Information W8 credits3V + 2U + 2AJ. M. Buhmann, C. Cotrini Jimenez
AbstractThe course covers advanced methods of statistical learning:

- Variational methods and optimization.
- Deterministic annealing.
- Clustering for diverse types of data.
- Model validation by information theory.
ObjectiveThe course surveys recent methods of statistical learning. The fundamentals of machine learning, as presented in the courses "Introduction to Machine Learning" and "Advanced Machine Learning", are expanded from the perspective of statistical learning.
Content- Variational methods and optimization. We consider optimization approaches for problems where the optimizer is a probability distribution. We will discuss concepts like maximum entropy, information bottleneck, and deterministic annealing.

- Clustering. This is the problem of sorting data into groups without using training samples. We discuss alternative notions of "similarity" between data points and adequate optimization procedures.

- Model selection and validation. This refers to the question of how complex the chosen model should be. In particular, we present an information theoretic approach for model validation.

- Statistical physics models. We discuss approaches for approximately optimizing large systems, which originate in statistical physics (free energy minimization applied to spin glasses and other models). We also study sampling methods based on these models.
Lecture notesA draft of a script will be provided. Lecture slides will be made available.
LiteratureHastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer, 2001.

L. Devroye, L. Gyorfi, and G. Lugosi: A probabilistic theory of pattern recognition. Springer, New York, 1996
Prerequisites / NoticeKnowledge of machine learning (introduction to machine learning and/or advanced machine learning)
Basic knowledge of statistics.
227-0432-00LLearning, Classification and Compression Information W4 credits2V + 1UE. Riegler
AbstractThe focus of the course is aligned to a theoretical approach of learning theory and classification and an introduction to lossy and lossless compression for general sets and measures. We will mainly focus on a probabilistic approach, where an underlying distribution must be learned/compressed. The concepts acquired in the course are of broad and general interest in data sciences.
ObjectiveAfter attending this lecture and participating in the exercise sessions, students will have acquired a working knowledge of learning theory, classification, and compression.
Content1. Learning Theory
(a) Framework of Learning
(b) Hypothesis Spaces and Target Functions
(c) Reproducing Kernel Hilbert Spaces
(d) Bias-Variance Tradeoff
(e) Estimation of Sample and Approximation Error

2. Classification
(a) Binary Classifier
(b) Support Vector Machines (separable case)
(c) Support Vector Machines (nonseparable case)
(d) Kernel Trick

3. Lossy and Lossless Compression
(a) Basics of Compression
(b) Compressed Sensing for General Sets and Measures
(c) Quantization and Rate Distortion Theory for General Sets and Measures
Lecture notesDetailed lecture notes will be provided.
Prerequisites / NoticeThis course is aimed at students with a solid background in measure theory and linear algebra and basic knowledge in functional analysis.
252-3005-00LNatural Language Processing Information Restricted registration - show details
Number of participants limited to 400.
W5 credits2V + 1U + 1AR. Cotterell
AbstractThis course presents topics in natural language processing with an emphasis on modern techniques, primarily focusing on statistical and deep learning approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.
ObjectiveThe objective of the course is to learn the basic concepts in the statistical processing of natural languages. The course will be project-oriented so that the students can also gain hands-on experience with state-of-the-art tools and techniques.
ContentThis course presents an introduction to general topics and techniques used in natural language processing today, primarily focusing on statistical approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.
LiteratureJacob Eisenstein: Introduction to Natural Language Processing (Adaptive Computation and Machine Learning series)
636-0702-00LStatistical Models in Computational BiologyW6 credits2V + 1U + 2AN. Beerenwinkel
AbstractThe course offers an introduction to graphical models and their application to complex biological systems. Graphical models combine a statistical methodology with efficient algorithms for inference in settings of high dimension and uncertainty. The unifying graphical model framework is developed and used to examine several classical and topical computational biology methods.
ObjectiveThe goal of this course is to establish the common language of graphical models for applications in computational biology and to see this methodology at work for several real-world data sets.
ContentGraphical models are a marriage between probability theory and graph theory. They combine the notion of probabilities with efficient algorithms for inference among many random variables. Graphical models play an important role in computational biology, because they explicitly address two features that are inherent to biological systems: complexity and uncertainty. We will develop the basic theory and the common underlying formalism of graphical models and discuss several computational biology applications. Topics covered include conditional independence, Bayesian networks, Markov random fields, Gaussian graphical models, EM algorithm, junction tree algorithm, model selection, Dirichlet process mixture, causality, the pair hidden Markov model for sequence alignment, probabilistic phylogenetic models, phylo-HMMs, microarray experiments and gene regulatory networks, protein interaction networks, learning from perturbation experiments, time series data and dynamic Bayesian networks. Some of the biological applications will be explored in small data analysis problems as part of the exercises.
Lecture notesno
Literature- Airoldi EM (2007) Getting started in probabilistic graphical models. PLoS Comput Biol 3(12): e252. doi:10.1371/journal.pcbi.0030252
- Bishop CM. Pattern Recognition and Machine Learning. Springer, 2007.
- Durbin R, Eddy S, Krogh A, Mitchinson G. Biological Sequence Analysis. Cambridge university Press, 2004
701-0104-00LStatistical Modelling of Spatial DataW3 credits2GA. J. Papritz
AbstractIn environmental sciences one often deals with spatial data. When analysing such data the focus is either on exploring their structure (dependence on explanatory variables, autocorrelation) and/or on spatial prediction. The course provides an introduction to geostatistical methods that are useful for such analyses.
ObjectiveThe course will provide an overview of the basic concepts and stochastic models that are used to model spatial data. In addition, participants will learn a number of geostatistical techniques and acquire familiarity with R software that is useful for analyzing spatial data.
ContentAfter an introductory discussion of the types of problems and the kind of data that arise in environmental research, an introduction into linear geostatistics (models: stationary and intrinsic random processes, modelling large-scale spatial patterns by linear regression, modelling autocorrelation by variogram; kriging: mean square prediction of spatial data) will be taught. The lectures will be complemented by data analyses that the participants have to do themselves.
Lecture notesSlides, descriptions of the problems for the data analyses and solutions to them will be provided.
LiteratureP.J. Diggle & P.J. Ribeiro Jr. 2007. Model-based Geostatistics. Springer.
Prerequisites / NoticeFamiliarity with linear regression analysis (e.g. equivalent to the first part of the course 401-0649-00L Applied Statistical Regression) and with the software R (e.g. 401-6215-00L Using R for Data Analysis and Graphics (Part I), 401-6217-00L Using R for Data Analysis and Graphics (Part II)) are required for attending the course.
401-6222-00LRobust and Nonlinear Regression Information Restricted registration - show details
Does not take place this semester.
W2 credits1V + 1U
AbstractIn a first part, the basic ideas of robust fitting techniques are explained theoretically and practically using regression models and explorative multivariate analysis.

The second part addresses the challenges of fitting nonlinear regression functions and finding reliable confidence intervals.
ObjectiveParticipants are familiar with common robust fitting methods for the linear regression models as well as for exploratory multivariate analysis and are able to assess their suitability for the data at hand.

They know the challenges that arise in fitting of nonlinear regression functions, and know the difference between classical and profile based methods to determine confidence intervals.

They can apply the discussed methods in practise by using the statistics software R.
ContentRobust fitting: influence function, breakdown point, regression M-estimation, regression MM-estimation, robust inference, covariance estimation with high breakdown point, application in principal component analysis and linear discriminant analysis.

Nonlinear regression: the nonlinear regression model, estimation methods, approximate tests and confidence intervals, estimation methods, profile t plot, profile traces, parameter transformation, prediction and calibration
Lecture notesLecture notes are available
Prerequisites / NoticeIt is a block course on three Mondays in June
401-8618-00LStatistical Methods in Epidemiology (University of Zurich)
No enrolment to this course at ETH Zurich. Book the corresponding module directly at UZH.
UZH Module Code: STA408

Mind the enrolment deadlines at UZH:
Link
W5 credits3GUniversity lecturers
AbstractAnalysis of case-control and cohort studies. The most relevant measures
of effect (odds and rate ratios) are introduced, and methods for
adjusting for confounders (Mantel-Haenszel, regression) are thoroughly
discussed. Advanced topics such as measurement error and propensity
score adjustments are also covered. We will outline statistical methods
for case-crossover and case series studies etc.
Objective
401-4626-00LAdvanced Statistical Modelling: Mixed Models
Does not take place this semester.
W4 credits2VM. Mächler
AbstractMixed Models = (*| generalized| non-) linear Mixed-effects Models, extend traditional regression models by adding "random effect" terms.

In applications, such models are called "hierarchical models", "repeated measures" or "split plot designs". Mixed models are widely used and appropriate in an aera of complex data measured from living creatures from biology to human sciences.
Objective- Becoming aware how mixed models are more realistic and more powerful in many cases than traditional ("fixed-effects only") regression models.

- Learning to fit such models to data correctly, critically interpreting results for such model fits, and hence learning to work the creative cycle of responsible statistical data analysis:
"fit -> interpret & diagnose -> modify the fit -> interpret & ...."

- Becoming aware of computational and methodological limitations of these models, even when using state-of-the art software.
ContentThe lecture will build on various examples, use R and notably the `lme4` package, to illustrate concepts. The relevant R scripts are made available online.

Inference (significance of factors, confidence intervals) will focus on the more realistic *un*balanced situation where classical (ANOVA, sum of squares etc) methods are known to be deficient. Hence, Maximum Likelihood (ML) and its variant, "REML", will be used for estimation and inference.
Lecture notesWe will work with an unfinished book proposal from Prof Douglas Bates, Wisconsin, USA which itself is a mixture of theory and worked R code examples.

These lecture notes and all R scripts are made available from
Link
Literature(see web page and lecture notes)
Prerequisites / Notice- We assume a good working knowledge about multiple linear regression ("the general linear model') and an intermediate (not beginner's) knowledge about model based statistics (estimation, confidence intervals,..).

Typically this means at least two classes of (math based) statistics, say
1. Intro to probability and statistics
2. (Applied) regression including Matrix-Vector notation Y = X b + E

- Basic (1 semester) "Matrix calculus" / linear algebra is also assumed.

- If familiarity with [R](Link) is not given, it should be acquired during the course (by the student on own initiative).
401-8628-00LSurvival Analysis (University of Zurich)
No enrolment to this course at ETH Zurich. Book the corresponding module directly at UZH.
UZH Module Code: STA425

Mind the enrolment deadlines at UZH:
Link
W3 credits1.5GUniversity lecturers
AbstractThe analysis of survival times, or in more general terms, the analysis
of time to event variables is concerned with models for censored
observations. Because we cannot always wait until the event of interest
actually happens, the methods discussed here are required for an
appropriate handling of incomplete observations where we only know that
the event of interest did not happen within ...
Objective
ContentDuring the course, we will study the most important methods and models
for censored data, including
- general concepts of censoring,
- simple summary statistics,
- estimation of survival curves,
- frequentist inference for two and more groups, and
- regression models for censored observations
227-0434-10LMathematics of Information Information W8 credits3V + 2U + 2AH. Bölcskei
AbstractThe class focuses on mathematical aspects of

1. Information science: Sampling theorems, frame theory, compressed sensing, sparsity, super-resolution, spectrum-blind sampling, subspace algorithms, dimensionality reduction

2. Learning theory: Approximation theory, greedy algorithms, uniform laws of large numbers, Rademacher complexity, Vapnik-Chervonenkis dimension
ObjectiveThe aim of the class is to familiarize the students with the most commonly used mathematical theories in data science, high-dimensional data analysis, and learning theory. The class consists of the lecture, exercise sessions with homework problems, and of a research project, which can be carried out either individually or in groups. The research project consists of either 1. software development for the solution of a practical signal processing or machine learning problem or 2. the analysis of a research paper or 3. a theoretical research problem of suitable complexity. Students are welcome to propose their own project at the beginning of the semester. The outcomes of all projects have to be presented to the entire class at the end of the semester.
ContentMathematics of Information

1. Signal representations: Frame theory, wavelets, Gabor expansions, sampling theorems, density theorems

2. Sparsity and compressed sensing: Sparse linear models, uncertainty relations in sparse signal recovery, super-resolution, spectrum-blind sampling, subspace algorithms (ESPRIT), estimation in the high-dimensional noisy case, Lasso

3. Dimensionality reduction: Random projections, the Johnson-Lindenstrauss Lemma

Mathematics of Learning

4. Approximation theory: Nonlinear approximation theory, best M-term approximation, greedy algorithms, fundamental limits on compressibility of signal classes, Kolmogorov-Tikhomirov epsilon-entropy of signal classes, optimal compression of signal classes

5. Uniform laws of large numbers: Rademacher complexity, Vapnik-Chervonenkis dimension, classes with polynomial discrimination
Lecture notesDetailed lecture notes will be provided at the beginning of the semester.
Prerequisites / NoticeThis course is aimed at students with a background in basic linear algebra, analysis, statistics, and probability.

We encourage students who are interested in mathematical data science to take both this course and "401-4944-20L Mathematics of Data Science" by Prof. A. Bandeira. The two courses are designed to be complementary.

H. Bölcskei and A. Bandeira
401-4944-20LMathematics of Data Science
Does not take place this semester.
W8 credits4GA. Bandeira
AbstractMostly self-contained, but fast-paced, introductory masters level course on various theoretical aspects of algorithms that aim to extract information from data.
ObjectiveIntroduction to various mathematical aspects of Data Science.
ContentThese topics lie in overlaps of (Applied) Mathematics with: Computer Science, Electrical Engineering, Statistics, and/or Operations Research. Each lecture will feature a couple of Mathematical Open Problem(s) related to Data Science. The main mathematical tools used will be Probability and Linear Algebra, and a basic familiarity with these subjects is required. There will also be some (although knowledge of these tools is not assumed) Graph Theory, Representation Theory, Applied Harmonic Analysis, among others. The topics treated will include Dimension reduction, Manifold learning, Sparse recovery, Random Matrices, Approximation Algorithms, Community detection in graphs, and several others.
Lecture notesLink
Prerequisites / NoticeThe main mathematical tools used will be Probability, Linear Algebra (and real analysis), and a working knowledge of these subjects is required. In addition
to these prerequisites, this class requires a certain degree of mathematical maturity--including abstract thinking and the ability to understand and write proofs.


We encourage students who are interested in mathematical data science to take both this course and ``227-0434-10L Mathematics of Information'' taught by Prof. H. Bölcskei. The two courses are designed to be
complementary.
A. Bandeira and H. Bölcskei
263-5300-00LGuarantees for Machine Learning Information Restricted registration - show details
Number of participants limited to 30.

Last cancellation/deregistration date for this graded semester performance: 17 March 2021! Please note that after that date no deregistration will be accepted and a "no show" will appear on your transcript.
W7 credits3G + 3AF. Yang
AbstractThis course is aimed at advanced master and doctorate students who want to conduct independent research on theory for modern machine learning (ML). It teaches classical and recent methods in statistical learning theory commonly used to prove theoretical guarantees for ML algorithms. The knowledge is then applied in independent project work that focuses on understanding modern ML phenomena.
ObjectiveLearning objectives:

- acquire enough mathematical background to understand a good fraction of theory papers published in the typical ML venues. For this purpose, students will learn common mathematical techniques from statistics and optimization in the first part of the course and apply this knowledge in the project work
- critically examine recently published work in terms of relevance and determine impactful (novel) research problems. This will be an integral part of the project work and involves experimental as well as theoretical questions
- find and outline an approach (some subproblem) to prove a conjectured theorem. This will be practiced in lectures / exercise and homeworks and potentially in the final project.
- effectively communicate and present the problem motivation, new insights and results to a technical audience. This will be primarily learned via the final presentation and report as well as during peer-grading of peer talks.
ContentThis course touches upon foundational methods in statistical learning theory aimed at proving theoretical guarantees for machine learning algorithms, touching on the following topics
- concentration bounds
- uniform convergence and empirical process theory
- high-dimensional statistics (e.g. sparsity)
- regularization for non-parametric statistics (e.g. in RKHS, neural networks)
- implicit regularization via gradient descent (e.g. margins, early stopping)
- minimax lower bounds

The project work focuses on current theoretical ML research that aims to understand modern phenomena in machine learning, including but not limited to
- how overparameterization could help generalization ( RKHS, NN )
- how overparameterization could help optimization ( non-convex optimization, loss landscape )
- complexity measures and approximation theoretic properties of randomly initialized and trained NN
- generalization of robust learning ( adversarial robustness, standard and robust error tradeoff, distribution shift)
Prerequisites / NoticeIt’s absolutely necessary for students to have a strong mathematical background (basic real analysis, probability theory, linear algebra) and good knowledge of core concepts in machine learning taught in courses such as “Introduction to Machine Learning”, “Regression”/ “Statistical Modelling”. In addition to these prerequisites, this class requires a high degree of mathematical maturity—including abstract thinking and the ability to understand and write proofs.

Students have usually taken a subset of Fundamentals of Mathematical Statistics, Probabilistic AI, Neural Network Theory, Optimization for Data Science, Advanced ML, Statistical Learning Theory, Probability Theory (D-MATH)
401-6102-00LMultivariate Statistics
Does not take place this semester.
W4 credits2Gnot available
AbstractMultivariate Statistics deals with joint distributions of several random variables. This course introduces the basic concepts and provides an overview over classical and modern methods of multivariate statistics. We will consider the theory behind the methods as well as their applications.
ObjectiveAfter the course, you should be able to:
- describe the various methods and the concepts and theory behind them
- identify adequate methods for a given statistical problem
- use the statistical software "R" to efficiently apply these methods
- interpret the output of these methods
ContentVisualization / Principal component analysis / Multidimensional scaling / The multivariate Normal distribution / Factor analysis / Supervised learning / Cluster analysis
Lecture notesNone
LiteratureThe course will be based on class notes and books that are available electronically via the ETH library.
Prerequisites / NoticeTarget audience: This course is the more theoretical version of "Applied Multivariate Statistics" (401-0102-00L) and is targeted at students with a math background.

Prerequisite: A basic course in probability and statistics.

Note: The courses 401-0102-00L and 401-6102-00L are mutually exclusive. You may register for at most one of these two course units.
Free Electives
Several further courses offered at the University of Zurich belong to the curriculum of the Master's Programme in Statistics. With the consent by the Advisor (Link) such a course is eligible as a free elective.
» Course Catalogue
Master Studies (Programme Regulations 2014)
Core Courses
In each subject area, the core courses offered are normally mathematical as well as application-oriented in content. For each subject area, only one of these is recognised for the Master degree.
Regression
No offering in this semester (401-3622-00L Statistical Modelling is offered in the autumn semester).
Analysis of Variance and Design of Experiments
No offering in this semester
Multivariate Statistics
NumberTitleTypeECTSHoursLecturers
401-6102-00LMultivariate Statistics
Does not take place this semester.
W4 credits2Gnot available
AbstractMultivariate Statistics deals with joint distributions of several random variables. This course introduces the basic concepts and provides an overview over classical and modern methods of multivariate statistics. We will consider the theory behind the methods as well as their applications.
ObjectiveAfter the course, you should be able to:
- describe the various methods and the concepts and theory behind them
- identify adequate methods for a given statistical problem
- use the statistical software "R" to efficiently apply these methods
- interpret the output of these methods
ContentVisualization / Principal component analysis / Multidimensional scaling / The multivariate Normal distribution / Factor analysis / Supervised learning / Cluster analysis
Lecture notesNone
LiteratureThe course will be based on class notes and books that are available electronically via the ETH library.
Prerequisites / NoticeTarget audience: This course is the more theoretical version of "Applied Multivariate Statistics" (401-0102-00L) and is targeted at students with a math background.

Prerequisite: A basic course in probability and statistics.

Note: The courses 401-0102-00L and 401-6102-00L are mutually exclusive. You may register for at most one of these two course units.
401-0102-00LApplied Multivariate StatisticsW5 credits2V + 1UF. Sigrist
AbstractMultivariate statistics analyzes data on several random variables simultaneously. This course introduces the basic concepts and provides an overview of classical and modern methods of multivariate statistics including visualization, dimension reduction, supervised and unsupervised learning for multivariate data. An emphasis is on applications and solving problems with the statistical software R.
ObjectiveAfter the course, you are able to:
- describe the various methods and the concepts behind them
- identify adequate methods for a given statistical problem
- use the statistical software R to efficiently apply these methods
- interpret the output of these methods
ContentVisualization, multivariate outliers, the multivariate normal distribution, dimension reduction, principal component analysis, multidimensional scaling, factor analysis, cluster analysis, classification, multivariate tests and multiple testing
Lecture notesNone
Literature1) "An Introduction to Applied Multivariate Analysis with R" (2011) by Everitt and Hothorn
2) "An Introduction to Statistical Learning: With Applications in R" (2013) by Gareth, Witten, Hastie and Tibshirani

Electronic versions (pdf) of both books can be downloaded for free from the ETH library.
Prerequisites / NoticeThis course is targeted at students with a non-math background.

Requirements:
==========
1) Introductory course in statistics (min: t-test, regression; ideal: conditional probability, multiple regression)
2) Good understanding of R (if you don't know R, it is recommended that you study chapters 1,2,3,4, and 5 of "Introductory Statistics with R" from Peter Dalgaard, which is freely available online from the ETH library)

An alternative course with more emphasis on theory is 401-6102-00L "Multivariate Statistics" (only every second year).

401-0102-00L and 401-6102-00L are mutually exclusive. You can register for only one of these two courses.
Time Series and Stochastic Processes
NumberTitleTypeECTSHoursLecturers
401-6624-11LApplied Time SeriesW5 credits2V + 1UM. Dettling
AbstractThe course starts with an introduction to time series analysis (examples, goal, mathematical notation). In the following, descriptive techniques, modeling and prediction as well as advanced topics will be covered.
ObjectiveGetting to know the mathematical properties of time series, as well as the requirements, descriptive techniques, models, advanced methods and software that are necessary such that the student can independently run an applied time series analysis.
ContentThe course starts with an introduction to time series analysis that comprises of examples and goals. We continue with notation and descriptive analysis of time series. A major part of the course will be dedicated to modeling and forecasting of time series using the flexible class of ARMA models. More advanced topics that will be covered in the following are time series regression, time series classification and spectral analysis.
Lecture notesA script will be available.
Prerequisites / NoticeThe course starts with an introduction to time series analysis that comprises of examples and goals. We continue with notation and descriptive analysis of time series. A major part of the course will be dedicated to modeling and forecasting of time series using the flexible class of ARMA models. More advanced topics that will be covered in the following are time series regression, time series classification and spectral analysis.
Mathematical Statistics
No offering in this semester
Specialization Areas and Electives
Statistical and Mathematical Courses
NumberTitleTypeECTSHoursLecturers
401-4632-15LCausality Information W4 credits2GC. Heinze-Deml
AbstractIn statistics, we are used to search for the best predictors of some random variable. In many situations, however, we are interested in predicting a system's behavior under manipulations. For such an analysis, we require knowledge about the underlying causal structure of the system. In this course, we study concepts and theory behind causal inference.
ObjectiveAfter this course, you should be able to
- understand the language and concepts of causal inference
- know the assumptions under which one can infer causal relations from observational and/or interventional data
- describe and apply different methods for causal structure learning
- given data and a causal structure, derive causal effects and predictions of interventional experiments
Prerequisites / NoticePrerequisites: basic knowledge of probability theory and regression
401-4627-00LEmpirical Process Theory and ApplicationsW4 credits2VS. van de Geer
AbstractEmpirical process theory provides a rich toolbox for studying the properties of empirical risk minimizers, such as least squares and maximum likelihood estimators, support vector machines, etc.
Objective
ContentIn this series of lectures, we will start with considering exponential inequalities, including concentration inequalities, for the deviation of averages from their mean. We furthermore present some notions from approximation theory, because this enables us to assess the modulus of continuity of empirical processes. We introduce e.g., Vapnik Chervonenkis dimension: a combinatorial concept (from learning theory) of the "size" of a collection of sets or functions. As statistical applications, we study consistency and exponential inequalities for empirical risk minimizers, and asymptotic normality in semi-parametric models. We moreover examine regularization and model selection.
401-4637-67LOn Hypothesis TestingW4 credits2VF. Balabdaoui
AbstractThis course is a review of the main results in decision theory.
ObjectiveThe goal of this course is to present a review for the most fundamental results in statistical testing. This entails reviewing the Neyman-Pearson Lemma for simple hypotheses and the Karlin-Rubin Theorem for monotone likelihood ratio parametric families. The students will also encounter the important concept of p-values and their use in some multiple testing situations. Further methods for constructing tests will be also presented including likelihood ratio and chi-square tests. Some non-parametric tests will be reviewed such as the Kolmogorov goodness-of-fit test and the two sample Wilcoxon rank test. The most important theoretical results will reproved and also illustrated via different examples. Four sessions of exercises will be scheduled (the students will be handed in an exercise sheet a week before discussing solutions in class).
Literature- Statistical Inference (Casella & Berger)
- Testing Statistical Hypotheses (Lehmann and Romano)
401-3632-00LComputational StatisticsW8 credits3V + 1UM. Mächler
AbstractWe discuss modern statistical methods for data analysis, including methods for data exploration, prediction and inference. We pay attention to algorithmic aspects, theoretical properties and practical considerations. The class is hands-on and methods are applied using the statistical programming language R.
ObjectiveThe student obtains an overview of modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. The methods are applied using the statistical programming language R.
ContentSee the class website
Prerequisites / NoticeAt least one semester of (basic) probability and statistics.

Programming experience is helpful but not required.
401-3602-00LApplied Stochastic Processes Information W8 credits3V + 1UV. Tassion
AbstractPoisson processes; renewal processes; Markov chains in discrete and in continuous time; some applications.
ObjectiveStochastic processes are a way to describe and study the behaviour of systems that evolve in some random way. In this course, the evolution will be with respect to a scalar parameter interpreted as time, so that we discuss the temporal evolution of the system. We present several classes of stochastic processes, analyse their properties and behaviour and show by some examples how they can be used. The main emphasis is on theory; in that sense, "applied" should be understood to mean "applicable".
LiteratureR. N. Bhattacharya and E. C. Waymire, "Stochastic Processes with Applications", SIAM (2009), available online: Link
R. Durrett, "Essentials of Stochastic Processes", Springer (2012), available online: Link
M. Lefebvre, "Applied Stochastic Processes", Springer (2007), available online: Link
S. I. Resnick, "Adventures in Stochastic Processes", Birkhäuser (2005)
Prerequisites / NoticePrerequisites are familiarity with (measure-theoretic) probability theory as it is treated in the course "Probability Theory" (401-3601-00L).
401-3642-00LBrownian Motion and Stochastic Calculus Information W10 credits4V + 1UW. Werner
AbstractThis course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations.
ObjectiveThis course covers some basic objects of stochastic analysis. In particular, the following topics are discussed: construction and properties of Brownian motion, stochastic integration, Ito's formula and applications, stochastic differential equations and connection with partial differential equations.
Lecture notesLecture notes will be distributed in class.
Literature- J.-F. Le Gall, Brownian Motion, Martingales, and Stochastic Calculus, Springer (2016).
- I. Karatzas, S. Shreve, Brownian Motion and Stochastic Calculus, Springer (1991).
- D. Revuz, M. Yor, Continuous Martingales and Brownian Motion, Springer (2005).
- L.C.G. Rogers, D. Williams, Diffusions, Markov Processes and Martingales, vol. 1 and 2, Cambridge University Press (2000).
- D.W. Stroock, S.R.S. Varadhan, Multidimensional Diffusion Processes, Springer (2006).
Prerequisites / NoticeFamiliarity with measure-theoretic probability as in the standard D-MATH course "Probability Theory" will be assumed. Textbook accounts can be found for example in
- J. Jacod, P. Protter, Probability Essentials, Springer (2004).
- R. Durrett, Probability: Theory and Examples, Cambridge University Press (2010).
401-6228-00LProgramming with R for Reproducible Research Information W1 credit1GM. Mächler
AbstractDeeper understanding of R: Function calls, rather than "commands".
Reproducible research and data analysis via Sweave and Rmarkdown.
Limits of floating point arithmetic.
Understanding how functions work. Environments, packages, namespaces.
Closures, i.e., Functions returning functions.
Lists and [mc]lapply() for easy parallelization.
Performance measurement and improvements.
ObjectiveLearn to understand R as a (very versatile and flexible) programming language and learn about some of its lower level functionalities which are needed to understand *why* R works the way it does.
ContentSee "Skript": Link
Lecture notesMaterial available from Github
Link

(typically will be updated during course)
LiteratureNorman Matloff (2011) The Art of R Programming - A tour of statistical software design.
no starch press, San Francisco. on stock at Polybuchhandlung (CHF 42.-).

More material, notably H.Wickam's "Advanced R" : see my ProgRRR github page.
Prerequisites / NoticeR Knowledge on the same level as after *both* parts of the ETH lecture
401-6217-00L Using R for Data Analysis and Graphics
Link

An interest to dig deeper than average R users do.

Bring your own laptop with a recent version of R installed
401-3629-00LQuantitative Risk Management Information W4 credits2V + 1UP. Cheridito
AbstractThis course introduces methods from probability theory and statistics that can be used to model financial risks. Topics addressed include loss distributions, risk measures, extreme value theory, multivariate models, copulas, dependence structures and operational risk.
ObjectiveThe goal is to learn the most important methods from probability theory and statistics used in financial risk modeling.
Content1. Introduction
2. Basic Concepts in Risk Management
3. Empirical Properties of Financial Data
4. Financial Time Series
5. Extreme Value Theory
6. Multivariate Models
7. Copulas and Dependence
8. Operational Risk
Lecture notesCourse material is available on Link
LiteratureQuantitative Risk Management: Concepts, Techniques and Tools
AJ McNeil, R Frey and P Embrechts
Princeton University Press, Princeton, 2015 (Revised Edition)
Link
Prerequisites / NoticeThe course corresponds to the Risk Management requirement for the SAA ("Aktuar SAV Ausbildung") as well as for the Master of Science UZH-ETH in Quantitative Finance.
401-4658-00LComputational Methods for Quantitative Finance: PDE Methods Information Restricted registration - show details W6 credits3V + 1UC. Marcati, A. Stein
AbstractIntroduction to principal methods of option pricing. Emphasis on PDE-based methods. Prerequisite MATLAB and Python programming
and knowledge of numerical mathematics at ETH BSc level.
ObjectiveIntroduce the main methods for efficient numerical valuation of derivative contracts in a
Black Scholes as well as in incomplete markets due Levy processes or due to stochastic volatility
models. Develop implementation of pricing methods in MATLAB and Python.
Finite-Difference/ Finite Element based methods for the solution of the pricing integrodifferential equation.
Content1. Review of option pricing. Wiener and Levy price process models. Deterministic, local and stochastic
volatility models.
2. Finite Difference Methods for option pricing. Relation to bi- and multinomial trees.
European contracts.
3. Finite Difference methods for Asian, American and Barrier type contracts.
4. Finite element methods for European and American style contracts.
5. Pricing under local and stochastic volatility in Black-Scholes Markets.
6. Finite Element Methods for option pricing under Levy processes. Treatment of
integrodifferential operators.
7. Stochastic volatility models for Levy processes.
8. Techniques for multidimensional problems. Baskets in a Black-Scholes setting and
stochastic volatility models in Black Scholes and Levy markets.
9. Introduction to sparse grid option pricing techniques.
Lecture notesThere will be english lecture notes as well as MATLAB or Python software for registered participants in the course.
LiteratureMain reference (course text):
N. Hilber, O. Reichmann, Ch. Schwab and Ch. Winter: Computational Methods for Quantitative Finance, Springer Finance, Springer, 2013.

Supplementary texts:
R. Cont and P. Tankov : Financial Modelling with Jump Processes, Chapman and Hall Publ. 2004.

Y. Achdou and O. Pironneau : Computational Methods for Option Pricing, SIAM Frontiers in Applied Mathematics, SIAM Publishers, Philadelphia 2005.

D. Lamberton and B. Lapeyre : Introduction to stochastic calculus Applied to Finance (second edition), Chapman & Hall/CRC Financial Mathematics Series, Taylor & Francis Publ. Boca Raton, London, New York 2008.

J.-P. Fouque, G. Papanicolaou and K.-R. Sircar : Derivatives in financial markets with stochastic volatility, Cambridge Univeristy Press, Cambridge, 2000.
Prerequisites / NoticeKnowledge of Numerical Analysis/ Scientific Computing Techniques
corresponding roughly to BSc MATH or BSc RW/CSE at ETH is expected.
Basic programming skills in MATLAB or Python are required for the exercises,
and are _not_ taught in this course.
401-2284-00LMeasure and Integration Information Restricted registration - show details W6 credits3V + 2UF. Da Lio
AbstractIntroduction to abstract measure and integration theory, including the following topics: Caratheodory extension theorem, Lebesgue measure, convergence theorems, L^p-spaces, Radon-Nikodym theorem, product measures and Fubini's theorem, measures on topological spaces
ObjectiveBasic acquaintance with the abstract theory of measure and integration
ContentIntroduction to abstract measure and integration theory, including the following topics: Caratheodory extension theorem, Lebesgue measure, convergence theorems, L^p-spaces, Radon-Nikodym theorem, product measures and Fubini's theorem, measures on topological spaces
Lecture notesNew lecture notes in English will be made available during the course.
Literature1. L. Evans and R.F. Gariepy " Measure theory and fine properties of functions"
2. Walter Rudin "Real and complex analysis"
3. R. Bartle The elements of Integration and Lebesgue Measure
4. The notes by Prof. Michael Struwe Springsemester 2013, Link.
5. The notes by Prof. UrsLang Springsemester 2019. Link
6. P. Cannarsa & T. D'Aprile: Lecture notes on Measure Theory and Functional Analysis: Link
.
401-4944-20LMathematics of Data Science
Does not take place this semester.
W8 credits4GA. Bandeira
AbstractMostly self-contained, but fast-paced, introductory masters level course on various theoretical aspects of algorithms that aim to extract information from data.
ObjectiveIntroduction to various mathematical aspects of Data Science.
ContentThese topics lie in overlaps of (Applied) Mathematics with: Computer Science, Electrical Engineering, Statistics, and/or Operations Research. Each lecture will feature a couple of Mathematical Open Problem(s) related to Data Science. The main mathematical tools used will be Probability and Linear Algebra, and a basic familiarity with these subjects is required. There will also be some (although knowledge of these tools is not assumed) Graph Theory, Representation Theory, Applied Harmonic Analysis, among others. The topics treated will include Dimension reduction, Manifold learning, Sparse recovery, Random Matrices, Approximation Algorithms, Community detection in graphs, and several others.
Lecture notesLink
Prerequisites / NoticeThe main mathematical tools used will be Probability, Linear Algebra (and real analysis), and a working knowledge of these subjects is required. In addition
to these prerequisites, this class requires a certain degree of mathematical maturity--including abstract thinking and the ability to understand and write proofs.


We encourage students who are interested in mathematical data science to take both this course and ``227-0434-10L Mathematics of Information'' taught by Prof. H. Bölcskei. The two courses are designed to be
complementary.
A. Bandeira and H. Bölcskei
227-0434-10LMathematics of Information Information W8 credits3V + 2U + 2AH. Bölcskei
AbstractThe class focuses on mathematical aspects of

1. Information science: Sampling theorems, frame theory, compressed sensing, sparsity, super-resolution, spectrum-blind sampling, subspace algorithms, dimensionality reduction

2. Learning theory: Approximation theory, greedy algorithms, uniform laws of large numbers, Rademacher complexity, Vapnik-Chervonenkis dimension
ObjectiveThe aim of the class is to familiarize the students with the most commonly used mathematical theories in data science, high-dimensional data analysis, and learning theory. The class consists of the lecture, exercise sessions with homework problems, and of a research project, which can be carried out either individually or in groups. The research project consists of either 1. software development for the solution of a practical signal processing or machine learning problem or 2. the analysis of a research paper or 3. a theoretical research problem of suitable complexity. Students are welcome to propose their own project at the beginning of the semester. The outcomes of all projects have to be presented to the entire class at the end of the semester.
ContentMathematics of Information

1. Signal representations: Frame theory, wavelets, Gabor expansions, sampling theorems, density theorems

2. Sparsity and compressed sensing: Sparse linear models, uncertainty relations in sparse signal recovery, super-resolution, spectrum-blind sampling, subspace algorithms (ESPRIT), estimation in the high-dimensional noisy case, Lasso

3. Dimensionality reduction: Random projections, the Johnson-Lindenstrauss Lemma

Mathematics of Learning

4. Approximation theory: Nonlinear approximation theory, best M-term approximation, greedy algorithms, fundamental limits on compressibility of signal classes, Kolmogorov-Tikhomirov epsilon-entropy of signal classes, optimal compression of signal classes

5. Uniform laws of large numbers: Rademacher complexity, Vapnik-Chervonenkis dimension, classes with polynomial discrimination
Lecture notesDetailed lecture notes will be provided at the beginning of the semester.
Prerequisites / NoticeThis course is aimed at students with a background in basic linear algebra, analysis, statistics, and probability.

We encourage students who are interested in mathematical data science to take both this course and "401-4944-20L Mathematics of Data Science" by Prof. A. Bandeira. The two courses are designed to be complementary.

H. Bölcskei and A. Bandeira
261-5110-00LOptimization for Data Science Information W10 credits3V + 2U + 4AB. Gärtner, D. Steurer, N. He
AbstractThis course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in data science.
ObjectiveUnderstanding the theoretical guarantees (and their limits) of relevant optimization methods used in data science. Learning general paradigms to deal with optimization problems arising in data science.
ContentThis course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in machine learning and data science.

In the first part of the course, we will first give a brief introduction to convex optimization, with some basic motivating examples from machine learning. Then we will analyse classical and more recent first and second order methods for convex optimization: gradient descent, Nesterov's accelerated method, proximal and splitting algorithms, subgradient descent, stochastic gradient descent, variance-reduced methods, Newton's method, and Quasi-Newton methods. The emphasis will be on analysis techniques that occur repeatedly in convergence analyses for various classes of convex functions. We will also discuss some classical and recent theoretical results for nonconvex optimization.

In the second part, we discuss convex programming relaxations as a powerful and versatile paradigm for designing efficient algorithms to solve computational problems arising in data science. We will learn about this paradigm and develop a unified perspective on it through the lens of the sum-of-squares semidefinite programming hierarchy. As applications, we are discussing non-negative matrix factorization, compressed sensing and sparse linear regression, matrix completion and phase retrieval, as well as robust estimation.
Prerequisites / NoticeAs background, we require material taught in the course "252-0209-00L Algorithms, Probability, and Computing". It is not necessary that participants have actually taken the course, but they should be prepared to catch up if necessary.
252-0220-00LIntroduction to Machine Learning Information Restricted registration - show details
Limited number of participants. Preference is given to students in programmes in which the course is being offered. All other students will be waitlisted. Please do not contact Prof. Krause for any questions in this regard. If necessary, please contact Link
W8 credits4V + 2U + 1AA. Krause, F. Yang
AbstractThe course introduces the foundations of learning and making predictions based on data.
ObjectiveThe course will introduce the foundations of learning and making predictions from data. We will study basic concepts such as trading goodness of fit and model complexitiy. We will discuss important machine learning algorithms used in practice, and provide hands-on experience in a course project.
Content- Linear regression (overfitting, cross-validation/bootstrap, model selection, regularization, [stochastic] gradient descent)
- Linear classification: Logistic regression (feature selection, sparsity, multi-class)
- Kernels and the kernel trick (Properties of kernels; applications to linear and logistic regression); k-nearest neighbor
- Neural networks (backpropagation, regularization, convolutional neural networks)
- Unsupervised learning (k-means, PCA, neural network autoencoders)
- The statistical perspective (regularization as prior; loss as likelihood; learning as MAP inference)
- Statistical decision theory (decision making based on statistical models and utility functions)
- Discriminative vs. generative modeling (benefits and challenges in modeling joint vy. conditional distributions)
- Bayes' classifiers (Naive Bayes, Gaussian Bayes; MLE)
- Bayesian approaches to unsupervised learning (Gaussian mixtures, EM)
LiteratureTextbook: Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press
Prerequisites / NoticeDesigned to provide a basis for following courses:
- Advanced Machine Learning
- Deep Learning
- Probabilistic Artificial Intelligence
- Seminar "Advanced Topics in Machine Learning"
252-0526-00LStatistical Learning Theory Information W8 credits3V + 2U + 2AJ. M. Buhmann, C. Cotrini Jimenez
AbstractThe course covers advanced methods of statistical learning:

- Variational methods and optimization.
- Deterministic annealing.
- Clustering for diverse types of data.
- Model validation by information theory.
ObjectiveThe course surveys recent methods of statistical learning. The fundamentals of machine learning, as presented in the courses "Introduction to Machine Learning" and "Advanced Machine Learning", are expanded from the perspective of statistical learning.
Content- Variational methods and optimization. We consider optimization approaches for problems where the optimizer is a probability distribution. We will discuss concepts like maximum entropy, information bottleneck, and deterministic annealing.

- Clustering. This is the problem of sorting data into groups without using training samples. We discuss alternative notions of "similarity" between data points and adequate optimization procedures.

- Model selection and validation. This refers to the question of how complex the chosen model should be. In particular, we present an information theoretic approach for model validation.

- Statistical physics models. We discuss approaches for approximately optimizing large systems, which originate in statistical physics (free energy minimization applied to spin glasses and other models). We also study sampling methods based on these models.
Lecture notesA draft of a script will be provided. Lecture slides will be made available.
LiteratureHastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer, 2001.

L. Devroye, L. Gyorfi, and G. Lugosi: A probabilistic theory of pattern recognition. Springer, New York, 1996
Prerequisites / NoticeKnowledge of machine learning (introduction to machine learning and/or advanced machine learning)
Basic knowledge of statistics.
227-0432-00LLearning, Classification and Compression Information W4 credits2V + 1UE. Riegler
AbstractThe focus of the course is aligned to a theoretical approach of learning theory and classification and an introduction to lossy and lossless compression for general sets and measures. We will mainly focus on a probabilistic approach, where an underlying distribution must be learned/compressed. The concepts acquired in the course are of broad and general interest in data sciences.
ObjectiveAfter attending this lecture and participating in the exercise sessions, students will have acquired a working knowledge of learning theory, classification, and compression.
Content1. Learning Theory
(a) Framework of Learning
(b) Hypothesis Spaces and Target Functions
(c) Reproducing Kernel Hilbert Spaces
(d) Bias-Variance Tradeoff
(e) Estimation of Sample and Approximation Error

2. Classification
(a) Binary Classifier
(b) Support Vector Machines (separable case)
(c) Support Vector Machines (nonseparable case)
(d) Kernel Trick

3. Lossy and Lossless Compression
(a) Basics of Compression
(b) Compressed Sensing for General Sets and Measures
(c) Quantization and Rate Distortion Theory for General Sets and Measures
Lecture notesDetailed lecture notes will be provided.
Prerequisites / NoticeThis course is aimed at students with a solid background in measure theory and linear algebra and basic knowledge in functional analysis.
252-3005-00LNatural Language Processing Information Restricted registration - show details
Number of participants limited to 400.
W5 credits2V + 1U + 1AR. Cotterell
AbstractThis course presents topics in natural language processing with an emphasis on modern techniques, primarily focusing on statistical and deep learning approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.
ObjectiveThe objective of the course is to learn the basic concepts in the statistical processing of natural languages. The course will be project-oriented so that the students can also gain hands-on experience with state-of-the-art tools and techniques.
ContentThis course presents an introduction to general topics and techniques used in natural language processing today, primarily focusing on statistical approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.
LiteratureJacob Eisenstein: Introduction to Natural Language Processing (Adaptive Computation and Machine Learning series)
252-3900-00LBig Data for Engineers Information
This course is not intended for Computer Science and Data Science MSc students!
W6 credits2V + 2U + 1AG. Fourny
AbstractThis course is part of the series of database lectures offered to all ETH departments, together with Information Systems for Engineers. It introduces the most recent advances in the database field: how do we scale storage and querying to Petabytes of data, with trillions of records? How do we deal with heterogeneous data sets? How do we deal with alternate data shapes like trees and graphs?
ObjectiveThis lesson is complementary with Information Systems for Engineers as they cover different time periods of database history and practices -- you can even take both lectures at the same time.

The key challenge of the information society is to turn data into information, information into knowledge, knowledge into value. This has become increasingly complex. Data comes in larger volumes, diverse shapes, from different sources. Data is more heterogeneous and less structured than forty years ago. Nevertheless, it still needs to be processed fast, with support for complex operations.

This combination of requirements, together with the technologies that have emerged in order to address them, is typically referred to as "Big Data." This revolution has led to a completely new way to do business, e.g., develop new products and business models, but also to do science -- which is sometimes referred to as data-driven science or the "fourth paradigm".

Unfortunately, the quantity of data produced and available -- now in the Zettabyte range (that's 21 zeros) per year -- keeps growing faster than our ability to process it. Hence, new architectures and approaches for processing it were and are still needed. Harnessing them must involve a deep understanding of data not only in the large, but also in the small.

The field of databases evolves at a fast pace. In order to be prepared, to the extent possible, to the (r)evolutions that will take place in the next few decades, the emphasis of the lecture will be on the paradigms and core design ideas, while today's technologies will serve as supporting illustrations thereof.

After visiting this lecture, you should have gained an overview and understanding of the Big Data landscape, which is the basis on which one can make informed decisions, i.e., pick and orchestrate the relevant technologies together for addressing each business use case efficiently and consistently.
ContentThis course gives an overview of database technologies and of the most important database design principles that lay the foundations of the Big Data universe.

It targets specifically students with a scientific or Engineering, but not Computer Science, background.

We take the monolithic, one-machine relational stack from the 1970s, smash it down and rebuild it on top of large clusters: starting with distributed storage, and all the way up to syntax, models, validation, processing, indexing, and querying. A broad range of aspects is covered with a focus on how they fit all together in the big picture of the Big Data ecosystem.

No data is harmed during this course, however, please be psychologically prepared that our data may not always be in normal form.

- physical storage: distributed file systems (HDFS), object storage(S3), key-value stores

- logical storage: document stores (MongoDB), column stores (HBase)

- data formats and syntaxes (XML, JSON, RDF, CSV, YAML, protocol buffers, Avro)

- data shapes and models (tables, trees)

- type systems and schemas: atomic types, structured types (arrays, maps), set-based type systems (?, *, +)

- an overview of functional, declarative programming languages across data shapes (SQL, JSONiq)

- the most important query paradigms (selection, projection, joining, grouping, ordering, windowing)

- paradigms for parallel processing, two-stage (MapReduce) and DAG-based (Spark)

- resource management (YARN)

- what a data center is made of and why it matters (racks, nodes, ...)

- underlying architectures (internal machinery of HDFS, HBase, Spark)

- optimization techniques (functional and declarative paradigms, query plans, rewrites, indexing)

- applications.

Large scale analytics and machine learning are outside of the scope of this course.
LiteraturePapers from scientific conferences and journals. References will be given as part of the course material during the semester.
Prerequisites / NoticeThis course is not intended for Computer Science and Data Science students. Computer Science and Data Science students interested in Big Data MUST attend the Master's level Big Data lecture, offered in Fall.

Requirements: programming knowledge (Java, C++, Python, PHP, ...) as well as basic knowledge on databases (SQL). If you have already built your own website with a backend SQL database, this is perfect.

Attendance is especially recommended to those who attended Information Systems for Engineers last Fall, which introduced the "good old databases of the 1970s" (SQL, tables and cubes). However, this is not a strict requirement, and it is also possible to take the lectures in reverse order.
263-5300-00LGuarantees for Machine Learning Information Restricted registration - show details
Number of participants limited to 30.

Last cancellation/deregistration date for this graded semester performance: 17 March 2021! Please note that after that date no deregistration will be accepted and a "no show" will appear on your transcript.
W7 credits3G + 3AF. Yang
AbstractThis course is aimed at advanced master and doctorate students who want to conduct independent research on theory for modern machine learning (ML). It teaches classical and recent methods in statistical learning theory commonly used to prove theoretical guarantees for ML algorithms. The knowledge is then applied in independent project work that focuses on understanding modern ML phenomena.
ObjectiveLearning objectives:

- acquire enough mathematical background to understand a good fraction of theory papers published in the typical ML venues. For this purpose, students will learn common mathematical techniques from statistics and optimization in the first part of the course and apply this knowledge in the project work
- critically examine recently published work in terms of relevance and determine impactful (novel) research problems. This will be an integral part of the project work and involves experimental as well as theoretical questions
- find and outline an approach (some subproblem) to prove a conjectured theorem. This will be practiced in lectures / exercise and homeworks and potentially in the final project.
- effectively communicate and present the problem motivation, new insights and results to a technical audience. This will be primarily learned via the final presentation and report as well as during peer-grading of peer talks.
ContentThis course touches upon foundational methods in statistical learning theory aimed at proving theoretical guarantees for machine learning algorithms, touching on the following topics
- concentration bounds
- uniform convergence and empirical process theory
- high-dimensional statistics (e.g. sparsity)
- regularization for non-parametric statistics (e.g. in RKHS, neural networks)
- implicit regularization via gradient descent (e.g. margins, early stopping)
- minimax lower bounds

The project work focuses on current theoretical ML research that aims to understand modern phenomena in machine learning, including but not limited to
- how overparameterization could help generalization ( RKHS, NN )
- how overparameterization could help optimization ( non-convex optimization, loss landscape )
- complexity measures and approximation theoretic properties of randomly initialized and trained NN
- generalization of robust learning ( adversarial robustness, standard and robust error tradeoff, distribution shift)
Prerequisites / NoticeIt’s absolutely necessary for students to have a strong mathematical background (basic real analysis, probability theory, linear algebra) and good knowledge of core concepts in machine learning taught in courses such as “Introduction to Machine Learning”, “Regression”/ “Statistical Modelling”. In addition to these prerequisites, this class requires a high degree of mathematical maturity—including abstract thinking and the ability to understand and write proofs.

Students have usually taken a subset of Fundamentals of Mathematical Statistics, Probabilistic AI, Neural Network Theory, Optimization for Data Science, Advanced ML, Statistical Learning Theory, Probability Theory (D-MATH)
636-0702-00LStatistical Models in Computational BiologyW6 credits2V + 1U + 2AN. Beerenwinkel
AbstractThe course offers an introduction to graphical models and their application to complex biological systems. Graphical models combine a statistical methodology with efficient algorithms for inference in settings of high dimension and uncertainty. The unifying graphical model framework is developed and used to examine several classical and topical computational biology methods.
ObjectiveThe goal of this course is to establish the common language of graphical models for applications in computational biology and to see this methodology at work for several real-world data sets.
ContentGraphical models are a marriage between probability theory and graph theory. They combine the notion of probabilities with efficient algorithms for inference among many random variables. Graphical models play an important role in computational biology, because they explicitly address two features that are inherent to biological systems: complexity and uncertainty. We will develop the basic theory and the common underlying formalism of graphical models and discuss several computational biology applications. Topics covered include conditional independence, Bayesian networks, Markov random fields, Gaussian graphical models, EM algorithm, junction tree algorithm, model selection, Dirichlet process mixture, causality, the pair hidden Markov model for sequence alignment, probabilistic phylogenetic models, phylo-HMMs, microarray experiments and gene regulatory networks, protein interaction networks, learning from perturbation experiments, time series data and dynamic Bayesian networks. Some of the biological applications will be explored in small data analysis problems as part of the exercises.
Lecture notesno
Literature- Airoldi EM (2007) Getting started in probabilistic graphical models. PLoS Comput Biol 3(12): e252. doi:10.1371/journal.pcbi.0030252
- Bishop CM. Pattern Recognition and Machine Learning. Springer, 2007.
- Durbin R, Eddy S, Krogh A, Mitchinson G. Biological Sequence Analysis. Cambridge university Press, 2004
701-0104-00LStatistical Modelling of Spatial DataW3 credits2GA. J. Papritz
AbstractIn environmental sciences one often deals with spatial data. When analysing such data the focus is either on exploring their structure (dependence on explanatory variables, autocorrelation) and/or on spatial prediction. The course provides an introduction to geostatistical methods that are useful for such analyses.
ObjectiveThe course will provide an overview of the basic concepts and stochastic models that are used to model spatial data. In addition, participants will learn a number of geostatistical techniques and acquire familiarity with R software that is useful for analyzing spatial data.
ContentAfter an introductory discussion of the types of problems and the kind of data that arise in environmental research, an introduction into linear geostatistics (models: stationary and intrinsic random processes, modelling large-scale spatial patterns by linear regression, modelling autocorrelation by variogram; kriging: mean square prediction of spatial data) will be taught. The lectures will be complemented by data analyses that the participants have to do themselves.
Lecture notesSlides, descriptions of the problems for the data analyses and solutions to them will be provided.
LiteratureP.J. Diggle & P.J. Ribeiro Jr. 2007. Model-based Geostatistics. Springer.
Prerequisites / NoticeFamiliarity with linear regression analysis (e.g. equivalent to the first part of the course 401-0649-00L Applied Statistical Regression) and with the software R (e.g. 401-6215-00L Using R for Data Analysis and Graphics (Part I), 401-6217-00L Using R for Data Analysis and Graphics (Part II)) are required for attending the course.
401-6222-00LRobust and Nonlinear Regression Information Restricted registration - show details
Does not take place this semester.
W2 credits1V + 1U
AbstractIn a first part, the basic ideas of robust fitting techniques are explained theoretically and practically using regression models and explorative multivariate analysis.

The second part addresses the challenges of fitting nonlinear regression functions and finding reliable confidence intervals.
ObjectiveParticipants are familiar with common robust fitting methods for the linear regression models as well as for exploratory multivariate analysis and are able to assess their suitability for the data at hand.

They know the challenges that arise in fitting of nonlinear regression functions, and know the difference between classical and profile based methods to determine confidence intervals.

They can apply the discussed methods in practise by using the statistics software R.
ContentRobust fitting: influence function, breakdown point, regression M-estimation, regression MM-estimation, robust inference, covariance estimation with high breakdown point, application in principal component analysis and linear discriminant analysis.

Nonlinear regression: the nonlinear regression model, estimation methods, approximate tests and confidence intervals, estimation methods, profile t plot, profile traces, parameter transformation, prediction and calibration
Lecture notesLecture notes are available
Prerequisites / NoticeIt is a block course on three Mondays in June
401-8618-00LStatistical Methods in Epidemiology (University of Zurich)
No enrolment to this course at ETH Zurich. Book the corresponding module directly at UZH.
UZH Module Code: STA408

Mind the enrolment deadlines at UZH:
Link
W5 credits3GUniversity lecturers
AbstractAnalysis of case-control and cohort studies. The most relevant measures
of effect (odds and rate ratios) are introduced, and methods for
adjusting for confounders (Mantel-Haenszel, regression) are thoroughly
discussed. Advanced topics such as measurement error and propensity
score adjustments are also covered. We will outline statistical methods
for case-crossover and case series studies etc.
Objective
401-4626-00LAdvanced Statistical Modelling: Mixed Models
Does not take place this semester.
W4 credits2VM. Mächler
AbstractMixed Models = (*| generalized| non-) linear Mixed-effects Models, extend traditional regression models by adding "random effect" terms.

In applications, such models are called "hierarchical models", "repeated measures" or "split plot designs". Mixed models are widely used and appropriate in an aera of complex data measured from living creatures from biology to human sciences.
Objective- Becoming aware how mixed models are more realistic and more powerful in many cases than traditional ("fixed-effects only") regression models.

- Learning to fit such models to data correctly, critically interpreting results for such model fits, and hence learning to work the creative cycle of responsible statistical data analysis:
"fit -> interpret & diagnose -> modify the fit -> interpret & ...."

- Becoming aware of computational and methodological limitations of these models, even when using state-of-the art software.
ContentThe lecture will build on various examples, use R and notably the `lme4` package, to illustrate concepts. The relevant R scripts are made available online.

Inference (significance of factors, confidence intervals) will focus on the more realistic *un*balanced situation where classical (ANOVA, sum of squares etc) methods are known to be deficient. Hence, Maximum Likelihood (ML) and its variant, "REML", will be used for estimation and inference.
Lecture notesWe will work with an unfinished book proposal from Prof Douglas Bates, Wisconsin, USA which itself is a mixture of theory and worked R code examples.

These lecture notes and all R scripts are made available from
Link
Literature(see web page and lecture notes)
Prerequisites / Notice- We assume a good working knowledge about multiple linear regression ("the general linear model') and an intermediate (not beginner's) knowledge about model based statistics (estimation, confidence intervals,..).

Typically this means at least two classes of (math based) statistics, say
1. Intro to probability and statistics
2. (Applied) regression including Matrix-Vector notation Y = X b + E

- Basic (1 semester) "Matrix calculus" / linear algebra is also assumed.

- If familiarity with [R](Link) is not given, it should be acquired during the course (by the student on own initiative).
447-6236-00LStatistics for Survival Data Restricted registration - show details
Does not take place this semester.
W2 credits1V + 1U
AbstractThe primary purpose of a survival analysis is to model and analyze time-to-event data; that is, data that have as a principal endpoint the length of time for an event to occur. This block course introduces the field of survival analysis without getting too embroiled in the theoretical technicalities.
ObjectivePresented here are some frequently used parametric models and methods, including accelerated failure time models; and the newer nonparametric procedures which include the Kaplan-Meier estimate of survival and the Cox proportional hazards regression model. The statistical tools treated are applicable to data from medical clinical trials, public health, epidemiology, engineering, economics, psychology, and demography as well.
ContentThe primary purpose of a survival analysis is to model and analyze time-to-event data; that is, data that have as a principal endpoint the length of time for an event to occur. Such events are generally referred to as "failures." Some examples are time until an electrical component fails, time to first recurrence of a tumor (i.e., length of remission) after initial treatment, time to death, time to the learning of a skill, and promotion times for employees.

In these examples we can see that it is possible that a "failure" time will not be observed either by deliberate design or due to random censoring. This occurs, for example, if a patient is still alive at the end of a clinical trial period or has moved away. The necessity of obtaining methods of analysis that accommodate censoring is the primary reason for developing specialized models and procedures for failure time data. Survival analysis is the modern name given to the collection of statistical procedures which accommodate time-to-event censored data. Prior to these new procedures, incomplete data were treated as missing data and omitted from the analysis. This resulted in the loss of the partial information obtained and in introducing serious systematic error (bias) in estimated quantities. This, of course, lowers the efficacy of the study. The procedures discussed here avoid bias and are more powerful as they utilize the partial information available on a subject or item.

This block course introduces the field of survival analysis without getting too embroiled in the theoretical technicalities. Models for failure times describe either the survivor function or hazard rate and their dependence on explanatory variables. Presented here are some frequently used parametric models and methods, including accelerated failure time models; and the newer nonparametric procedures which include the Kaplan-Meier estimate of survival and the Cox proportional hazards regression model. The statistical tools treated are applicable to data from medical clinical trials, public health, epidemiology, engineering, economics, psychology, and demography as well.
401-8628-00LSurvival Analysis (University of Zurich)
No enrolment to this course at ETH Zurich. Book the corresponding module directly at UZH.
UZH Module Code: STA425

Mind the enrolment deadlines at UZH:
Link
W3 credits1.5GUniversity lecturers
AbstractThe analysis of survival times, or in more general terms, the analysis
of time to event variables is concerned with models for censored
observations. Because we cannot always wait until the event of interest
actually happens, the methods discussed here are required for an
appropriate handling of incomplete observations where we only know that
the event of interest did not happen within ...
Objective
ContentDuring the course, we will study the most important methods and models
for censored data, including
- general concepts of censoring,
- simple summary statistics,
- estimation of survival curves,
- frequentist inference for two and more groups, and
- regression models for censored observations
Application Areas
Students select one area of application and look for suitable courses in which quantitative methods and modeling play a role. They need the consent by the Advisor (Link) that the chosen courses are eligible in the category "Application Areas".

For the category assignment of eligible courses keep the choice "no category" and take contact with the Study Administration Office (Link) after having received the credits. The Study Administration Office needs the Advisor's consent.
Seminar or Semester Paper
NumberTitleTypeECTSHoursLecturers
401-3620-21LStudent Seminar in Statistics: Statistical Network Modeling Information Restricted registration - show details
Number of participants limited to 48.
Mainly for students from the Mathematics Bachelor and Master Programmes who, in addition to the introductory course unit 401-2604-00L Probability and Statistics, have heard at least one core or elective course in statistics. Also offered in the Master Programmes Statistics resp. Data Science.
W4 credits2SP. L. Bühlmann, M. Azadkia
AbstractNetwork models can be used to analyze non-iid data because their structure incorporates interconnectedness between the individuals. We introduce networks, describe them mathematically, and consider applications.
ObjectiveNetwork models can be used to analyze non-iid data because their structure incorporates interconnectedness between the individuals. The participants of the seminar acquire knowledge to formulate and analyze network models and to apply them in examples.
LiteratureE. D. Kolaczyk and G. Csárdi. Statistical analysis of network data with R. Springer, Cham, Switzerland, second edition, 2020.

Tianxi Li, Elizaveta Levina, and Ji Zhu. Network cross-validation by edge sampling, 2020. Preprint arXiv:1612.04717.

Tianxi Li, Elizaveta Levina, and Ji Zhu. Community models for partially observed networks from surveys, 2020. Preprint arXiv:2008.03652.

Tianxi Li, Elizaveta Levina, and Ji Zhu. Prediction Models for Network-Linked Data, 2018. Preprint arXiv:1602.01192.
Prerequisites / NoticeEvery class will consist of an oral presentation highlighting key ideas of selected book chapters by a pair of students. Another two students will be responsible for asking questions during the presentation and providing a discussion of the the presented concepts and ideas, including pros+cons, at the end. Finally, an additional two students are responsible for giving an evaluation on the quality of the presentations/discussions and provide constructive feedback for improvement.
401-3620-20LStudent Seminar in Statistics: Inference in Non-Classical Regression Models Restricted registration - show details
Does not take place this semester.
Number of participants limited to 24.
Mainly for students from the Mathematics Bachelor and Master Programmes who, in addition to the introductory course unit 401-2604-00L Probability and Statistics, have heard at least one core or elective course in statistics. Also offered in the Master Programmes Statistics resp. Data Science.
W4 credits2SF. Balabdaoui
AbstractReview of some non-standard regression models and the statistical properties of estimation methods in such models.
ObjectiveThe main goal is the students get to discover some less known regression models which either generalize the well-known linear model (for example monotone regression) or violate some of the most fundamental assumptions (as in shuffled or unlinked regression models).
ContentLinear regression is one of the most used models for prediction and hence one of the most understood in statistical literature. However, linearity might too simplistic to capture the actual relationship between some response and given covariates. Also, there are many real data problems where linearity is plausible but the actual pairing between the observed covariates and responses is completely lost or at partially. In this seminar, we review some of the non-classical regression models and the statistical properties of the estimation methods considered by well-known statisticians and machine learners. This will encompass:
1. Monotone regression
2. Single index model
3. Unlinked regression
4. Partially unlinked regression
Lecture notesNo script is necessary for this seminar
LiteratureIn the following is the material that will read and studied by each pair of students (all the items listed below are available through the ETH electronic library or arXiv):

1. Chapter 2 from the book "Nonparametric estimation under shape constraints" by P. Groeneboom and G. Jongbloed, 2014, Cambridge University Press

2. "Nonparametric shape-restricted regression" by A. Guntuoyina and B. Sen, 2018, Statistical Science, Volume 33, 568-594

3. "Asymptotic distributions for two estimators of the single index model" by Y. Xia, 2006, Econometric Theory, Volume 22, 1112-1137

4. "Least squares estimation in the monotone single index model" by F. Balabdaoui, C. Durot and H. K. Jankowski, Journal of Bernoulli, 2019, Volume 4B, 3276-3310

5. "Least angle regression" by B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, 2004, Annals of Statsitics, Volume 32, 407-499.

6. "Sharp thresholds for high dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)" by M. Wainwright, 2009, IEEE transactions in Information Theory, Volume 55, 1-19

7."Denoising linear models with permuted data" by A. Pananjady, M. Wainwright and T. A. Courtade and , 2017, IEEE International Symposium on Information Theory, 446-450.

8. "Linear regression with shuffled data: statistical and computation limits of permutation recovery" by A. Pananjady, M. Wainwright and T. A. Courtade , 2018, IEEE transactions in Information Theory, Volume 64, 3286-3300

9. "Linear regression without correspondence" by D. Hsu, K. Shi and X. Sun, 2017, NIPS

10. "A pseudo-likelihood approach to linear regression with partially shuffled data" by M. Slawski, G. Diao, E. Ben-David, 2019, arXiv.

11. "Uncoupled isotonic regression via minimum Wasserstein deconvolution" by P. Rigollet and J. Weed, 2019, Information and Inference, Volume 00, 1-27
401-4620-00LStatistics Lab Restricted registration - show details
Number of participants limited to 27.
W6 credits2SM. Kalisch, M. H. Maathuis, M. Mächler, L. Meier, N. Meinshausen
Abstract"Statistics Lab" is an Applied Statistics Workshop in Data Analysis. It
provides a learning environment in a realistic setting.

Students lead a regular consulting session at the Seminar für Statistik
(SfS). After the session, the statistical data analysis is carried out and
a written report and results are presented to the client. The project is
also presented in the course's seminar.
Objective- gain initial experience in the consultancy process
- carry out a consultancy session and produce a report
- apply theoretical knowledge to an applied problem

After the course, students will have practical knowledge about statistical
consulting. They will have determined the scientific problem and its
context, enquired the design of the experiment or data collection, and
selected the appropriate methods to tackle the problem. They will have
deepened their statistical knowledge, and applied their theoretical
knowledge to the problem. They will have gathered experience in explaining
the relevant mathematical and software issues to a client. They will have
performed a statistical analysis using R (or SPSS). They improve their
skills in writing a report and presenting statistical issues in a talk.
ContentStudents participate in consulting meetings at the SfS. Several consulting
dates are available for student participation. These are arranged
individually.

-During the first meeting the student mainly observes and participates in
the discussion. During the second meeting (with a different client), the
student leads the meeting. The member of the consulting team is overseeing
(and contributing to) the meeting.

-After the meeting, the student performs the recommended analysis, produces
a report and presents the results to the client.

-Finally, the student presents the case in the weekly course seminar in a
talk. All students are required to attend the seminar regularly.
Lecture notesn/a
LiteratureThe required literature will depend on the specific statistical problem
under investigation. Some introductory material can be found below.
Prerequisites / NoticePrerequisites:
Sound knowledge in basic statistical methods, especially regression and, if
possible, analysis of variance. Basic experience in Data Analysis with R.
401-3630-04LSemester Paper Restricted registration - show details
Successful participation in the course unit 401-2000-00L Scientific Works in Mathematics is required.
For more information, see Link
W4 credits6ASupervisors
AbstractSemester papers serve to delve into a problem in statistics and to study it with the appropriate methods or to compile and clearly exhibit a case study of a statistical evaluation.
Objective
401-3630-06LSemester Paper Restricted registration - show details
Successful participation in the course unit 401-2000-00L Scientific Works in Mathematics is required.
For more information, see Link
W6 credits9ASupervisors
AbstractSemester papers serve to delve into a problem in statistics and to study it with the appropriate methods or to compile and clearly exhibit a case study of a statistical evaluation.
Objective
363-1100-00LRisk Case Study Challenge Restricted registration - show details
Does not take place this semester.
W3 credits2S
AbstractThis seminar provides master students at ETH with the challenging opportunity to work on a real risk-modelling and risk-management case in close collaboration with a Risk Center Corporate Partner. The Corporate Partner for the Spring 2021 Edition will be announced soon.
ObjectiveDuring the challenge students acquire a basic understanding of
o The insurance and reinsurance business
o Risk management and risk modelling
o The role of operational risk management

as well as learn to frame a real risk-related business case together with a case manager from the Corporate Partner. Students learn to coordinate as a group. They also learn to integrate and learn from business insights in order to elaborate a solution for their case. Finally, students communicate their solution to an assembly of professionals from the Corporate Partner.
ContentStudents work on a real-world, risk-related case. The case is based on a business-relevant topic. Topics are provided by experts from the Risk Center's Corporate Partners. While gaining substantial insights into the industry's risk modeling and management, students explore the case or problem on their own. They work in teams and develop solutions. The cases allow students to use logical problem-solving skills with an emphasis on evidence and application. Cases offer students the opportunity to apply their scientific knowledge. Typically, the risk-related cases can be complex, contain ambiguities, and may be addressed in more than one way. During the seminar, students visit the Corporate Partner’s headquarters, conduct interviews with members of the management team as well as internal and external experts, and finally present their results in a professional manner.
Prerequisites / NoticePlease apply for this course via the official website (Link). Apply no later than February 20, 2021.
The number of participants is limited to 16.
GESS Science in Perspective
Two credits are needed from the "Science in Perspective" programme with language courses excluded if three credits from language courses have already been recognised for the Bachelor's degree.
see Link (Eight credits must be acquired in this category: normally six during the Bachelor’s degree programme, and two during the Master’s degree programme. A maximum of three credits from language courses from the range of the Language Center of the University of Zurich and ETH Zurich may be recognised. In addition, only advanced courses (level B2 upwards) in the European languages English, French, Italian and Spanish are recognised. German language courses are recognised from level C2 upwards.)
» see Science in Perspective: Type A: Enhancement of Reflection Capability
» Recommended Science in Perspective (Type B) for D-MATH
» see Science in Perspective: Language Courses ETH/UZH
Master's Thesis
NumberTitleTypeECTSHoursLecturers
401-2000-00LScientific Works in Mathematics
Target audience:
Third year Bachelor students;
Master students who cannot document to have received an adequate training in working scientifically.
O0 creditsM. Burger
AbstractIntroduction to scientific writing for students with focus on publication standards and ethical issues, especially in the case of citations (references to works of others.)
ObjectiveLearn the basic standards of scientific works in mathematics.
Content- Types of mathematical works
- Publication standards in pure and applied mathematics
- Data handling
- Ethical issues
- Citation guidelines
Lecture notesMoodle of the Mathematics Library: Link
Prerequisites / NoticeDirective Link
401-2000-01LLunch Sessions – Thesis Basics for Mathematics Students
Details and registration for the optional MathBib training course: Link
Z0 creditsSpeakers
AbstractOptional course "Recherchieren in der Mathematik" (held in German) by the Mathematics Library.
Objective
401-4990-02LMaster's Thesis Restricted registration - show details
Only students who fulfil the following criteria are allowed to begin with their Master's thesis:
a. successful completion of the Bachelor's programme;
b. fulfilling of any additional requirements necessary to gain admission to the Master's programme;
c. They have acquired at least 16 credits in the category “Core courses” for Programme Regulations 2014 and 40 credits in the category “Main Areas” for Programme Regulations 2020.

Successful participation in the course unit 401-2000-00L Scientific Works in Mathematics is required.
For more information, see Link
O30 credits57DSupervisors
AbstractThe master's thesis concludes the study programme. Thesis work should prove the students' ability to independent, structured and scientific working.
ObjectiveDie Studierenden sollen mit der Master-Arbeit, die den Abschluss des Studiengangs bildet, ihre Fähigkeit zu selbständiger, strukturierter und wissenschaftlicher Tätigkeit unter Beweis stellen.
Course Units for Additional Admission Requirements
The courses below are only available for MSc students with additional admission requirements.
NumberTitleTypeECTSHoursLecturers
406-0173-AALLinear Algebra I and II
Enrolment ONLY for MSc students with a decree declaring this course unit as an additional admission requirement.

Any other students (e.g. incoming exchange students, doctoral students) CANNOT enrol for this course unit.
E-6 credits13RN. Hungerbühler
AbstractLinear algebra is an indispensable tool of engineering mathematics. The course is an introduction to basic methods and fundamental concepts of linear algebra and its applications to engineering sciences.
ObjectiveAfter completion of this course, students are able to recognize linear structures and to apply adequate tools from linear algebra in order to solve corresponding problems from theory and applications. In addition, students have a basic knowledge of the software package Matlab.
ContentSystems of linear equations, Gaussian elimination, solution space, matrices, LR decomposition, determinants, structure of linear spaces, normed vector spaces, inner products, method of least squares, QR decomposition, introduction to MATLAB, applications.
Linear maps, kernel and image, coordinates and matrices, coordinate transformations, norm of a matrix, orthogonal matrices, eigenvalues and eigenvectors, algebraic and geometric multiplicity, eigenbasis, diagonalizable matrices, symmetric matrices, orthonormal basis, condition number, linear differential equations, Jordan decomposition, singular value decomposition, examples in MATLAB, applications.

Reading:

Gilbert Strang "Introduction to linear algebra", Wellesley-Cambridge Press: Chapters 1-6, 7.1-7.3, 8.1, 8.2, 8.6

A Practical Introduction to MATLAB: Link

Matlab Primer: Link
Literature- Gilbert Strang: Introduction to linear algebra. Wellesley-Cambridge Press

- A Practical Introduction to MATLAB: Link

- Matlab Primer: Link

- K. Nipp / D. Stoffer, Lineare Algebra, vdf Hochschulverlag, 5. Auflage 2002

- K. Meyberg / P. Vachenauer, Höhere Mathematik 1, Springer 2003
406-0243-AALAnalysis I and II Information
Enrolment ONLY for MSc students with a decree declaring this course unit as an additional admission requirement.

Any other students (e.g. incoming exchange students, doctoral students) CANNOT enrol for this course unit.
E-14 credits30RM. Akveld
AbstractMathematical tools for the engineer
ObjectiveMathematics as a tool to solve engineering problems. Mathematical formulation of technical and scientific problems. Basic mathematical knowledge for engineers.
ContentShort introduction to mathematical logic.
Complex numbers.
Calculus for functions of one variable with applications.
Simple types of ordinary differential equations.
Simple Mathematical models in engineering.

Multi variable calculus: gradient, directional derivative, chain rule, Taylor expansion. Multiple integrals: coordinate transformations, path integrals, integrals over surfaces, divergence theorem, applications in physics.
LiteratureTextbooks in English:
- J. Stewart: Calculus, Cengage Learning, 2009, ISBN 978-0-538-73365-6
- J. Stewart: Multivariable Calculus, Thomson Brooks/Cole (e.g. Appendix G on complex numbers)
- V. I. Smirnov: A course of higher mathematics. Vol. II. Advanced calculus
- W. L. Briggs, L. Cochran: Calculus: Early Transcendentals: International Edition, Pearson Education
Textbooks in German:
- M. Akveld, R. Sperb: Analysis I, vdf
- M. Akveld, R. Sperb: Analysis II, vdf
- L. Papula: Mathematik für Ingenieure und Naturwissenschaftler, Vieweg Verlag
- L. Papula: Mathematik für Ingenieure 2, Vieweg Verlag
406-0603-AALStochastics (Probability and Statistics)
Enrolment ONLY for MSc students with a decree declaring this course unit as an additional admission requirement.

Any other students (e.g. incoming exchange students, doctoral students) CANNOT enrol for this course unit.
E-4 credits9RM. Kalisch
AbstractIntroduction to basic methods and fundamental concepts of statistics and
probability theory for non-mathematicians. The concepts are presented on
the basis of some descriptive examples. The course will be based on the
book "Statistics for research" by S. Dowdy et.al. and on the
book "Introductory Statistics with R" by P. Dalgaard.
ObjectiveThe objective of this course is to build a solid fundament in probability
and statistics. The student should understand some fundamental concepts and
be able to apply these concepts to applications in the real
world. Furthermore, the student should have a basic knowledge of the
statistical programming language "R". The main topics of the course are:
- Introduction to probability
- Common distributions
- Binomialtest
- z-Test, t-Test
- Regression
ContentFrom "Statistics for research":
Ch 1: The Role of Statistics
Ch 2: Populations, Samples, and Probability Distributions
Ch 3: Binomial Distributions
Ch 6: Sampling Distribution of Averages
Ch 7: Normal Distributions
Ch 8: Student's t Distribution
Ch 9: Distributions of Two Variables [Regression]

From "Introductory Statistics with R":
Ch 1: Basics
Ch 2: Probability and distributions
Ch 3: Descriptive statistics and tables
Ch 4: One- and two-sample tests
Ch 5: Regression and correlation
Literature"Statistics for research" by S. Dowdy et. al. (3rd
edition); Print ISBN: 9780471267355; Online ISBN: 9780471477433; DOI:
10.1002/0471477435;
From within the ETH, this book is freely available online under:
Link

"Introductory Statistics with R" by Peter Dalgaard; ISBN
978-0-387-79053-4; DOI: 10.1007/978-0-387-79054-1
From within the ETH, this book is freely available online under:
Link
406-2604-AALProbability and Statistics
Enrolment ONLY for MSc students with a decree declaring this course unit as an additional admission requirement.

Any other students (e.g. incoming exchange students, doctoral students) CANNOT enrol for this course unit.
E-7 credits15RJ. Teichmann
Abstract- Statistical models
- Methods of moments
- Maximum likelihood estimation
- Hypothesis testing
- Confidence intervals
- Introductory Bayesian statistics
- Linear regression model
- Rudiments of high-dimensional statistics
ObjectiveThe goal of this part of the course is to provide a solid introduction into statistics. It offers of a wide overview of the main tools used in statistical inference. The course will start with an introduction to statistical models and end with some notions of high-dimensional statistics. Some time will be spent on proving certain important results. Tools from probability and measure theory will be assumed to be known and hence will be only and occasionally recalled.
Lecture notesScript of Prof. Dr. S. van de Geer
LiteratureThese references could be use complementary sources:

R. Berger and G. Casella, Statistical Inference
J. A. Rice, Mathematical Statistics and Data Analysis
L. Wasserman, All of Statistics