Search result: Catalogue data in Spring Semester 2020

Data Science Master Information
Interdisciplinary Electives
NumberTitleTypeECTSHoursLecturers
851-0739-01LSequencing Legal DNA: NLP for Law and Political Economy
Particularly suitable for students of D-INFK, D-ITET, D-MTEC
W3 credits2VE. Ash
AbstractThis course explores the application of natural language processing techniques to texts in law, politics, and the news media. Students will put these tools to work in a course project.
ObjectiveLaw is embedded in language. An essential task for a judge, therefore, is reading legal texts to interpret case facts and apply legal rules. Can an artificial intelligence learn to do these tasks? The recent and ongoing breakthroughs in natural language processing (NLP) hint at this possibility.

Meanwhile, a vast and growing corpus of legal documents are being digitized and put online for use by the public. No single human could hope to read all of them, yet many of these documents remain untouched by NLP techniques. This course invites students to participate in these new explorations applying NLP to the law -- that is, sequencing legal DNA.
ContentNLP technologies have the potential to assist judges in their decisions by making them more efficient and consistent. On the other hand, legal language choices -- as in legal choices more generally -- could be biased toward some groups, and automated systems could entrench those biases. We will explore, critique, and integrate the emerging set of tools for debiasing language models and think carefully about how notions of fairness should be applied in this domain.

More generally, we will explore the use of NLP for social science research, not just in the law but also in politics, the economy, and culture. In a semester paper, students (individually or in groups) will conceive and implement their own research project applying natural language tools to legal or political texts.
Prerequisites / NoticeSome programming experience in Python is required, and some experience with NLP is highly recommended.
851-0739-02LSequencing Legal DNA: NLP for Law and Political Economy (Course Project)
This is the optional course project for "Building a Robot Judge: Data Science for the Law."

Please register only if attending the lecture course or with consent of the instructor.

Some programming experience in Python is required, and some experience with text mining is highly recommended.
W2 credits2VE. Ash
AbstractThis is the companion course for extra credit for a more substantial project, for the course "Sequencing Legal DNA: NLP for Law and Political Economy".
Objective
851-0740-00LBig Data, Law, and Policy Restricted registration - show details
Number of participants limited to 35

Students will be informed by 1.3.2020 at the latest.
W3 credits2SS. Bechtold
AbstractThis course introduces students to societal perspectives on the big data revolution. Discussing important contributions from machine learning and data science, the course explores their legal, economic, ethical, and political implications in the past, present, and future.
ObjectiveThis course is intended both for students of machine learning and data science who want to reflect on the societal implications of their field, and for students from other disciplines who want to explore the societal impact of data sciences. The course will first discuss some of the methodological foundations of machine learning, followed by a discussion of research papers and real-world applications where big data and societal values may clash. Potential topics include the implications of big data for privacy, liability, insurance, health systems, voting, and democratic institutions, as well as the use of predictive algorithms for price discrimination and the criminal justice system. Guest speakers, weekly readings and reaction papers ensure a lively debate among participants from various backgrounds.
363-1100-00LRisk Case Study Challenge Restricted registration - show details
Does not take place this semester.
W3 credits2SA. Bommier, S. Feuerriegel
AbstractThis seminar provides master students at ETH with the challenging opportunity of working on a real risk modelling and risk management case in close collaboration with a Risk Center Partner Company. For the Spring 2019 Edition the Partner will be Zurich Insurance Group.
ObjectiveStudents work on a real risk-related case of a business relevant topic provided by experts from Risk Center partners. While gaining substantial insights into the risk modeling and management of the industry, students explore the case or problem on their own, working in teams, and develop possible solutions. The cases allow students to use logical problem solving skills with emphasis on evidence and application and involve the integration of scientific knowledge. Typically, the risk-related cases can be complex, cover ambiguities, and may be addressed in more than one way. During the seminar students visit the partners’ headquarters, conduct interviews with members of the management team as well as internal and external experts, and present their results.
ContentGet a basic understanding of
o The insurance and reinsurance business
o Risk management and risk modelling
o The role of operational risk management

Get in contact with industry experts and conduct interviews on the topic.

Conduct a small empirical study and present findings to the company
Prerequisites / NoticePlease apply for this course via the official website (Link). Apply no later than February 15, 2019.
The number of participants is limited to 14.
Seminar
NumberTitleTypeECTSHoursLecturers
261-5113-00LComputational Challenges in Medical Genomics Information Restricted registration - show details
Number of participants limited to 20.
W2 credits2SA. Kahles, G. Rätsch
AbstractThis seminar discusses recent relevant contributions to the fields of computational genomics, algorithmic bioinformatics, statistical genetics and related areas. Each participant will hold a presentation and lead the subsequent discussion.
ObjectivePreparing and holding a scientific presentation in front of peers is a central part of working in the scientific domain. In this seminar, the participants will learn how to efficiently summarize the relevant parts of a scientific publication, critically reflect its contents, and summarize it for presentation to an audience. The necessary skills to succesfully present the key points of existing research work are the same as needed to communicate own research ideas.
In addition to holding a presentation, each student will both contribute to as well as lead a discussion section on the topics presented in the class.
ContentThe topics covered in the seminar are related to recent computational challenges that arise from the fields of genomics and biomedicine, including but not limited to genomic variant interpretation, genomic sequence analysis, compressive genomics tasks, single-cell approaches, privacy considerations, statistical frameworks, etc.
Both recently published works contributing novel ideas to the areas mentioned above as well as seminal contributions from the past are amongst the list of selected papers.
Prerequisites / NoticeKnowledge of algorithms and data structures and interest in applications in genomics and computational biomedicine.
263-3840-00LHardware Architectures for Machine Learning Information
The deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not attend the seminar, will officially fail the seminar.
W2 credits2SG. Alonso, T. Hoefler, C. Zhang
AbstractThe seminar covers recent results in the increasingly important field of hardware acceleration for data science and machine learning, both in dedicated machines or in data centers.
ObjectiveThe seminar aims at students interested in the system aspects of machine learning, who are willing to bridge the gap across traditional disciplines: machine learning, databases, systems, and computer architecture.
ContentThe seminar is intended to cover recent results in the increasingly important field of hardware acceleration for data science and machine learning, both in dedicated machines or in data centers.
Prerequisites / NoticeThe seminar should be of special interest to students intending to complete a master's thesis or a doctoral dissertation in related topics.
263-5225-00LAdvanced Topics in Machine Learning and Data Science Restricted registration - show details
Number of participants limited to 20.

The deadline for deregistering expires at the end of the fourth week of the semester. Students who are still registered after that date, but do not attend the seminar, will officially fail the seminar.
W2 credits2SF. Perez Cruz
AbstractIn this seminar, recent papers of the machine learning and data science literature are presented and discussed. Possible topics cover statistical models, machine learning algorithms and its applications.
ObjectiveThe seminar “Advanced Topics in Machine Learning and Data Science” familiarizes students with recent developments in machine learning and data science. Recently published articles, as well as influential papers, have to be presented and critically reviewed. The students will learn how to structure a scientific presentation, which covers the motivation, key ideas and main results of a scientific paper. An important goal of the seminar presentation is to summarize the essential ideas of the paper in sufficient depth for the audience to be able to follow its main conclusion, especially why the article is (or is not) worth attention. The presentation style will play an important role and should reach the level of professional scientific presentations.
ContentThe seminar will cover a number of recent papers which have emerged as important contributions to the machine learning and data science literatures. The topics will vary from year to year but they are centered on methodological issues in machine learning and its application, not only to text or images, but other scientific
domains like medicine, climate or physics.
LiteratureThe papers will be presented in the first session of the seminar.
401-3620-20LStudent Seminar in Statistics: Inference in Non-Classical Regression Models Restricted registration - show details
Number of participants limited to 24.
Mainly for students from the Mathematics Bachelor and Master Programmes who, in addition to the introductory course unit 401-2604-00L Probability and Statistics, have heard at least one core or elective course in statistics. Also offered in the Master Programmes Statistics resp. Data Science.
W4 credits2SF. Balabdaoui
AbstractReview of some non-standard regression models and the statistical properties of estimation methods in such models.
ObjectiveThe main goal is the students get to discover some less known regression models which either generalize the well-known linear model (for example monotone regression) or violate some of the most fundamental assumptions (as in shuffled or unlinked regression models).
ContentLinear regression is one of the most used models for prediction and hence one of the most understood in statistical literature. However, linearity might too simplistic to capture the actual relationship between some response and given covariates. Also, there are many real data problems where linearity is plausible but the actual pairing between the observed covariates and responses is completely lost or at partially. In this seminar, we review some of the non-classical regression models and the statistical properties of the estimation methods considered by well-known statisticians and machine learners. This will encompass:
1. Monotone regression
2. Single index model
3. Unlinked regression
4. Partially unlinked regression
Lecture notesNo script is necessary for this seminar
LiteratureIn the following is the material that will read and studied by each pair of students (all the items listed below are available through the ETH electronic library or arXiv):

1. Chapter 2 from the book "Nonparametric estimation under shape constraints" by P. Groeneboom and G. Jongbloed, 2014, Cambridge University Press

2. "Nonparametric shape-restricted regression" by A. Guntuoyina and B. Sen, 2018, Statistical Science, Volume 33, 568-594

3. "Asymptotic distributions for two estimators of the single index model" by Y. Xia, 2006, Econometric Theory, Volume 22, 1112-1137

4. "Least squares estimation in the monotone single index model" by F. Balabdaoui, C. Durot and H. K. Jankowski, Journal of Bernoulli, 2019, Volume 4B, 3276-3310

5. "Least angle regression" by B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, 2004, Annals of Statsitics, Volume 32, 407-499.

6. "Sharp thresholds for high dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)" by M. Wainwright, 2009, IEEE transactions in Information Theory, Volume 55, 1-19

7."Denoising linear models with permuted data" by A. Pananjady, M. Wainwright and T. A. Courtade and , 2017, IEEE International Symposium on Information Theory, 446-450.

8. "Linear regression with shuffled data: statistical and computation limits of permutation recovery" by A. Pananjady, M. Wainwright and T. A. Courtade , 2018, IEEE transactions in Information Theory, Volume 64, 3286-3300

9. "Linear regression without correspondence" by D. Hsu, K. Shi and X. Sun, 2017, NIPS

10. "A pseudo-likelihood approach to linear regression with partially shuffled data" by M. Slawski, G. Diao, E. Ben-David, 2019, arXiv.

11. "Uncoupled isotonic regression via minimum Wasserstein deconvolution" by P. Rigollet and J. Weed, 2019, Information and Inference, Volume 00, 1-27
GESS Science in Perspective
NumberTitleTypeECTSHoursLecturers
851-0740-00LBig Data, Law, and Policy Restricted registration - show details
Number of participants limited to 35

Students will be informed by 1.3.2020 at the latest.
W3 credits2SS. Bechtold
AbstractThis course introduces students to societal perspectives on the big data revolution. Discussing important contributions from machine learning and data science, the course explores their legal, economic, ethical, and political implications in the past, present, and future.
ObjectiveThis course is intended both for students of machine learning and data science who want to reflect on the societal implications of their field, and for students from other disciplines who want to explore the societal impact of data sciences. The course will first discuss some of the methodological foundations of machine learning, followed by a discussion of research papers and real-world applications where big data and societal values may clash. Potential topics include the implications of big data for privacy, liability, insurance, health systems, voting, and democratic institutions, as well as the use of predictive algorithms for price discrimination and the criminal justice system. Guest speakers, weekly readings and reaction papers ensure a lively debate among participants from various backgrounds.
» see Science in Perspective: Type A: Enhancement of Reflection Capability
» Recommended Science in Perspective (Type B) for D-INFK
» see Science in Perspective: Language Courses ETH/UZH
Master's Thesis
NumberTitleTypeECTSHoursLecturers
261-0800-00LMaster's Thesis
The minimal prerequisites for the Master’s thesis registration are:
- Completed Bachelor’s program
- All additional requirements completed (additional requirements, if any, are listed in the admission decree)
- Minimum degree requirements fulfilled of the course categories Data Analysis and Data Management and overall 50 credits obtained in the course category Core Courses
- Data Science Lab (14 credits) completed
O30 credits64DProfessors
AbstractThe Master's thesis concludes the study program and demonstrates the students' ability to use the knowledge and skills acquired during Master’s studies to solve a complex data science problem.
ObjectiveTo work independently and to produce a scientifically structured work.
  • First page Previous page Page  4  of  4     All