Valentina Boeva: Catalogue data in Spring Semester 2023

Name Prof. Dr. Valentina Boeva
FieldBiomedical Informatics
Address
Professur für Biomedizininformatik
ETH Zürich, CAB G 32.2
Universitätstrasse 6
8092 Zürich
SWITZERLAND
Telephone+41 44 633 66 87
E-mailvalentina.boeva@inf.ethz.ch
DepartmentComputer Science
RelationshipAssistant Professor (Tenure Track)

NumberTitleECTSHoursLecturers
252-0868-00LData Science for Medicine Information Restricted registration - show details 4 credits4VJ. Vogt, V. Boeva, M. Kuznetsova
AbstractMachine Learning (ML) methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, and work on practical projects to solve medical problems with the help of ML.
ObjectiveThe course will start with a general introduction to ML, where we will cover supervised and unsupervised learning techniques, as for example classification and regression models, feature selection and preprocessing of data, clustering and dimensionality reduction techniques. After the introduction of the basic methodologies, we will continue with the most relevant applications of ML in medicine, as for example dealing with time series, medical notes and medical images.
ContentDuring the last few years, we have observed a rapid growth of Machine Learning (ML) in Medicine. ML methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, discuss the main challenges they present and their current technical solutions, and work on practical projects to solve medical problems with the help of ML.
Prerequisites / NoticePrerequisite:
Attendance/exam of 252-0866-00 Digital Medicine I
252-0945-16LDoctoral Seminar Machine Learning (FS23)
Only for Computer Science Ph.D. students.

This doctoral seminar is intended for PhD students affiliated with the Institute for Machine Learning. Other PhD students who work on machine learning projects or related topics need approval by at least one of the organizers to register for the seminar.
2 credits1SN. He, V. Boeva, J. M. Buhmann, R. Cotterell, T. Hofmann, A. Krause, M. Sachan, J. Vogt, F. Yang
AbstractAn essential aspect of any research project is dissemination of the findings arising from the study. Here we focus on oral communication, which includes: appropriate selection of material, preparation of the visual aids (slides and/or posters), and presentation skills.
ObjectiveThe seminar participants should learn how to prepare and deliver scientific talks as well as to deal with technical questions. Participants are also expected to actively contribute to discussions during presentations by others, thus learning and practicing critical thinking skills.
Prerequisites / NoticeThis doctoral seminar of the Machine Learning Laboratory of ETH is intended for PhD students who work on a machine learning project, i.e., for the PhD students of the ML lab.
261-5120-00LMachine Learning for Health Care Information Restricted registration - show details 5 credits2V + 2AV. Boeva, J. Vogt, M. Kuznetsova
AbstractThe course will review the most relevant methods and applications of Machine Learning in Biomedicine, discuss the main challenges they present and their current technical problems.
ObjectiveDuring the last years, we have observed a rapid growth in the field of Machine Learning (ML), mainly due to improvements in ML algorithms, the increase of data availability and a reduction in computing costs. This growth is having a profound impact in biomedical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in biomedicine, discuss the main challenges they present and their current technical solutions.
ContentThe course will consist of four topic clusters that will cover the most relevant applications of ML in Biomedicine:
1) Structured time series: Temporal time series of structured data often appear in biomedical datasets, presenting challenges as containing variables with different periodicities, being conditioned by static data, etc.
2) Medical notes: Vast amount of medical observations are stored in the form of free text, we will analyze stategies for extracting knowledge from them.
3) Medical images: Images are a fundamental piece of information in many medical disciplines. We will study how to train ML algorithms with them.
4) Genomics data: ML in genomics is still an emerging subfield, but given that genomics data are arguably the most extensive and complex datasets that can be found in biomedicine, it is expected that many relevant ML applications will arise in the near future. We will review and discuss current applications and challenges.
Prerequisites / NoticeData Structures & Algorithms, Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line

Relation to Course 261-5100-00 Computational Biomedicine: This course is a continuation of the previous course with new topics related to medical data and machine learning. The format of Computational Biomedicine II will also be different. It is helpful but not essential to attend Computational Biomedicine before attending Computational Biomedicine II.
263-3300-00LData Science Lab Restricted registration - show details 14 credits9PA. Ilic, V. Boeva, R. Cotterell, J. Vogt, F. Yang
AbstractIn this class, we bring together data science applications
provided by ETH researchers outside computer science and
teams of computer science master's students. Two to three
students will form a team working on data science/machine
learning-related research topics provided by scientists in
a diverse range of domains such as astronomy, biology,
social sciences etc.
ObjectiveThe goal of this class if for students to gain experience
of dealing with data science and machine learning applications
"in the wild". Students are expected to go through the full
process starting from data cleaning, modeling, execution,
debugging, error analysis, and quality/performance refinement.
Prerequisites / NoticePrerequisites: At least 8 KP must have been obtained under Data Analysis and at least 8 KP must have been obtained under Data Management and Processing.
263-5351-00LMachine Learning for Genomics Information Restricted registration - show details
The deadline for deregistering expires at the end of the third week of the semester. Students who are still registered after that date, but do not provide project work, do not participate in paper presentation sessions and/or do not show up for the exam, will officially fail the course.
5 credits2V + 1U + 1AV. Boeva
AbstractThe course reviews solutions that machine learning provides to the most challenging questions in human genomics.
ObjectiveOver the last few years, the parallel development of machine learning methods and molecular profiling technologies for human cells, such as sequencing, created an extremely powerful tool to get insights into the cellular mechanisms in healthy and diseased contexts. In this course, we will discuss the state-of-the-art machine learning methodology solving or attempting to solve common problems in human genomics. At the end of the course, you will be familiar with (1) classical and advanced machine learning architectures used in genomics, (2) bioinformatics analysis of human genomic and transcriptomic data, and (3) data types used in this field.
Content- Short introduction to major concepts of molecular biology: DNA, genes, genome, central dogma, transcription factors, epigenetic code, DNA methylation, signaling pathways
- Prediction of transcription factor binding sites, open chromatin, histone marks, promoters, nucleosome positioning (convolutional neural networks, position weight matrices)
- Prediction of variant effects and gene expression (hidden Markov models, topic models)
- Deconvolution of mixed signal
- DNA, RNA and protein folding (RNN, LSTM, transformers)
- Data imputation for single cell RNA-seq data, clustering and annotation (diffusion and methods on graphs)
- Batch correction (autoencoders, optimal transport)
- Survival analysis (Cox proportional hazard model, regularization penalties, multi-omics, multi-tasking)
Prerequisites / NoticeIntroduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line