Valentina Boeva: Catalogue data in Spring Semester 2022
|Prof. Dr. Valentina Boeva
Professur für Biomedizininformatik
ETH Zürich, CAB G 32.2
|+41 44 633 66 87
|Assistant Professor (Tenure Track)
|Data Science for Medicine
Only for Human Medicine BSc
|J. Vogt, V. Boeva, G. Rätsch
|Machine Learning (ML) methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, and work on practical projects to solve medical problems with the help of ML.
|The course will start with a general introduction to ML, where we will cover supervised and unsupervised learning techniques, as for example classification and regression models, feature selection and preprocessing of data, clustering and dimensionality reduction techniques. After the introduction of the basic methodologies, we will continue with the most relevant applications of ML in medicine, as for example dealing with time series, medical notes and medical images.
|During the last few years, we have observed a rapid growth of Machine Learning (ML) in Medicine. ML methods have shown to have a profound impact in medical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in medicine, discuss the main challenges they present and their current technical solutions, and work on practical projects to solve medical problems with the help of ML.
|Prerequisites / Notice
Attendance/exam of 252-0866-00 Digital Medicine I
|Machine Learning for Health Care
Number of participants limited to 150.
|2V + 2A
|V. Boeva, G. Rätsch, J. Vogt
|The course will review the most relevant methods and applications of Machine Learning in Biomedicine, discuss the main challenges they present and their current technical problems.
|During the last years, we have observed a rapid growth in the field of Machine Learning (ML), mainly due to improvements in ML algorithms, the increase of data availability and a reduction in computing costs. This growth is having a profound impact in biomedical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in biomedicine, discuss the main challenges they present and their current technical solutions.
|The course will consist of four topic clusters that will cover the most relevant applications of ML in Biomedicine:
1) Structured time series: Temporal time series of structured data often appear in biomedical datasets, presenting challenges as containing variables with different periodicities, being conditioned by static data, etc.
2) Medical notes: Vast amount of medical observations are stored in the form of free text, we will analyze stategies for extracting knowledge from them.
3) Medical images: Images are a fundamental piece of information in many medical disciplines. We will study how to train ML algorithms with them.
4) Genomics data: ML in genomics is still an emerging subfield, but given that genomics data are arguably the most extensive and complex datasets that can be found in biomedicine, it is expected that many relevant ML applications will arise in the near future. We will review and discuss current applications and challenges.
|Prerequisites / Notice
|Data Structures & Algorithms, Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line
Relation to Course 261-5100-00 Computational Biomedicine: This course is a continuation of the previous course with new topics related to medical data and machine learning. The format of Computational Biomedicine II will also be different. It is helpful but not essential to attend Computational Biomedicine before attending Computational Biomedicine II.
|Data Science Lab
Only for Data Science MSc.
|C. Zhang, V. Boeva, R. Cotterell, J. Vogt, F. Yang
|In this class, we bring together data science applications
provided by ETH researchers outside computer science and
teams of computer science master's students. Two to three
students will form a team working on data science/machine
learning-related research topics provided by scientists in
a diverse range of domains such as astronomy, biology,
social sciences etc.
|The goal of this class if for students to gain experience
of dealing with data science and machine learning applications
"in the wild". Students are expected to go through the full
process starting from data cleaning, modeling, execution,
debugging, error analysis, and quality/performance refinement.
|Prerequisites / Notice
|Prerequisites: At least 8 KP must have been obtained under Data Analysis and at least 8 KP must have been obtained under Data Management and Processing.
|Machine Learning for Genomics
The deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not provide project work and/or do not show up for the exam, will officially fail the course.
Number of participants limited to 75.
|2V + 1U + 1A
|The course reviews solutions that machine learning provides to the most challenging questions in human genomics.
|Over the last few years, the parallel development of machine learning methods and molecular profiling technologies for human cells, such as sequencing, created an extremely powerful tool to get insights into the cellular mechanisms in healthy and diseased contexts. In this course, we will discuss the state-of-the-art machine learning methodology solving or attempting to solve common problems in human genomics. At the end of the course, you will be familiar with (1) classical and advanced machine learning architectures used in genomics, (2) bioinformatics analysis of human genomic and transcriptomic data, and (3) data types used in this field.
|- Short introduction to major concepts of molecular biology: DNA, genes, genome, central dogma, transcription factors, epigenetic code, DNA methylation, signaling pathways
- Prediction of transcription factor binding sites, open chromatin, histone marks, promoters, nucleosome positioning (convolutional neural networks, position weight matrices)
- Prediction of variant effects and gene expression (hidden Markov models, topic models)
- Deconvolution of mixed signal
- DNA, RNA and protein folding (RNN, LSTM, transformers)
- Data imputation for single cell RNA-seq data, clustering and annotation (diffusion and methods on graphs)
- Batch correction (autoencoders, optimal transport)
- Survival analysis (Cox proportional hazard model, regularization penalties, multi-omics, multi-tasking)
|Prerequisites / Notice
|Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line; having taken Computational Biomedicine is highly recommended