263-5351-00L  Machine Learning for Genomics

SemesterSpring Semester 2022
LecturersV. Boeva
Periodicityyearly recurring course
Language of instructionEnglish
CommentThe deadline for deregistering expires at the end of the second week of the semester. Students who are still registered after that date, but do not provide project work and/or do not show up for the exam, will officially fail the course.

Number of participants limited to 75.

AbstractThe course reviews solutions that machine learning provides to the most challenging questions in human genomics.
ObjectiveOver the last few years, the parallel development of machine learning methods and molecular profiling technologies for human cells, such as sequencing, created an extremely powerful tool to get insights into the cellular mechanisms in healthy and diseased contexts. In this course, we will discuss the state-of-the-art machine learning methodology solving or attempting to solve common problems in human genomics. At the end of the course, you will be familiar with (1) classical and advanced machine learning architectures used in genomics, (2) bioinformatics analysis of human genomic and transcriptomic data, and (3) data types used in this field.
Content- Short introduction to major concepts of molecular biology: DNA, genes, genome, central dogma, transcription factors, epigenetic code, DNA methylation, signaling pathways
- Prediction of transcription factor binding sites, open chromatin, histone marks, promoters, nucleosome positioning (convolutional neural networks, position weight matrices)
- Prediction of variant effects and gene expression (hidden Markov models, topic models)
- Deconvolution of mixed signal
- DNA, RNA and protein folding (RNN, LSTM, transformers)
- Data imputation for single cell RNA-seq data, clustering and annotation (diffusion and methods on graphs)
- Batch correction (autoencoders, optimal transport)
- Survival analysis (Cox proportional hazard model, regularization penalties, multi-omics, multi-tasking)
Prerequisites / NoticeIntroduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line; having taken Computational Biomedicine is highly recommended