227-0085-36L  Projects & Seminars: Genome Sequencing on Mobile Devices

SemesterSpring Semester 2021
LecturersM. H. K. Alser, J. Gómez Luna
Periodicityevery semester recurring course
Language of instructionEnglish
CommentOnly for Electrical Engineering and Information Technology BSc.

The course unit can only be taken once. Repeated enrollment in a later semester is not creditable.

AbstractThe category of "Laboratory Courses, Projects, Seminars" includes courses and laboratories in various formats designed to impart practical knowledge and skills. Moreover, these classes encourage independent experimentation and design, allow for explorative learning and teach the methodology of project work.
ObjectiveGenome analysis is the foundation of many scientific and medical discoveries, and serves as a key enabler of personalized medicine. This analysis is currently limited by the inability of existing technologies to read an organism’s complete genome. Instead, a dedicated machine (called sequencer) extracts a large number of shorter random fragments of an organism’s DNA sequence, known as reads. Small, handheld sequencers such as ONT MinION and Flongle make it possible to sequence bacterial and viral genomes in the field, thus facilitating disease outbreak analyses such as COVID-19, Ebola, and Zika. However, large, capable computers are still needed to perform genome assembly, which tries to reassemble read fragments back into an entire genome sequence. This limits the benefits of mobile sequencing and may pose problems in rapid diagnosis of infectious diseases, tracking outbreaks, and near-patient testing. The problem is exacerbated in developing countries and during crises where access to the internet network, cloud services, or data centers is even more limited.

In this course, we will cover the basics of genome analysis to understand the speed-accuracy tradeoff in using computationally-lightweight heuristics versus accurate computationally-expensive algorithms. Such heuristic algorithms typically operate on a smaller dataset that can fit in the memory of today’s mobile device. Students will experimentally evaluate different heuristic algorithms and observe their effect on the end results. This evaluation will give the students the chance to carry out a hands-on project to implement one or more of these heuristic algorithms in their smartphones and help the society by enabling on-site analysis of genomic data.

Prerequisites of the course:
- No prior knowledge in bioinformatics or genome analysis is required.
- A good knowledge in C programming language and programming is required.
- Interest in making things efficient and solving problems

The course is conducted in English.

Course website: https://safari.ethz.ch/projects_and_seminars/doku.php?id=genome_seq_mobile

Learning Materials
1. A survey on accelerating genome analysis: https://arxiv.org/pdf/2008.00961

2. A detailed survey on the state-of-the-art algorithms for sequencing data: https://arxiv.org/pdf/2003.00110

3. An example of how to accelerate genomic sequence matching by two orders of magnitude with the help of FPGAs or GPUs: https://arxiv.org/abs/1910.09020

4. An example of how to accelerate read mapping step by an order of magnitude and without using hardware acceleration: https://arxiv.org/pdf/1912.08735

5. An example of using a different computing paradigm for accelerating read mapping step and improving its energy consumption: https://arxiv.org/pdf/1708.04329

6. Two examples on using software/hardware co-design to accelerate genomic sequence matching by two orders of magnitude: https://arxiv.org/abs/1604.01789 https://arxiv.org/abs/1809.07858

7. An example of a purely software method for fast genome sequence analysis: http://www.biomedcentral.com/content/pdf/1471-2164-14-S1-S13.pdf