376-1723-00L  Big Data Analysis in Biomedical Research

SemesterFrühjahrssemester 2022
DozierendeE. Araldi, M. Ristow
Periodizitätjährlich wiederkehrende Veranstaltung
LehrspracheEnglisch


KurzbeschreibungBiomedical datasets are increasing in size and complexity, and discoveries arising from their analysis have important implications in human health and biotechnological advances. While the potential of biomedical dataset analysis is considerable, preclinical researchers often lack the computational tools to analyze them. This course will provide the basis of data analysis of large biomedical data
LernzielThis course aims to provide practical tools to analyze large biomedical datasets, and it is tailored towards experimental researchers in the life sciences with minimal prior programming experience, but with a strong interest in exploring big data to solve own research problems. Through theoretical classes, practical demonstrations, in class exercises and homework, the participants will master computational methods to independently manipulate large datasets, effectively visualize big data, and analyze it with appropriate statistical tools and machine learning approaches. For the final assessment, students will conduct an independent data analysis project based on a biomedical problem of their choosing and using publicly available population-based biomedical datasets.
InhaltWhile learning the programming skills needed to manipulate and visualize the data, participants will learn the statistical and modeling approaches for big data analysis. The course will cover:
•Basis of Python programming and UNIX;
•High performance computing;
•Manipulation and cleaning of large datasets with Pandas;
•Visualization tools (Matplotlib, Seaborn);
•Machine learning and numerical libraries (SciPy, NumPy, Statsmodels, Scikit-Learn).
•Statistical analysis and modeling of big data, and applications to biomedical datasets (statistical learning, distributions, linear and logistic regressions, principal component analysis, clustering, classification, time series analysis, tree-based methods, predictive models).
Voraussetzungen / BesonderesBasic understanding of mathematics and statistics, as taught in basic courses at the Bachelor`s level.