# 261-5110-00L Optimization for Data Science

Semester | Spring Semester 2021 |

Lecturers | B. Gärtner, D. Steurer, N. He |

Periodicity | yearly recurring course |

Language of instruction | English |

### Courses

Number | Title | Hours | Lecturers | |||||||
---|---|---|---|---|---|---|---|---|---|---|

261-5110-00 V | Optimization for Data Science | 3 hrs |
| B. Gärtner, D. Steurer, N. He | ||||||

261-5110-00 U | Optimization for Data Science | 2 hrs |
| B. Gärtner, D. Steurer, N. He | ||||||

261-5110-00 A | Optimization for Data Science | 4 hrs | B. Gärtner, D. Steurer, N. He |

### Catalogue data

Abstract | This course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in data science. |

Objective | Understanding the theoretical guarantees (and their limits) of relevant optimization methods used in data science. Learning general paradigms to deal with optimization problems arising in data science. |

Content | This course provides an in-depth theoretical treatment of optimization methods that are particularly relevant in machine learning and data science. In the first part of the course, we will first give a brief introduction to convex optimization, with some basic motivating examples from machine learning. Then we will analyse classical and more recent first and second order methods for convex optimization: gradient descent, Nesterov's accelerated method, proximal and splitting algorithms, subgradient descent, stochastic gradient descent, variance-reduced methods, Newton's method, and Quasi-Newton methods. The emphasis will be on analysis techniques that occur repeatedly in convergence analyses for various classes of convex functions. We will also discuss some classical and recent theoretical results for nonconvex optimization. In the second part, we discuss convex programming relaxations as a powerful and versatile paradigm for designing efficient algorithms to solve computational problems arising in data science. We will learn about this paradigm and develop a unified perspective on it through the lens of the sum-of-squares semidefinite programming hierarchy. As applications, we are discussing non-negative matrix factorization, compressed sensing and sparse linear regression, matrix completion and phase retrieval, as well as robust estimation. |

Prerequisites / Notice | As background, we require material taught in the course "252-0209-00L Algorithms, Probability, and Computing". It is not necessary that participants have actually taken the course, but they should be prepared to catch up if necessary. |

### Performance assessment

Performance assessment information (valid until the course unit is held again) | |

Performance assessment as a semester course | |

ECTS credits | 10 credits |

Examiners | B. Gärtner, N. He, D. Steurer |

Type | session examination |

Language of examination | English |

Repetition | The performance assessment is offered every session. Repetition possible without re-enrolling for the course unit. |

Mode of examination | written 180 minutes |

Additional information on mode of examination | At two times in the course of the semester, we will hand out specially marked exercises or term projects (compulsory continuous performance assessments) - the written part of the solutions are expected to be typeset in LaTeX or similar. Solutions will be graded, and the grades will account for 20% of the final grade. Assignments can be discussed with colleagues, but we expect an independent writeup. |

Written aids | None |

This information can be updated until the beginning of the semester; information on the examination timetable is binding. |

### Learning materials

Main link | Information |

Only public learning materials are listed. |

### Groups

No information on groups available. |

### Restrictions

There are no additional restrictions for the registration. |