**Teaching**: 2019, Spring (Master 2 Data Science course @ University of Paris-Saclay).

Structured Data: Learning, Prediction, Dependency, Testing:

- Goal:
- Many real-world applications involve objects with an explicit or implicit structure. Social networks, protein-protein interaction networks, molecules, DNA sequences and syntactic tags are instances of explicitly structured data while texts, images, videos, biomedical signals are examples with implicit structure. The focus of the course is solving learning and prediction tasks, estimation of dependency measures, and hypothesis testing under this complex/structured assumption.
- While in learning and prediction problems the case of structured inputs has been investigated for about three decades, structural assumption on the output side is a significantly more challenging and less understood area of statistical learning. The first part of the course provides a transversal and comprehensive overview on the recent advances and tools for this exploding field of structured output learning, including graphical models, max margin approaches as well as deep learning. The covered methods are mainly energy-based techniques among which (1) max margin approaches, conditional random fields, deep structured learning, and (2) structured output regression algorithms. These approaches illustrated on real applications will allow to cope with complex problems such as question-answering, automatic captioning, molecule identification. Finally both families of methods (1) and (2) will be studied under the angle of generalization and consistency.
- The second part of the course gives an alternative view on the structured problem family, dealing with topics on
*dependency estimation and hypothesis testing*. Emerging methods in these fields can not only lead to state-of-the-art algorithms in several application areas (such as blind signal separation, feature selection, outlier-robust image registration, regression problems on probability distributions), but they also come with elegant performance guarantees, complementing the regular statistical tools restricted to unstructured Euclidean domains. We are going to construct features of probability distributions which will enable us to define easy-to-estimate independence measures and distances of random variables. As a byproduct, we will get nonparametric extensions of the classical t-test (two-sample test) and the Pearson correlation test (independence test).

- Lecturers: Florence d'Alché-Buc, Zoltán Szabó, Slim Essid.
- Place: Télécom ParisTech (1st part), École Polytechnique (2nd = my part).
- Prerequisites:
- The course requires a basic knowledge of kernel methods, graphical models, deep learning, optimization and functional analysis.

- Exam:
- 2 Projects.
- Topics: link prediction, question answering, image/document understanding, drug activity prediction, molecule prediction, functional prediction, information theoretical optimization (including two-sample and independence testing).

- 2nd part (Zoltán, Feb. 25-):
- Keywords:
- Kernel canonical correlation analysis, mean embedding, maximum mean discrepancy, integral probability metric, characteristic/universal kernel, Hilbert-Schmidt independence criterion, covariance operator, Hilbert-Schmidt norm.
- Kernel based two-sample and independence tests. Quadratic and linear-time methods.

- Slides:
- Feb. 25, Mar. 4, 18, 25: main, supplement (kernel, RKHS).

- Code:
- Information Theoretical Estimators (ITE) toolbox in Python, Matlab.
- Two-sample test, independence test, goodness-of-fit test.

- Keywords: