Programma di Big Data, Machine Learning And Astrophysical Data:

This introductory course covers a wide range of methods and applications of Machine Learning in Astrophysics. What is Machine Learning? What problems does it try to solve? What are the main categories and fundamental concepts of Machine Learning systems?

Learning by fitting a model to data. Optimizing a cost function. Handling, cleaning, and preparing data. Selecting a model and tuning hyper-parameters using cross-validation.

The main challenges of Machine Learning. Reducing the dimensionality of the training data to fight the curse of dimensionality. The most common learning algorithms: Linear and Polynomial Regression, Logistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision Trees, Random Forests, and Ensemble methods.

The lessons are accompanied with plenty of laboratory hours. In the lab, students learn to program in Python and write their own routines to apply on astrophysical datasets the concepts introduced in the lessons.

Intro: What this course is for and how

2h What is in the course? How/When will the lessons be? How is the exam? Concepts of Big Data and Machine Learning.

Section I: Searching for Structure in Point Data

                2h Density Estimation (1 theory + 1 implementations)

                2h Clustering (1 theory + 1 implementations)

Section II: Dimensionality Reduction

2h Reducing the dimensionality (1 theory + 1 implementations)

Section III: Regression and Model Fitting

2h Regression for Linear Models (1 theory + 1 implementations)

                2h Overfitting, Underfitting and Cross Validation (1 theory + 1 implementations)

Section IV: Classification

2h Classifier Methods (KNN, GMM, SVM, etc…) (1 theory + 1 implementations)

                2h Evaluating Classifier Methods (ROC curves) (1 theory + 1 implementations)

 

In the LAB:

Data Access and Plots: (2 LAB)

SDSS Filters

SDSS Moving Object Catalog

Density estimation and Clustering: (2 LAB)

Bayesian Blocks for Histograms

Reducing the Dimensionality: (1 LAB)

                SDSS Spectrum Example + SDSS Spectra with PCA

Regression and Fitting: (1 LAB)

                K-Neighbors for Photometric Redshifts

Classifier: (2 LAB)

                H-R diagram + GMM