Navigazione di Sezione:
Information Theory And Data Mining 2016/2017
Course taught in english.
Information Theory: random variables and processes, the concept of information, self-information, Shannon entropy, alternative entropy measures, relative entropies, Kullback–Leibler divergence, Jensen–Shannon divergence, conditional entropy, joint entropy, mutual information, total correlation, differential entropy, Markov chains.
Applications to Communications Systems: coding of discrete sources, first Shannon theorem, Kraft inequality, Huffman coding, discrete communication channels, channel capacity, error probability, Fano inequality, second Shannon theorem, elements of channel coding and cryptography.
Applications to Data Mining: basic concept of data mining, definition of dataset and attribute, data types, multivariate analysis, basic statistical description of data, case studies, information theoretic metrics in data mining tasks, data preparation, data cleaning, discretization of attributes, dimensionality reduction, association rules (unidimensional and multidimensional), classification algorithms (ID3, C4.5, Bayes), classification trees, anomaly detection, clustering, training and testing of algorithms, data visualization.
Computer experiments: introduction to Matlab, applications of information theory to communications systems, applications of data mining algorithms.