The official webpage of the book
by Charles BOUVEYRON, Gilles CELEUX, T. Brendan MURPHY and Adrian E. RAFTERY
The century that is ours will be certainly the century of the data revolution. Our numerical world is indeed creating mass of data everyday and the volume of generated data is doubling every two years according to most recent estimations. This wealth of available data offers hope for exploitation that may lead to great advances in areas such as health, science, transportation or defense. However, manipulating, analyzing and extracting information from those data is made difficult by the volume and nature (high-dimensional data, networks, time series, ...) of modern data.
Among the broad field of statistical and machine learning, model-based techniques for clustering and classification have a central position for anyone interested in exploiting those data. This text book focuses on the recent developments in model-based clustering and classification while providing a comprehensive introduction to the field. It is aimed at advanced undergraduates, graduates or first year PhD students in data science, as well as researchers and practitioners.
The book covers the following topics:
1. Introduction
2. Model-based Clustering: Basic Ideas
3. Dealing with Difficulties
4. Model-based Classification
5. Semi-supervised Clustering and Classification
6. Discrete Data Clustering
7. Variable Selection
8. High-dimensional Data
9. Non-Gaussian Model-based Clustering
10. Network Data
11. Model-based Clustering with Covariates
12. Other Topics
The book is supported by extensive examples on data, with 72 listings of code mobilizing more than 30 software packages, that can be run by the reader. The chosen language for codes is the R software which is one of the most popular languages for data science.
The book is part of the Statistical and Probabilistic Mathematics Series of Cambridge University Press and can be bought on most specialized bookshops and e-commerce websites.
The Cambridge University Press website allows to buy the book. Additional information is also available on CUP website such as citation metrics.
See the book on CUP websiteThe book can be bought on many e-commerce website, such as Amazon.
Buy the book on AmazonThe 12 chapters of the book can be downloaded separately using the links below:
The book is supported by extensive examples on data, with 72 listings of R code and mobilizing more than 30 software packages. The book is accompanied by a dedicated R package (the MBCbook package) that can be directly downloaded from the CRAN within the R software. The MBCbook package provides in particular all datasets used in the book and some original functions.
In order to ease the reproducibility of the codes provided in the book, all scripts of
the 12 chapters can be also downloaded below: