A very good guide book on STHDA about machine learning.
Supervised learning
regression analysis– predict continuous variable
Different methods for regression analysis:
- Ordinary least squares (Chapter @ref(linear-regression))
- Simple linear regression
- Multiple linear regression
- Model selection methods:
- Best subsets regression (Chapter @ref(best-subsets-regression))
- Stepwise regression (Chapter @ref(stepwise-regression))
- Principal component-based methods (Chapter @ref(pcr-and-pls-regression)):
- Principal component regression (PCR)
- Partial least squares regression (PLS)
- Penalized regression (Chapter @ref(penalized-regression)):
- Ridge regression
- Lasso regression
Regression Analysis
Regression Model Diagnostics
Regression Model Validation
Model Selection Methods
Classification– predict class/group variable
- Logistic regression, for binary classification tasks (Chapter @ref(logistic-regression))
- Stepwise and penalized logistic regression for variable selections (Chapter @ref(stepwise-logistic-regression) and @ref(penalized-logistic-regression))
- Logistic regression assumptions and diagnostics (Chapter @ref(logistic-regression-assumptions-and-diagnostics))
- Multinomial logistic regression, an extension of the logistic regression for multiclass classification tasks (Chapter @ref(multinomial-logistic-regression)).
- Discriminant analysis, for binary and multiclass classification problems (Chapter @ref(discriminant-analysis))
- Naive bayes classifier (Chapter @ref(naive-bayes-classifier))
- Support vector machines (Chapter @ref(support-vector-machine))
- Classification model evaluation (Chapter @ref(classification-model-evaluation))
Logistic Regression
Evaluation of Classification Model Accuracy
-
ROC curve
-
Advanced machine learning methods
Unsupervised learning
principal component analysis
Cluster analysis
Part I. Cluster Analysis Basics:
- Data Preparation and Essential R Packages for Cluster Analysis
- Clustering Distance Measures Essentials
Part II. Partitioning Clustering methods:
- K-Means Clustering Essentials
- K-Medoids Essentials: PAM clustering
- CLARA - Clustering Large Applications
Part III. Hierarchical Clustering:
- Agglomerative Clustering
- Algorithm and steps
- Verify the cluster tree
- Cut the dendrogram into different groups
- Divisive Clustering
- Compare Dendrograms
- Visual comparison of two dendrograms
- Correlation matrix between a list of dendrograms
- Visualize Dendrograms
- Case of small data sets
- Case of dendrogram with large data sets: zoom, sub-tree, PDF
- Customize dendrograms using dendextend
- Heatmap: Static and Interactive
- R base heat maps
- Pretty heat maps
- Interactive heat maps
- Complex heatmap
- Real application: gene expression data