Machine Learning Advanced Methods
Some problems can not be solved using basic Classification / Regression:
- Forecasting the demand of a product.
- Ranking the options given to users by the most likely to be chosen to the least.
- Using the probability measure of a classifier for decision making.
- Adjusting the classification loss function to handle imbalanced data and real world problems.
- Feature Engineering beyond the correlation with the target.
- Optimizing hyper parameters with a limited time budget.
There are many real world problems not covered in regular Machine Learning courses. This course is built as a second course in Machine Learning to present practical and advanced methods in Machine Learning.
Overview
The course:
${\color{lime}\surd}$ Covers advanced concepts in Classification, Regression, Feature Engineering and Hyper Parameter Optimization.
${\color{lime}\surd}$ Introduces the Supervised Learning concept / task: Ranking.
${\color{lime}\surd}$ Introduces Time Series and the task of Forecasting.
${\color{lime}\surd}$ Introduces the powerful Supervised Learning method: Gaussian Processes.
${\color{lime}\surd}$ Introduces advanced Feature Engineering methods, handling missing values included.
${\color{lime}\surd}$ Introduces advanced Hyper Parameter Optimization methods to optimize the score of the model.
${\color{lime}\surd}$ Provides practical tools for solving data science tasks.
${\color{lime}\surd}$ Hands On experience and intuition by interactive visualization.
${\color{lime}\surd}$ Targets people who are expected to have a deep understanding of ML to use it in their daily tasks.
${\color{lime}\surd}$ The course is accompanied by more than 30 notebooks given in a dedicated GitHub repository.
Main Topics
Advanced Classification | Calibration, Custom Loss Function, Cost Sensitive Classifier |
Advanced Regression | Kernel Regression, Local Kernel Regression, Isotonic Regression |
Gaussian Processes | The Model, Regression, Classification (Probabilistic) |
Feature Engineering | Predictive Score, Auto ML, Piepline |
Hyper Parameters Optimization | Random Grid Search, Basyesian Methods |
Supervised Learning: Ranking | The model, Pointwise, Pairwise, Listwise |
Supervised Learning: Forecast | Time Series, Models, Feature Engineering, Forecasting |
Interpretability / Explainability | Overview, Challenges, Lime, Shap, Pipeline |
Goals
- The participants will be able to match the proper approach to a given problem.
- The participants will be able to implement, adjust, fine-tune and benchmark the chosen method.
- The participants will be able to build a pipeline with Auto ML, Hyper Parameter optimization and explainability module.
- The participants will be able to calibrate a classifier.
- The participants will be able to create a custom loss function for a classifier.
- The participants will be able to train and use a Forecasting / Ranking model.
- The participants will be able to train and use a Gaussian Process based model.
Pre Built Syllabus
We have been given this course in various lengths, targeting different audiences. The final syllabus will be decided and customized according to audience, allowed time and other needs.
Day | Subject | Details |
---|---|---|
1 | Course Overview | Motivation, Agenda, Notations |
Classification Recap | Geometric Interpretation, The Decision Function, Probabilistic Classification, Loss vs. Score | |
Classifiers Recap | Linear Classifier, SVM, Kernel SVM, Logistic Regression, Decision Tree, Ensemble Methods | |
Notebook 001 | Classification with SVM |
|
Ensemble Methods | Stacking, Random Forest, AdaBoost, Gradient Boosting, XGBoost , LightGBM |
|
Notebook 002 | Classification with LightGBM |
|
Ordinal Classification | Use Case, Limitations of the Classifier, Definition, Loss Function | |
Notebook 003 | Ordinal Classification | |
2 | Cost Sensitive Classifier | Imbalanced Data & Scores, Loss Matrix Model, Cost Sensitive Classifier, Weights Adjustment |
Notebook 004 | Cost Sensitive Logistic Regression Classifier | |
Custom Loss Function | Test Case, Implementation in Python, Using XGBoost | |
Notebook 005 | Custom Loss Function in XGBoost |
|
Notebook 006 | Custom Loss Function in LightGBM |
|
Classifier Calibration | Motivation (Decision based on Probability), Calibration Methods | |
Notebook 007 | Classifier Calibration | |
3 | Random Process Recap | Probability, The Gaussian Distribution, Random Process, Stationarity, Auto Correlation Function |
Local Regressors | Kernel Regression, Weighted Local Kernel Regression | |
Notebook 008 | Kernel Regression | |
Notebook 009 | Local Kernel Regression | |
Gaussian Process | The Model, The Parameters, Fitting, Gaussian Process Regressor, Gaussian Process Classifier | |
Notebook 010 | The Gaussian Process Regressor | |
Notebook 011 | The Gaussian Process Classifier | |
Isotonic Regression | Use Cases, The Model, The Loss Function, Fitting | |
Notebook 012 | Isotonic Regression | |
4 | Feature Engineering | Discrete Features, Issues with One Hot Encoding, Ordinal Features, Cyclic Features |
Notebook 013 | Feature Engineering for Discrete Data | |
Feature Transforms | Cyclic Features, Cyclic Objectives, Unsupervised Methods & LDA | |
Notebook 014 | Feature Engineering for Cyclic Features | |
Notebook 015 | Feature Engineering for Cyclic Target (Regression) | |
Notebook 016 | Feature Engineering with Unsupervised Methods | |
Feature Impute | Statistics Based, Feature Based, Model Based | |
Notebook 017 | Missing Data Imputation | |
AutoML | Concept, Frameworks, Pipeline | |
Notebook 018 | AutoML | |
Feature Selection | Univariate Methods, Multivariate Methods, Sparsity, Issues with Correlation, Predictive Score | |
Notebook 019 | Feature Selection Methods | |
Notebook 020 | Predictive Score for Feature Analysis & Selection | |
5 | Hyper Parameter Optimization - Grid Methods | Cross Validation, Uniform Search, Random Search, Prior |
Notebook 021 | Cross Validation (Leave One Out) & Uniform Grid Search | |
Notebook 022 | Random Grid Search | |
Hyper Parameter Optimization - Bayesian Methods | Concept, Conversion from Discrete to Pseudo Smooth, Optimization | |
Notebook 023 | Bayesian Optimization with Weights and Biases Framework | |
6 | Ranking | Motivation, Use Cases, Ranking vs. Classification |
Ranking Model | Pointwise, Pairwise, Listwise | |
Learning to Rank | Data Preparation, XGBoost , LightGBM |
|
Notebook 024 | Ranking Basics | |
Notebook 025 | Product Recommendation (Recommendation System) | |
7 | Time Series & Forecasting | Stationarity, Seasonality, Noise, Forecasting Loss |
Time Series Concepts | Differencing, Exponential Smoothing | |
Time Series Generation Models | MA, AR, ARMA, ARIMA, SARIMA | |
Time Series Forecasting | Statistical Models, Kalman Filter, Learning Models | |
Notebook 025 | Forecasting with Statistics Models | |
Notebook 026 | Forecasting with Statistics Kalman Filter | |
Notebook 027 | Forecasting with Learning Model | |
Forecasting by Regression | Feature Engineering (AutoML), The Mini Rocket Feature Extractor, Models | |
Notebook 028 | Forecasting by Regression (XGBoost / LightGBM ) |
|
8 | Explainability & Interpretability | Motivation, Decision Explaining, Results Analysis, Results Investigation |
Models | LIME Model, SHAP Model, Integration into a Pipeline | |
Notebook 029 | Explainability by LIME | |
Notebook 030 | Explainability by SHAP | |
Notebook 031 | Integration into a Pipeline |
Some of the notebooks are in the form of guided and interactive exercises.
The days are a crude partitioning. In practice some subjects will take more than a day and some less.
Audience
Experienced developers who use Machine Learning: Algorithm Engineers, Data Engineers, Data Scientists.
Prerequisites
- Mathematical: Linear Algebra (Basic), Probability / Statistics (Basic),
- Machine Learning: Classification, Regression.
- Programming: Python.
- Experience with a SciKit Learn.
In any case any of the prerequisites are not met, we can offer a half day sprint on: Linear Algebra, Calculus, Probability and Python.