Machine Learning Advanced Methods

Some problems can not be solved using basic Classification / Regression:

Forecasting the demand of a product.
Ranking the options given to users by the most likely to be chosen to the least.
Using the probability measure of a classifier for decision making.
Adjusting the classification loss function to handle imbalanced data and real world problems.
Feature Engineering beyond the correlation with the target.
Optimizing hyper parameters with a limited time budget.

There are many real world problems not covered in regular Machine Learning courses. This course is built as a second course in Machine Learning to present practical and advanced methods in Machine Learning.

Machine Learning - Advanced Methods (PDF)

Overview

The course:
${\color{lime}\surd}$ Covers advanced concepts in Classification, Regression, Feature Engineering and Hyper Parameter Optimization.
${\color{lime}\surd}$ Introduces the Supervised Learning concept / task: Ranking.
${\color{lime}\surd}$ Introduces Time Series and the task of Forecasting.
${\color{lime}\surd}$ Introduces the powerful Supervised Learning method: Gaussian Processes.
${\color{lime}\surd}$ Introduces advanced Feature Engineering methods, handling missing values included.
${\color{lime}\surd}$ Introduces advanced Hyper Parameter Optimization methods to optimize the score of the model.
${\color{lime}\surd}$ Provides practical tools for solving data science tasks.
${\color{lime}\surd}$ Hands On experience and intuition by interactive visualization.
${\color{lime}\surd}$ Targets people who are expected to have a deep understanding of ML to use it in their daily tasks.
${\color{lime}\surd}$ The course is accompanied by more than 30 notebooks given in a dedicated GitHub repository.

Main Topics


Advanced Classification	Calibration, Custom Loss Function, Cost Sensitive Classifier
Advanced Regression	Kernel Regression, Local Kernel Regression, Isotonic Regression
Gaussian Processes	The Model, Regression, Classification (Probabilistic)
Feature Engineering	Predictive Score, Auto ML, Piepline
Hyper Parameters Optimization	Random Grid Search, Basyesian Methods
Supervised Learning: Ranking	The model, Pointwise, Pairwise, Listwise
Supervised Learning: Forecast	Time Series, Models, Feature Engineering, Forecasting
Interpretability / Explainability	Overview, Challenges, Lime, Shap, Pipeline

Goals

The participants will be able to match the proper approach to a given problem.
The participants will be able to implement, adjust, fine-tune and benchmark the chosen method.
The participants will be able to build a pipeline with Auto ML, Hyper Parameter optimization and explainability module.
The participants will be able to calibrate a classifier.
The participants will be able to create a custom loss function for a classifier.
The participants will be able to train and use a Forecasting / Ranking model.
The participants will be able to train and use a Gaussian Process based model.

Pre Built Syllabus

We have been given this course in various lengths, targeting different audiences. The final syllabus will be decided and customized according to audience, allowed time and other needs.

Day	Subject	Details
1	Course Overview	Motivation, Agenda, Notations
	Classification Recap	Geometric Interpretation, The Decision Function, Probabilistic Classification, Loss vs. Score
	Classifiers Recap	Linear Classifier, SVM, Kernel SVM, Logistic Regression, Decision Tree, Ensemble Methods
	Notebook 001	Classification with `SVM`
	Ensemble Methods	Stacking, Random Forest, AdaBoost, Gradient Boosting, `XGBoost`, `LightGBM`
	Notebook 002	Classification with `LightGBM`
	Ordinal Classification	Use Case, Limitations of the Classifier, Definition, Loss Function
	Notebook 003	Ordinal Classification
2	Cost Sensitive Classifier	Imbalanced Data & Scores, Loss Matrix Model, Cost Sensitive Classifier, Weights Adjustment
	Notebook 004	Cost Sensitive Logistic Regression Classifier
	Custom Loss Function	Test Case, Implementation in Python, Using XGBoost
	Notebook 005	Custom Loss Function in `XGBoost`
	Notebook 006	Custom Loss Function in `LightGBM`
	Classifier Calibration	Motivation (Decision based on Probability), Calibration Methods
	Notebook 007	Classifier Calibration
3	Random Process Recap	Probability, The Gaussian Distribution, Random Process, Stationarity, Auto Correlation Function
	Local Regressors	Kernel Regression, Weighted Local Kernel Regression
	Notebook 008	Kernel Regression
	Notebook 009	Local Kernel Regression
	Gaussian Process	The Model, The Parameters, Fitting, Gaussian Process Regressor, Gaussian Process Classifier
	Notebook 010	The Gaussian Process Regressor
	Notebook 011	The Gaussian Process Classifier
	Isotonic Regression	Use Cases, The Model, The Loss Function, Fitting
	Notebook 012	Isotonic Regression
4	Feature Engineering	Discrete Features, Issues with One Hot Encoding, Ordinal Features, Cyclic Features
	Notebook 013	Feature Engineering for Discrete Data
	Feature Transforms	Cyclic Features, Cyclic Objectives, Unsupervised Methods & LDA
	Notebook 014	Feature Engineering for Cyclic Features
	Notebook 015	Feature Engineering for Cyclic Target (Regression)
	Notebook 016	Feature Engineering with Unsupervised Methods
	Feature Impute	Statistics Based, Feature Based, Model Based
	Notebook 017	Missing Data Imputation
	AutoML	Concept, Frameworks, Pipeline
	Notebook 018	AutoML
	Feature Selection	Univariate Methods, Multivariate Methods, Sparsity, Issues with Correlation, Predictive Score
	Notebook 019	Feature Selection Methods
	Notebook 020	Predictive Score for Feature Analysis & Selection
5	Hyper Parameter Optimization - Grid Methods	Cross Validation, Uniform Search, Random Search, Prior
	Notebook 021	Cross Validation (Leave One Out) & Uniform Grid Search
	Notebook 022	Random Grid Search
	Hyper Parameter Optimization - Bayesian Methods	Concept, Conversion from Discrete to Pseudo Smooth, Optimization
	Notebook 023	Bayesian Optimization with Weights and Biases Framework
6	Ranking	Motivation, Use Cases, Ranking vs. Classification
	Ranking Model	Pointwise, Pairwise, Listwise
	Learning to Rank	Data Preparation, `XGBoost`, `LightGBM`
	Notebook 024	Ranking Basics
	Notebook 025	Product Recommendation (Recommendation System)
7	Time Series & Forecasting	Stationarity, Seasonality, Noise, Forecasting Loss
	Time Series Concepts	Differencing, Exponential Smoothing
	Time Series Generation Models	MA, AR, ARMA, ARIMA, SARIMA
	Time Series Forecasting	Statistical Models, Kalman Filter, Learning Models
	Notebook 025	Forecasting with Statistics Models
	Notebook 026	Forecasting with Statistics Kalman Filter
	Notebook 027	Forecasting with Learning Model
	Forecasting by Regression	Feature Engineering (AutoML), The Mini Rocket Feature Extractor, Models
	Notebook 028	Forecasting by Regression (`XGBoost` / `LightGBM`)
8	Explainability & Interpretability	Motivation, Decision Explaining, Results Analysis, Results Investigation
	Models	LIME Model, SHAP Model, Integration into a Pipeline
	Notebook 029	Explainability by LIME
	Notebook 030	Explainability by SHAP
	Notebook 031	Integration into a Pipeline

Some of the notebooks are in the form of guided and interactive exercises.

The days are a crude partitioning. In practice some subjects will take more than a day and some less.

Machine Learning - Advanced Methods (PDF)

Audience

Experienced developers who use Machine Learning: Algorithm Engineers, Data Engineers, Data Scientists.

Prerequisites

Mathematical: Linear Algebra (Basic), Probability / Statistics (Basic),
Machine Learning: Classification, Regression.
Programming: Python.
Experience with a SciKit Learn.

In any case any of the prerequisites are not met, we can offer a half day sprint on: Linear Algebra, Calculus, Probability and Python.