Machine Learning from Scratch

Content from Machine Learning from Scratch, covering statistical methods, machine learning algorithms, and practical implementations with mathematical foundations.

66 items
Hypothesis Testing Summary & Practical Guide: Reporting, Test Selection & scipy.stats
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Hypothesis Testing Summary & Practical Guide: Reporting, Test Selection & scipy.stats

Jan 10, 202618 min read

Practical reporting guidelines, summary of key concepts, test selection parameters table, multiple comparison corrections table, and scipy.stats functions reference. Complete reference guide for hypothesis testing.

Open notebook
Multiple Comparisons: FWER, FDR, Bonferroni, Holm & Benjamini-Hochberg
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Multiple Comparisons: FWER, FDR, Bonferroni, Holm & Benjamini-Hochberg

Jan 9, 202628 min read

Family-wise error rate, false discovery rate, Bonferroni correction, Holm's method, and Benjamini-Hochberg procedure. Learn how to control error rates when conducting multiple hypothesis tests.

Open notebook
Effect Sizes and Statistical Significance: Cohen's d & Practical Significance
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Effect Sizes and Statistical Significance: Cohen's d & Practical Significance

Jan 8, 202620 min read

Cohen's d, practical significance, interpreting effect sizes, and why tiny p-values can mean tiny effects. Learn to distinguish statistical significance from practical importance.

Open notebook
Sample Size, Minimum Detectable Effect & Power: Power Analysis & MDE Calculation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Sample Size, Minimum Detectable Effect & Power: Power Analysis & MDE Calculation

Jan 7, 202631 min read

Power analysis, sample size determination, MDE calculation, and avoiding underpowered studies. Learn how to design studies with adequate sensitivity to detect meaningful effects.

Open notebook
Type I and Type II Errors: False Positives, False Negatives & Statistical Power
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Type I and Type II Errors: False Positives, False Negatives & Statistical Power

Jan 6, 202633 min read

Understanding false positives, false negatives, statistical power, and the tradeoff between error types. Learn how to balance Type I and Type II errors in study design.

Open notebook
ANOVA (Analysis of Variance): One-Way ANOVA, Post-Hoc Tests & Assumptions
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

ANOVA (Analysis of Variance): One-Way ANOVA, Post-Hoc Tests & Assumptions

Jan 5, 202624 min read

One-way ANOVA, post-hoc tests, assumptions, and when to use ANOVA. Learn how to compare means across three or more groups while controlling Type I error rates.

Open notebook
The F-Test and F-Distribution: Comparing Variances, Regression & Nested Models
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

The F-Test and F-Distribution: Comparing Variances, Regression & Nested Models

Jan 4, 202625 min read

F-distribution, F-test for comparing variances, F-test in regression, and nested model comparison. Learn how F-tests extend hypothesis testing beyond means to variance analysis and model comparison.

Open notebook
The T-Test: One-Sample, Two-Sample (Pooled & Welch), Paired Tests & Decision Framework
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

The T-Test: One-Sample, Two-Sample (Pooled & Welch), Paired Tests & Decision Framework

Jan 3, 202631 min read

Complete guide to t-tests including one-sample, two-sample (pooled and Welch), paired tests, assumptions, and decision framework. Learn when to use each variant and how to check assumptions.

Open notebook
Confidence Intervals and Test Assumptions
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Confidence Intervals and Test Assumptions

Jan 2, 202625 min read

Mathematical equivalence between confidence intervals and hypothesis tests, test assumptions (independence, normality, equal variances), and choosing between z and t tests. Learn how to validate assumptions and select appropriate tests.

Open notebook
The Z-Test: One-Sample, Two-Sample & Proportion Tests Complete Guide
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

The Z-Test: One-Sample, Two-Sample & Proportion Tests Complete Guide

Jan 2, 202624 min read

Complete guide to z-tests including one-sample, two-sample, and proportion tests. Learn when to use z-tests, how to calculate test statistics, and interpret results when population variance is known.

Open notebook
P-values and Hypothesis Test Setup
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

P-values and Hypothesis Test Setup

Jan 1, 202622 min read

Foundation of hypothesis testing covering p-values, null and alternative hypotheses, one-sided vs two-sided tests, and test statistics. Learn how to set up and interpret hypothesis tests correctly.

Open notebook
DBSCAN Clustering: Density-Based Algorithm for Finding Arbitrary Shapes
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

DBSCAN Clustering: Density-Based Algorithm for Finding Arbitrary Shapes

Sep 10, 202560 min read

Master DBSCAN (Density-Based Spatial Clustering of Applications with Noise), the algorithm that discovers clusters of any shape without requiring predefined cluster counts. Learn core concepts, parameter tuning, and practical implementation.

Open notebook
Quadratic Programming for Portfolio Optimization: Complete Guide with Python Implementation
Interactive
Machine Learning from ScratchMachine LearningData, Analytics & AI

Quadratic Programming for Portfolio Optimization: Complete Guide with Python Implementation

Sep 7, 202545 min read

Learn quadratic programming (QP) for portfolio optimization, including the mean-variance framework, efficient frontier construction, and scipy implementation with practical examples.

Open notebook
Vehicle Routing Problem with Time Windows: Complete Guide to VRPTW Optimization with OR-Tools
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratchoptimization

Vehicle Routing Problem with Time Windows: Complete Guide to VRPTW Optimization with OR-Tools

Sep 4, 202565 min read

Master the Vehicle Routing Problem with Time Windows (VRPTW), including mathematical formulation, constraint programming, and practical implementation using Google OR-Tools for logistics optimization.

Open notebook
Minimum Cost Flow Slotting: Complete Guide to Network Flow Optimization & Resource Allocation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Minimum Cost Flow Slotting: Complete Guide to Network Flow Optimization & Resource Allocation

Sep 1, 202571 min read

Learn minimum cost flow optimization for slotting problems, including network flow theory, mathematical formulation, and practical implementation with OR-Tools. Master resource allocation across time slots, capacity constraints, and cost structures.

Open notebook
Mixed Integer Linear Programming (MILP) for Factory Optimization: Complete Guide with Mathematical Foundations & Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Mixed Integer Linear Programming (MILP) for Factory Optimization: Complete Guide with Mathematical Foundations & Implementation

Aug 29, 202569 min read

Complete guide to Mixed Integer Linear Programming (MILP) for factory optimization, covering mathematical foundations, constraint modeling, branch-and-bound algorithms, and practical implementation with Google OR-Tools. Learn how to optimize production planning with discrete setup decisions and continuous quantities.

Open notebook
CP-SAT Rostering: Complete Guide to Constraint Programming for Workforce Scheduling
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

CP-SAT Rostering: Complete Guide to Constraint Programming for Workforce Scheduling

Aug 26, 202560 min read

Learn CP-SAT rostering using Google OR-Tools to solve complex workforce scheduling problems with binary decision variables, coverage constraints, and employee availability. Master constraint programming for optimal employee shift assignments.

Open notebook
NHITS: Neural Hierarchical Interpolation for Time Series Forecasting with Multi-Scale Decomposition & Implementation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratchtime-seriesdeep-learning

NHITS: Neural Hierarchical Interpolation for Time Series Forecasting with Multi-Scale Decomposition & Implementation

Aug 23, 202572 min read

Master NHITS (Neural Hierarchical Interpolation for Time Series), a deep learning architecture for multi-scale time series forecasting. Learn hierarchical decomposition, neural interpolation, and how to implement NHITS for complex temporal patterns in retail, energy, and financial data.

Open notebook
N-BEATS: Neural Basis Expansion Analysis for Time Series Forecasting
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

N-BEATS: Neural Basis Expansion Analysis for Time Series Forecasting

Aug 20, 202556 min read

Complete guide to N-BEATS, an interpretable deep learning architecture for time series forecasting. Learn how N-BEATS decomposes time series into trend and seasonal components, understand the mathematical foundation, and implement it in PyTorch.

Open notebook
HDBSCAN Clustering: Complete Guide to Hierarchical Density-Based Clustering with Automatic Cluster Selection
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

HDBSCAN Clustering: Complete Guide to Hierarchical Density-Based Clustering with Automatic Cluster Selection

Aug 17, 202539 min read

Complete guide to HDBSCAN clustering algorithm covering density-based clustering, automatic cluster selection, noise detection, and handling variable density clusters. Learn how to implement HDBSCAN for real-world clustering problems.

Open notebook
Hierarchical Clustering: Complete Guide with Dendrograms, Linkage Criteria & Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Hierarchical Clustering: Complete Guide with Dendrograms, Linkage Criteria & Implementation

Aug 14, 202555 min read

Comprehensive guide to hierarchical clustering, including dendrograms, linkage criteria (single, complete, average, Ward), and scikit-learn implementation. Learn how to build cluster hierarchies and interpret dendrograms.

Open notebook
SARIMA: Complete Guide to Seasonal Time Series Forecasting with Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

SARIMA: Complete Guide to Seasonal Time Series Forecasting with Implementation

Aug 11, 202535 min read

Learn SARIMA (Seasonal AutoRegressive Integrated Moving Average) for forecasting time series with seasonal patterns. Includes mathematical foundations, step-by-step implementation, and practical applications.

Open notebook
Exponential Smoothing (ETS): Complete Guide to Time Series Forecasting with Weighted Averages & Holt-Winters
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Exponential Smoothing (ETS): Complete Guide to Time Series Forecasting with Weighted Averages & Holt-Winters

Aug 8, 202560 min read

Learn exponential smoothing for time series forecasting, including simple, double (Holt's), and triple (Holt-Winters) methods. Master weighted averages, smoothing parameters, and practical implementation in Python.

Open notebook
Prophet Time Series Forecasting: Complete Guide with Trend, Seasonality & Holiday Effects
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Prophet Time Series Forecasting: Complete Guide with Trend, Seasonality & Holiday Effects

Aug 5, 202541 min read

Learn Prophet time series forecasting including additive decomposition, trend modeling, seasonal patterns, and holiday effects. Master Facebook's powerful forecasting tool for business applications.

Open notebook
K-means Clustering: Complete Guide with Algorithm, Implementation & Best Practices
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratchunsupervised-learning

K-means Clustering: Complete Guide with Algorithm, Implementation & Best Practices

Aug 2, 202592 min read

Master K-means clustering from mathematical foundations to practical implementation. Learn the algorithm, initialization strategies, optimal cluster selection, and real-world applications.

Open notebook
t-SNE: Complete Guide to Dimensionality Reduction & High-Dimensional Data Visualization
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

t-SNE: Complete Guide to Dimensionality Reduction & High-Dimensional Data Visualization

Jul 30, 202534 min read

A comprehensive guide covering t-SNE (t-Distributed Stochastic Neighbor Embedding), including mathematical foundations, probability distributions, KL divergence optimization, and practical implementation. Learn how to visualize complex high-dimensional datasets effectively.

Open notebook
LIME Explainability: Complete Guide to Local Interpretable Model-Agnostic Explanations
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

LIME Explainability: Complete Guide to Local Interpretable Model-Agnostic Explanations

Jul 27, 202532 min read

A comprehensive guide covering LIME (Local Interpretable Model-Agnostic Explanations), including mathematical foundations, implementation strategies, and practical applications. Learn how to explain any machine learning model's predictions with interpretable local approximations.

Open notebook
UMAP: Complete Guide to Uniform Manifold Approximation and Projection for Dimensionality Reduction
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

UMAP: Complete Guide to Uniform Manifold Approximation and Projection for Dimensionality Reduction

Jul 24, 202534 min read

A comprehensive guide covering UMAP dimensionality reduction, including mathematical foundations, fuzzy simplicial sets, manifold learning, and practical implementation. Learn how to preserve both local and global structure in high-dimensional data visualization.

Open notebook
PCA (Principal Component Analysis): Complete Guide with Mathematical Foundation & Implementation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

PCA (Principal Component Analysis): Complete Guide with Mathematical Foundation & Implementation

Jul 21, 202521 min read

A comprehensive guide covering Principal Component Analysis, including mathematical foundations, eigenvalue decomposition, and practical implementation. Learn how to reduce dimensionality while preserving maximum variance in your data.

Open notebook
XGBoost: Complete Guide to Extreme Gradient Boosting with Mathematical Foundations, Optimization Techniques & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

XGBoost: Complete Guide to Extreme Gradient Boosting with Mathematical Foundations, Optimization Techniques & Python Implementation

Jul 18, 202576 min read

A comprehensive guide to XGBoost (eXtreme Gradient Boosting), including second-order Taylor expansion, regularization techniques, split gain optimization, ranking loss functions, and practical implementation with classification, regression, and learning-to-rank examples.

Open notebook
SHAP (SHapley Additive exPlanations): Complete Guide to Model Interpretability
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

SHAP (SHapley Additive exPlanations): Complete Guide to Model Interpretability

Jul 15, 202555 min read

A comprehensive guide to SHAP values covering mathematical foundations, feature attribution, and practical implementations for explaining any machine learning model

Open notebook
LightGBM: Fast Gradient Boosting with Leaf-wise Tree Growth - Complete Guide with Math Formulas & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

LightGBM: Fast Gradient Boosting with Leaf-wise Tree Growth - Complete Guide with Math Formulas & Python Implementation

Jul 12, 202553 min read

A comprehensive guide covering LightGBM gradient boosting framework, including leaf-wise tree growth, histogram-based binning, GOSS sampling, exclusive feature bundling, mathematical foundations, and Python implementation. Learn how to use LightGBM for large-scale machine learning with speed and memory efficiency.

Open notebook
CatBoost: Complete Guide to Categorical Boosting with Target Encoding, Symmetric Trees & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

CatBoost: Complete Guide to Categorical Boosting with Target Encoding, Symmetric Trees & Python Implementation

Jul 9, 202540 min read

A comprehensive guide to CatBoost (Categorical Boosting), including categorical feature handling, target statistics, symmetric trees, ordered boosting, regularization techniques, and practical implementation with mixed data types.

Open notebook
Isolation Forest: Complete Guide to Unsupervised Anomaly Detection with Random Trees & Path Length Analysis
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Isolation Forest: Complete Guide to Unsupervised Anomaly Detection with Random Trees & Path Length Analysis

Jul 6, 202545 min read

A comprehensive guide to Isolation Forest covering unsupervised anomaly detection, path length calculations, harmonic numbers, anomaly scoring, and implementation in scikit-learn. Learn how to detect rare outliers in high-dimensional data with practical examples.

Open notebook
Boosted Trees: Complete Guide to Gradient Boosting Algorithm & Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Boosted Trees: Complete Guide to Gradient Boosting Algorithm & Implementation

Jul 3, 202547 min read

A comprehensive guide to boosted trees and gradient boosting, covering ensemble learning, loss functions, sequential error correction, and scikit-learn implementation. Learn how to build high-performance predictive models using gradient boosting.

Open notebook
Random Forest: Complete Guide to Ensemble Learning with Bootstrap Sampling & Feature Selection
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Random Forest: Complete Guide to Ensemble Learning with Bootstrap Sampling & Feature Selection

Jun 30, 202542 min read

A comprehensive guide to Random Forest covering ensemble learning, bootstrap sampling, random feature selection, bias-variance tradeoff, and implementation in scikit-learn. Learn how to build robust predictive models for classification and regression with practical examples.

Open notebook
CART Decision Trees: Complete Guide to Classification and Regression Trees with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

CART Decision Trees: Complete Guide to Classification and Regression Trees with Mathematical Foundations & Python Implementation

Jun 27, 202544 min read

A comprehensive guide to CART (Classification and Regression Trees), including mathematical foundations, Gini impurity, variance reduction, and practical implementation with scikit-learn. Learn how to build interpretable decision trees for both classification and regression tasks.

Open notebook
Logistic Regression: Complete Guide with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Logistic Regression: Complete Guide with Mathematical Foundations & Python Implementation

Jun 24, 202545 min read

A comprehensive guide to logistic regression covering mathematical foundations, the logistic function, optimization algorithms, and practical implementation. Learn how to build binary classification models with interpretable results.

Open notebook
Poisson Regression: Complete Guide to Count Data Modeling with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Poisson Regression: Complete Guide to Count Data Modeling with Mathematical Foundations & Python Implementation

Jun 21, 202547 min read

A comprehensive guide to Poisson regression for count data analysis. Learn mathematical foundations, maximum likelihood estimation, rate ratio interpretation, and practical implementation with scikit-learn. Includes real-world examples and diagnostic techniques.

Open notebook
Spline Regression: Complete Guide to Non-Linear Modeling with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Spline Regression: Complete Guide to Non-Linear Modeling with Mathematical Foundations & Python Implementation

Jun 18, 202565 min read

A comprehensive guide to spline regression covering B-splines, knot selection, natural cubic splines, and practical implementation. Learn how to model complex non-linear relationships with piecewise polynomials.

Open notebook
Multinomial Logistic Regression: Complete Guide with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Multinomial Logistic Regression: Complete Guide with Mathematical Foundations & Python Implementation

Jun 15, 202549 min read

A comprehensive guide to multinomial logistic regression covering mathematical foundations, softmax function, coefficient estimation, and practical implementation in Python with scikit-learn.

Open notebook
Elastic Net Regularization: Complete Guide with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Elastic Net Regularization: Complete Guide with Mathematical Foundations & Python Implementation

Jun 12, 202552 min read

A comprehensive guide covering Elastic Net regularization, including mathematical foundations, geometric interpretation, and practical implementation. Learn how to combine L1 and L2 regularization for optimal feature selection and model stability.

Open notebook
Polynomial Regression: Complete Guide with Math, Implementation & Best Practices
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Polynomial Regression: Complete Guide with Math, Implementation & Best Practices

Jun 9, 202537 min read

A comprehensive guide covering polynomial regression, including mathematical foundations, implementation in Python, bias-variance trade-offs, and practical applications. Learn how to model non-linear relationships using polynomial features.

Open notebook
Ridge Regression (L2 Regularization): Complete Guide with Mathematical Foundations & Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Ridge Regression (L2 Regularization): Complete Guide with Mathematical Foundations & Implementation

Jun 6, 202535 min read

A comprehensive guide covering Ridge regression and L2 regularization, including mathematical foundations, geometric interpretation, bias-variance tradeoff, and practical implementation. Learn how to prevent overfitting in linear regression using coefficient shrinkage.

Open notebook
Variable Relationships: Complete Guide to Covariance, Correlation & Regression Analysis
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Variable Relationships: Complete Guide to Covariance, Correlation & Regression Analysis

Jun 3, 202527 min read

A comprehensive guide covering relationships between variables, including covariance, correlation, simple and multiple regression. Learn how to measure, model, and interpret variable associations while understanding the crucial distinction between correlation and causation.

Open notebook
Data Quality & Outliers: Complete Guide to Measurement Error, Missing Data & Detection Methods
Interactive
Data, Analytics & AIMachine Learning from Scratch

Data Quality & Outliers: Complete Guide to Measurement Error, Missing Data & Detection Methods

May 31, 202535 min read

A comprehensive guide covering data quality fundamentals, including measurement error, systematic bias, missing data mechanisms, and outlier detection. Learn how to assess, diagnose, and improve data quality for reliable statistical analysis and machine learning.

Open notebook
Statistical Modeling Guide: Model Fit, Overfitting vs Underfitting & Cross-Validation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Statistical Modeling Guide: Model Fit, Overfitting vs Underfitting & Cross-Validation

May 28, 202521 min read

A comprehensive guide covering statistical modeling fundamentals, including measuring model fit with R-squared and RMSE, understanding the bias-variance tradeoff between overfitting and underfitting, and implementing cross-validation for robust model evaluation.

Open notebook
Data Visualization Guide: Histograms, Box Plots & Scatter Plots for Exploratory Analysis
Interactive
Data, Analytics & AIdata-scienceMachine Learning from Scratch

Data Visualization Guide: Histograms, Box Plots & Scatter Plots for Exploratory Analysis

May 25, 202519 min read

A comprehensive guide to foundational data visualization techniques including histograms, box plots, and scatter plots. Learn how to understand distributions, identify outliers, reveal relationships, and build intuition before statistical analysis.

Open notebook
Gauss-Markov Assumptions: Foundation of Linear Regression & OLS Estimation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Gauss-Markov Assumptions: Foundation of Linear Regression & OLS Estimation

May 22, 202516 min read

A comprehensive guide to the Gauss-Markov assumptions that underpin linear regression. Learn the five key assumptions, how to test them, consequences of violations, and practical remedies for reliable OLS estimation.

Open notebook
Normalization: Complete Guide to Feature Scaling with Min-Max Implementation
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Normalization: Complete Guide to Feature Scaling with Min-Max Implementation

May 19, 202514 min read

A comprehensive guide to normalization in machine learning, covering min-max scaling, proper train-test split implementation, when to use normalization vs standardization, and practical applications for neural networks and distance-based algorithms.

Open notebook
Sampling: From Populations to Observations - Complete Guide to Statistical Sampling Methods
Interactive
Data, Analytics & AIMachine Learning from Scratch

Sampling: From Populations to Observations - Complete Guide to Statistical Sampling Methods

May 16, 202516 min read

A comprehensive guide to sampling theory and methods in data science, covering simple random sampling, stratified sampling, cluster sampling, sampling error, and uncertainty quantification. Learn how to design effective sampling strategies and interpret results from sample data.

Open notebook
Probability Distributions: Complete Guide to Normal, Binomial, Poisson & More for Data Science
Interactive
Data, Analytics & AIMachine Learning from ScratchMachine Learning

Probability Distributions: Complete Guide to Normal, Binomial, Poisson & More for Data Science

May 13, 202518 min read

A comprehensive guide covering probability distributions for data science, including normal, t-distribution, binomial, Poisson, exponential, and log-normal distributions. Learn when and how to apply each distribution with practical examples and visualizations.

Open notebook
Statistical Inference: Drawing Conclusions from Data - Complete Guide with Estimation & Hypothesis Testing
Interactive
Data, Analytics & AIMachine Learning from ScratchMachine Learning

Statistical Inference: Drawing Conclusions from Data - Complete Guide with Estimation & Hypothesis Testing

May 10, 202525 min read

A comprehensive guide covering statistical inference, including point and interval estimation, confidence intervals, hypothesis testing, p-values, Type I and Type II errors, and common statistical tests. Learn how to make rigorous conclusions about populations from sample data.

Open notebook
Descriptive Statistics: Complete Guide to Summarizing and Understanding Data with Python
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Descriptive Statistics: Complete Guide to Summarizing and Understanding Data with Python

May 7, 202526 min read

A comprehensive guide covering descriptive statistics fundamentals, including measures of central tendency (mean, median, mode), variability (variance, standard deviation, IQR), and distribution shape (skewness, kurtosis). Learn how to choose appropriate statistics for different data types and apply them effectively in data science.

Open notebook
Central Limit Theorem: Foundation of Statistical Inference & Sampling Distributions
Interactive
Data, Analytics & AIMachine Learning from Scratch

Central Limit Theorem: Foundation of Statistical Inference & Sampling Distributions

May 4, 202549 min read

A comprehensive guide to the Central Limit Theorem covering convergence to normality, standard error, sample size requirements, and practical applications in statistical inference. Learn how CLT enables confidence intervals, hypothesis testing, and machine learning methods.

Open notebook
Probability Basics: Foundation of Statistical Reasoning & Key Concepts
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Probability Basics: Foundation of Statistical Reasoning & Key Concepts

May 1, 202550 min read

A comprehensive guide to probability theory fundamentals, covering random variables, probability distributions, expected value and variance, independence and conditional probability, Law of Large Numbers, and Central Limit Theorem. Learn how to apply probabilistic reasoning to data science and machine learning applications.

Open notebook
Types of Data: Complete Guide to Data Classification - Quantitative, Qualitative, Discrete & Continuous
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Types of Data: Complete Guide to Data Classification - Quantitative, Qualitative, Discrete & Continuous

Apr 28, 202515 min read

Master data classification with this comprehensive guide covering quantitative vs. qualitative data, discrete vs. continuous data, and the data type hierarchy including nominal, ordinal, interval, and ratio scales. Learn how to choose appropriate analytical methods, avoid common pitfalls, and apply correct preprocessing techniques for data science and machine learning projects.

Open notebook
Standardization: Normalizing Features for Fair Comparison - Complete Guide with Math Formulas & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Standardization: Normalizing Features for Fair Comparison - Complete Guide with Math Formulas & Python Implementation

Apr 25, 202511 min read

A comprehensive guide to standardization in machine learning, covering mathematical foundations, practical implementation, and Python examples. Learn how to properly standardize features for fair comparison across different scales and units.

Open notebook
Sum of Squared Errors (SSE): Complete Guide to Measuring Model Performance
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

Sum of Squared Errors (SSE): Complete Guide to Measuring Model Performance

Apr 22, 202520 min read

A comprehensive guide to the Sum of Squared Errors (SSE) metric in regression analysis. Learn the mathematical foundation, visualization techniques, practical applications, and limitations of SSE with Python examples and detailed explanations.

Open notebook
L1 Regularization (LASSO): Complete Guide with Math, Examples & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

L1 Regularization (LASSO): Complete Guide with Math, Examples & Python Implementation

Apr 19, 202562 min read

A comprehensive guide to L1 regularization (LASSO) in machine learning, covering mathematical foundations, optimization theory, practical implementation, and real-world applications. Learn how LASSO performs automatic feature selection through sparsity.

Open notebook
Multiple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Multiple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation

Apr 16, 202540 min read

A comprehensive guide to multiple linear regression, including mathematical foundations, intuitive explanations, worked examples, and Python implementation. Learn how to fit, interpret, and evaluate multiple linear regression models with real-world applications.

Open notebook
Multicollinearity in Regression: Complete Guide to Detection, Impact & Solutions
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Multicollinearity in Regression: Complete Guide to Detection, Impact & Solutions

Apr 13, 202542 min read

Learn about multicollinearity in regression analysis with this practical guide. VIF analysis, correlation matrices, coefficient stability testing, and approaches such as Ridge regression, Lasso, and PCR. Includes Python code examples, visualizations, and useful techniques for working with correlated predictors in machine learning models.

Open notebook
Ordinary Least Squares (OLS): Complete Mathematical Guide with Formulas, Examples & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Ordinary Least Squares (OLS): Complete Mathematical Guide with Formulas, Examples & Python Implementation

Apr 10, 202534 min read

A comprehensive guide to Ordinary Least Squares (OLS) regression, including mathematical derivations, matrix formulations, step-by-step examples, and Python implementation. Learn the theory behind OLS, understand the normal equations, and implement OLS from scratch using NumPy and scikit-learn.

Open notebook
Simple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Simple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation

Apr 7, 202552 min read

A complete hands-on guide to simple linear regression, including formulas, intuitive explanations, worked examples, and Python code. Learn how to fit, interpret, and evaluate a simple linear regression model from scratch.

Open notebook
R-squared (Coefficient of Determination): Formula, Intuition & Model Fit in Regression
Interactive
Data, Analytics & AIMachine LearningMachine Learning from Scratch

R-squared (Coefficient of Determination): Formula, Intuition & Model Fit in Regression

Apr 4, 20258 min read

A comprehensive guide to R-squared, the coefficient of determination. Learn what R-squared means, how to calculate it, interpret its value, and use it to evaluate regression models. Includes formulas, intuitive explanations, practical guidelines, and visualizations.

Open notebook
Generalized Linear Models: Complete Guide with Mathematical Foundations & Python Implementation
Interactive
Data, Analytics & AISoftware EngineeringMachine LearningMachine Learning from Scratch

Generalized Linear Models: Complete Guide with Mathematical Foundations & Python Implementation

Apr 1, 202553 min read

A comprehensive guide to Generalized Linear Models (GLMs), covering logistic regression, Poisson regression, and maximum likelihood estimation. Learn how to model binary outcomes, count data, and non-normal distributions with practical Python examples.

Open notebook

Stay updated

Get notified when I publish new articles on data and AI, private equity, technology, and more.

No spam, unsubscribe anytime.

or

Create a free account to unlock exclusive features, track your progress, and join the conversation.

No popupsUnobstructed readingCommenting100% Free