Writing

Mostly thoughts on data, AI, software engineering—and where they intersect with finance, business, and entrepreneurship.

Backpropagation - Training Deep Neural Networks
Interactive

Backpropagation - Training Deep Neural Networks

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•20 min read

In the 1980s, neural networks hit a wall—nobody knew how to train deep models. That changed when Rumelhart, Hinton, and Williams introduced backpropagation in 1986. Their clever use of the chain rule finally let researchers figure out which parts of a network deserved credit or blame, making deep learning work in practice. Thanks to this breakthrough, we now have everything from word embeddings to powerful language models like transformers.

Open notebook
BLEU Metric - Automatic Evaluation for Machine Translation
Interactive

BLEU Metric - Automatic Evaluation for Machine Translation

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•5 min read

In 2002, IBM researchers introduced BLEU (Bilingual Evaluation Understudy), revolutionizing machine translation evaluation by providing the first widely adopted automatic metric that correlated well with human judgments. By comparing n-gram overlap with reference translations and adding a brevity penalty, BLEU enabled rapid iteration and development, establishing automatic evaluation as a fundamental principle across all language AI.

Open notebook
Convolutional Neural Networks - Revolutionizing Feature Learning
Interactive

Convolutional Neural Networks - Revolutionizing Feature Learning

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

In 1988, Yann LeCun introduced Convolutional Neural Networks at Bell Labs, forever changing how machines process visual information. While initially designed for computer vision, CNNs introduced automatic feature learning, translation invariance, and parameter sharing. These principles would later revolutionize language AI, inspiring text CNNs, 1D convolutions for sequential data, and even attention mechanisms in transformers.

Open notebook
Conditional Random Fields - Structured Prediction for Sequences
Interactive

Conditional Random Fields - Structured Prediction for Sequences

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•5 min read

In 2001, Lafferty and colleagues introduced CRFs, a powerful probabilistic framework that revolutionized structured prediction by modeling entire sequences jointly rather than making independent predictions. By capturing dependencies between adjacent elements through conditional probability and feature functions, CRFs became essential for part-of-speech tagging, named entity recognition, and established principles that would influence all future sequence models.

Open notebook
ELIZA - The First Conversational AI Program
Interactive

ELIZA - The First Conversational AI Program

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•3 min read

Joseph Weizenbaum's ELIZA, created in 1966, became the first computer program to hold something resembling a conversation. Using clever pattern-matching techniques, its famous DOCTOR script simulated a Rogerian psychotherapist. ELIZA showed that even simple tricks could create the illusion of understanding, bridging theory and practice in language AI.

Open notebook
Katz Back-off - Handling Sparse Data in Language Models
Interactive

Katz Back-off - Handling Sparse Data in Language Models

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

In 1987, Slava Katz solved one of statistical language modeling's biggest problems. When your model encounters word sequences it has never seen before, what do you do? His elegant solution was to "back off" to shorter sequences, a technique that made n-gram models practical for real-world applications. By redistributing probability mass and using shorter contexts when longer ones lack data, Katz back-off allowed language models to handle the infinite variety of human language with finite training data.

Open notebook
Long Short-Term Memory - Solving the Memory Problem
Interactive

Long Short-Term Memory - Solving the Memory Problem

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•6 min read

In 1997, Hochreiter and Schmidhuber solved the vanishing gradient problem with LSTMs, introducing sophisticated gated memory mechanisms that could selectively remember and forget information across long sequences. This breakthrough enabled practical language modeling, machine translation, and speech recognition while establishing principles of gated information flow that would influence all future sequence models.

Open notebook
MADALINE - Multiple Adaptive Linear Neural Networks
Interactive

MADALINE - Multiple Adaptive Linear Neural Networks

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•6 min read

Bernard Widrow and Marcian Hoff built MADALINE at Stanford in 1962, taking neural networks beyond the perceptron's limitations. This adaptive architecture could tackle real-world engineering problems in signal processing and pattern recognition, proving that neural networks weren't just theoretical curiosities but practical tools for solving complex problems.

Open notebook
The Perceptron - Foundation of Modern Neural Networks
Interactive

The Perceptron - Foundation of Modern Neural Networks

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•7 min read

In 1957, Frank Rosenblatt created the perceptron at Cornell University, the first artificial neural network that could actually learn to classify patterns. This groundbreaking algorithm proved that machines could learn from examples, not just follow rigid rules. It established the foundation for modern deep learning and every neural network we use today.

Open notebook
Recurrent Neural Networks - Machines That Remember
Interactive

Recurrent Neural Networks - Machines That Remember

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

In 1995, RNNs revolutionized sequence processing by introducing neural networks with memory—connections that loop back on themselves, allowing machines to process information that unfolds over time. This breakthrough enabled speech recognition, language modeling, and established the sequential processing paradigm that would influence LSTMs, GRUs, and eventually transformers.

Open notebook
Shannon's N-gram Model - The Foundation of Statistical Language Processing
Interactive

Shannon's N-gram Model - The Foundation of Statistical Language Processing

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•5 min read

Claude Shannon's 1948 work on information theory introduced n-gram models, one of the most foundational concepts in natural language processing. These deceptively simple statistical models predict language patterns by looking at sequences of words. They laid the groundwork for everything from autocomplete to machine translation in modern language AI.

Open notebook
SHRDLU - Understanding Language Through Action
Interactive

SHRDLU - Understanding Language Through Action

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

Terry Winograd's 1968 SHRDLU system took a revolutionary approach to language understanding. Instead of just pattern matching, it created genuine comprehension within a simulated blocks world. SHRDLU could parse complex sentences, track what was happening, and demonstrate that computers could truly understand language when grounded in a physical context.

Open notebook
IBM Statistical Machine Translation - From Rules to Data
Interactive

IBM Statistical Machine Translation - From Rules to Data

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

In 1991, IBM researchers revolutionized machine translation by introducing the first comprehensive statistical approach. Instead of hand-crafted linguistic rules, they treated translation as a statistical problem of finding word correspondences from parallel text data. This breakthrough established principles like data-driven learning, probabilistic modeling, and word alignment that would transform not just translation, but all of natural language processing.

Open notebook
From Symbolic Rules to Statistical Learning - The Paradigm Shift in NLP
Interactive

From Symbolic Rules to Statistical Learning - The Paradigm Shift in NLP

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•7 min read

Natural language processing underwent a fundamental shift from symbolic rules to statistical learning. Early systems relied on hand-crafted grammars and formal linguistic theories, but their limitations became clear. The statistical revolution of the 1980s transformed language AI by letting computers learn patterns from data instead of following rigid rules.

Open notebook
Time Delay Neural Networks - Processing Sequential Data with Temporal Convolutions
Interactive

Time Delay Neural Networks - Processing Sequential Data with Temporal Convolutions

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•6 min read

In 1987, Alex Waibel introduced Time Delay Neural Networks, a revolutionary architecture that changed how neural networks process sequential data. By introducing weight sharing across time and temporal convolutions, TDNNs laid the groundwork for modern convolutional and recurrent networks. This breakthrough enabled end-to-end learning for speech recognition and established principles that remain fundamental to language AI today.

Open notebook
The Turing Test - A Foundational Challenge for Language AI
Interactive

The Turing Test - A Foundational Challenge for Language AI

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•5 min read

In 1950, Alan Turing proposed a deceptively simple test for machine intelligence, originally called the Imitation Game. Could a machine fool a human judge into thinking it was human through conversation alone? This thought experiment shaped decades of AI research and remains surprisingly relevant today as we evaluate modern language models like GPT-4 and Claude.

Open notebook
WordNet - A Semantic Network for Language Understanding
Interactive

WordNet - A Semantic Network for Language Understanding

Data, Analytics & AIMachine LearningLLM and GenAI
Oct 1, 2025•4 min read

In 1995, Princeton University released WordNet 1.0, a revolutionary lexical database that represented words not as isolated definitions, but as interconnected concepts in a semantic network. By capturing relationships like synonymy, hypernymy, and meronymy, WordNet established the principle that meaning is relational, influencing everything from word sense disambiguation to modern word embeddings and knowledge graphs.

Open notebook
Multicollinearity in Regression: Complete Guide to Detection, Impact & Solutions
Interactive

Multicollinearity in Regression: Complete Guide to Detection, Impact & Solutions

Data, Analytics & AISoftware EngineeringMachine Learning
Sep 29, 2025•27 min read

Learn about multicollinearity in regression analysis with this practical guide. VIF analysis, correlation matrices, coefficient stability testing, and approaches such as Ridge regression, Lasso, and PCR. Includes Python code examples, visualizations, and useful techniques for working with correlated predictors in machine learning models.

Open notebook
Ordinary Least Squares (OLS): Complete Mathematical Guide with Formulas, Examples & Python Implementation
Interactive

Ordinary Least Squares (OLS): Complete Mathematical Guide with Formulas, Examples & Python Implementation

Data, Analytics & AISoftware EngineeringMachine Learning
Sep 28, 2025•26 min read

A comprehensive guide to Ordinary Least Squares (OLS) regression, including mathematical derivations, matrix formulations, step-by-step examples, and Python implementation. Learn the theory behind OLS, understand the normal equations, and implement OLS from scratch using NumPy and scikit-learn.

Open notebook
Simple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation
Interactive

Simple Linear Regression: Complete Guide with Formulas, Examples & Python Implementation

Data, Analytics & AISoftware EngineeringMachine Learning
Sep 26, 2025•38 min read

A complete hands-on guide to simple linear regression, including formulas, intuitive explanations, worked examples, and Python code. Learn how to fit, interpret, and evaluate a simple linear regression model from scratch.

Open notebook
R-squared (Coefficient of Determination): Formula, Intuition & Model Fit in Regression
Interactive

R-squared (Coefficient of Determination): Formula, Intuition & Model Fit in Regression

Data, Analytics & AIMachine Learning
Sep 25, 2025•6 min read

A comprehensive guide to R-squared, the coefficient of determination. Learn what R-squared means, how to calculate it, interpret its value, and use it to evaluate regression models. Includes formulas, intuitive explanations, practical guidelines, and visualizations.

Open notebook
Building Intelligent Agents with LangChain and LangGraph: Part 2 - Agentic Workflows
Interactive

Building Intelligent Agents with LangChain and LangGraph: Part 2 - Agentic Workflows

Data, Analytics & AISoftware EngineeringLLM and GenAI
Aug 2, 2025•11 min read

Learn how to build agentic workflows with LangChain and LangGraph.

Open notebook
Understanding Market Crashes: Where Does the Money Go and How Do Markets Recover?

Understanding Market Crashes: Where Does the Money Go and How Do Markets Recover?

Economics & Finance
Aug 1, 2025•5 min read

An in-depth look at what happens to money during market crashes, how wealth is redistributed, and the mechanisms behind market recovery.

Read article
The Mathematics Behind LLM Fine-Tuning: A Beginner's Guide to how and why finetuning works

The Mathematics Behind LLM Fine-Tuning: A Beginner's Guide to how and why finetuning works

Data, Analytics & AISoftware EngineeringLLM and GenAI
Jul 28, 2025•11 min read

Understand the mathematical foundations of LLM fine-tuning with clear explanations and minimal prerequisites. Learn how gradient descent, weight updates, and Transformer architectures work together to adapt pre-trained models to new tasks.

Read article
Adapating LLMs: Off-the-Shelf vs. Context Injection vs. Fine-Tuning — When and Why

Adapating LLMs: Off-the-Shelf vs. Context Injection vs. Fine-Tuning — When and Why

Data, Analytics & AISoftware EngineeringLLM and GenAI
Jul 22, 2025•12 min read

A comprehensive guide to choosing the right approach for your LLM project: using pre-trained models as-is, enhancing them with context injection and RAG, or specializing them through fine-tuning. Learn the trade-offs, costs, and when each method works best.

Read article
Building Intelligent Agents with LangChain and LangGraph: Part 1 - Core Concepts
Interactive

Building Intelligent Agents with LangChain and LangGraph: Part 1 - Core Concepts

Data, Analytics & AISoftware EngineeringLLM and GenAI
Jul 21, 2025•5 min read

Learn the foundational concepts of LLM workflows - connecting language models to tools, handling responses, and building intelligent systems that take real-world actions.

Open notebook
Simulating stock market returns using Monte Carlo
Interactive

Simulating stock market returns using Monte Carlo

Data, Analytics & AISoftware EngineeringMachine Learning
Jul 19, 2025•10 min read

Learn how to use Monte Carlo simulation to model and analyze stock market returns, estimate future performance, and understand the impact of randomness in financial forecasting. This tutorial covers the fundamentals, practical implementation, and interpretation of simulation results.

Open notebook
What are AI Agents, Really?

What are AI Agents, Really?

Data, Analytics & AISoftware EngineeringLLM and GenAI
May 27, 2025•8 min read

A comprehensive guide to understanding AI agents, their building blocks, and how they differ from agentic workflows and agent swarms.

Read article
Understanding the Model Context Protocol (MCP)

Understanding the Model Context Protocol (MCP)

Data, Analytics & AISoftware EngineeringLLM and GenAI
May 22, 2025•5 min read

A deep dive into how MCP makes tool use with LLMs easier, cleaner, and more standardized.

Read article
Why Temperature=0 Doesn't Guarantee Determinism in LLMs

Why Temperature=0 Doesn't Guarantee Determinism in LLMs

Data, Analytics & AISoftware EngineeringLLM and GenAI
May 18, 2025•10 min read

An exploration of why setting temperature to zero doesn't eliminate all randomness in large language model outputs.

Read article

Stay updated

Get notified when I publish new articles on data and AI, private equity, technology, and more.