Publications

My research contributions span machine learning, natural language processing, speech recognition, human-computer interaction, and distributed systems. I am not actively publishing new research, but I am always open to collaboration.

Research Areas

  • Machine Learning (ML)
  • Natural Language Processing (NLP)
  • Automatic Speech Recognition (ASR)
  • Human-Computer Interaction (HCI)
  • Distributed Systems, Database Systems, Blockchain

Academic Affiliations

  • IBM Research
  • UC Berkeley - Electrical Engineering & Computer Science
  • Collaborations with MIT, Stanford, and CMU researchers
  • Industry partnerships in AI/ML research

Current Research Interests

While not actively performing academic research, I am interested in the intersection of AI/ML with finance, private equity, software engineering and the overall impact on the entire entrepreneurship ecosystem.

Books

Data Science Handbook: A Complete Guide to Machine Learning, Optimization and AI

Data Science Handbook: A Complete Guide to Machine Learning, Optimization and AI

Michael Brenndoerfer

In Progress • 2025

A comprehensive guide covering the mathematical foundations and practical implementations of machine learning, optimization, and artificial intelligence. From fundamental concepts to advanced techniques, this handbook provides both theoretical depth and real-world applications for data scientists, ML engineers, researchers, and students.

Read online
Language AI: A Practitioner's Guide from Fundamentals to State-of-the-Art

Language AI: A Practitioner's Guide from Fundamentals to State-of-the-Art

Michael Brenndoerfer

In Progress • 2025

A comprehensive guide to language AI that bridges the gap between fundamental concepts and cutting-edge applications. From the early symbolic foundations to modern transformer architectures, this book provides practical insights for practitioners working with language models, NLP systems, and AI applications.

Read online

Conference Papers

Articles & Other Publications

Recent Blog Content

Backpropagation - Training Deep Neural Networks

In the 1980s, neural networks hit a wall—nobody knew how to train deep models. That changed when Rumelhart, Hinton, and Williams introduced backpropagation in 1986. Their clever use of the chain rule finally let researchers figure out which parts of a network deserved credit or blame, making deep learning work in practice. Thanks to this breakthrough, we now have everything from word embeddings to powerful language models like transformers.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

BLEU Metric - Automatic Evaluation for Machine Translation

In 2002, IBM researchers introduced BLEU (Bilingual Evaluation Understudy), revolutionizing machine translation evaluation by providing the first widely adopted automatic metric that correlated well with human judgments. By comparing n-gram overlap with reference translations and adding a brevity penalty, BLEU enabled rapid iteration and development, establishing automatic evaluation as a fundamental principle across all language AI.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Convolutional Neural Networks - Revolutionizing Feature Learning

In 1988, Yann LeCun introduced Convolutional Neural Networks at Bell Labs, forever changing how machines process visual information. While initially designed for computer vision, CNNs introduced automatic feature learning, translation invariance, and parameter sharing. These principles would later revolutionize language AI, inspiring text CNNs, 1D convolutions for sequential data, and even attention mechanisms in transformers.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Conditional Random Fields - Structured Prediction for Sequences

In 2001, Lafferty and colleagues introduced CRFs, a powerful probabilistic framework that revolutionized structured prediction by modeling entire sequences jointly rather than making independent predictions. By capturing dependencies between adjacent elements through conditional probability and feature functions, CRFs became essential for part-of-speech tagging, named entity recognition, and established principles that would influence all future sequence models.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

ELIZA - The First Conversational AI Program

Joseph Weizenbaum's ELIZA, created in 1966, became the first computer program to hold something resembling a conversation. Using clever pattern-matching techniques, its famous DOCTOR script simulated a Rogerian psychotherapist. ELIZA showed that even simple tricks could create the illusion of understanding, bridging theory and practice in language AI.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Katz Back-off - Handling Sparse Data in Language Models

In 1987, Slava Katz solved one of statistical language modeling's biggest problems. When your model encounters word sequences it has never seen before, what do you do? His elegant solution was to "back off" to shorter sequences, a technique that made n-gram models practical for real-world applications. By redistributing probability mass and using shorter contexts when longer ones lack data, Katz back-off allowed language models to handle the infinite variety of human language with finite training data.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Long Short-Term Memory - Solving the Memory Problem

In 1997, Hochreiter and Schmidhuber solved the vanishing gradient problem with LSTMs, introducing sophisticated gated memory mechanisms that could selectively remember and forget information across long sequences. This breakthrough enabled practical language modeling, machine translation, and speech recognition while establishing principles of gated information flow that would influence all future sequence models.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

MADALINE - Multiple Adaptive Linear Neural Networks

Bernard Widrow and Marcian Hoff built MADALINE at Stanford in 1962, taking neural networks beyond the perceptron's limitations. This adaptive architecture could tackle real-world engineering problems in signal processing and pattern recognition, proving that neural networks weren't just theoretical curiosities but practical tools for solving complex problems.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

The Perceptron - Foundation of Modern Neural Networks

In 1957, Frank Rosenblatt created the perceptron at Cornell University, the first artificial neural network that could actually learn to classify patterns. This groundbreaking algorithm proved that machines could learn from examples, not just follow rigid rules. It established the foundation for modern deep learning and every neural network we use today.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Recurrent Neural Networks - Machines That Remember

In 1995, RNNs revolutionized sequence processing by introducing neural networks with memory—connections that loop back on themselves, allowing machines to process information that unfolds over time. This breakthrough enabled speech recognition, language modeling, and established the sequential processing paradigm that would influence LSTMs, GRUs, and eventually transformers.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Shannon's N-gram Model - The Foundation of Statistical Language Processing

Claude Shannon's 1948 work on information theory introduced n-gram models, one of the most foundational concepts in natural language processing. These deceptively simple statistical models predict language patterns by looking at sequences of words. They laid the groundwork for everything from autocomplete to machine translation in modern language AI.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

SHRDLU - Understanding Language Through Action

Terry Winograd's 1968 SHRDLU system took a revolutionary approach to language understanding. Instead of just pattern matching, it created genuine comprehension within a simulated blocks world. SHRDLU could parse complex sentences, track what was happening, and demonstrate that computers could truly understand language when grounded in a physical context.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

IBM Statistical Machine Translation - From Rules to Data

In 1991, IBM researchers revolutionized machine translation by introducing the first comprehensive statistical approach. Instead of hand-crafted linguistic rules, they treated translation as a statistical problem of finding word correspondences from parallel text data. This breakthrough established principles like data-driven learning, probabilistic modeling, and word alignment that would transform not just translation, but all of natural language processing.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

From Symbolic Rules to Statistical Learning - The Paradigm Shift in NLP

Natural language processing underwent a fundamental shift from symbolic rules to statistical learning. Early systems relied on hand-crafted grammars and formal linguistic theories, but their limitations became clear. The statistical revolution of the 1980s transformed language AI by letting computers learn patterns from data instead of following rigid rules.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Time Delay Neural Networks - Processing Sequential Data with Temporal Convolutions

In 1987, Alex Waibel introduced Time Delay Neural Networks, a revolutionary architecture that changed how neural networks process sequential data. By introducing weight sharing across time and temporal convolutions, TDNNs laid the groundwork for modern convolutional and recurrent networks. This breakthrough enabled end-to-end learning for speech recognition and established principles that remain fundamental to language AI today.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

The Turing Test - A Foundational Challenge for Language AI

In 1950, Alan Turing proposed a deceptively simple test for machine intelligence, originally called the Imitation Game. Could a machine fool a human judge into thinking it was human through conversation alone? This thought experiment shaped decades of AI research and remains surprisingly relevant today as we evaluate modern language models like GPT-4 and Claude.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

WordNet - A Semantic Network for Language Understanding

In 1995, Princeton University released WordNet 1.0, a revolutionary lexical database that represented words not as isolated definitions, but as interconnected concepts in a semantic network. By capturing relationships like synonymy, hypernymy, and meronymy, WordNet established the principle that meaning is relational, influencing everything from word sense disambiguation to modern word embeddings and knowledge graphs.

•
Data, Analytics & AIMachine LearningLLM and GenAI
Open notebook

Multicollinearity in Regression: Complete Guide to Detection, Impact & Solutions

Learn about multicollinearity in regression analysis with this practical guide. VIF analysis, correlation matrices, coefficient stability testing, and approaches such as Ridge regression, Lasso, and PCR. Includes Python code examples, visualizations, and useful techniques for working with correlated predictors in machine learning models.

•
Data, Analytics & AISoftware EngineeringMachine Learning
Open notebook

Simulating stock market returns using Monte Carlo

Learn how to use Monte Carlo simulation to model and analyze stock market returns, estimate future performance, and understand the impact of randomness in financial forecasting. This tutorial covers the fundamentals, practical implementation, and interpretation of simulation results.

•
Data, Analytics & AISoftware EngineeringMachine Learning
Open notebook

What are AI Agents, Really?

A comprehensive guide to understanding AI agents, their building blocks, and how they differ from agentic workflows and agent swarms.

•
Data, Analytics & AISoftware EngineeringLLM and GenAI
Read article

Stay updated

Get notified when I publish new articles on data and AI, private equity, technology, and more.