Master DBSCAN (Density-Based Spatial Clustering of Applications with Noise), the algorithm that discovers clusters of any shape without requiring predefined cluster counts. Learn core concepts, parameter tuning, and practical implementation.
Publications
Nowadays, I write primarily to learn. There's something about the act of explaining a concept that exposes gaps in my own understanding: the vague intuitions that felt solid in my head often fall apart when I try to put them into words.
This process regularly humbles me. I'll think I understand something, start writing about it, and realize I need to go back to the basics. Some topics have taken me years and multiple attempts before they finally clicked.
My hope is to create the kind of resources I wished I had when I was learning: clear explanations with context, working code examples, helpful visualizations, and most importantly, the steps and reasoning behind decisions, not just the final results.
Research Background
Research Areas
Academic Affiliations
Current Interests
While not actively performing academic research, I'm interested in the intersection of AI/ML with finance, private equity, software engineering, and the overall impact on the entrepreneurship ecosystem.
Books

Machine Learning from Scratch: A Complete Guide to Machine Learning, Optimization and AI: Mathematical Foundations and Practical Implementations
Michael Brenndoerfer
Released • 2025
A comprehensive guide covering the mathematical foundations and practical implementations of machine learning, optimization, and artificial intelligence. From fundamental concepts to advanced techniques, this handbook provides both theoretical depth and real-world applications.
Read onlineLearn quadratic programming (QP) for portfolio optimization, including the mean-variance framework, efficient frontier construction, and scipy implementation with practical examples.
Master the Vehicle Routing Problem with Time Windows (VRPTW), including mathematical formulation, constraint programming, and practical implementation using Google OR-Tools for logistics optimization.
Learn minimum cost flow optimization for slotting problems, including network flow theory, mathematical formulation, and practical implementation with OR-Tools. Master resource allocation across time slots, capacity constraints, and cost structures.
Complete guide to Mixed Integer Linear Programming (MILP) for factory optimization, covering mathematical foundations, constraint modeling, branch-and-bound algorithms, and practical implementation with Google OR-Tools. Learn how to optimize production planning with discrete setup decisions and continuous quantities.

Language AI Handbook: A Complete Guide to Natural Language Processing and Large Language Models: From Classical NLP and Transformer Architecture to Pre-training, Fine-tuning, and Production Deployment
Michael Brenndoerfer
In Progress • 2025
A comprehensive guide from the fundamentals of natural language processing to cutting-edge large language models and the latest research breakthroughs. Learn about NLP, transformers, GPT, BERT, and modern language AI.
Read onlineDiscover how sparse models decouple capacity from compute using conditional computation and mixture of experts to achieve efficient scaling.
Explore grokking: how neural networks suddenly generalize long after memorization. Learn about phase transitions, theories, and training implications.
Explore why larger language models sometimes perform worse on specific tasks. Learn about distractor tasks, sycophancy, and U-shaped scaling patterns.
Explore whether LLM emergent capabilities are genuine phase transitions or measurement artifacts. Learn how discontinuous metrics create artificial emergence.
Discover how chain-of-thought reasoning emerges in large language models. Learn CoT prompting techniques, scaling behavior, and self-consistency methods.

AI Agent Handbook: A Complete Guide to Building Autonomous AI Systems: From Language Models and Memory Architecture to Tool Integration, Multi-Agent Coordination, and Production Deployment
Michael Brenndoerfer
Released • 2025
A comprehensive guide to building and deploying intelligent autonomous agents that can reason, act, and learn. Covers the full stack from foundational models to advanced reasoning techniques, memory systems, tool integration, evaluation methods, and operational best practices.
Read onlineLearn how to scale AI agents from single users to thousands while maintaining performance and controlling costs. Covers horizontal scaling, load balancing, monitoring, cost controls, and prompt optimization strategies.
Learn how to dramatically reduce AI agent API costs without sacrificing capability. Covers model selection, caching, batching, prompt optimization, and budget controls with practical Python examples.
Learn practical techniques to make AI agents respond faster, including model selection strategies, response caching, streaming, parallel execution, and prompt optimization for reduced latency.
Learn how to maintain and update AI agents safely, manage costs, respond to user feedback, and keep your system healthy over months and years of operation.
Learn how to monitor your deployed AI agent's health, handle errors gracefully, and build reliability through health checks, metrics tracking, error handling, and scaling strategies.

History of Language AI: How We Taught Machines to Read, Write, and Reason Through a Hundred Years of Discovery
Michael Brenndoerfer
Released • November 2025
A journey through the history of language AI, from the early days of information theory to modern large language models. Discover the key breakthroughs, influential figures, and technological advances that shaped how machines understand and generate human language.
Read onlineA comprehensive guide to hybrid retrieval systems introduced in 2024. Learn how hybrid systems combine sparse retrieval for fast candidate generation with dense retrieval for semantic reranking, leveraging complementary strengths to create more effective retrieval solutions.
A comprehensive guide covering structured outputs introduced in language models during 2024. Learn how structured outputs enable reliable data extraction, eliminate brittle text parsing, and make language models production-ready. Understand schema specification, format constraints, validation guarantees, practical applications, limitations, and the transformative impact on AI application development.
A comprehensive guide to multimodal integration in 2024, the breakthrough that enabled AI systems to seamlessly process and understand text, images, audio, and video within unified model architectures. Learn how unified representations and cross-modal attention mechanisms transformed multimodal AI and enabled true multimodal fluency.
A comprehensive guide covering advanced parameter-efficient fine-tuning methods introduced in 2024, including AdaLoRA, DoRA, VeRA, and other innovations. Learn how these techniques addressed LoRA's limitations through adaptive rank allocation, magnitude-direction decomposition, parameter sharing, and their impact on research and industry deployments.
A comprehensive guide covering continuous post-training, including parameter-efficient fine-tuning with LoRA, catastrophic forgetting prevention, incremental model updates, continuous learning techniques, and efficient adaptation strategies for keeping language models current and responsive.

Quantitative Finance: Pricing, Portfolios, and Execution End to End: Academic Foundations, Design, Calibration, Backtesting and Deployment
Michael Brenndoerfer
In Progress • 2025
A comprehensive guide to quantitative finance covering the complete workflow from academic foundations to practical deployment. Learn about pricing models, portfolio construction, execution strategies, model calibration, backtesting methodologies, and production deployment of quantitative trading systems.
Read onlineMaster option strategies by combining basic building blocks. Learn to construct spreads, straddles, and iron condors to visualize payoffs and manage risk.
Learn to measure and manage bond interest rate risk using duration, convexity, and immunization. Master portfolio hedging and liability-driven investing.
Master yield curve construction through zero rates, forward rates, and bootstrapping. Learn to interpret curve shapes and build production-quality curves.
Master financial data handling with pandas, NumPy, and Numba. Learn time series operations, return calculations, and visualization for quant finance.
Learn bond pricing through present value calculations, yield to maturity analysis, and price-yield relationships. Master fixed income fundamentals.
Conference Papers
A Free Synthetic Corpus for Speaker Diarization Research
Erik Edwards, Michael Brenndoerfer, Amanda Robinson, Najmeh Sadoughi, Gregory Finley, Maxim Korenevsky, Nico Axtmann, Mark Miller, David Suendermann-Oeft
SPECOM 2018 (20th International Conference on Speech and Computer) • 2018
Semi-Supervised Acoustic Model Retraining for Medical ASR
Gregory Finley, Erik Edwards, Wael Salloum, Amanda Robinson, Najmeh Sadoughi, Nico Axtmann, Maxim Korenevsky, Michael Brenndoerfer, Mark Miller, David Suendermann-Oeft
SPECOM 2018 (20th International Conference on Speech and Computer) • 2018
Detecting Section Boundaries in Medical Dictations: Toward Real-Time Conversion of Medical Dictations to Clinical Reports
Najmeh Sadoughi, Gregory Finley, Erik Edwards, Amanda Robinson, Maxim Korenevsky, Michael Brenndoerfer, Nico Axtmann, Mark Miller, David Suendermann-Oeft
SPECOM 2018 (20th International Conference on Speech and Computer) • 2018
An Automated Assistant for Medical Scribes
Gregory Finley, Erik Edwards, Amanda Robinson, Najmeh Sadoughi, James Fone, Mark Miller, David Suendermann-Oeft, Michael Brenndoerfer, Nico Axtmann
INTERSPEECH 2018 • 2018
From dictations to clinical reports using machine translation
Gregory Finley, Wael Salloum, Najmeh Sadoughi, Erik Edwards, Amanda Robinson, Nico Axtmann, Michael Brenndoerfer, Mark Miller, David Suendermann-Oeft
NAACL-HLT 2018 (Industry Track) • 2018
An automated medical scribe for documenting clinical encounters
Gregory Finley, Erik Edwards, Amanda Robinson, Michael Brenndoerfer, Najmeh Sadoughi, James Fone, Nico Axtmann, Mark Miller, David Suendermann-Oeft
NAACL-HLT 2018 (Demonstrations) • 2018
RemindMe: Plugging a Reminder Manager into Email for Enhancing Workplace Responsiveness
Casey Dugan, Aabhas Sharma, Michael Muller, Di Lu, Michael Brenndoerfer, Werner Geyer
INTERACT 2017 (IFIP Conference on Human-Computer Interaction) • 2017
What Did I Ask You to Do, by When, and for Whom?: Passion and Compassion in Request Management
Michael Muller, Casey Dugan, Michael Brenndoerfer, Megan Monroe, Werner Geyer
CSCW 2017 (ACM Conference on Computer-Supported Cooperative Work and Social Computing) • 2017
Articles & Other Publications
An introduction to multi-threading and multi-processing
Michael Brenndoerfer
LinkedIn Article • 2021
Why the Bitcoin price is surging
Michael Brenndoerfer
LinkedIn Article • 2021
Deep Sentiment Analysis
Michael Brenndoerfer, Stefan Palombo, Vinitra Swamy
Distill Literature Review • 2018
How I started a company while going to school full-time
Michael Brenndoerfer, by Jessie Ying
Berkeley Master of Engineering (Medium) • 2018
Op-Ed: Why we need to talk about the applications of blockchain technology on the financial market
Michael Brenndoerfer, edited by Maya Rector
Berkeley Master of Engineering (Medium) • 2018
Recent Blog Content
View all →Learn how to transform raw text into structured data through tokenization, normalization, and cleaning techniques. Discover best practices for different NLP tasks and understand when to apply aggressive versus minimal preprocessing strategies.
Learn TF-IDF and Bag of Words, including term frequency, inverse document frequency, vectorization, and text classification. Master classical NLP text representation methods with Python implementation.
Complete guide to word embeddings covering Word2Vec skip-gram, GloVe matrix factorization, negative sampling, and co-occurrence statistics. Learn how to implement embeddings from scratch and understand how semantic relationships emerge from vector space geometry.
Learn how to build agentic workflows with LangChain and LangGraph.
Understand the mathematical foundations of LLM fine-tuning with clear explanations and minimal prerequisites. Learn how gradient descent, weight updates, and Transformer architectures work together to adapt pre-trained models to new tasks.
Learn how to build agentic workflows with LangChain and LangGraph.
Understand the mathematical foundations of LLM fine-tuning with clear explanations and minimal prerequisites. Learn how gradient descent, weight updates, and Transformer architectures work together to adapt pre-trained models to new tasks.
A comprehensive guide to choosing the right approach for your LLM project: using pre-trained models as-is, enhancing them with context injection and RAG, or specializing them through fine-tuning. Learn the trade-offs, costs, and when each method works best.
Learn the foundational concepts of LLM workflows - connecting language models to tools, handling responses, and building intelligent systems that take real-world actions.
A comprehensive guide to understanding AI agents, their building blocks, and how they differ from agentic workflows and agent swarms.
Learn how to transform raw text into structured data through tokenization, normalization, and cleaning techniques. Discover best practices for different NLP tasks and understand when to apply aggressive versus minimal preprocessing strategies.
Learn TF-IDF and Bag of Words, including term frequency, inverse document frequency, vectorization, and text classification. Master classical NLP text representation methods with Python implementation.
Complete guide to word embeddings covering Word2Vec skip-gram, GloVe matrix factorization, negative sampling, and co-occurrence statistics. Learn how to implement embeddings from scratch and understand how semantic relationships emerge from vector space geometry.
Learn how to use Monte Carlo simulation to model and analyze stock market returns, estimate future performance, and understand the impact of randomness in financial forecasting. This tutorial covers the fundamentals, practical implementation, and interpretation of simulation results.
Complete guide to word embeddings covering Word2Vec skip-gram, GloVe matrix factorization, negative sampling, and co-occurrence statistics. Learn how to implement embeddings from scratch and understand how semantic relationships emerge from vector space geometry.
Learn how to build agentic workflows with LangChain and LangGraph.
Understand the mathematical foundations of LLM fine-tuning with clear explanations and minimal prerequisites. Learn how gradient descent, weight updates, and Transformer architectures work together to adapt pre-trained models to new tasks.
A comprehensive guide to choosing the right approach for your LLM project: using pre-trained models as-is, enhancing them with context injection and RAG, or specializing them through fine-tuning. Learn the trade-offs, costs, and when each method works best.
Learn the foundational concepts of LLM workflows - connecting language models to tools, handling responses, and building intelligent systems that take real-world actions.
An in-depth look at what happens to money during market crashes, how wealth is redistributed, and the mechanisms behind market recovery.
A guide to Plato's foundational dialogue on epistemology, exploring three definitions of knowledge and why the question 'what do we actually know?' still haunts philosophy, science, and everyday life.
A guide to Plato's profound dialogue on the immortality of the soul, the nature of death, and why philosophers should welcome rather than fear their mortality.
A guide to Locke's revolutionary theory of natural rights, limited government, and the right to revolution, ideas that shaped democratic constitutions and continue to frame debates about liberty and authority.
A guide to Hume's revolutionary investigation into the foundations of morality, revealing why reason alone cannot motivate action and how sentiment shapes our deepest ethical convictions.
A guide to Hobbes' revolutionary theory of political authority, exploring why we need an absolute sovereign to escape the war of all against all, and what this dark vision reveals about human nature and modern governance.
Stay updated
Get notified when I publish new articles on data and AI, private equity, technology, and more.
No spam, unsubscribe anytime.
Create a free account to unlock exclusive features, track your progress, and join the conversation.