1968: SHRDLU

Two years after ELIZA demonstrated the power of pattern matching, Terry Winograd's SHRDLU took a fundamentally different approach to natural language understanding. Rather than using shallow tricks to simulate conversation, SHRDLU attempted to create genuine understanding within a constrained domain—a simulated world of colored blocks.

SHRDLU doesn't stand for anything—it's not an acronym. The name comes from the nonsense phrase "SHRDLU ETAOIN" which represents the most frequently used letters in English, traditionally used by Linotype operators and typesetters. Winograd chose this whimsical name to reflect the system's focus on language and text processing.

The choice was both practical and philosophical: just as "SHRDLU ETAOIN" represented the building blocks of written language, Winograd's SHRDLU represented an attempt to build language understanding from fundamental computational blocks.

Understanding Through Action

SHRDLU operated in a "blocks world"—a simplified environment containing pyramids, blocks, and boxes of various colors that could be manipulated by a simulated robot arm. What made SHRDLU revolutionary was its ability to:

Parse complex sentences using formal grammar rules
Maintain world state by tracking the positions and properties of objects
Execute actions based on linguistic commands
Answer questions about its world and previous actions

Unlike ELIZA's pattern-matching approach, SHRDLU represented a genuine attempt at understanding. It built an internal model of its world, tracked changes over time, and could reason about spatial relationships and object properties. This was the first system to demonstrate that a computer could understand language in the context of a physical (albeit simulated) environment.

The Blocks World: A Perfect Testing Ground

The blocks world wasn't chosen arbitrarily—it provided the perfect balance of complexity and constraint:

Finite vocabulary: Objects had clear, unambiguous names and properties
Clear semantics: Actions like "pick up," "put down," and "stack" had precise meanings
Visual feedback: The system could verify its understanding through action
Spatial reasoning: Required understanding of concepts like "on," "under," "beside"

This constrained environment allowed Winograd to focus on the core challenges of language understanding without getting lost in the infinite complexity of open-domain conversation.

Sample Interaction

Here's a classic exchange that demonstrates SHRDLU's sophisticated capabilities:

Human

Pick up a big red block.

SHRDLU

OK. (moves the robot arm to pick up the red block)

Human

Grasp the pyramid.

SHRDLU

I don't understand which pyramid you mean.

Human

Find a block which is taller than the one you are holding and put it into the box.

SHRDLU

By 'it', I assume you mean the block which is taller than the one I am holding. (executes the action)

Human

What does the box contain?

SHRDLU

The blue pyramid and the blue block.

Human

What is the pyramid supported by?

SHRDLU

The box.

Human

How many blocks are not in the box?

SHRDLU

Four of them.

What Made This Interaction Revolutionary

This seemingly simple conversation demonstrates multiple breakthroughs in language understanding:

1. Reference Resolution

When the human says "Grasp the pyramid," SHRDLU recognizes the ambiguity—there are multiple pyramids in its world. Rather than guessing or failing silently, it explicitly requests clarification. This shows genuine understanding of the reference problem.

2. Complex Spatial Reasoning

The command "Find a block which is taller than the one you are holding" requires SHRDLU to:

Remember what it's currently holding
Compare heights of other objects
Identify the target object
Plan and execute the movement

3. Pronoun Resolution

When SHRDLU says "By 'it', I assume you mean the block which is taller than the one I am holding," it demonstrates sophisticated understanding of pronoun reference. It doesn't just resolve the pronoun—it explains its reasoning.

4. World State Maintenance

SHRDLU tracks the contents of containers, spatial relationships, and the history of its actions. When asked "What does the box contain?" it provides an accurate inventory based on its current world model.

5. Logical Reasoning

The question "How many blocks are not in the box?" requires SHRDLU to:

Count all blocks in its world
Identify which ones are in the box
Perform subtraction
Provide the answer in natural language

Technical Architecture: Engineering Language Understanding

SHRDLU's architecture represented a masterful integration of multiple AI components working together to achieve genuine language understanding. Unlike simple pattern-matching approaches, SHRDLU demonstrated how sophisticated linguistic processing could be grounded in complete understanding of both language structure and world knowledge.

The Sophisticated Parser

At SHRDLU's heart lay a sophisticated parser that converted natural language into structured meaning representations. This wasn't crude keyword matching—SHRDLU employed a comprehensive grammar handling complex syntactic structures, embedded clauses, and ambiguous constructions.

When a human said "Find a block which is taller than the one you are holding," the parser constructed a complete logical representation capturing the comparative relationship, reference to the held object, and search request. This allowed SHRDLU to handle genuinely complex linguistic constructions and recognize when clarification was needed.

World Model: Complete Environmental Awareness

SHRDLU's world model maintained a complete, constantly updated representation of its blocks world environment. This tracked object properties (color, size, shape, position), spatial relationships ("on top of," "underneath," "beside"), and historical states enabling sophisticated reasoning.

The system maintained a complete history of actions and state changes, allowing it to answer questions about past events and maintain consistency over time. It also understood causation—knowing that moving a supporting block would cause objects on top to fall.

The Strategic Planner

SHRDLU's planner broke high-level commands into sequences of primitive actions. Given "put the red block on the green block," it could identify obstacles and plan intermediate steps, working backward from the goal state to identify necessary preconditions.

The planner handled failure gracefully, generating alternative sequences when actions failed or conditions changed. It also demonstrated common-sense reasoning about physical constraints and impossible actions.

The Executor: Language to Action

The executor bridged linguistic understanding and physical action through a simulated robot arm. This integration was revolutionary—previous systems were purely conversational, but SHRDLU demonstrated that language understanding became more powerful when grounded in environmental interaction.

Limitations: Boundaries of the Rule-Based Approach

Despite impressive demonstrations, SHRDLU's limitations revealed fundamental challenges with rule-based language understanding.

Domain Brittleness

SHRDLU's most severe limitation was complete dependence on its blocks world domain. Adding new object types or concepts required extensive modifications to grammar, world model, and planner. This brittleness reflected the deeper frame problem—complete symbolic representation becomes intractable in complex domains.

Exponential Scaling

Attempts to expand SHRDLU's capabilities encountered exponential scaling problems. Each new sentence pattern required new grammar rules, each concept needed new reasoning procedures. The system's complexity grew exponentially with additions, making extension practically impossible.

Common Sense Knowledge Gap

SHRDLU's blocks world avoided real-world common sense knowledge, but this created a fundamental limitation. Real language understanding requires vast background knowledge about how the world works—knowledge that proved impossible to encode explicitly in rule-based systems.

Natural Language Complexity

SHRDLU struggled with full natural language complexity—metaphor, idioms, contextual ambiguity, and the creative flexibility that characterizes human communication. Rule-based parsers couldn't capture this fundamental creativity.

Historical Impact and Legacy

SHRDLU fundamentally changed how researchers thought about language understanding and influenced AI development for decades.

Establishing Language Understanding Standards

SHRDLU provided the first convincing demonstration of genuine computer language understanding. It showed machines could handle situated, contextual communication and established benchmarks for understanding through action, consistency, and explanation.

Illuminating Symbol Grounding

SHRDLU highlighted the symbol grounding problem—how linguistic symbols connect to real-world meaning. Its success came from grounding symbols in a simulated physical world, but this revealed the challenge of scaling to real-world complexity.

Inspiring Microworlds Methodology

SHRDLU established the "microworlds" approach that dominated AI research for two decades. While this methodology generated important insights, it also revealed that intelligence doesn't scale smoothly from simple to complex domains.

Anticipating Embodied AI

SHRDLU's integration of language with perception and action presaged modern embodied AI research. Its demonstration that language understanding benefits from grounding in sensorimotor experience remains highly relevant to contemporary robotics and multimodal AI.

Transition to New Paradigms

SHRDLU represented both the culmination and the beginning of the end of rule-based natural language processing. Its limitations pointed toward the need for statistical and machine learning approaches that could learn language patterns from data rather than requiring hand-crafted rules.

The system's core insight—that language understanding requires grounding in action and world knowledge—would remain a guiding principle through the evolution to neural networks and modern language models, continuing to influence research in embodied AI today.

Quiz: Understanding SHRDLU

Test your knowledge of Terry Winograd's revolutionary language understanding system.

SHRDLU Quiz

Question 1 of 80 of 8 completed

What does 'SHRDLU' stand for?

System for Human-Robot Dialogue and Language Understanding

It doesn't stand for anything - it's not an acronym

Speech Recognition and Dialogue Language Unit

Symbolic Human Reasoning and Dialogue Logic Unit

1968: SHRDLU

Understanding Through Action

The Blocks World: A Perfect Testing Ground

Sample Interaction

What Made This Interaction Revolutionary

1. Reference Resolution

2. Complex Spatial Reasoning

3. Pronoun Resolution

4. World State Maintenance

5. Logical Reasoning

Technical Architecture: Engineering Language Understanding

The Sophisticated Parser

World Model: Complete Environmental Awareness

The Strategic Planner

The Executor: Language to Action

Limitations: Boundaries of the Rule-Based Approach

Domain Brittleness

Exponential Scaling

Common Sense Knowledge Gap

Natural Language Complexity

Historical Impact and Legacy

Establishing Language Understanding Standards

Illuminating Symbol Grounding

Inspiring Microworlds Methodology

Anticipating Embodied AI

Transition to New Paradigms

Quiz: Understanding SHRDLU

SHRDLU Quiz

Continue reading

1. 1948: Shannon's N-gram Model

2. 1950: The Turing Test

3. 1966: ELIZA

4. 1968: SHRDLU

5. Early Grammars and Symbolic Systems

6. The Transition to Statistical Methods

Stay Updated