What is the difference between LR, SLR, and LALR parsers?

parsing techniques

LR parsers

SLR parsers

LALR parsers

compiler design

What is the difference between LR, SLR, and LALR parsers?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

LR family parsers are bottom up parsers used in many compilers and parser generators. SLR, LALR, and canonical LR all follow the same shift reduce engine, but they differ in how much lookahead precision they keep while building parse tables. That difference directly affects table size, conflict frequency, and how easily a grammar can be maintained.

Shared LR Mechanics

All three parser types use the same runtime loop:

Keep a stack of parser states.
Read one token of lookahead.
Consult an action table for shift, reduce, accept, or error.
Use a goto table after reductions.

text

1loop:
2  a = ACTION[top_state, lookahead]
3  if a is SHIFT s: push token and state s
4  if a is REDUCE A -> beta: pop len(beta), then goto on A
5  if a is ACCEPT: success
6  if a is ERROR: reject

So the runtime machinery is not the main distinction. The distinction is how parser states and reduction decisions are built.

Canonical LR(1): Maximum Context Precision

Canonical LR(1) items carry one token of lookahead for each item. That means reduce decisions are made with precise context tied to each state item.

Practical consequence:

Fewer spurious conflicts.
Better handling of grammars with subtle prefix overlap.
Larger state space and larger parse tables.

For complex language grammars, canonical LR(1) often succeeds where weaker methods need grammar rewrites.

The cost is memory and generator complexity. Historically that was expensive, though modern tools reduce this issue with table compression.

SLR(1): Simpler Construction, Coarser Decisions

SLR builds LR(0) states and uses global FOLLOW sets to decide where reductions apply. FOLLOW sets are grammar wide, not item specific.

That coarse granularity can introduce conflicts even when a grammar is unambiguous under canonical LR(1).

python

1# Example of FOLLOW-style idea used by SLR table construction
2FOLLOW = {
3    "Expr": [")", "$"],
4    "Term": ["+", ")", "$"],
5    "Factor": ["*", "+", ")", "$"],
6}

SLR is valuable for teaching and small DSL projects because the construction is conceptually straightforward. For larger grammars it often becomes fragile.

LALR(1): Practical Compromise

LALR starts from LR(1) style lookahead and merges states that share the same LR(0) core. This dramatically shrinks tables while preserving much of LR(1) power.

This is why classic tools such as Yacc and many Bison defaults are LALR oriented.

Benefits:

Much smaller tables than canonical LR(1).
Usually enough power for many programming language grammars.
Strong tooling support and predictable runtime performance.

Tradeoff:

State merges can reintroduce conflicts that canonical LR(1) would not have.

In practice, grammar authors often resolve these with precedence declarations or targeted grammar refactors.

Conflict Behavior in Real Projects

When a parser generator reports shift reduce or reduce reduce conflicts, that report is design feedback. Common workflow:

Reproduce conflict on minimal grammar fragment.
Inspect item sets and conflicting lookahead tokens.
Decide whether precedence declaration is semantically correct.
Refactor grammar only when precedence would hide real ambiguity.

Example precedence declaration in a Yacc style grammar:

yacc

1%left '+' '-'
2%left '*' '/'
3
4expr:
5    expr '+' expr
6| expr '*' expr | '(' expr ')' | NUMBER ; ``` This can resolve expected arithmetic ambiguities cleanly, but it should not be used to mask unrelated grammar problems. ## Choosing the Right Variant A useful rule of thumb: - Use SLR for teaching or very small grammars. - Use LALR for most practical compiler front ends where toolchain support is mature. - Use canonical LR(1) when grammar complexity and correctness pressure justify larger tables. If your tool offers both LALR and canonical LR(1), test both early. The conflict report difference can save days of grammar debugging. ## Common Pitfalls - Assuming all LR parsers accept the same grammars in practice. - Treating conflict warnings as harmless noise. - Adding precedence directives without checking whether they match intended language semantics. - Choosing SLR for a growing grammar and discovering late stage conflict explosions. - Ignoring generator mode defaults, which can change behavior between tools. ## Summary - SLR, LALR, and canonical LR share one shift reduce runtime model. - Canonical LR(1) has highest context precision and typically fewest conflicts, with larger tables. - SLR is simplest but most limited due to global FOLLOW based reductions. - LALR is the common middle ground with good power and compact tables. - Conflict reports should guide grammar design decisions, not be suppressed blindly.