Macro hygiene and scope management in Lean and how expansions go wrong

Introduction

This article aims to examine macro hygiene and scope management in Lean from the perspective of expansion semantics and name resolution.

Lean’s macros are hygienic by construction, but that guarantee is mediated through syntax objects carrying scope annotations rather than trivial textual substitution. Expansion introduces fresh macro scopes and identifiers are resolved relative to these scopes during name lookup and not during expansion itself. The boundary with the elaborator is where this becomes nontrivial, since generated syntax must align with local contexts, implicit binders and metavariables.

The focus here is on where this machinery fails in practice: scope misalignment across nested expansions, accidental capture through antiquotation and identifiers that appear well scoped but resolve incorrectly under elaboration.

Example

Consider a macro that introduces a binder and reuses a user supplied identifier inside the expansion:

macro "wrapLet " x:ident " := " val:term " in " body:term : term =>
  `(let $x := $val; $body)

At a glance this looks fine. The intent is to splice x into both the binder and the body. The problem is that $x in $body isn’t automatically tied to the binding occurrence introduced by let. It retains whatever scope information it had at the call site.

def x := 10

#eval wrapLet x := 20 in x

You might expect 20 but depending on how scopes align this can resolve to the outer x. The binder introduces a fresh scoped identifier but the occurrence in $body may still point at the original one.

The hygienic behaviour here is correct. The macro system is preventing accidental capture. But from the macro author’s perspective this is a scope mismatch between the introduced binder and the reused identifier. The usual repair is to make sure that the identifier in the body is the one produced at the binding site and not the original syntax node. That means you’d need to construct the body in a way that reuses the bound identifier after expansion, rather than re-injecting the caller’s x verbatim.

A more explicit variant shows the issue under nesting:

macro "doubleWrap " x:ident " := " val:term " in " body:term : term =>
  `(wrapLet $x := $val in wrapLet $x := $x + 1 in $body)

Here two binders are introduced with the same surface name. Without careful scope handling the inner $x in $x + 1 may not refer to the immediately preceding binder and the $body occurrence may refer to neither. The expansion is structurally correct but the scope graph no longer matches the intended binding structure.

This is the core pattern. Hygiene preserves correctness by avoiding capture but macros that intend to simulate binding constructs must actively re establish the correct scope relationships.

Scope Propagation and Macro Expansion

The next layer is how scopes actually propagate across nested expansions. Each macro expansion introduces a fresh macro scope and effectively a new namespace layer that’s attached to every identifier produced during that expansion. These scopes accumulate, not overwrite, so identifiers end up carrying a stack of scope marks.

Something like this would make it visible:

macro "m1 " x:ident : term =>
  `(fun $x => $x)

macro "m2 " x:ident : term =>
  `(m1 $x)

Then expand:

#check m2 x

What happens isn’t a trivial inlining of m1. The $x passed into m1 carries the callers scope and then m1 has a fresh scope for its own expansion. The binder fun $x => ... gets a new scoped identifier but the $x inside the body is the same syntax node as the parameter to m1, just augmented with the additional macro scope.

The key point I’m making here is that scope stacks differ between occurrences even when names match textually. The binders x is a freshly introduced identifier with a scope marking it as local to that lambda. The occurrence $x in the body is only correctly linked because the expansion process rewrites occurrences under that binder to share the same scope context.

Where this breaks is when expansion boundaries do not align with binding structure. For example:

macro "capture " x:ident : term =>
  `(fun y => $x)

and

#check (capture y)

Here $x is instantiated with y from the call site. The macro introduces a binder y but the occurrence inside the body refers to the outer y and not the newly bound one. Hygiene prevents capture by design and the inner binder doesn’t retroactively rewrite the identifier introduced earlier.

This shows that there’s a general invariant of scope flow forward through expansion but binding only affects syntax generated within the same expansion context. If a macro wants an identifier to be captured by a binder it introduces then it has to construct that identifier within the same scoped expansion and not splice in an existing one.

Nested macros amplify this. If m2 expands into m1 then the scopes from both layers accumulate and resolution depends on the full stack. Subtle bugs arise when two identifiers differ only by one scope layer making them distinct even though they print identically. At that point debugging basically means inspecting scope annotations rather than relying on surface syntax.

Antiquotation and Controlled Hygiene Breaks

Antiquotation is where the abstraction boundary around hygiene becomes porous. Up to this point scopes have been introduced and propagated by the macro system itself. With antiquotation you’re explicitly reusing syntax nodes originating from a different scope context, and the system will not normalise those scopes for you.

A simple example shows the shape of the problem:

macro "constFun " x:ident " => " body:term : term =>  
`(fun y => $body)

Now consider:

def y := 5

#eval constFun x => y

The y inside body is antiquoted directly. It carries the caller’s scope and not the scope introduced by fun y =>. As a result the occurrence resolves to the outer y, not the binder. This is consistent with hygiene but it violates the intuitive reading.

If the goal is to deliberately capture then you have to write identifiers in the target scope and not reuse the caller’s syntax. In Lean this usually means avoiding raw $x splicing in binding positions and instead working with APIs that rebind or reintroduce identifiers under the current macro scope. Otherwise you’re stitching together fragments of syntax that live in incompatible scope graphs.

The failure mode becomes more subtle when mixing quotation and syntax:

macro "badLet " x:ident " := " val:term " in " body:term : term => do
  let stx ← `(let $x := $val; $body)
  pure stx

This looks identical to the earlier example but the staging matters. If body itself was produced by another macro it’d already carry multiple scope layers. When you splice it here, those scopes are preserved. The new binder for x lives in the current expansion scope but occurrences inside body can still refer to earlier bindings with similar names. You get a term that is syntactically coherent but semantically misaligned.

A more explicit capture attempt shows what you shouldn’t rely on:

macro "forceCapture " x:ident : term =>
  `(fun $x => $x)

as well as

#check forceCapture y

This works only because both occurrences of $x are introduced within the same quotation. If you try to split this across quotations or pass x through intermediate macros the scopes diverge and the apparent capture disappears. The system doesn’t unify identifiers based on spelling, only on scope identity.

The practical takeaway is that antiquotation isn’t just interpolation but that it’s scope reuse. Every time you splice syntax you’re importing its entire scope stack into a new context. If that stack doesn’t match the binding structure you’re writing then the resolver will do exactly what the scopes dictate, not what the surface form suggests.

What makes this a nuisance is that failures rarely show up as hard errors. More often the term elaborates against a different binding than intended, especially once implicit arguments and type class search start interacting with those names. At that point the only reliable model is to treat scopes as first class and reason about where each identifier was introduced and how many expansion layers it has passed through.