Parametricity breakdown under inlining and unboxed representations

Parametricity is one of the “useful” lies we tell ourselves when reasoning about polymorphic programs. If a function has type forall a. a -> a -> bool then in a source language with no eliminator for arbitrary a, it can’t inspect its arguments. The usual free theorem follows from the shape of the type. The implementation may return a constant, diverge or just pass its arguments around abstractly.

Optimising compilers complicate that story since a compiler rarely preserves the source calculus literally. It specialises polymorphic functions, inlines definitions, lowers boxed values to unboxed representations and rewrites calling conventions. The interesting question is whether the whole pipeline still preserves the relation that justified the free theorem.

This blog post describes a small artefact that makes two failure modes explicit. Representation exposure occurs when inlining introduces a primitive observer such as eq_int. Strictness shift occurs when unboxing changes when a value is forced.

The source is a small call by value calculus with explicit polymorphism, recursive values, bottom, pairs, sums and ground equality.

types A = int | bool | A * A | A + A | A -> A | forall a. A | mu a. A
terms e = x | fun x -> e | e e | /\a. e | e [A]
          | roll[A] e | unroll e | fix f. e | eq_int e e | eq_bool e e

The restriction is that equality only exists at ground types. There’s no eq_a for an abstract type variable so a term of this type is constrained by abstraction.

forall a. a -> a -> bool

It can’t distinguish its two arguments. A representative source term is the constant false function.

/\a. fun x -> fun y -> false

This term is boring in the source which is kind of the point. Since x and y have abstract type the body can’t inspect them and the semantics is small step call by value, with divergence observable through bounded traces. The target is monomorphic and representation aware. It has unboxed int# and bool#, unboxed tuples and sums, explicit box and unbox and boxed/unboxed calling conventions.

Worker wrapper lowering is modelled explicitly and a boxed wrapper receives the source shaped argument and projects it to then call an unboxed worker. For a function over (int * bool) the shape looks like this:

wrapper boxed =
  let p = unbox boxed in
  worker (proj0 p) (proj1 p)

worker n# b# =
  ...

This transformation isn’t inherently wrong since it’s justified by a cross language logical relation that relates a source pair (n, b) to a target tuple (#n#, b##). The obligation is pointwise. Projections and reconstruction must preserve the relation.

Now specialise the polymorphic constant function at int:

let poly_const_false =
  /\a. fun x -> fun y -> false
in
  ((poly_const_false [int]) 1) 1

In the source this evaluates to:

false

A safe specialisation pass may clone the int instance and still preserve the theorem. The problem begins when an aggressive inliner replaces the specialised body with a representation primitive:

eq_int# 1# 1#

The target evaluates to:

true#

This isn’t a failure of beta reduction. It’s a failure of the optimisations semantic assumption. The source type said that the implementation couldn’t inspect values of type a. The optimised target did inspect them because this instantiation became int. Once eq_int# appears underneath a formerly parametric binding the baseline logical relation no longer applies.

The second failure mode is different. Consider this program:

let x = roll[mu a. int] (fix loop. loop) in
0

In the source this terminates. roll forms a recursive value without evaluating the payload, so the diverging fix is never forced

An unsafe lowering pass may decide that the recursive payload should be represented directly:

let x =
  let forced_roll_payload =
    letrec loop = loop in loop
  in
  roll# forced_roll_payload
in
0#

The target diverges before reaching 0# so the issue is that lowering moved evaluation across a boundary that was observationally non-strict in the source semantics.

The distinction matters because exposing representation introduces a new observable, while the strictness shift perturbs contextual equivalence via evaluation order.

Worker wrapper can preserve a relational interpretation if the compiler materialises an explicit correspondence between boxed and unboxed forms, specialisation remains sound when it is purely duplicative and inlining is admissible provided it doesn’t leak representation primitives across abstraction boundaries.

Breakage happens when a pass implicitly strengthens its invariants, so if parametric reasoning is to be preserved each optimisation requires a preservation theorem with respect to the appropriate logical relation where for worker wrapper the relation identifies boxed and unboxed representations, for polymorphic code it enforces uniformity and for recursive definitions it is divergence sensitive.

I feel like it’s important to say that parametricity isn’t confined to the type system and must be maintained through specialisation, inlining, lowering, and calling convention transformations, since without an explicit account of which relation each pass preserves an optimisation can collapse a free theorem into an empirical artefact and leave the source semantics unable to justify the outcome.