diff --git a/README.md b/README.md index a6ada8a..4b81958 100644 --- a/README.md +++ b/README.md @@ -1,39 +1,126 @@ # vanity -executable artefact about parametricity failure under specialisation and inlining with representation lowering +vanity is an executable model for auditing optimisation failures caused by +specialisation, inlining, and representation lowering. -this project models a small typed source calculus and then lowers it into an explicit representation calculus and audits whether the source observation and target observation remain related. the surface is semantic rather than structural as its about abstraction theorems, representation relations, strictness shifts, and the proof obligations that an optimiser assumes when it rewrites polymorphic code +the project describes a small typed source calculus, lowers it into an explicit target IR, and checks whether source and target observations remain related across pass boundaries. the interesting cases are not parser or codegen bugs but failures of the abstraction theorem that an optimiser implicitly assumes when it rewrites polymorphic code -the target language makes representation explicit and forces each transition to be spelled out in the term. values are either boxed or unboxed and can be projected from products or injected into sums as well as unpacked when needed and moved across worker wrapper boundaries as part of calling convention shifts +## the idea -the target calculus is intentionally lower level than the source and trades abstraction for control over layout and calling. it has tuple and sum representations and uses explicit box and unbox operations to mediate between them. equality on integers and booleans is primitive rather than encoded as well as recursive binding being built in rather than derived +the source language keeps representation abstract. it has functions, type +abstraction and application, products, sums, recursive types, recursive +bindings and an explicit `tick`. the effect is intentionally small and just enough to make DCE rewrites observable. -the audit path records pass boundary observations and each corpus case is checked against relation preservation, target arity and shape validation, termination class preservation and effect trace preservation. unsafe profiles intentionally refute the invariants so that optimiser bugs show up as small source target deltas rather than vague failures +the target IR makes layout explicit. values can be boxed or unboxed, products and sums have concrete representations, functions carry a boxed or unboxed calling convention, and worker wrapper lowering is represented as a term rather than hidden in the compiler -the source calculus also has an explicit `tick` effect which is enough to expose unsound dce style rewrites. eliminating a dead binding is only valid when the eliminated term is effect free +the audit asks a simple question at each boundary: -## example(s) +- did the source and target terminate in the same class? +- did the effect trace remain equivalent? +- did representation lowering preserve the selected relation? +- did the target IR validate its arity and shape constraints? +- did the pass emit the proof obligation needed to justify the rewrite? -a representation exposure witness starts from a polymorphic constant false term +## pipeline + +the pipeline is controlled by optimisation flags: ```ocaml -forall a. function a function a bool +type optimisation_flags = { + specialise : bool; + inline : bool; + repr_lower : bool; + unsafe_repr_eq : bool; + unsafe_strict_unroll : bool; + unsafe_effect_drop : bool; +} ``` -under a safe profile instantiating this preserves the baseline relation. under an unsafe inline profile instantiating can expose integer equality and refute the abstraction theorem +compilation records each intermediate term: -a strictness witness places a divergent recursive computation +```text +source -> specialise -> inline -> effect rewrite -> repr lower -> target IR +``` + +each pass may also emit obligations: + +- preserve the source relation at a rewrite site +- account for exposed representation specific tests +- prove that strictness did not shift termination +- preserve effect observations +- justify worker wrapper argument and result correspondence + +safe profiles leave those obligations discharged or merely assumed where the +model has no full proof engine. unsafe profiles deliberately refute them so the failure appears as a small source target delta. + +## relations + +the comparison engine supports three relations: + +- compare source observations without representation lowering +- relate boxed source values to unboxed target values +- restrict comparison to ground integer and boolean values + +these relations are intentionally small. the point is to make the failure mode visible, not to model a compiler front end + +## corpus + +the checked corpus contains witnesses for both preserved and broken rewrites + +### safe polymorphic instantiation + +this case instantiates a polymorphic constant function at `bool` + +```ocaml +forall a. a -> a -> bool +``` + +specialisation alone should preserve the baseline relation. no +representation specific primitive is introduced, so the free theorem remains +intact + +### representation exposure + +this case instantiates the same polymorphic shape at `int` and then permits an unsafe inline profile to rewrite the abstract branch into `eq_int` + +the source term is parametric. the target term is no longer parametric because it can observe integer representation + +### strictness shift + +this case places a divergent recursive computation behind a `roll` boundary ```ocaml roll mu a. int loop ``` -the source observation can keep the recursive payload latent and a strict lowering path can force it during construction and change termination +a safe lowering keeps the payload latent and an unsafe unroll path forces it early to change termination -a worker wrapper witness uses the explicit source type representation +### effect shift + +this case uses an effectful dead binding: ```ocaml -TArrow (TPair (TInt, TBool), TPair (TInt, TBool)) +let _ = tick "alloc" 0 in 1 ``` -the target worker receives `RInt` and `RBool` while the wrapper preserves the boxed source interface IRs +the result value is independent of the binding but the observation is not. +dropping the binding is only valid if the eliminated term is effect free + +### worker wrapper + +this case lowers a boxed pair function through an unboxed worker + +```ocaml +int * bool -> int * bool +``` + +the wrapper preserves the boxed source interface while the worker receives +`RInt` and `RBool`. the target validator checks that worker arity and result shape still match the representation relation. + +## notes + +- the model is call by value. +- target validation is structural and representation aware +- generated campaigns sample typed source terms and shrink failing specimens +- unsafe profiles are supposed to fail; they are witnesses, not broken tests +- this is a semantic audit harness and not a complete optimiser