M8 — Type metadata system
Status: design / pre-implementation
Owner: M8 milestone — user-decided HIGHEST PRIORITY (2026-05-02)
Closes: M8 — Type metadata system deferred row + the JS legacy bans row’s instanceof / typeof / class / type X = … items.
1. Why this exists
Tu’s current type story (post-M2) compiles .tu to a TypeScript shadow and lets tsserver infer types. But all of that is erased at runtime. Three real-world UI use cases force users back into JS / Zod / external JS:
- Form input parsing (untrusted user → typed model)
- API response checking (untrusted JSON → typed model)
- Defensive prop validation (component-boundary contracts)
We’re also banning instanceof and typeof (per the JS-bans deferred row), so we owe users a strictly-better replacement before those bans land.
The plan: types stop being purely compile-time and become first-class runtime metadata. Every interface declaration produces both a TS type AND a JS value carrying its descriptor. type.of(v) and type.is(v, I) operate on those descriptors.
2. Core decisions (locked 2026-05-02)
- Strict duck typing. Structural matching, but every declared field must be present and correctly typed. Excess properties are tolerated for now (matches TS semantics); future opt-in for “strict-no-extras” mode.
- Only
interfacefor compound types.- No
class(banned, was never needed for UI). - No
struct. - No
type X = …keyword (REMOVED — current Tu’s contextualtypekeyword goes away).
- No
- No
extends, no extension, no pipeline. Interfaces are flat field declarations. Composition happens at the value level via spread. - No new symbols. API uses standard member access:
type.of(v),type.is(v, I). No::namespace operator. - Interface = runtime value.
interface Foo { x: number; y: string }declares ONE identifierFoothat is both a TS type AND a JS value. - Anonymous interfaces. Untyped
let a = { x: 1 }triggers compiler-synthesized anonymous interface descriptor. Required for end-to-end metadata propagation. - Primitives can’t be extended.
type.Number,type.String, etc. are sealed. - Object construction is ONLY
let a: Interface = {}. NoObject.create, nonew Foo()for user types. - Spread inheritance.
let b = {...a, extra: 1}synthesizes a descriptor combininga’s fields withextra.
3. Surface
3.1 Interface declaration
interface User {
id: number
name: string
email: string
}
Compiles to:
- TS shadow:
export interface User { id: number; name: string; email: string } - JS value:
export const User = type.struct("User", { id: type.Number, name: type.String, email: type.String })
The same User identifier serves both purposes — TS picks up the interface declaration for inference; JS code sees the const User.
3.2 Construction
let alice: User = { id: 1, name: "Alice", email: "alice@example.com" }
Compiler emits a runtime-tagged object:
const alice = type.tag(User, { id: 1, name: "Alice", email: "alice@example.com" })
type.tag either:
- Attaches a non-enumerable
[type.symbol]field pointing at the descriptor (option A — heaviest, most reflective), OR - Returns the object verbatim, registers
(value, descriptor)in a WeakMap (option B — non-invasive, garbage-collected with the value), OR - Returns the object verbatim, no tag at all —
type.of(v)walks the structure (option C — fastest, but lossy because shape ↔ descriptor isn’t bijective for primitives like{ x: number }vs{ x: number; y?: never }).
Default decision: option B (WeakMap registry). Best balance of zero-overhead-when-unused (no tag eats memory if the user never calls type.of) and accurate-shape-recovery (registry remembers the original descriptor).
3.3 Spread composition
let admin: Admin = { ...alice, role: "admin" }
If Admin is declared explicitly, the : Admin annotation drives the descriptor. If untyped:
let admin = { ...alice, role: "admin" }
// Compiler synthesizes anon interface from sources:
// anon = merge(descriptorOf(alice), { role: type.String })
// Registers admin → anon in the descriptor WeakMap.
The compiler must trace spread sources and merge their descriptors. When a source is statically unknown (e.g. a function-returned object), the synthesized descriptor falls back to type.Object (the open-ended root descriptor).
3.4 The runtime API
Exported from @tu-lang/std:
import { type } from "@tu-lang/std"
// Primitives — sealed module-level constants:
type.Number // descriptor for JS number
type.String // descriptor for JS string
type.Boolean // descriptor for JS boolean
type.Null // descriptor for null (Tu unifies null and undefined)
type.Function // descriptor for any function
type.Array(T) // constructor: descriptor for "array of T"
type.Object // descriptor for "any object" (open-ended root)
// Introspection:
type.of(v) // returns the descriptor — known interface > shape match > primitive > Object
type.is(v, I) // structural check: every required field of I must be present + match type recursively
// User construction:
type.struct(name, fields) // creates a new interface descriptor (compiler emits this for `interface` decls)
type.is is recursive — for a field declared : number[], the check walks the array elements.
3.5 What type.of returns
Priority order:
- Tagged: if
vis in the WeakMap registry, return that descriptor. - Primitive:
typeof vJS check →type.Number/type.String/ etc. - Array:
Array.isArray(v)→type.Array(elementType)where elementType comes from sampling the first element (ortype.Objectfor empty arrays). - Plain object: walk own-enumerable keys, build an anonymous shape descriptor on the fly. This is the lossy fallback — duck-typed shape recovery.
- null:
type.Null.
This means type.of(v) always returns a descriptor (no null / undefined returns), and the shape may be lossy when the value isn’t tagged.
3.6 Built-in JS types
@tu-lang/std/type ships descriptors for the JS built-ins Tu still allows construction of:
type.Promise // for `new Promise(…)`
type.Map // for `new Map()`
type.Set
type.Error
type.AbortController
type.RegExp
Each carries an instanceof-equivalent check internally (since these are nominal in JS), so type.is(p, type.Promise) actually does p instanceof Promise. Important: this is the ONLY place instanceof runs — Tu source NEVER writes instanceof directly.
4. Compiler changes
4.1 Lexer
- Add
interfacekeyword (replaces contextualtypekeyword from M2.4). - Remove
typekeyword recognition (becomes a normal identifier — for the runtime API).
4.2 Parser
- New AST node:
InterfaceDecl { name, fields: { name, type, optional }[], start, end }. - Drop
TypeAliasAST node + every parser branch that produces it. - Object literal parser: when no
: Iannotation present, mark for compiler-side anon-interface synthesis pass.
4.3 Codegen
- For
interface Foo { … }:- Emit
export interface Foo { … }to the TS shadow (drives tsserver inference). - Emit
export const Foo = type.struct("Foo", { … })to BOTH JS and TS modes.
- Emit
- For
let a: I = { … }:- Emit
type.tag(I, { … })so the registry catches the value.
- Emit
- For
let a = { … }(untyped):- Synthesize anon descriptor at module-level (with shape interning for repeats).
- Emit
type.tag(__anon_42, { … }).
- For
let b = {...a, extra: 1}:- Compute merged descriptor from spread sources at compile time.
- Emit
type.tag(__merged, { ... }).
- Banned constructs throw with directive errors:
typeof v→ “usetype.of(v)”.v instanceof T→ “usetype.is(v, T)”.type X = …→ “useinterface X { … }”.
4.4 Migration
The codebase has many type X = … aliases (in examples, tu-xing, playground). Each becomes interface X { … }. Audit scope:
examples/typed/Typed.tuexamples/js-compat/JsCompat.tuexamples/suspense/Page.tuplayground/src/live-cases.tu(CaseDefinition,CaseFile)playground/src/Sidebar.tu(DemoLinkProps)packages/tu-xing/src/components/*.tu(every Props type)
Done in the same commit as the parser change so nothing’s left in type X = … form after.
5. Performance considerations
- WeakMap registry is GC-friendly: when a tagged value is collected, its descriptor entry goes too.
- Shape interning: same anonymous shape (same field names + same field types in same order) → same module-level descriptor constant. Prevents allocation explosion for
{ x: 1 }literals appearing many times. - Hot-path opt-out: future optimization —
type.tagcan no-op when--releasemode is on (all type checks become identity, the registry is dropped, structural recovery is the only path). Don’t ship the opt-out in v1; first prove the registry isn’t a bottleneck.
6. Phases (re-stated)
- Phase 0 — this design doc. Lands first.
- Phase 1 —
@tu-lang/std/typeprimitives +of/isfor JS primitives + arrays + plain objects. Standalone;type.of(1) === type.Numberworks without compiler changes. - Phase 2 —
interfacekeyword + codegen + repo migration (the big bang). - Phase 3 — anonymous interface synthesis + shape interning.
- Phase 4 — wire the parser bans (
typeof,instanceof,type X = …) with directive errors. - Phase 5 — built-in JS-type descriptors (Promise, Map, Set, Error, AbortController) + Temporal types from
@tu-lang/std/time(submodule of std, decided 2026-05-03 not to carve a separate package). - Phase 6 (M9) — generics (
interface Box<T>) + unions (union(A, B)runtime constructor or syntax) + recursive interfaces.
7. Open questions (resolve during implementation)
- Union types — coexistence with
type X = …(CRITICAL — surfaced by Phase 2 audit). The repo has many string-literal unions (type ButtonVariant = "primary" | "secondary" | …) and nullable shapes (email: string | null). Pureinterfacedoesn’t cover these — interfaces are object shapes. Recommended resolution: keep the existingtype X = …keyword EXCLUSIVELY for unions/aliases (non-object compound types); addinterface X { … }as the new keyword for object shapes (with runtime descriptor). They coexist — pick based on what you’re declaring. Phase 2 then stops being blocked on the union design (which lives in M9 / Phase 6 with theunion(A, B)runtime constructor). - Excess-property handling: tolerate (TS-style) or strict-reject (Tu-strict)? Default: tolerate, with a future
type.strict(I)wrapper for strict mode. - Interface name collision with primitives:
interface Number { … }should error. - Reflection on functions:
type.of(() => 42)returnstype.Function. Do we capture parameter / return types? Not in v1 — too expensive. - Cross-
.tuinterface references: tsserver already handles this via the import graph. Runtime descriptors must also be importable;interface Foo { … }exports cleanly so this is automatic.
8. Banned things this directly closes
typeof v(operator) → permanently banned, replaced bytype.of(v).v instanceof T→ permanently banned, replaced bytype.is(v, T).type X = …(Tu’s M2.4 alias keyword) → permanently removed, replaced byinterface X { … }.class→ permanently banned (already was; M8 cements the alternative).
9. Open compatibility migrations (for the implementation commit)
When Phase 2 lands, every .tu file in packages/, examples/, playground/, docs/ must be migrated:
type Foo = { … }(object shape) →interface Foo { … }. Affectsexamples/typed/Typed.tu,examples/js-compat/JsCompat.tu,examples/suspense/Page.tu,playground/src/Sidebar.tu,playground/src/live-cases.tu(Todo, User, ShuffleResult, CaseFile, CaseDefinition),packages/tu-xing/src/components/*.tu(every Props type).type X = "a" | "b" | …(union) → STAYS astype X = …per §7 resolution. Affectspackages/tu-xing/src/components/Button.tu(ButtonVariant, ButtonSize),Badge.tu(BadgeVariant),Input.tu(InputSize). These are union-aliases, not interfaces.email: string | null(inline union in interface field) — STAYS. The interface declaration uses TS union syntax verbatim; the runtime descriptor’sOptional(String)covers the nullable case.typeof x→type.of(x)(in user code; compiler-emittedtypeoffor type guards stays in TS shadow).x instanceof Y→type.is(x, Y).
The audit at Phase 2 time picks up specifics; this doc lists the categories.