NFC: BridgeJS: Descriptor-driven codegen, decoupling type knowledge from code generation#622
NFC: BridgeJS: Descriptor-driven codegen, decoupling type knowledge from code generation#622krodak wants to merge 6 commits intoswiftwasm:mainfrom
Conversation
771bd03 to
5726968
Compare
|
I think the schema-driven approach makes sense to me. On the other hand, I think what we really need to think more about is how to composite the type descriptors. e.g. Optional should have a different descriptor depending on wrapped type T, so we need to define a general rule for that. Or define a fallback convention that works without knowing the T's descriptor by using Stack ABI and define specialized descriptors for known cases. I still haven't checked the entire changes yet so I might be missing something 🙇 |
|
@kateinoigakukun thanks for looking at this, I'll think on your feedback and try to progress in this direction when I can; in parallel I'll look for more simplifications around intrinsics like we are both currently doing 👌🏻 |
effae5d to
e8fd782
Compare
…to use descriptor-driven dispatch
…scriptor, unify return-type switches
0392496 to
df5af2e
Compare
…ring, and collapse liftExpression
df5af2e to
1646876
Compare
…r cleanup failures
|
@kateinoigakukun updated PR description and made some more changes and fixes, no rush on this, but PR should be in a good to have a look for further discussion; let me know if some of the changes would satisfy your earlier remarks partially |
BridgeJS: Descriptor-driven codegen - decoupling type knowledge from code generation
Context
Following the discussion in #496 - @kateinoigakukun raised a concern about adding more
BridgeTypecases before paying down complexity debt in the code generator. Specifically, the ad-hoc per-type handling should be moved out of codegen, and extending supported types shouldn't require invasive changes.In my reply, I suggested we could define pairs of Swift/TS type bridging declaratively and generalize codegen logic - similar in spirit to how stack-based types already work.
This PR is an attempt at that approach. It's experimental - I'm completely open to feedback on the direction. We can rework it, split parts into separate PRs (e.g. the JS glue changes), or close this entirely if this isn't the right path. The goal was to explore what it takes to centralize type-specific ABI knowledge so codegen operates generically rather than switching on every type.
The idea
Today, each codegen function (
lowerParameter,liftReturn,optionalLowerReturn, etc.) has its ownswitchoverBridgeTypeencoding the same structural facts - "this type uses one i32", "this type returns via the stack", "this type's optional uses a side-channel function" - in slightly different ways. Adding a new simple type means touching many of these switches.This PR introduces three declarative abstractions:
BridgeTypeDescriptorDefined once per
BridgeType, this captures the Wasm ABI shape as a struct:wasmParams- core Wasm parameter types for export direction (e.g. string =[i32, i32], bool =[i32], struct =[])importParams- parameter types for import direction (defaults towasmParams; string overrides to[i32]since imports use object IDs)wasmReturnType/importReturnType- return type in export vs import directionoptionalConvention- howOptional<T>is handled (see below)nilSentinel- bit pattern that representsnilfor types with "extra inhabitants" (see below)usesStackLifting- whether multi-param stack lifting needs LIFO variable reorderingaccessorTransform- how to access the bridgeable value from a Swift accessor (identity,.jsObjectmember access, protocol downcast)lowerMethod- which bridge protocol method to call (stackReturn, fullReturn, pushParameter, none)Codegen functions read descriptor fields instead of switching per type. For example,
lowerStatementsusesdescriptor.lowerMethodanddescriptor.accessorTransformrather than matching on.jsObject,.swiftProtocol,.swiftStruct, etc. individually. Return-type switches inrenderCallStatement,callStaticProperty, andcallPropertyGetterare unified viaaccessorTransform.applyToReturnBinding().OptionalConventionCaptures how
Optional<T>is lowered/lifted for a given wrapped type T:Part of
BridgeTypeDescriptor, the convention defaults tonilin the init and is derived from T'swasmParamswhen not explicitly specified:wasmParams->.stackABI(stack-based types)wasmParams->.inlineFlag(scalar types).sideChannelReturnneeds explicit specificationThis means new types get correct optional convention automatically from their descriptor shape.
NilSentinelInspired by Swift's "extra inhabitant" concept (bit patterns that are never valid values for a type), the
NilSentinelenum captures whether a type has a value that can representnilwithout an extraisSomeflag:Types with sentinels:
jsObject,swiftProtocol- sentinel 0 (object IDs start at 2)swiftHeapObject- sentinel null pointercaseEnum,associatedValueEnum- sentinel -1 (never a valid case index)The sentinel is used in
optionalLowerReturnfor the JS import direction - types with sentinels use a generic sentinel-based return path instead of per-type switch cases. The innerlowerReturnfragment is composed into anisSome ? <lowered> : <sentinel>pattern automatically.JSScalarCoercionJS-side coercion info for simple scalar types - lift/lower transforms, variable hints, optional return storage/function names. Types that return non-nil
jsCoercionare handled through a genericscalarFragments()builder that returns a(lift, lower)pair (scalarlowerParameteris always.identitysince JS auto-coerces). This replaces all per-type fragment functions (boolLowerParameter,uintLiftReturn, etc., which are all removed).Compositional optional handling
optionalLowerParameterandoptionalLiftParameterno longer contain per-type switches. They compose T's existinglowerParameter/liftParameterfragment inside anisSomeconditional:optionalLowerParameter: Runs T'slowerParameterfragment into a buffer printer, captures results into outer variables, and wraps any cleanup in a scoped closure. The buffer approach lets us detect whether the inner fragment actually produces cleanup code - theinnerCleanupvariable and its associated emission are only generated when the inner fragment has cleanup lines, avoiding dead code in the generated JS.optionalLiftParameter: Runs T'sliftParameterfragment into a buffer. If the buffer is empty (pure expression), uses a ternaryisSome ? expr : null. If it has side effects, wraps in anif/elseblock.New types added to
lowerParameterorliftParameterautomatically get correct optional handling for free.The same compositional approach extends to struct fields -
structFieldLowerFragmentfor nullable wrapped types delegates to the inner type's non-optional lowering fragment, wrapping it in anisSomeconditional with placeholder pushes in the else branch. The same conditional cleanup emission applies here.Import/export param unification via
importParamsTypes like
string,rawValueEnum(.string), andswiftStructhave different parameter shapes depending on direction:(bytes, length)pair(value)The
importParamsfield on the descriptor captures this difference.loweringParameterInfo(used when Swift calls JS) always readsimportParams, whileliftParameterInfo(used when JS calls Swift) readswasmParams. This eliminated the per-type parameter info switches - the default path now reads directly from the descriptor.liftExpressioncollapseIn
StackCodegen.liftExpression, 15 types all generated the sameTypeName.bridgeJSLiftParameter()pattern. These are collapsed to adefaultcase, keeping only the types that need genuinely different codegen:.jsObject(className?)(wrapping constructor),.nullable/.array/.dictionary(delegation),.closure(usesJSObject), and.void/.namespaceEnum(literal()).Similarly,
liftNullableExpressionis collapsed from a 15-type explicit case list to a 2-case switch: namedjsObjectneeding a.mapwrapper vs everything else using direct lift.What's not fully unified (and why)
This is an incremental step. Some per-type logic remains where types have genuinely different codegen semantics:
optionalLiftReturnandoptionalLowerReturnstill have per-type cases for non-sentinel types - these are genuinely heterogeneous (some read from side-channel storage, some pop stack flags, some use different JS APIs)liftParameter/lowerParameter/liftReturn/lowerReturnfor complex types (string, jsObject, jsValue, closures, etc.) still need bespoke fragments since their JS mechanics differ fundamentally - these are dispatch tables mapping types to pre-built fragments, which is the right abstraction levelWhat adding a new type (e.g. UUID) would look like after this PR
BridgeTypecase - e.g..uuidinBridgeTypeenumdescriptor- UUID maps to string on the wire, so:.inlineFlag).jsCoercion- same as string (no JS-side coercion needed beyond what string does), ornilif it reuses string's bespoke JS codegen pathBridgeJSLowerStackReturn/BridgeJSLiftParameterthat convert betweenUUIDand its wire representation (UUID(uuidString:)!/.uuidString)Foundation.UUIDand emit.uuidCodegen functions (
lowerStatements,liftReturn,optionalLower*,optionalLift*, JS glue generation) pick up the new type automatically through the descriptor and coercion info - no need to add cases to each of them individually.For a type like URL, the same pattern applies (URL also maps to string on the wire via
.absoluteString/URL(string:)!). The Foundation-gating (#if canImport(Foundation)) would go in the runtime conformances for Embedded Swift compatibility.