lyng/docs/BytecodeSpec.md

4.8 KiB

Lyng Bytecode VM Spec v0 (Draft)

This document describes a register-like (3-address) bytecode for Lyng with dynamic slot width (8/16/32-bit slot IDs), a slot-tail argument model, and typed lanes for Obj/Int/Real/Bool. The VM is intended to run as a suspendable interpreter and fall back to the existing AST execution when needed.

1) Frame & Slot Model

Frame metadata

  • localCount: number of local slots for this function (fixed at compile time).
  • argCount: number of arguments passed at call time.
  • argBase = localCount.

Slot layout

slots[0 .. localCount-1] locals slots[localCount .. localCount+argCount-1] arguments

Typed lanes

  • slotType[]: UNKNOWN/OBJ/INT/REAL/BOOL
  • objSlots[], intSlots[], realSlots[], boolSlots[]
  • A slot is a logical index; active lane is selected by slotType.

Parameter access

  • param i => slot localCount + i
  • variadic extra => slot localCount + declaredParamCount + k

2) Slot ID Width

Per frame, select:

  • 8-bit if localCount + argCount < 256
  • 16-bit if < 65536
  • 32-bit otherwise

The decoder uses a dedicated loop per width. All slot operands are expanded to Int internally.

3) CALL Semantics (Model A)

Instruction: CALL_DIRECT fnId, argBase, argCount, dst

Behavior:

  • Allocate a callee frame sized localCount + argCount.
  • Copy caller slots [argBase .. argBase+argCount-1] into callee slots [localCount .. localCount+argCount-1].
  • Callee returns via RET slot or RET_VOID.
  • Caller stores return value to dst.

Other calls:

  • CALL_VIRTUAL recvSlot, methodId, argBase, argCount, dst
  • CALL_FALLBACK stmtId, argBase, argCount, dst

4) Binary Encoding Layout

All instructions are: [opcode:U8] [operands...]

Operand widths:

  • slotId: S = 1/2/4 bytes (per frame slot width)
  • constId: K = 2 bytes (U16), extend to 4 if needed
  • ip: I = 2 bytes (U16) or 4 bytes (U32) per function size
  • fnId/methodId/stmtId: F/M/T = 2 bytes (U16) unless extended
  • argCount: C = 2 bytes (U16), extend to 4 if needed

Endianness: little-endian for multi-byte operands.

Common operand patterns:

  • S: one slot
  • SS: two slots
  • SSS: three slots
  • K S: constId + dst slot
  • S I: slot + jump target
  • I: jump target
  • F S C S: fnId, argBase slot, argCount, dst slot

5) Opcode Table

Note: Any opcode can be compiled to FALLBACK if not implemented in a VM pass.

Data movement

  • NOP
  • MOVE_OBJ S -> S
  • MOVE_INT S -> S
  • MOVE_REAL S -> S
  • MOVE_BOOL S -> S
  • CONST_OBJ K -> S
  • CONST_INT K -> S
  • CONST_REAL K -> S
  • CONST_BOOL K -> S
  • CONST_NULL -> S

Numeric conversions

  • INT_TO_REAL S -> S
  • REAL_TO_INT S -> S
  • BOOL_TO_INT S -> S
  • INT_TO_BOOL S -> S

Arithmetic: INT

  • ADD_INT S, S -> S
  • SUB_INT S, S -> S
  • MUL_INT S, S -> S
  • DIV_INT S, S -> S
  • MOD_INT S, S -> S
  • NEG_INT S -> S
  • INC_INT S
  • DEC_INT S

Arithmetic: REAL

  • ADD_REAL S, S -> S
  • SUB_REAL S, S -> S
  • MUL_REAL S, S -> S
  • DIV_REAL S, S -> S
  • NEG_REAL S -> S

Bitwise: INT

  • AND_INT S, S -> S
  • OR_INT S, S -> S
  • XOR_INT S, S -> S
  • SHL_INT S, S -> S
  • SHR_INT S, S -> S
  • USHR_INT S, S -> S
  • INV_INT S -> S

Comparisons (typed)

  • CMP_EQ_INT S, S -> S
  • CMP_NEQ_INT S, S -> S
  • CMP_LT_INT S, S -> S
  • CMP_LTE_INT S, S -> S
  • CMP_GT_INT S, S -> S
  • CMP_GTE_INT S, S -> S
  • CMP_EQ_REAL S, S -> S
  • CMP_NEQ_REAL S, S -> S
  • CMP_LT_REAL S, S -> S
  • CMP_LTE_REAL S, S -> S
  • CMP_GT_REAL S, S -> S
  • CMP_GTE_REAL S, S -> S
  • CMP_EQ_BOOL S, S -> S
  • CMP_NEQ_BOOL S, S -> S

Mixed numeric comparisons

  • CMP_EQ_INT_REAL S, S -> S
  • CMP_EQ_REAL_INT S, S -> S
  • CMP_LT_INT_REAL S, S -> S
  • CMP_LT_REAL_INT S, S -> S
  • CMP_LTE_INT_REAL S, S -> S
  • CMP_LTE_REAL_INT S, S -> S
  • CMP_GT_INT_REAL S, S -> S
  • CMP_GT_REAL_INT S, S -> S
  • CMP_GTE_INT_REAL S, S -> S
  • CMP_GTE_REAL_INT S, S -> S

Boolean ops

  • NOT_BOOL S -> S
  • AND_BOOL S, S -> S
  • OR_BOOL S, S -> S

Control flow

  • JMP I
  • JMP_IF_TRUE S, I
  • JMP_IF_FALSE S, I
  • RET S
  • RET_VOID

Calls

  • CALL_DIRECT F, S, C, S
  • CALL_VIRTUAL S, M, S, C, S
  • CALL_FALLBACK T, S, C, S

Object access (optional, later)

  • GET_FIELD S, M -> S
  • SET_FIELD S, M, S
  • GET_INDEX S, S -> S
  • SET_INDEX S, S, S

Fallback

  • EVAL_FALLBACK T -> S

6) Function Header (binary container)

Suggested layout for a bytecode function blob:

  • magic: U32 ("LYBC")
  • version: U16 (0x0001)
  • slotWidth: U8 (1,2,4)
  • ipWidth: U8 (2,4)
  • constIdWidth: U8 (2,4)
  • localCount: U32
  • codeSize: U32 (bytes)
  • constCount: U32
  • constPool: [const entries...]
  • code: [bytecode...]

Const pool entries are encoded as type-tagged values (Obj/Int/Real/Bool/String) in a simple tagged format. This is intentionally unspecified in v0.

7) Notes

  • Mixed-mode is allowed: compiler can emit FALLBACK ops for unsupported nodes.
  • The VM must be suspendable; on suspension, store ip + minimal operand state.
  • Source mapping uses a separate ip->Pos table, not part of core bytecode.