Add Lyng bytecode VM spec draft

This commit is contained in:
Sergey Chernov 2026-01-25 16:51:06 +03:00
parent b674ee7c18
commit bc9e557814

200
docs/BytecodeSpec.md Normal file
View File

@ -0,0 +1,200 @@
# Lyng Bytecode VM Spec v0 (Draft)
This document describes a register-like (3-address) bytecode for Lyng with
dynamic slot width (8/16/32-bit slot IDs), a slot-tail argument model, and
typed lanes for Obj/Int/Real/Bool. The VM is intended to run as a suspendable
interpreter and fall back to the existing AST execution when needed.
## 1) Frame & Slot Model
### Frame metadata
- localCount: number of local slots for this function (fixed at compile time).
- argCount: number of arguments passed at call time.
- argBase = localCount.
### Slot layout
slots[0 .. localCount-1] locals
slots[localCount .. localCount+argCount-1] arguments
### Typed lanes
- slotType[]: UNKNOWN/OBJ/INT/REAL/BOOL
- objSlots[], intSlots[], realSlots[], boolSlots[]
- A slot is a logical index; active lane is selected by slotType.
### Parameter access
- param i => slot localCount + i
- variadic extra => slot localCount + declaredParamCount + k
## 2) Slot ID Width
Per frame, select:
- 8-bit if localCount + argCount < 256
- 16-bit if < 65536
- 32-bit otherwise
The decoder uses a dedicated loop per width. All slot operands are expanded to
Int internally.
## 3) CALL Semantics (Model A)
Instruction:
CALL_DIRECT fnId, argBase, argCount, dst
Behavior:
- Allocate a callee frame sized localCount + argCount.
- Copy caller slots [argBase .. argBase+argCount-1] into callee slots
[localCount .. localCount+argCount-1].
- Callee returns via RET slot or RET_VOID.
- Caller stores return value to dst.
Other calls:
- CALL_VIRTUAL recvSlot, methodId, argBase, argCount, dst
- CALL_FALLBACK stmtId, argBase, argCount, dst
## 4) Binary Encoding Layout
All instructions are:
[opcode:U8] [operands...]
Operand widths:
- slotId: S = 1/2/4 bytes (per frame slot width)
- constId: K = 2 bytes (U16), extend to 4 if needed
- ip: I = 2 bytes (U16) or 4 bytes (U32) per function size
- fnId/methodId/stmtId: F/M/T = 2 bytes (U16) unless extended
- argCount: C = 2 bytes (U16), extend to 4 if needed
Endianness: little-endian for multi-byte operands.
Common operand patterns:
- S: one slot
- SS: two slots
- SSS: three slots
- K S: constId + dst slot
- S I: slot + jump target
- I: jump target
- F S C S: fnId, argBase slot, argCount, dst slot
## 5) Opcode Table
Note: Any opcode can be compiled to FALLBACK if not implemented in a VM pass.
### Data movement
- NOP
- MOVE_OBJ S -> S
- MOVE_INT S -> S
- MOVE_REAL S -> S
- MOVE_BOOL S -> S
- CONST_OBJ K -> S
- CONST_INT K -> S
- CONST_REAL K -> S
- CONST_BOOL K -> S
- CONST_NULL -> S
### Numeric conversions
- INT_TO_REAL S -> S
- REAL_TO_INT S -> S
- BOOL_TO_INT S -> S
- INT_TO_BOOL S -> S
### Arithmetic: INT
- ADD_INT S, S -> S
- SUB_INT S, S -> S
- MUL_INT S, S -> S
- DIV_INT S, S -> S
- MOD_INT S, S -> S
- NEG_INT S -> S
- INC_INT S
- DEC_INT S
### Arithmetic: REAL
- ADD_REAL S, S -> S
- SUB_REAL S, S -> S
- MUL_REAL S, S -> S
- DIV_REAL S, S -> S
- NEG_REAL S -> S
### Bitwise: INT
- AND_INT S, S -> S
- OR_INT S, S -> S
- XOR_INT S, S -> S
- SHL_INT S, S -> S
- SHR_INT S, S -> S
- USHR_INT S, S -> S
- INV_INT S -> S
### Comparisons (typed)
- CMP_EQ_INT S, S -> S
- CMP_NEQ_INT S, S -> S
- CMP_LT_INT S, S -> S
- CMP_LTE_INT S, S -> S
- CMP_GT_INT S, S -> S
- CMP_GTE_INT S, S -> S
- CMP_EQ_REAL S, S -> S
- CMP_NEQ_REAL S, S -> S
- CMP_LT_REAL S, S -> S
- CMP_LTE_REAL S, S -> S
- CMP_GT_REAL S, S -> S
- CMP_GTE_REAL S, S -> S
- CMP_EQ_BOOL S, S -> S
- CMP_NEQ_BOOL S, S -> S
### Mixed numeric comparisons
- CMP_EQ_INT_REAL S, S -> S
- CMP_EQ_REAL_INT S, S -> S
- CMP_LT_INT_REAL S, S -> S
- CMP_LT_REAL_INT S, S -> S
- CMP_LTE_INT_REAL S, S -> S
- CMP_LTE_REAL_INT S, S -> S
- CMP_GT_INT_REAL S, S -> S
- CMP_GT_REAL_INT S, S -> S
- CMP_GTE_INT_REAL S, S -> S
- CMP_GTE_REAL_INT S, S -> S
### Boolean ops
- NOT_BOOL S -> S
- AND_BOOL S, S -> S
- OR_BOOL S, S -> S
### Control flow
- JMP I
- JMP_IF_TRUE S, I
- JMP_IF_FALSE S, I
- RET S
- RET_VOID
### Calls
- CALL_DIRECT F, S, C, S
- CALL_VIRTUAL S, M, S, C, S
- CALL_FALLBACK T, S, C, S
### Object access (optional, later)
- GET_FIELD S, M -> S
- SET_FIELD S, M, S
- GET_INDEX S, S -> S
- SET_INDEX S, S, S
### Fallback
- EVAL_FALLBACK T -> S
## 6) Function Header (binary container)
Suggested layout for a bytecode function blob:
- magic: U32 ("LYBC")
- version: U16 (0x0001)
- slotWidth: U8 (1,2,4)
- ipWidth: U8 (2,4)
- constIdWidth: U8 (2,4)
- localCount: U32
- codeSize: U32 (bytes)
- constCount: U32
- constPool: [const entries...]
- code: [bytecode...]
Const pool entries are encoded as type-tagged values (Obj/Int/Real/Bool/String)
in a simple tagged format. This is intentionally unspecified in v0.
## 7) Notes
- Mixed-mode is allowed: compiler can emit FALLBACK ops for unsupported nodes.
- The VM must be suspendable; on suspension, store ip + minimal operand state.
- Source mapping uses a separate ip->Pos table, not part of core bytecode.