diff --git a/docs/BytecodeSpec.md b/docs/BytecodeSpec.md new file mode 100644 index 0000000..c57bd43 --- /dev/null +++ b/docs/BytecodeSpec.md @@ -0,0 +1,200 @@ +# Lyng Bytecode VM Spec v0 (Draft) + +This document describes a register-like (3-address) bytecode for Lyng with +dynamic slot width (8/16/32-bit slot IDs), a slot-tail argument model, and +typed lanes for Obj/Int/Real/Bool. The VM is intended to run as a suspendable +interpreter and fall back to the existing AST execution when needed. + +## 1) Frame & Slot Model + +### Frame metadata +- localCount: number of local slots for this function (fixed at compile time). +- argCount: number of arguments passed at call time. +- argBase = localCount. + +### Slot layout +slots[0 .. localCount-1] locals +slots[localCount .. localCount+argCount-1] arguments + +### Typed lanes +- slotType[]: UNKNOWN/OBJ/INT/REAL/BOOL +- objSlots[], intSlots[], realSlots[], boolSlots[] +- A slot is a logical index; active lane is selected by slotType. + +### Parameter access +- param i => slot localCount + i +- variadic extra => slot localCount + declaredParamCount + k + +## 2) Slot ID Width + +Per frame, select: +- 8-bit if localCount + argCount < 256 +- 16-bit if < 65536 +- 32-bit otherwise + +The decoder uses a dedicated loop per width. All slot operands are expanded to +Int internally. + +## 3) CALL Semantics (Model A) + +Instruction: +CALL_DIRECT fnId, argBase, argCount, dst + +Behavior: +- Allocate a callee frame sized localCount + argCount. +- Copy caller slots [argBase .. argBase+argCount-1] into callee slots + [localCount .. localCount+argCount-1]. +- Callee returns via RET slot or RET_VOID. +- Caller stores return value to dst. + +Other calls: +- CALL_VIRTUAL recvSlot, methodId, argBase, argCount, dst +- CALL_FALLBACK stmtId, argBase, argCount, dst + +## 4) Binary Encoding Layout + +All instructions are: + [opcode:U8] [operands...] + +Operand widths: +- slotId: S = 1/2/4 bytes (per frame slot width) +- constId: K = 2 bytes (U16), extend to 4 if needed +- ip: I = 2 bytes (U16) or 4 bytes (U32) per function size +- fnId/methodId/stmtId: F/M/T = 2 bytes (U16) unless extended +- argCount: C = 2 bytes (U16), extend to 4 if needed + +Endianness: little-endian for multi-byte operands. + +Common operand patterns: +- S: one slot +- SS: two slots +- SSS: three slots +- K S: constId + dst slot +- S I: slot + jump target +- I: jump target +- F S C S: fnId, argBase slot, argCount, dst slot + +## 5) Opcode Table + +Note: Any opcode can be compiled to FALLBACK if not implemented in a VM pass. + +### Data movement +- NOP +- MOVE_OBJ S -> S +- MOVE_INT S -> S +- MOVE_REAL S -> S +- MOVE_BOOL S -> S +- CONST_OBJ K -> S +- CONST_INT K -> S +- CONST_REAL K -> S +- CONST_BOOL K -> S +- CONST_NULL -> S + +### Numeric conversions +- INT_TO_REAL S -> S +- REAL_TO_INT S -> S +- BOOL_TO_INT S -> S +- INT_TO_BOOL S -> S + +### Arithmetic: INT +- ADD_INT S, S -> S +- SUB_INT S, S -> S +- MUL_INT S, S -> S +- DIV_INT S, S -> S +- MOD_INT S, S -> S +- NEG_INT S -> S +- INC_INT S +- DEC_INT S + +### Arithmetic: REAL +- ADD_REAL S, S -> S +- SUB_REAL S, S -> S +- MUL_REAL S, S -> S +- DIV_REAL S, S -> S +- NEG_REAL S -> S + +### Bitwise: INT +- AND_INT S, S -> S +- OR_INT S, S -> S +- XOR_INT S, S -> S +- SHL_INT S, S -> S +- SHR_INT S, S -> S +- USHR_INT S, S -> S +- INV_INT S -> S + +### Comparisons (typed) +- CMP_EQ_INT S, S -> S +- CMP_NEQ_INT S, S -> S +- CMP_LT_INT S, S -> S +- CMP_LTE_INT S, S -> S +- CMP_GT_INT S, S -> S +- CMP_GTE_INT S, S -> S +- CMP_EQ_REAL S, S -> S +- CMP_NEQ_REAL S, S -> S +- CMP_LT_REAL S, S -> S +- CMP_LTE_REAL S, S -> S +- CMP_GT_REAL S, S -> S +- CMP_GTE_REAL S, S -> S +- CMP_EQ_BOOL S, S -> S +- CMP_NEQ_BOOL S, S -> S + +### Mixed numeric comparisons +- CMP_EQ_INT_REAL S, S -> S +- CMP_EQ_REAL_INT S, S -> S +- CMP_LT_INT_REAL S, S -> S +- CMP_LT_REAL_INT S, S -> S +- CMP_LTE_INT_REAL S, S -> S +- CMP_LTE_REAL_INT S, S -> S +- CMP_GT_INT_REAL S, S -> S +- CMP_GT_REAL_INT S, S -> S +- CMP_GTE_INT_REAL S, S -> S +- CMP_GTE_REAL_INT S, S -> S + +### Boolean ops +- NOT_BOOL S -> S +- AND_BOOL S, S -> S +- OR_BOOL S, S -> S + +### Control flow +- JMP I +- JMP_IF_TRUE S, I +- JMP_IF_FALSE S, I +- RET S +- RET_VOID + +### Calls +- CALL_DIRECT F, S, C, S +- CALL_VIRTUAL S, M, S, C, S +- CALL_FALLBACK T, S, C, S + +### Object access (optional, later) +- GET_FIELD S, M -> S +- SET_FIELD S, M, S +- GET_INDEX S, S -> S +- SET_INDEX S, S, S + +### Fallback +- EVAL_FALLBACK T -> S + +## 6) Function Header (binary container) + +Suggested layout for a bytecode function blob: +- magic: U32 ("LYBC") +- version: U16 (0x0001) +- slotWidth: U8 (1,2,4) +- ipWidth: U8 (2,4) +- constIdWidth: U8 (2,4) +- localCount: U32 +- codeSize: U32 (bytes) +- constCount: U32 +- constPool: [const entries...] +- code: [bytecode...] + +Const pool entries are encoded as type-tagged values (Obj/Int/Real/Bool/String) +in a simple tagged format. This is intentionally unspecified in v0. + +## 7) Notes + +- Mixed-mode is allowed: compiler can emit FALLBACK ops for unsupported nodes. +- The VM must be suspendable; on suspension, store ip + minimal operand state. +- Source mapping uses a separate ip->Pos table, not part of core bytecode.