2.6 KiB
2.6 KiB
Pi Spigot JVM Baseline
Saved on April 4, 2026 before the List<Int> indexed-access follow-up fix.
Benchmark target:
Execution path:
- Python:
python3 examples/pi-bench.py - Lyng JVM:
./gradlew :lyng:runJvm --args='/home/sergeych/dev/lyng/examples/pi-bench.lyng' - Constraint: do not use Kotlin/Native
lyngCLI for perf comparisons
Baseline measurements:
- Python full script:
167 ms - Lyng JVM full script:
1.287097604 s - Python warm function average over 5 runs:
126.126 ms - Lyng JVM warm function average over 5 runs: about
1071.6 ms
Baseline ratio:
- Full script: about
7.7xslower on Lyng JVM - Warm function only: about
8.5xslower on Lyng JVM
Primary finding at baseline:
- The hot
reminders[j]accesses inpiSpigotwere still lowered through boxed object index ops and boxed arithmetic. - Newly added
GET_INDEX_INTandSET_INDEX_INTonly reachedpi, notreminders. - Root cause: initializer element inference handled list literals, but not
List.fill(boxes) { 2 }, soremindersdid not become knownList<Int>at compile time.
After Optimizations 1-4
Follow-up change:
- propagate inferred lambda return class into bytecode compilation
- infer
List.fill(...)element type from the fill lambda - lower
reminders[j]reads and writes toGET_INDEX_INTandSET_INDEX_INT - add primitive-backed
ObjListstorage for all-int lists - lower
List.fill(Int) { Int }toLIST_FILL_INT - stop boxing the integer index inside
GET_INDEX_INT/SET_INDEX_INT
Verification:
piSpigotdisassembly now contains typed ops forreminders, for example:GET_INDEX_INT s5(reminders), s10(j), ...SET_INDEX_INT s5(reminders), s10(j), ...
Post-change measurements using jlyng:
- Full script:
655.819559 ms - Warm 5-run total:
1.430945810 s - Warm average per run: about
286.2 ms
Observed improvement vs baseline:
- Full script: about
1.96xfaster (1.287 s -> 0.656 s) - Warm function: about
3.74xfaster (1071.6 ms -> 286.2 ms)
Residual gap vs Python baseline:
- Full script: Lyng JVM is still about
3.9xslower than Python (655.8 msvs167 ms) - Warm function: Lyng JVM is still about
2.3xslower than Python (286.2 msvs126.126 ms)
Current benchmark-test snapshot (n=200, JVM test harness):
optimized-int-division-rval-off:135 msoptimized-int-division-rval-on:125 mspiSpigotbytecode now contains:LIST_FILL_INTfor bothpiandremindersGET_INDEX_INT/SET_INDEX_INTfor the hot indexed loop