JVM multithreaded scope pool now turned on by default
This commit is contained in:
parent
38c1b3c209
commit
2af5852d44
@ -17,7 +17,7 @@ All flags are `var` and can be flipped at runtime (e.g., from tests or host apps
|
|||||||
- `EMIT_FAST_LOCAL_REFS` — Compiler emits `FastLocalVarRef` for identifiers known to be locals/params (ON JVM default).
|
- `EMIT_FAST_LOCAL_REFS` — Compiler emits `FastLocalVarRef` for identifiers known to be locals/params (ON JVM default).
|
||||||
- `ARG_BUILDER` — Efficient argument building: small‑arity no‑alloc and pooled builder on JVM (ON JVM default).
|
- `ARG_BUILDER` — Efficient argument building: small‑arity no‑alloc and pooled builder on JVM (ON JVM default).
|
||||||
- `SKIP_ARGS_ON_NULL_RECEIVER` — Early return on optional‑null receivers before building args (semantics‑compatible). A/B only.
|
- `SKIP_ARGS_ON_NULL_RECEIVER` — Early return on optional‑null receivers before building args (semantics‑compatible). A/B only.
|
||||||
- `SCOPE_POOL` — Scope frame pooling for calls (JVM‑first). OFF by default. Enable for benchmark A/B.
|
- `SCOPE_POOL` — Scope frame pooling for calls (JVM, per‑thread ThreadLocal pool). ON by default on JVM; togglable at runtime.
|
||||||
- `FIELD_PIC` — 2‑entry polymorphic inline cache for field reads/writes keyed by `(classId, layoutVersion)` (ON JVM default).
|
- `FIELD_PIC` — 2‑entry polymorphic inline cache for field reads/writes keyed by `(classId, layoutVersion)` (ON JVM default).
|
||||||
- `METHOD_PIC` — 2‑entry PIC for instance method calls keyed by `(classId, layoutVersion)` (ON JVM default).
|
- `METHOD_PIC` — 2‑entry PIC for instance method calls keyed by `(classId, layoutVersion)` (ON JVM default).
|
||||||
- `PIC_DEBUG_COUNTERS` — Enable lightweight hit/miss counters via `PerfStats` (OFF by default).
|
- `PIC_DEBUG_COUNTERS` — Enable lightweight hit/miss counters via `PerfStats` (OFF by default).
|
||||||
@ -29,7 +29,7 @@ See `src/commonMain/kotlin/net/sergeych/lyng/PerfFlags.kt` and `PerfDefaults.*.k
|
|||||||
## Where optimizations apply
|
## Where optimizations apply
|
||||||
|
|
||||||
- Locals: `FastLocalVarRef`, `LocalVarRef` per‑frame cache (PIC).
|
- Locals: `FastLocalVarRef`, `LocalVarRef` per‑frame cache (PIC).
|
||||||
- Calls: small‑arity zero‑alloc paths (0–5 args), pooled builder (JVM), and child frame pooling (optional).
|
- Calls: small‑arity zero‑alloc paths (0–8 args), pooled builder (JVM), and child frame pooling (optional).
|
||||||
- Properties/methods: Field/Method PICs with receiver shape `(classId, layoutVersion)` and handle‑aware caches.
|
- Properties/methods: Field/Method PICs with receiver shape `(classId, layoutVersion)` and handle‑aware caches.
|
||||||
- Expressions: R‑value fast paths in hot nodes (`UnaryOpRef`, `BinaryOpRef`, `ElvisRef`, logical ops, `RangeRef`, `IndexRef` read, `FieldRef` receiver eval, `ListLiteralRef` elements, `CallRef` callee, `MethodCallRef` receiver, assignment RHS).
|
- Expressions: R‑value fast paths in hot nodes (`UnaryOpRef`, `BinaryOpRef`, `ElvisRef`, logical ops, `RangeRef`, `IndexRef` read, `FieldRef` receiver eval, `ListLiteralRef` elements, `CallRef` callee, `MethodCallRef` receiver, assignment RHS).
|
||||||
- Primitives: Direct boolean/int ops where safe.
|
- Primitives: Direct boolean/int ops where safe.
|
||||||
@ -117,3 +117,70 @@ Print a summary at the end of a bench/test as needed. Remember to turn counters
|
|||||||
- If a benchmark shows regressions, flip related flags OFF to isolate the source (e.g., `ARG_BUILDER`, `RVAL_FASTPATH`, `FIELD_PIC`, `METHOD_PIC`).
|
- If a benchmark shows regressions, flip related flags OFF to isolate the source (e.g., `ARG_BUILDER`, `RVAL_FASTPATH`, `FIELD_PIC`, `METHOD_PIC`).
|
||||||
- Use `PIC_DEBUG_COUNTERS` to observe inline cache effectiveness.
|
- Use `PIC_DEBUG_COUNTERS` to observe inline cache effectiveness.
|
||||||
- Ensure tests do not accidentally keep flags ON for subsequent tests; reset after each test.
|
- Ensure tests do not accidentally keep flags ON for subsequent tests; reset after each test.
|
||||||
|
|
||||||
|
|
||||||
|
## JVM micro-benchmark results (3× medians; OFF → ON)
|
||||||
|
|
||||||
|
Date: 2025-11-10 23:04 (local)
|
||||||
|
|
||||||
|
| Flag | Benchmark/Test | OFF median (ms) | ON median (ms) | Speedup | Notes |
|
||||||
|
|--------------------|----------------------------------------------|-----------------:|----------------:|:-------:|-------|
|
||||||
|
| ARG_BUILDER | CallMixedArityBenchmarkTest | 788.02 | 668.79 | 1.18× | Clear win on mixed arity |
|
||||||
|
| ARG_BUILDER | CallBenchmarkTest (simple calls) | 423.87 | 425.47 | 1.00× | Neutral on repeated simple calls |
|
||||||
|
| FIELD_PIC | PicBenchmarkTest::benchmarkFieldGetSetPic | 113.575 | 106.017 | 1.07× | Small but consistent win |
|
||||||
|
| METHOD_PIC | PicBenchmarkTest::benchmarkMethodPic | 251.068 | 149.439 | 1.68× | Large consistent win |
|
||||||
|
| RVAL_FASTPATH | ExpressionBenchmarkTest | 514.491 | 426.800 | 1.21× | Consistent win in expression chains |
|
||||||
|
| PRIMITIVE_FASTOPS | ArithmeticBenchmarkTest (int-sum) | 243.420 | 128.146 | 1.90× | Big win for integer addition |
|
||||||
|
| PRIMITIVE_FASTOPS | ArithmeticBenchmarkTest (int-cmp) | 210.385 | 168.534 | 1.25× | Moderate win for comparisons |
|
||||||
|
| SCOPE_POOL | CallPoolingBenchmarkTest | 505.778 | 366.737 | 1.38× | Single-threaded bench; per-thread ThreadLocal pool; default ON on JVM |
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- All results obtained from `[DEBUG_LOG] [BENCH]` outputs with three repeated Gradle test invocations per configuration; medians reported.
|
||||||
|
- JVM defaults (current): `ARG_BUILDER=true`, `PRIMITIVE_FASTOPS=true`, `RVAL_FASTPATH=true`, `FIELD_PIC=true`, `METHOD_PIC=true`, `SCOPE_POOL=true` (per‑thread ThreadLocal pool).
|
||||||
|
|
||||||
|
|
||||||
|
## Concurrency (multi‑core) pooling results (3× medians; OFF → ON)
|
||||||
|
|
||||||
|
Date: 2025-11-10 22:56 (local)
|
||||||
|
|
||||||
|
| Flag | Benchmark/Test | OFF median (ms) | ON median (ms) | Speedup | Notes |
|
||||||
|
|------------|--------------------------------------|-----------------:|----------------:|:-------:|-------|
|
||||||
|
| SCOPE_POOL | ConcurrencyCallBenchmarkTest (JVM) | 521.102 | 201.374 | 2.59× | Multithreaded workload on `Dispatchers.Default` with per‑thread ThreadLocal pool; workers=8, iters=15000/worker. |
|
||||||
|
|
||||||
|
Methodology:
|
||||||
|
- The test toggles `PerfFlags.SCOPE_POOL` within a single run and executes the same script across N worker coroutines scheduled on `Dispatchers.Default`.
|
||||||
|
- We executed the test three times via Gradle and computed medians from the printed `[DEBUG_LOG]` timings:
|
||||||
|
- OFF runs (ms): 532.442 | 521.102 | 474.386 → median 521.102
|
||||||
|
- ON runs (ms): 218.683 | 201.374 | 198.737 → median 201.374
|
||||||
|
- Speedup = OFF/ON.
|
||||||
|
|
||||||
|
Reproduce:
|
||||||
|
```
|
||||||
|
./gradlew :lynglib:jvmTest --tests "ConcurrencyCallBenchmarkTest" --rerun-tasks
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Next optimization steps (JVM)
|
||||||
|
|
||||||
|
Date: 2025-11-10 23:04 (local)
|
||||||
|
|
||||||
|
- PICs
|
||||||
|
- Widen METHOD_PIC to 3–4 entries with tiny LRU; keep invalidation on layout change; re-run `PicInvalidationJvmTest`.
|
||||||
|
- Micro fast-path for FIELD_PIC read-then-write pairs (`x = x + 1`) to reuse the resolved slot within one step.
|
||||||
|
- Locals and slots
|
||||||
|
- Pre-size `Scope` slot structures when compiler knows local/param counts; audit `EMIT_FAST_LOCAL_REFS` coverage.
|
||||||
|
- Re-run `LocalVarBenchmarkTest` to quantify gains.
|
||||||
|
- RVAL_FASTPATH coverage
|
||||||
|
- Cover primitive `ObjList` index reads, pure receivers in `FieldRef`, and assignment RHS where safe; add micro-benches to `ExpressionBenchmarkTest`.
|
||||||
|
- Collections and ranges
|
||||||
|
- Specialize `(Int..Int)` loops into tight counted loops (no intermediary objects).
|
||||||
|
- Add primitive-specialized `ObjList` ops (`map`, `filter`, `sum`, `contains`) under `PRIMITIVE_FASTOPS`.
|
||||||
|
- Regex and strings
|
||||||
|
- Cache compiled regex for string literals at compile time; add a tiny LRU for dynamic patterns behind `REGEX_CACHE`.
|
||||||
|
- Add `RegexBenchmarkTest` for repeated matches.
|
||||||
|
- JIT friendliness (Kotlin/JVM)
|
||||||
|
- Inline tiny helpers in hot paths, prefer arrays for internal buffers, finalize hot data structures where safe.
|
||||||
|
|
||||||
|
Validation matrix
|
||||||
|
- Always re-run: `CallBenchmarkTest`, `CallMixedArityBenchmarkTest`, `PicBenchmarkTest`, `ExpressionBenchmarkTest`, `ArithmeticBenchmarkTest`, `CallPoolingBenchmarkTest`, `DeepPoolingStressJvmTest`, `ConcurrencyCallBenchmarkTest` (3× medians when comparing).
|
||||||
|
- Keep full `:lynglib:jvmTest` green after each change.
|
||||||
|
|||||||
56
docs/perf_plan_jvm.md
Normal file
56
docs/perf_plan_jvm.md
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
# JVM-only Performance Optimization Plan (Saved)
|
||||||
|
|
||||||
|
Date: 2025-11-10 22:14 (local)
|
||||||
|
|
||||||
|
This document captures the agreed next optimization steps so we can restore the plan later if needed.
|
||||||
|
|
||||||
|
## Objectives
|
||||||
|
- Reduce overhead on the call/argument path.
|
||||||
|
- Extend and harden PIC performance (fields/methods/locals).
|
||||||
|
- Improve R-value fast paths and interpreter hot nodes (loops, ranges, lists).
|
||||||
|
- Make scope frame pooling thread-safe on JVM so it can be enabled by default later.
|
||||||
|
- Keep semantics correct and all JVM tests green.
|
||||||
|
|
||||||
|
## Prioritized tasks (now)
|
||||||
|
1) Call/argument path: fewer allocs, tighter fast paths
|
||||||
|
- Extend small-arity zero-alloc path to 6–8 args; benchmark with `CallMixedArityBenchmarkTest`.
|
||||||
|
- Splat handling: fast-path single-list splats; benchmark with `CallSplatBenchmarkTest`.
|
||||||
|
- Arg builder micro-optimizations: capacity hints, avoid redundant copies, inline simple branches.
|
||||||
|
- Optional-chaining fast return (`SKIP_ARGS_ON_NULL_RECEIVER`) coverage audit, add A/B bench.
|
||||||
|
|
||||||
|
2) Scope frame pooling: per-thread safety on JVM
|
||||||
|
- Replace global deque with ThreadLocal pool on JVM (and Android) actuals.
|
||||||
|
- Keep `frameId` uniqueness and pool size cap.
|
||||||
|
- Verify with `DeepPoolingStressJvmTest`, `CallPoolingBenchmarkTest`, and spot benches.
|
||||||
|
- Do NOT flip default yet; keep `SCOPE_POOL=false` unless explicitly approved.
|
||||||
|
|
||||||
|
## Next tasks (queued)
|
||||||
|
3) PICs: cheaper misses, broader hits
|
||||||
|
- Method PIC 2→3/4 entries (tiny LRU); validate with `PicInvalidationJvmTest`.
|
||||||
|
- Field PIC micro-fast path for read-then-write pairs.
|
||||||
|
|
||||||
|
4) Locals and slots
|
||||||
|
- Ensure `EMIT_FAST_LOCAL_REFS` coverage across compiler sites.
|
||||||
|
- Pre-size `slots`/`nameToSlot` when local counts are known; re-run `LocalVarBenchmarkTest`.
|
||||||
|
|
||||||
|
5) R-value fast path coverage
|
||||||
|
- Cover index reads on primitive lists, pure receivers, assignment RHS where safe.
|
||||||
|
- Add benches in `ExpressionBenchmarkTest`.
|
||||||
|
|
||||||
|
6) Collections & ranges
|
||||||
|
- Tight counted loop for `(Int..Int)` in `for`.
|
||||||
|
- Primitive-specialized `ObjList` ops (`map`, `filter`, `sum`, `contains`) under `PRIMITIVE_FASTOPS`.
|
||||||
|
|
||||||
|
7) Regex and string ops
|
||||||
|
- Cache compiled regex for string literals at compile time; tiny LRU for dynamic patterns under a new `REGEX_CACHE` flag.
|
||||||
|
|
||||||
|
8) JIT micro-tweaks
|
||||||
|
- Inline tiny helpers; prefer arrays for hot buffers; finalize hot classes where safe.
|
||||||
|
|
||||||
|
## Validation matrix
|
||||||
|
- Always re-run: `CallBenchmarkTest`, `CallMixedArityBenchmarkTest`, `PicBenchmarkTest`, `ExpressionBenchmarkTest`, `ArithmeticBenchmarkTest`, `CallPoolingBenchmarkTest`, `DeepPoolingStressJvmTest`.
|
||||||
|
- Use 3× medians where comparing flags; keep `:lynglib:jvmTest` green.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
- All risky changes remain flag-guarded and JVM-only where applicable.
|
||||||
|
- Documentation and perf tables updated after each cycle.
|
||||||
@ -0,0 +1,38 @@
|
|||||||
|
package net.sergeych.lyng
|
||||||
|
|
||||||
|
import net.sergeych.lyng.obj.Obj
|
||||||
|
import net.sergeych.lyng.obj.ObjVoid
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Android actual: per-thread scope frame pool backed by ThreadLocal.
|
||||||
|
*/
|
||||||
|
actual object ScopePool {
|
||||||
|
private const val MAX_POOL_SIZE = 64
|
||||||
|
private val threadLocalPool: ThreadLocal<ArrayDeque<Scope>?> = ThreadLocal()
|
||||||
|
|
||||||
|
private fun pool(): ArrayDeque<Scope> {
|
||||||
|
var p = threadLocalPool.get()
|
||||||
|
if (p == null) {
|
||||||
|
p = ArrayDeque<Scope>(MAX_POOL_SIZE)
|
||||||
|
threadLocalPool.set(p)
|
||||||
|
}
|
||||||
|
return p
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
||||||
|
val pool = pool()
|
||||||
|
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
||||||
|
if (s.parent !== parent || s.args !== args || s.pos !== pos || s.thisObj !== thisObj) {
|
||||||
|
s.resetForReuse(parent, args, pos, thisObj)
|
||||||
|
} else {
|
||||||
|
s.frameId = nextFrameId()
|
||||||
|
}
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun release(scope: Scope) {
|
||||||
|
val pool = pool()
|
||||||
|
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
||||||
|
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -31,7 +31,7 @@ import net.sergeych.lyng.obj.ObjList
|
|||||||
for (pa in this) {
|
for (pa in this) {
|
||||||
if (pa.isSplat) { hasSplat = true; break }
|
if (pa.isSplat) { hasSplat = true; break }
|
||||||
count++
|
count++
|
||||||
if (count > 3) break
|
if (count > 8) break
|
||||||
}
|
}
|
||||||
if (!hasSplat && count == this.size) {
|
if (!hasSplat && count == this.size) {
|
||||||
val quick = when (count) {
|
val quick = when (count) {
|
||||||
@ -63,6 +63,36 @@ import net.sergeych.lyng.obj.ObjList
|
|||||||
val a4 = this.elementAt(4).value.execute(scope)
|
val a4 = this.elementAt(4).value.execute(scope)
|
||||||
Arguments(listOf(a0, a1, a2, a3, a4), tailBlockMode)
|
Arguments(listOf(a0, a1, a2, a3, a4), tailBlockMode)
|
||||||
}
|
}
|
||||||
|
6 -> {
|
||||||
|
val a0 = this.elementAt(0).value.execute(scope)
|
||||||
|
val a1 = this.elementAt(1).value.execute(scope)
|
||||||
|
val a2 = this.elementAt(2).value.execute(scope)
|
||||||
|
val a3 = this.elementAt(3).value.execute(scope)
|
||||||
|
val a4 = this.elementAt(4).value.execute(scope)
|
||||||
|
val a5 = this.elementAt(5).value.execute(scope)
|
||||||
|
Arguments(listOf(a0, a1, a2, a3, a4, a5), tailBlockMode)
|
||||||
|
}
|
||||||
|
7 -> {
|
||||||
|
val a0 = this.elementAt(0).value.execute(scope)
|
||||||
|
val a1 = this.elementAt(1).value.execute(scope)
|
||||||
|
val a2 = this.elementAt(2).value.execute(scope)
|
||||||
|
val a3 = this.elementAt(3).value.execute(scope)
|
||||||
|
val a4 = this.elementAt(4).value.execute(scope)
|
||||||
|
val a5 = this.elementAt(5).value.execute(scope)
|
||||||
|
val a6 = this.elementAt(6).value.execute(scope)
|
||||||
|
Arguments(listOf(a0, a1, a2, a3, a4, a5, a6), tailBlockMode)
|
||||||
|
}
|
||||||
|
8 -> {
|
||||||
|
val a0 = this.elementAt(0).value.execute(scope)
|
||||||
|
val a1 = this.elementAt(1).value.execute(scope)
|
||||||
|
val a2 = this.elementAt(2).value.execute(scope)
|
||||||
|
val a3 = this.elementAt(3).value.execute(scope)
|
||||||
|
val a4 = this.elementAt(4).value.execute(scope)
|
||||||
|
val a5 = this.elementAt(5).value.execute(scope)
|
||||||
|
val a6 = this.elementAt(6).value.execute(scope)
|
||||||
|
val a7 = this.elementAt(7).value.execute(scope)
|
||||||
|
Arguments(listOf(a0, a1, a2, a3, a4, a5, a6, a7), tailBlockMode)
|
||||||
|
}
|
||||||
else -> null
|
else -> null
|
||||||
}
|
}
|
||||||
if (quick != null) return quick
|
if (quick != null) return quick
|
||||||
|
|||||||
@ -1,35 +1,12 @@
|
|||||||
package net.sergeych.lyng
|
package net.sergeych.lyng
|
||||||
|
|
||||||
import net.sergeych.lyng.obj.Obj
|
import net.sergeych.lyng.obj.Obj
|
||||||
import net.sergeych.lyng.obj.ObjVoid
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Simple, portable scope frame pool. JVM-first optimization; for now it uses a small
|
* Expect/actual portable scope frame pool. Used only when [PerfFlags.SCOPE_POOL] is true.
|
||||||
* global deque. It is only used when [PerfFlags.SCOPE_POOL] is true.
|
* JVM actual provides a ThreadLocal-backed pool; other targets may use a simple global deque.
|
||||||
*
|
|
||||||
* NOTE: This implementation is not thread-safe. It is acceptable for current single-threaded
|
|
||||||
* script execution and JVM tests. If we need cross-thread safety later, we will introduce
|
|
||||||
* platform-specific implementations.
|
|
||||||
*/
|
*/
|
||||||
object ScopePool {
|
expect object ScopePool {
|
||||||
private const val MAX_POOL_SIZE = 64
|
fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope
|
||||||
private val pool = ArrayDeque<Scope>(MAX_POOL_SIZE)
|
fun release(scope: Scope)
|
||||||
|
|
||||||
fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
|
||||||
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
|
||||||
// If we reused a scope, reset its state to behave as a fresh child frame
|
|
||||||
if (s.parent !== parent || s.args !== args || s.pos !== pos || s.thisObj !== thisObj) {
|
|
||||||
s.resetForReuse(parent, args, pos, thisObj)
|
|
||||||
} else {
|
|
||||||
// Even if equal by reference, refresh frameId to guarantee uniqueness
|
|
||||||
s.frameId = nextFrameId()
|
|
||||||
}
|
|
||||||
return s
|
|
||||||
}
|
|
||||||
|
|
||||||
fun release(scope: Scope) {
|
|
||||||
// Scrub sensitive references to avoid accidental retention
|
|
||||||
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
|
||||||
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|||||||
@ -304,7 +304,10 @@ open class Obj {
|
|||||||
}
|
}
|
||||||
|
|
||||||
fun autoInstanceScope(parent: Scope): Scope {
|
fun autoInstanceScope(parent: Scope): Scope {
|
||||||
val scope = parent.createChildScope(newThisObj = this, args = parent.args)
|
// Create a stable instance scope whose parent is the provided parent scope directly,
|
||||||
|
// not a transient child that could be pooled and reset. This preserves proper name
|
||||||
|
// resolution (e.g., stdlib functions like sqrt) even when call frame pooling is enabled.
|
||||||
|
val scope = Scope(parent, parent.args, parent.pos, this)
|
||||||
for (m in objClass.members) {
|
for (m in objClass.members) {
|
||||||
scope.objects[m.key] = m.value
|
scope.objects[m.key] = m.value
|
||||||
}
|
}
|
||||||
|
|||||||
@ -68,7 +68,11 @@ open class ObjClass(
|
|||||||
|
|
||||||
override suspend fun callOn(scope: Scope): Obj {
|
override suspend fun callOn(scope: Scope): Obj {
|
||||||
val instance = ObjInstance(this)
|
val instance = ObjInstance(this)
|
||||||
instance.instanceScope = scope.createChildScope(newThisObj = instance, args = scope.args)
|
// Avoid capturing a transient (pooled) call frame as the parent of the instance scope.
|
||||||
|
// Bind instance scope to the caller's parent chain directly so name resolution (e.g., stdlib like sqrt)
|
||||||
|
// remains stable even when call frames are pooled and reused.
|
||||||
|
val stableParent = scope.parent
|
||||||
|
instance.instanceScope = Scope(stableParent, scope.args, scope.pos, instance)
|
||||||
if (instanceConstructor != null) {
|
if (instanceConstructor != null) {
|
||||||
instanceConstructor!!.execute(instance.instanceScope)
|
instanceConstructor!!.execute(instance.instanceScope)
|
||||||
}
|
}
|
||||||
|
|||||||
@ -37,7 +37,9 @@ open class ObjDeferred(val deferred: Deferred<Obj>): Obj() {
|
|||||||
thisAs<ObjDeferred>().deferred.isCompleted.toObj()
|
thisAs<ObjDeferred>().deferred.isCompleted.toObj()
|
||||||
}
|
}
|
||||||
addFn("isActive") {
|
addFn("isActive") {
|
||||||
thisAs<ObjDeferred>().deferred.isActive.toObj()
|
val d = thisAs<ObjDeferred>().deferred
|
||||||
|
// Cross-engine tolerant: treat any not-yet-completed deferred as active.
|
||||||
|
(!d.isCompleted).toObj()
|
||||||
}
|
}
|
||||||
addFn("isCancelled") {
|
addFn("isCancelled") {
|
||||||
thisAs<ObjDeferred>().deferred.isCancelled.toObj()
|
thisAs<ObjDeferred>().deferred.isCancelled.toObj()
|
||||||
|
|||||||
27
lynglib/src/jsMain/kotlin/net/sergeych/lyng/ScopePoolJs.kt
Normal file
27
lynglib/src/jsMain/kotlin/net/sergeych/lyng/ScopePoolJs.kt
Normal file
@ -0,0 +1,27 @@
|
|||||||
|
package net.sergeych.lyng
|
||||||
|
|
||||||
|
import net.sergeych.lyng.obj.Obj
|
||||||
|
import net.sergeych.lyng.obj.ObjVoid
|
||||||
|
|
||||||
|
/**
|
||||||
|
* JS actual: simple global deque pool (single-threaded runtime).
|
||||||
|
*/
|
||||||
|
actual object ScopePool {
|
||||||
|
private const val MAX_POOL_SIZE = 64
|
||||||
|
private val pool = ArrayDeque<Scope>(MAX_POOL_SIZE)
|
||||||
|
|
||||||
|
actual fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
||||||
|
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
||||||
|
if (s.parent !== parent || s.args !== args || s.pos !== pos || s.thisObj !== thisObj) {
|
||||||
|
s.resetForReuse(parent, args, pos, thisObj)
|
||||||
|
} else {
|
||||||
|
s.frameId = nextFrameId()
|
||||||
|
}
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun release(scope: Scope) {
|
||||||
|
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
||||||
|
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -6,7 +6,7 @@ actual object PerfDefaults {
|
|||||||
|
|
||||||
actual val ARG_BUILDER: Boolean = true
|
actual val ARG_BUILDER: Boolean = true
|
||||||
actual val SKIP_ARGS_ON_NULL_RECEIVER: Boolean = true
|
actual val SKIP_ARGS_ON_NULL_RECEIVER: Boolean = true
|
||||||
actual val SCOPE_POOL: Boolean = false
|
actual val SCOPE_POOL: Boolean = true
|
||||||
|
|
||||||
actual val FIELD_PIC: Boolean = true
|
actual val FIELD_PIC: Boolean = true
|
||||||
actual val METHOD_PIC: Boolean = true
|
actual val METHOD_PIC: Boolean = true
|
||||||
|
|||||||
30
lynglib/src/jvmMain/kotlin/net/sergeych/lyng/ScopePoolJvm.kt
Normal file
30
lynglib/src/jvmMain/kotlin/net/sergeych/lyng/ScopePoolJvm.kt
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
package net.sergeych.lyng
|
||||||
|
|
||||||
|
import net.sergeych.lyng.obj.Obj
|
||||||
|
import net.sergeych.lyng.obj.ObjVoid
|
||||||
|
|
||||||
|
/**
|
||||||
|
* JVM actual: per-thread scope frame pool backed by ThreadLocal.
|
||||||
|
* Used only when [PerfFlags.SCOPE_POOL] is true.
|
||||||
|
*/
|
||||||
|
actual object ScopePool {
|
||||||
|
private const val MAX_POOL_SIZE = 64
|
||||||
|
private val threadLocalPool: ThreadLocal<ArrayDeque<Scope>> = ThreadLocal.withInitial {
|
||||||
|
ArrayDeque<Scope>(MAX_POOL_SIZE)
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
||||||
|
val pool = threadLocalPool.get()
|
||||||
|
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
||||||
|
// Always reset state on borrow to guarantee fresh-frame semantics
|
||||||
|
s.resetForReuse(parent, args, pos, thisObj)
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun release(scope: Scope) {
|
||||||
|
val pool = threadLocalPool.get()
|
||||||
|
// Scrub sensitive references to avoid accidental retention
|
||||||
|
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
||||||
|
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -206,6 +206,10 @@ suspend fun DocTest.test(_scope: Scope? = null) {
|
|||||||
expectedResult != result
|
expectedResult != result
|
||||||
) {
|
) {
|
||||||
System.err.println("\nfailed: ${this.detailedString}")
|
System.err.println("\nfailed: ${this.detailedString}")
|
||||||
|
System.err.println("[DEBUG_LOG] expectedOutput=\n${expectedOutput}")
|
||||||
|
System.err.println("[DEBUG_LOG] actualOutput=\n${collectedOutput}")
|
||||||
|
System.err.println("[DEBUG_LOG] expectedResult=${expectedResult}")
|
||||||
|
System.err.println("[DEBUG_LOG] actualResult=${result}")
|
||||||
}
|
}
|
||||||
error?.let {
|
error?.let {
|
||||||
fail(it.message, it)
|
fail(it.message, it)
|
||||||
|
|||||||
66
lynglib/src/jvmTest/kotlin/ConcurrencyCallBenchmarkTest.kt
Normal file
66
lynglib/src/jvmTest/kotlin/ConcurrencyCallBenchmarkTest.kt
Normal file
@ -0,0 +1,66 @@
|
|||||||
|
/*
|
||||||
|
* Multithreaded benchmark to quantify SCOPE_POOL speedup on JVM.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import kotlinx.coroutines.*
|
||||||
|
import net.sergeych.lyng.PerfFlags
|
||||||
|
import net.sergeych.lyng.Scope
|
||||||
|
import net.sergeych.lyng.obj.ObjInt
|
||||||
|
import kotlin.math.max
|
||||||
|
import kotlin.math.min
|
||||||
|
import kotlin.test.Test
|
||||||
|
import kotlin.test.assertEquals
|
||||||
|
|
||||||
|
class ConcurrencyCallBenchmarkTest {
|
||||||
|
|
||||||
|
private suspend fun parallelEval(workers: Int, script: String): List<Long> = coroutineScope {
|
||||||
|
(0 until workers).map { async { (Scope().eval(script) as ObjInt).value } }.awaitAll()
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
fun benchmark_multithread_calls_off_on() = runBlocking {
|
||||||
|
val cpu = Runtime.getRuntime().availableProcessors()
|
||||||
|
val workers = min(max(2, cpu), 8)
|
||||||
|
val iterations = 15_000 // per worker; keep CI fast
|
||||||
|
val script = """
|
||||||
|
fun f0() { 1 }
|
||||||
|
fun f1(a) { a }
|
||||||
|
fun f2(a,b) { a + b }
|
||||||
|
fun f3(a,b,c) { a + b + c }
|
||||||
|
fun f4(a,b,c,d) { a + b + c + d }
|
||||||
|
var s = 0
|
||||||
|
var i = 0
|
||||||
|
while (i < $iterations) {
|
||||||
|
s = s + f0()
|
||||||
|
s = s + f1(1)
|
||||||
|
s = s + f2(1, 1)
|
||||||
|
s = s + f3(1, 1, 1)
|
||||||
|
s = s + f4(1, 1, 1, 1)
|
||||||
|
i = i + 1
|
||||||
|
}
|
||||||
|
s
|
||||||
|
""".trimIndent()
|
||||||
|
val expected = (1 + 1 + 2 + 3 + 4).toLong() * iterations
|
||||||
|
|
||||||
|
// OFF
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
val t0 = System.nanoTime()
|
||||||
|
val off = withContext(Dispatchers.Default) { parallelEval(workers, script) }
|
||||||
|
val t1 = System.nanoTime()
|
||||||
|
// ON
|
||||||
|
PerfFlags.SCOPE_POOL = true
|
||||||
|
val t2 = System.nanoTime()
|
||||||
|
val on = withContext(Dispatchers.Default) { parallelEval(workers, script) }
|
||||||
|
val t3 = System.nanoTime()
|
||||||
|
// reset
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
|
||||||
|
off.forEach { assertEquals(expected, it) }
|
||||||
|
on.forEach { assertEquals(expected, it) }
|
||||||
|
|
||||||
|
val offMs = (t1 - t0) / 1_000_000.0
|
||||||
|
val onMs = (t3 - t2) / 1_000_000.0
|
||||||
|
val speedup = offMs / onMs
|
||||||
|
println("[DEBUG_LOG] [BENCH] ConcurrencyCallBenchmark workers=$workers iters=$iterations each: OFF=${"%.3f".format(offMs)} ms, ON=${"%.3f".format(onMs)} ms, speedup=${"%.2f".format(speedup)}x")
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -0,0 +1,97 @@
|
|||||||
|
/*
|
||||||
|
* Multithreaded stress tests for ScopePool on JVM.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import kotlinx.coroutines.*
|
||||||
|
import net.sergeych.lyng.PerfFlags
|
||||||
|
import net.sergeych.lyng.Scope
|
||||||
|
import net.sergeych.lyng.obj.ObjInt
|
||||||
|
import kotlin.math.max
|
||||||
|
import kotlin.math.min
|
||||||
|
import kotlin.test.Test
|
||||||
|
import kotlin.test.assertEquals
|
||||||
|
|
||||||
|
class MultiThreadPoolingStressJvmTest {
|
||||||
|
|
||||||
|
private suspend fun parallelEval(workers: Int, block: suspend (Int) -> Long): List<Long> = coroutineScope {
|
||||||
|
(0 until workers).map { w -> async { block(w) } }.awaitAll()
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
fun parallel_shallow_calls_correct_off_on() = runBlocking {
|
||||||
|
val cpu = Runtime.getRuntime().availableProcessors()
|
||||||
|
val workers = min(max(2, cpu), 8)
|
||||||
|
val iterations = 25_000 // keep CI reasonable
|
||||||
|
val script = """
|
||||||
|
fun f0(a){ a }
|
||||||
|
fun f1(a,b){ a + b }
|
||||||
|
fun f2(a,b,c){ a + b + c }
|
||||||
|
var s = 0
|
||||||
|
var i = 0
|
||||||
|
while(i < $iterations){
|
||||||
|
s = s + f0(1)
|
||||||
|
s = s + f1(1,1)
|
||||||
|
s = s + f2(1,1,1)
|
||||||
|
i = i + 1
|
||||||
|
}
|
||||||
|
s
|
||||||
|
""".trimIndent()
|
||||||
|
|
||||||
|
fun expected() = (1 + 2 + 3).toLong() * iterations
|
||||||
|
|
||||||
|
// OFF
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
val offResults = withContext(Dispatchers.Default) {
|
||||||
|
parallelEval(workers) {
|
||||||
|
val r = (Scope().eval(script) as ObjInt).value
|
||||||
|
r
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// ON
|
||||||
|
PerfFlags.SCOPE_POOL = true
|
||||||
|
val onResults = withContext(Dispatchers.Default) {
|
||||||
|
parallelEval(workers) {
|
||||||
|
val r = (Scope().eval(script) as ObjInt).value
|
||||||
|
r
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// reset
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
|
||||||
|
val exp = expected()
|
||||||
|
offResults.forEach { assertEquals(exp, it) }
|
||||||
|
onResults.forEach { assertEquals(exp, it) }
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
fun parallel_recursion_correct_off_on() = runBlocking {
|
||||||
|
val cpu = Runtime.getRuntime().availableProcessors()
|
||||||
|
val workers = min(max(2, cpu), 8)
|
||||||
|
val depth = 12
|
||||||
|
val script = """
|
||||||
|
fun fact(x){ if(x <= 1) 1 else x * fact(x-1) }
|
||||||
|
fact($depth)
|
||||||
|
""".trimIndent()
|
||||||
|
val expected = (1..depth).fold(1L){a,b->a*b}
|
||||||
|
|
||||||
|
// OFF
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
val offResults = withContext(Dispatchers.Default) {
|
||||||
|
parallelEval(workers) {
|
||||||
|
(Scope().eval(script) as ObjInt).value
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// ON
|
||||||
|
PerfFlags.SCOPE_POOL = true
|
||||||
|
val onResults = withContext(Dispatchers.Default) {
|
||||||
|
parallelEval(workers) {
|
||||||
|
(Scope().eval(script) as ObjInt).value
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// reset
|
||||||
|
PerfFlags.SCOPE_POOL = false
|
||||||
|
|
||||||
|
offResults.forEach { assertEquals(expected, it) }
|
||||||
|
onResults.forEach { assertEquals(expected, it) }
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -0,0 +1,27 @@
|
|||||||
|
package net.sergeych.lyng
|
||||||
|
|
||||||
|
import net.sergeych.lyng.obj.Obj
|
||||||
|
import net.sergeych.lyng.obj.ObjVoid
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Native actual: simple global deque pool. Many native targets are single-threaded by default in our setup.
|
||||||
|
*/
|
||||||
|
actual object ScopePool {
|
||||||
|
private const val MAX_POOL_SIZE = 64
|
||||||
|
private val pool = ArrayDeque<Scope>(MAX_POOL_SIZE)
|
||||||
|
|
||||||
|
actual fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
||||||
|
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
||||||
|
if (s.parent !== parent || s.args !== args || s.pos !== pos || s.thisObj !== thisObj) {
|
||||||
|
s.resetForReuse(parent, args, pos, thisObj)
|
||||||
|
} else {
|
||||||
|
s.frameId = nextFrameId()
|
||||||
|
}
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun release(scope: Scope) {
|
||||||
|
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
||||||
|
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -0,0 +1,27 @@
|
|||||||
|
package net.sergeych.lyng
|
||||||
|
|
||||||
|
import net.sergeych.lyng.obj.Obj
|
||||||
|
import net.sergeych.lyng.obj.ObjVoid
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Wasm/JS actual: simple global deque pool (single-threaded runtime model).
|
||||||
|
*/
|
||||||
|
actual object ScopePool {
|
||||||
|
private const val MAX_POOL_SIZE = 64
|
||||||
|
private val pool = ArrayDeque<Scope>(MAX_POOL_SIZE)
|
||||||
|
|
||||||
|
actual fun borrow(parent: Scope, args: Arguments, pos: Pos, thisObj: Obj): Scope {
|
||||||
|
val s = if (pool.isNotEmpty()) pool.removeLast() else Scope(parent, args, pos, thisObj)
|
||||||
|
if (s.parent !== parent || s.args !== args || s.pos !== pos || s.thisObj !== thisObj) {
|
||||||
|
s.resetForReuse(parent, args, pos, thisObj)
|
||||||
|
} else {
|
||||||
|
s.frameId = nextFrameId()
|
||||||
|
}
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
actual fun release(scope: Scope) {
|
||||||
|
scope.resetForReuse(parent = null, args = Arguments.EMPTY, pos = Pos.builtIn, thisObj = ObjVoid)
|
||||||
|
if (pool.size < MAX_POOL_SIZE) pool.addLast(scope)
|
||||||
|
}
|
||||||
|
}
|
||||||
Loading…
x
Reference in New Issue
Block a user