World Create Performance

World Creation Performance Research (Personal Worlds)

TL;DR

This document investigates personal world initialization time under different concurrency scenarios and evaluates whether protocol/message-level optimizations are worthwhile.

Key takeaways:

Baseline: 1 world init completes in ~36s on local host.
Baseline concurrency issue: 5 worlds initialized “at the same time” show cumulative delays (48s → 187s).
Optimization: batching (SetChunkVerifiers) dramatically reduces on-chain message count and gas overhead.
After batching: 10 concurrent world inits complete consistently in ~29–37s (avg ~35s).

1) Background and Goal

In Akkadia, “world initialization” is an end-to-end workflow triggered after CreateWorld.
It includes off-chain processing (biome/chunk pipeline) and on-chain state writes (chunk verifier registration).

The on-chain bottleneck candidate was verifier writes.
Before optimization, writes were effectively many message-level calls (SetChunkVerifier style).
After optimization, writes were grouped into batched calls (SetChunkVerifiers), with BATCH_SIZE=100 in bridge message construction.


// contracts/personal_world/data.gno
var (
    verifierStore = avl.NewTree() // worldID(string) -> map[chunkKey]verifier
)
 
func SetChunkVerifier(cur realm, worldID uint32, chunkKey string, verifier string) {
    caller := runtime.PreviousRealm().Address()
    assertIsAdminOrOperator(caller)
 
    worldIDStr := ufmt.Sprintf("%d", worldID)
    var chunkMap map[string]string // chunkKey -> verifier
 
    if value, exists := verifierStore.Get(worldIDStr); exists {
        chunkMap = value.(map[string]string)
    } else {
        chunkMap = make(map[string]string)
        verifierStore.Set(worldIDStr, chunkMap)
    }
 
    chunkMap[chunkKey] = verifier
}
 
func SetChunkVerifiers(cur realm, worldID uint32, chunkKeys string, verifiers string) {
    caller := runtime.PreviousRealm().Address()
    assertIsAdminOrOperator(caller)
 
    if chunkKeys == "" { panic("chunkKeys must not be empty") }
    if verifiers == "" { panic("verifiers must not be empty") }
 
    // direct scan parsing (no strings.Split)
    keyStart, valStart := 0, 0
    keyIdx, valIdx := 0, 0
    for {
        for keyIdx < len(chunkKeys) && chunkKeys[keyIdx] != ',' { keyIdx++ }
        for valIdx < len(verifiers) && verifiers[valIdx] != ',' { valIdx++ }
 
        key := chunkKeys[keyStart:keyIdx]
        val := verifiers[valStart:valIdx]
        if key == "" { panic("empty chunkKey not allowed") }
        if val == "" { panic("empty verifier not allowed") }
 
        chunkMap[key] = val
 
        keyEnd := keyIdx >= len(chunkKeys)
        valEnd := valIdx >= len(verifiers)
        if keyEnd != valEnd { panic("chunkKeys and verifiers count mismatch") }
        if keyEnd { break }
 
        keyIdx++
        valIdx++
        keyStart = keyIdx
        valStart = valIdx
    }
}

This report focuses on how that message strategy changes completion latency under concurrent initialization requests.

Goal

Measure initialization duration for:
- 1 world (baseline)
- 5 worlds in parallel (baseline concurrency behavior)
Identify an optimization that reduces bottlenecks (message count / gas / parsing overhead).
Re-run concurrency benchmarks after applying the optimization.

2) Test Environment (Baseline Benchmarks)

Date: 2026-02-05
Execution: host (local machine)
Gno RPC: http://localhost:26657
Bridge: http://localhost:3020

Metric definition

“Duration” = (completedAt - createdAt) until the world reaches COMPLETED.

3) Experiment A — Baseline Initialization Benchmarks

A-1. Single world initialization

worldId	status	createdAt	completedAt	Duration
8	COMPLETED	08:40:35.245	08:41:11.643	36s

A-2. 5 worlds initialized concurrently

Procedure:

Create 5 worlds individually.
Send the init request for all 5 worlds at the same time.

worldId	status	createdAt	completedAt	Duration
12	COMPLETED	08:44:50.689	08:45:39.563	48s
11	COMPLETED	08:44:50.733	08:46:18.696	87s
10	COMPLETED	08:44:50.711	08:46:51.834	121s
9	COMPLETED	08:44:50.768	08:47:24.948	154s
13	COMPLETED	08:44:50.797	08:47:58.130	187s

Observation

Even though init requests are sent concurrently, completion times drift significantly.
This suggests a bottleneck (or partial serialization) in message processing, RPC throughput, bridge handling, or on-chain execution costs.

4) Experiment B — “Parameter Readline” / Parsing & Gas Research

B-1. Hypothesis

If we avoid strings.Split(str, ",") and instead scan the string directly (handling delimiters as we encounter them), we may reduce gas usage by avoiding extra slice allocations and intermediate strings.

B-2. Implementation idea

Implement SetChunkVerifiers using a single-pass scan parser (no strings.Split).
Prefer batch processing over repeated individual calls.

B-3. Measurements

`SetChunkVerifiers` (batch, single-pass scan parsing)

Items	Keys length (chars)	Vals length (chars)	Gas used	Gas / item	Storage delta
3	11	194	5,512,197	1,837,399	1,905 bytes
10	39	649	6,912,890	691,289	3,263 bytes
25	114	1,624	9,956,000	398,240	6,188 bytes
50	239	3,249	15,027,860	300,557	11,063 bytes
100	489	6,499	25,171,592	251,716	20,815 bytes

`SetChunkVerifier` (single item per call)

Items	Gas used	Gas / item
1	4,914,970	4,914,970

B-4. Comparative analysis

For 100 items:

Approach	Total gas	Relative
Batch (`SetChunkVerifiers`)	25,171,592	1.0x
100 individual calls (`SetChunkVerifier × 100`)	491,497,000 (estimated)	19.5x

Result: batching reduces total gas by ~95% (for 100 items), mainly by removing per-message overhead.

B-5. Gas growth per additional item (batch mode)

Range	Added items	Incremental gas per item
3 → 10	7	200,099
10 → 25	15	202,874
25 → 50	25	202,874
50 → 100	50	202,875

Conclusion: gas increases by approximately ~200,000 gas per item, showing a stable linear relationship.

B-6. Outcome

Batching is the dominant win: ~95% gas reduction vs per-item calls.
Single-pass scan parsing removes additional allocation overhead compared to strings.Split.
Predictable scaling: per-item incremental gas is ~200k.
Practical batch size: 50–100 items is efficient in terms of gas/item.

5) Experiment C — Benchmark After Applying Batch Messages

C-1. Change summary

Replace individual SetChunkVerifier messages with batch SetChunkVerifiers.
Batch size: 100
Chunks: 249 total → 3 batch messages (100 + 100 + 49)

C-2. Test environment

Date: 2026-02-05
Execution: host (local machine)
Biome: verdant_hollow (249 chunks)

C-3. 10 worlds initialized concurrently (after batching)

worldId	status	Duration
14	COMPLETED	37s
15	COMPLETED	36s
16	COMPLETED	36s
17	COMPLETED	36s
18	COMPLETED	29s
19	COMPLETED	37s
20	COMPLETED	37s
21	COMPLETED	36s
22	COMPLETED	29s
23	COMPLETED	36s

Average: ~35s

6) Before/After Comparison

Scenario	Before	After
Single world	36s	(not re-measured)
5 concurrent worlds	48–187s (cumulative delay)	(not re-measured)
10 concurrent worlds	(not measured)	29–37s (stable)

7) Conclusions

Applying batch messages eliminates the “cumulative delay” pattern under concurrency (in this environment and biome).
With batching enabled, 10 concurrent inits remain close to single-world baseline time (~35s).
Message count reduction is substantial: 249 → 3 (≈ 83× fewer messages), which aligns with the observed stabilization.

8) Next Steps

Re-run the 5 concurrent worlds benchmark after batching for an apples-to-apples before/after comparison.
Add repeated runs and report median / p95 durations instead of single measurements.
Validate limits and safety:
- max batch size vs message size / gas limit
- failure handling for partial batches
Measure on a production-like environment (network latency, node load) to confirm the improvement holds outside localhost.