Architecture & Vision

The Holistic Cloud OS for AI Swarms. Designed to eliminate agentic hallucinations, cut token bloat, and provide zero-trust execution out of the box.

The OS Topology

Neonia acts as the deterministic nervous system for your swarm. Local agents connect via a single MCP stream, offloading heavy compute, state governance, and inter-process communication to the cloud.

Local Agents / Cursor, Antigravity IDEs
⚡ MCP SSE Stream
☁️ Neonia Cloud OS
🧠 1. State Layer
Tier 1: Ephemeral Context
Tier 2: Governed Graph
⚙️ 2. Compute Layer
Zero-Trust Wasm Sandboxes
Token Arbitrage (neonia://)
📬 3. Routing & IPC
Agentic Queues (Pub/Sub)
Trust Graph (Discovery)

1. Two-Tier Governed Memory

Forcing an LLM to hold all raw execution logs in its prompt leads to Agentic Amnesia and Poison Propagation (where one hallucinated fix infects the entire swarm). Neonia enforces a strict two-tier memory hierarchy:

  • Tier 1 (Working Memory): The agent's local context window remains pristine, holding only active intents and lightweight pointers.
  • Tier 2 (Governed Graph Memory): The swarm's persistent cloud brain. When an agent solves a bug, the OS mathematically maps a Symptom -> Cause -> Rule graph. Outdated rules are algorithmically superseded, guaranteeing future agents retrieve exact, zero-hallucination logic.

Fragmented Setup

Local JSON State

+

Ad-hoc Python Scripts

=

Poison Propagation

VS

Neonia Holistic OS

Governed Graph Memory

+

Deterministic Wasm Tools

=

Zero Hallucinations

2. Token Arbitrage

Passing massive payloads (like 5MB JSON files or raw HTML) directly into an LLM context window is expensive and slow. We call this Token Bloat.

With Neonia, the heavy lifting stays in the Wasm cloud. Instead of returning a 5MB string, the OS returns a lightweight 15-byte pointer: neonia://resource/123. The agent passes this pointer to the next tool. This is Token Arbitrage.

[Raw HTML]---(5MB)--->LLM Context (Crash / $10)
[Raw HTML]---(Wasm Extraction)--->neonia://res/123---(15 bytes)--->LLM Context ($0.001)

3. Security (Zero-Trust)

Running untrusted Python code written by an LLM is a recipe for Remote Code Execution (RCE) vulnerabilities.

Neonia runs all tools as WebAssembly (wasm32-wasip1) modules. This provides a mathematically proven, default-deny sandbox. Tools cannot access the network, filesystem, or environment variables unless explicitly granted permission by the OS. It is a true Zero-Trust execution environment.

4. Agentic Queues (Swarm Orchestration)

True swarm autonomy requires asynchronous execution. If Agent A makes a synchronous HTTP call to process a heavy file, the LLM thread hangs, burning API compute time and causing deadlocks.

Neonia provides native Inter-Process Communication (IPC) via Pub/Sub Queues. An 'Architect Agent' publishes a task to the OS event loop and instantly clears its context to continue reasoning. Once the background Wasm tool finishes, the OS wakes up the relevant 'Worker Agents' via the SSE stream. Zero blocking. Infinite horizontal scaling.

[Architect Agent] ---> PUBLISH: {topic: "process", payload: "neonia://file_99"} [OS Broker] ---> Task queued. Architect Agent thread freed instantly. ... (Background Zero-Trust Wasm Execution) ... [Coder Agent] <--- EVENT WAKEUP: topic "process_done" [Coder Agent] ---> CONSUME: "neonia://file_result" -> Proceeding with logic.

5. Dynamic Discovery (The Trust Graph)

Static tool registries break context windows. Hardcoding 50 tool JSON schemas into an agent's prompt will exhaust its capacity before the task even begins.

Neonia uses Just-In-Time (JIT) Tool Injection. Agents spawn with only a "Discovery" meta-tool. They declare an intent, and the OS queries its global Trust Graph—evaluating historical swarm telemetry—to dynamically inject only the single highest-rated WebAssembly tool schema required for the task.

[Static Prompt] Hardcoding 50 schemas ---> 25,000 tokens wasted pre-execution. [Neonia JIT] Agent declares intent ---> OS injects 1 top-trusted schema.