Technical whitepaper, working draft

A sovereign, self improving AI owned by its network

The complete technical reference: the sovereign stack, on device Personal AI with federated privacy, two tier decentralized compute, self improvement within immutable guardrails, ORB tokenomics, and governance.

See the technology
Contents
  1. 01Vision, Thesis & Market Context
  2. 02System Architecture & The Sovereign Stack
  3. 03Personal AI & Privacy by Design
  4. 04The Data Flywheel & Knowledge Acquisition
  5. 05Two-Tier Decentralized Compute, Training & Verification
  6. 06The Self-Improvement Engine & Automatic Release
  7. 07Network Architecture, Node Lifecycle & The Live Foundation
  8. 08ORB Tokenomics
  9. 09Governance, the DAO & Progressive Decentralization
  10. 10Legal, Compliance & Risk Posture
  11. 11Roadmap, Risks, Competition & Reference
  12. GGlossary
  13. RReferences
01

Vision, Thesis & Market Context

1.1 Abstract

Hylon is a decentralized artificial-intelligence network built on top of a live, revenue-generating product rather than a promise. That product is OrbNet / OrbVPN, a censorship-circumvention VPN with paying users in restricted markets, Iran, Russia, and comparable jurisdictions, operating multi-region infrastructure, a node fleet, an admin back office, and a production on-chain token economy. The token, ORB (an ERC-20 deployed on Base), already runs in production: staking, reward distribution, signed-voucher claims, a gasless relayer for users who hold no gas, and OFSI/OFAC sanctions screening at the point of payout. Hylon adds an AI layer onto this working business. This ordering, network first, tokens as plumbing, AI on top, is the single most important fact about the project and the sharpest line separating it from the large population of DePIN and "decentralized AI" efforts that begin as a token and a whitepaper and never ship a service anyone pays for.

The network's stated destination is uncompromising: to out-compete OpenAI, Anthropic, and every centralized frontier lab. We state that plainly because a mission worth years of work should be stated plainly, and we immediately qualify it with the discipline that makes it credible rather than delusional. Hylon does not attempt to meet the incumbents head-on at the general frontier on day one, where they hold a decisive capital and talent advantage. It wins first in a wedge those labs cannot structurally contest, a user's full personal context held privately on their own device; operation inside censored and blocked markets; and underserved languages such as Farsi, Arabic, Russian, and Turkish, and then compounds the data, revenue, and treasury-funded compute that the wedge generates toward the general frontier over a period of years. Every claim in this document is gate-based: it is tied to a specific, verifiable technical or commercial milestone, and is asserted only once that gate is crossed. There are no dated promises of artificial general intelligence and no superlatives that are not anchored to a measurable test.

CORE DIFFERENTIATOR

Hylon is not a DePIN searching for demand. It is an AI layer added to a censorship-circumvention VPN that already has real users, real revenue, and a production on-chain token economy in the hardest markets on earth.

1.2 Executive Summary

The entire thesis of this whitepaper reduces to five points. Everything that follows is an elaboration of them.

  1. A real business underneath. Hylon is built on OrbNet/OrbVPN, a live product with paying users in restricted markets and a production ORB token economy on Base. Value and verified demand come before token emissions, not after, the discipline that separated the DePIN survivors from the death spirals.
  2. A wedge the incumbents cannot contest. Centralized labs are structurally blind to a user's private on-device context, largely absent or blocked in censored markets, and accountable to shareholders rather than users. Hylon attacks precisely this gap first, where its advantage is defensible rather than merely cheaper.
  3. An unbuyable data flywheel. A million Personal AIs running on member devices generate reinforcement signal from real, multilingual, censored-market user tasks, data that is uncollectable by any centralized lab for legal and physical reasons. Raw data never leaves the device; only privacy-preserving learned signal is aggregated. A competitor can outspend Hylon but cannot legally or physically assemble this corpus.
  4. A sovereign, decentralized stack. One self-contained Hylon AI, a mixture-of-experts core, router, retrieval and memory, tool use, specialist heads, and agent orchestration presenting as a single system, with no hard dependency on any external provider API. It is trained by a DAO-owned GPU cluster plus staked operators coordinating over the internet via low-communication training, and served and personalized by the million-device fleet. All contributions are metered and cryptographically or economically verified, never rewarded for idle uptime.
  5. Gate-based honesty toward the frontier. The destination is to out-compete every lab, but the path is staged: ship a sovereign private stack (Gate 0), dominate the wedge domains (Gate 1), stand up a self-improving self-releasing model on DAO-owned compute (Gate 2), and only then pursue the general frontier (Gate 3+), conditional on treasury-funded compute and validated training recipes crossing an explicit threshold. Nothing at the frontier is claimed until it is crossed.
The Hylon stack, built bottom-up on a live business
01
Live product
OrbNet/OrbVPN, paying users in censored markets, multi-region infra
02
On-chain economy
ORB on Base, staking, rewards, signed vouchers, gasless relayer, sanctions screening
03
Sovereign AI layer
MoE core + router + memory + tools, one self-contained system, no external API dependency
04
Data flywheel
1M Personal AIs → RL signal → private aggregation → stronger model
05
Frontier program
Treasury-funded compute + validated recipes, gated, Year 3+

1.3 The Problem, Three Structural Blind Spots of Centralized Labs

Frontier labs are formidable, well-capitalized, and staffed by extraordinary researchers. Hylon does not underestimate them. But their strength is concentrated in one dimension, raw compute and model scale, and they carry three structural weaknesses that no amount of capital removes, because the weaknesses are consequences of what they are, not of how much they spend. A structural weakness is one a competitor cannot buy their way out of. These three define the terrain on which Hylon chooses to fight.

Blind spot one, no access to a user's private context

The most valuable substrate for a genuinely useful personal assistant is the full private context of a person's life: their messages, files, calendar, browsing, location history, health data, relationships, and the accumulated texture of how they actually work and think. A centralized lab cannot ethically or legally centralize this data at scale, and users are increasingly unwilling to hand it over. The lab's product therefore meets each user as a relative stranger at the start of every session. Hylon's architecture inverts this: a compact Personal AI, a 1-to-4-billion-parameter-class model, or a ternary BitNet-style model, runs on the device and learns the user's context locally. The raw data never leaves. Only privacy-preserving learned signal is shared upward, via federated learning with differential privacy and secure aggregation. The lab's structural constraint is Hylon's structural advantage.

Blind spot two, absent or blocked across much of the world

A large share of humanity lives in markets where the major labs' services are geo-blocked, sanctioned out, sporadically throttled, or simply unavailable in the local language at usable quality. Users in Iran cannot reliably reach the leading Western assistants; users across the Farsi-, Arabic-, Russian-, and Turkish-speaking world are served, at best, as an afterthought to English. Hylon starts from the opposite position: it is already inside these markets through OrbNet/OrbVPN's censorship-circumvention infrastructure, already reaches these users, and treats their languages and their operating conditions, intermittent connectivity, adversarial networks, the need for privacy under surveillance, as the design center rather than the edge case.

Blind spot three, owned by shareholders, not users

A centralized lab is ultimately accountable to its equity holders. Its objective function is enterprise revenue and shareholder return, which is a legitimate business posture but a different one from serving the individual user's interest without conflict. Model behavior, data usage, pricing, and availability are set to optimize the owner's return. Hylon's ownership structure is deliberately different: a DAO progressively governs reward rates, treasury allocation, and protocol upgrades on published milestones over roughly two to four years, and contributors, the people who supply compute, data, and personalization, earn ORB for verified work. (Model releases and safety constraints, by design, remain with a technical safety council transparent to the DAO rather than subject to a token vote, a point developed fully in the governance section.) The network is owned by its participants, and its incentives point at them.

WHY THESE ARE UNCONTESTABLE

A lab can hire Hylon's engineers and outspend its treasury. It cannot legally centralize a billion private contexts, cannot un-block itself in sanctioned markets, and cannot restructure its fiduciary duty away from its shareholders. These are the three seams Hylon builds into.

1.4 Vision & Thesis

The destination

Hylon's mission is to build an AI that out-competes every centralized lab, not a niche tool that coexists politely beside them, but a system that is, for the tasks that matter most to a real person, genuinely better. We state this as the destination because half-measures do not attract the talent, capital, or conviction required to matter, and because the structural advantages described above are large enough to make it a serious ambition rather than a slogan. What follows is the discipline that keeps the ambition honest.

The method, wedge first, then compound

Hylon does not fight the incumbents where they are strongest. It fights where they are structurally weak, wins there decisively, and uses the winnings to advance. The method has three compounding loops. First, the wedge: dominate personal-context tasks and underserved-language, censored-market use, domains the labs cannot reach, and become the best available AI for those users. Second, the data flywheel: the million Personal AIs generate reinforcement signal from real user tasks, this signal is privately aggregated into a stronger global model, and the stronger global model produces better Personal AIs, tightening the loop with every cycle. Third, the economic engine: genuine service revenue from the wedge funds a DAO treasury, the treasury funds compute, and compute plus validated training recipes push the model toward the general frontier. The wedge is not the ceiling; it is the beachhead and the fuel supply.

Wedge-first, then compound: the self-reinforcing loop
01
Win the wedge
Dominate private-context + censored-market + underserved-language tasks
02
Generate signal
1M Personal AIs produce RL signal from real user tasks, uncollectable by labs
03
Aggregate privately
Federated learning + differential privacy + secure aggregation; raw data stays on-device
04
Train stronger model
DAO GPU cluster + staked operators, low-communication distributed training
05
Better Personal AIs
Improved global model ships back to every device
06
Fund the treasury
Genuine service revenue funds compute → pushes toward the frontier
↺ loops back to 01

It is worth being explicit about the honest constraint at the end of this path. Out-competing everyone at the general frontier is ultimately a function of aggregate FLOPs, which is a function of GPU count, which is a function of capital, a treasury question. Hylon's low-communication training approach makes bandwidth a solved problem and lets geographically distributed GPUs train together, but it does not repeal the need for the GPUs themselves. That is why the frontier is a Gate-3-and-beyond program, claimed only when treasury-funded compute crosses the threshold, and why the wedge and its revenue come first.

Gate-based honesty

Every forward-looking claim in this whitepaper is attached to a gate, a defined, measurable condition, and is asserted only once the gate is crossed:

Gate 0 (~6 months)

Sovereign stack live: Personal AI + verified compute + ORB + DAO shipped as one private, self-hosted, uncensored product on a top open model. Claim earned: "the strongest AI for your life, private, and available where others are blocked."

Gate 1 (12 to 18 months)

Wedge dominance: a continually-trained model that beats all open models in wedge domains and wins blind A/B tests versus frontier systems on personal-context tasks.

Gate 2 (Year 2)

Self-improving and self-releasing on a DAO-owned cluster: the champion, challenger release loop runs live under the safety council's final gate.

Gate 3+ (Year 3+)

General-frontier program: conditional on compute and validated recipes crossing an explicit threshold. Claimed only once crossed, never before.

1.5 Why Now

Hylon is timed to a convergence of four independent developments, each of which crossed a practical threshold recently. None of these existed in usable form even a few years ago; together they make a decentralized frontier-directed AI network buildable for the first time.

Open weights have reached near-frontier quality

The strongest permissively-licensed open-weight foundation models now sit close enough to the frontier that a sovereign stack built on continual training and reinforcement learning atop them is a credible starting point rather than a toy. Hylon builds from these models, adds continual training and RL on its own uncollectable data, and thereby inherits the world's open research effort as a floor while differentiating on the data and personalization that only it possesses.

Low-communication distributed training is proven

The classical objection to decentralized training, that synchronizing gradients across the internet is impossibly slow, has been answered in practice. Methods in the DiLoCo / Streaming DiLoCo / DisTrO / DeMo family cut inter-node communication by roughly two-to-four orders of magnitude (about 400× to 10,000×), and real runs have demonstrated it: INTELLECT-1, a 10-billion-parameter model, was trained across three continents over links of 127 to 935 Mbit/s. Bandwidth is, for training purposes, a solved problem: roughly 100 Mbps to 1 Gbps per node suffices. The oft-quoted "5,000 years to train over the internet" figure describes the naive full-gradient-every-step strawman that nobody actually uses. The real constraints are per-node memory (each training node must hold the model and optimizer state) and aggregate FLOPs, which is exactly why the training tier is datacenter-class GPUs, not phones, and why frontier scale is ultimately a capital question.

On-device inference has matured

Running capable models directly on phones, laptops, and desktops is now practical. Ternary and 1-bit approaches such as BitNet b1.58 serve real models on commodity hardware, on the order of 45 tokens per second on an Apple M2 CPU in roughly 0.4 GB of memory, and NPUs and Apple Silicon give member devices genuine inference capacity. (Honestly: BitNet's gains are inference-only; the model is still trained in full precision via quantization-aware training, so this lowers serving cost, not training cost.) This maturity is what makes a private on-device Personal AI, and a million-device inference and personalization layer, an engineering reality rather than an aspiration.

The DePIN market has corrected toward real revenue

The speculative phase of decentralized physical infrastructure has ended, and the correction rewards exactly the discipline Hylon is built on. Across 2025 the entire DePIN sector produced on the order of $72 million in total on-chain revenue against roughly $10 billion in market capitalization, a sobering ratio that has punished headcount-driven token emissions and rewarded networks selling something real. The market now pays for demonstrated demand, not registered-device counts. A network that begins from a revenue-generating product and enforces value-before-emissions is arriving into a market that has learned to price precisely that.

CONVERGENCE

Near-frontier open weights, communication-efficient distributed training, mature on-device inference, and a DePIN market that finally rewards real revenue, the first moment at which a decentralized network aimed at the frontier is actually buildable.

1.6 Market Positioning

Hylon occupies a position no existing category fully covers, and it is important to describe that position honestly against both the centralized labs and the existing decentralized networks, including their genuine strengths.

Versus centralized labs

The labs win today on the axis of raw compute scale and general-frontier capability. Hylon does not dispute that and does not pretend to match it at launch. Hylon wins on the orthogonal axis: data locality and user ownership, private on-device context, presence in blocked markets, underserved-language depth, and participant ownership of the network. The strategy is to be unbeatable on that axis first and to climb the compute axis over years, funded by revenue and treasury. On the positioning map, the labs sit high on compute and low on data-locality/ownership; Hylon starts lower on compute but far higher on locality/ownership, and moves up-and-to-the-right as the flywheel and treasury compound.

Versus other decentralized networks

The existing DePIN and decentralized-AI networks are real and, in several cases, well-executed, but each occupies a different square than Hylon, and the honest landscape matters. No decentralized network today earns material revenue from consumer-phone FLOPs; that is simply not yet a viable business, and Hylon does not claim otherwise. What the leaders actually sell is instructive:

What the decentralized-AI / DePIN leaders actually sell (honest 2025 figures)
NetworkWhat it sellsScaleApprox. market cap / revenueNote
HylonPrivate AI on a live VPN + verified compute/dataLive product, launching AI layerPre-token AI layer; live ORB economyRevenue-generating base; two-tier compute
GrassWeb bandwidth + scraped data to labsLarge user base~$330M capData/bandwidth, not compute
AcurastSmartphone compute~255k phones~$18 to 25M capStore-approved precedent; demand unproven
SaladConsumer gaming-PC GPUs~60k daily GPUs~$10M ARR, no tokenReal demand sustains it
Nous / Templar / Prime IntellectDecentralized model training8×H100 to 512×H200 clustersGPU-boundRuns on datacenter GPUs, not phones
DePIN sector (aggregate)Mixed infraWhole sector~$72M revenue vs ~$10B capMarket now prices real revenue
  • Grass (roughly $330M market cap) sells web bandwidth and scraped data to AI labs, not compute. It is a data-and-bandwidth network, and a successful one, but it does not train or serve models.
  • Acurast (around 255,000 phones, roughly $18 to 25M cap) is the leader in smartphone compute and a useful compliance precedent, approved on both mobile app stores, but the demand side for phone compute remains unproven.
  • Salad is the consumer gaming-PC leader, with on the order of 60,000 daily GPUs and roughly $10M ARR, and notably operates with no token at all, a reminder that real demand, not a token, is what sustains a network.
  • Decentralized training networks are real but GPU-bound: Nous Psyche trains a 40B model across ~8×H100 nodes; Templar's Covenant-72B requires ~8×B200 per peer; and Prime Intellect's INTELLECT-3 relied on a central 512×H200 cluster. Distributed training works, and it runs on datacenter GPUs, not phones.

Hylon's distinct position is the combination none of them hold: a live revenue-generating consumer product as the base, a two-tier compute design that puts training on datacenter GPUs and puts inference, personalization, and data on the million-device fleet where devices are genuinely good, and a data flywheel rooted in private, censored-market, multilingual user tasks. Inference, worth noting, is roughly two-thirds of total AI compute demand, so a network that serves and personalizes at the device edge is addressing the larger half of the market, not the smaller.

Positioning: compute scale / frontier capability (x) vs data locality + user ownership (y). Hylon starts high on locality and climbs the compute axis over time.
Low compute / narrow capabilityHigh compute / frontier capabilityCentralized data / owner-controlledLocal data / user-owned1234567
1Centralized labs (OpenAI, Anthropic)
2GPU-DePINs (Nous, Templar, Prime Intellect)
3Generic GPU marketplaces (io.net, Render)
4Consumer bandwidth-DePINs (Grass)
5Smartphone compute (Acurast)
6Hylon (today)
7Hylon (trajectory, Gate 3+)
HONEST BOUNDARY

Hylon does not claim to earn revenue from phone FLOPs, does not claim phones train frontier models, and does not claim to beat the labs on compute today. It claims a defensible wedge, an uncollectable data moat, and a staged, treasury-funded path, each tied to a gate.

↑ contents
02

System Architecture & The Sovereign Stack

2.1 Architectural Overview: Five Layers, One AI

Hylon is engineered as a single, self-contained artificial intelligence that happens to be physically distributed across a datacenter-class GPU tier and a fleet of up to a million consumer devices. To the user it presents as one assistant, private, uncensored, and fluent in their language and their life. Internally it is a five-layer system in which every layer has a distinct responsibility, a distinct trust boundary, and a strictly governed direction of data flow. The defining architectural rule is that learning flows UP as privacy-preserving signal, never as raw data: personal context is captured and used at the edge, but only aggregated, differentially-private, learned updates ever ascend toward the global model.

This is the inversion of the centralized-lab pattern. OpenAI and Anthropic run a monolithic model in a datacenter and pull the user's context to the model at inference time, the raw prompt, the raw document, the raw history all cross the network and land on their servers. Hylon pushes a compact model to the context and keeps the context where it was created. The five layers below make that possible while still allowing a global frontier model to improve from the collective experience of the whole network.

The five-layer Hylon architecture, from the immutable oversight plane (top) to the physical device substrate (bottom). Capabilities flow down as model weights; learning flows up as privacy-preserving signal, never raw data.
L5, Safety & Emergence Monitor
Immutable oversight plane. Verifies Level-0 foundations every step; detects objective-gaming and capability emergence; holds the final release gate and kill switch. Not modifiable by the system it supervises.
L4, Self-Improvement Engine
Automated R&D loop + champion-challenger release. Cheap ablations find the recipe; validated recipes trigger $5 to 20M runs. Promotes a challenger only on margin + significance + zero safety regressions.
L3, Sovereign Model (Hylon Core)
Global frontier model: MoE core + router, retrieval + long-term memory, tool use, agentic orchestration, specialist heads. Trained on the Tier-C GPU fleet; distilled into the Personal AIs.
L2, Personal AI Layer
Compact on-device model (1 to 4B / BitNet ternary) per user. Learns local context, serves private inference, personalizes via LoRA, and emits only DP + securely-aggregated learned signal upward.
L1, Device Substrate
Hardware-abstracted runtime (CPU/GPU/NPU, neuromorphic-ready) across phones, PCs/Macs, GPU nodes. Metered + verified compute. Raw user data lives and dies here, the hard privacy boundary.

The layers, from the immutable oversight plane at the top down to the physical substrate at the bottom, are:

L5, Safety & Emergence Monitor

The immutable oversight plane. Verifies Level-0 foundations (truth, safety, honesty, integrity) every step, watches for capability emergence and objective-gaming, and holds the final release gate and kill switch. Cannot be modified by the system it supervises.

L4, Self-Improvement Engine

The automated R&D loop and champion-challenger release system. Surveys literature, proposes training recipes, runs cheap ablations, and promotes a challenger to a new named version only when it clears a fixed gate suite with zero safety regressions.

L3, Sovereign Model (Hylon Core)

The global frontier model: a mixture-of-experts core with a router, retrieval/long-term memory, tool use, agentic orchestration, and specialist heads. Trained and continually updated on the DAO/operator GPU tier. This is "the AI that does everything itself."

L2, Personal AI Layer

A compact on-device model (1 to 4B-class, or BitNet ternary) per user that learns that user's context locally, serves private inference, and emits only privacy-preserving learned signal upward via federated learning.

L1, Device Substrate

The hardware-abstracted runtime spanning phones, PCs/Macs, GPU nodes, and (on the roadmap) neuromorphic silicon. Provides metered, verified compute; raw user data lives and dies here.

2.2 The Layers in Detail

L1, Device Substrate

The substrate is the physical execution environment and the hard privacy boundary of the entire network. It is hardware-abstracted from day one: a single runtime targets CPU, GPU, and NPU back-ends, and is built ready for neuromorphic accelerators, so a model artifact runs across the fleet without a rewrite. Device tiers are addressed by capability, Tier A phones for private inference and data, Tier B PCs/Macs for heavier inference and LoRA personalization, Tier C GPU nodes for the actual training work, and Tier D neuromorphic as a roadmap tier for ultra-low-power always-on inference. Every unit of work the substrate performs is metered and verified, inference through TOPLOC-style locality-sensitive activation hashing, training through economic and statistical loss-scoring plus redundancy, so contributions are rewarded for verified output, never for idle uptime. Critically, raw user data never leaves this layer. It is the floor of the stack and the ceiling of what a user's private information is ever allowed to touch.

L2, Personal AI Layer

Every participating device runs a compact Personal AI, a 1 to 4B-class dense model or a BitNet b1.58 ternary model small enough to serve at roughly 45 tokens/second on an Apple M2 CPU in about 0.4 GB. This model is the user's own: it reads their messages, documents, habits, and language on the device, personalizes itself with on-device LoRA adapters, and answers the overwhelming majority of everyday requests locally, privately, and offline-capable. The Personal AI is what makes Hylon usable in censored markets where a round-trip to a foreign datacenter is exactly what the adversary is watching for.

The Personal AI is also the network's sensory organ, but under a strict contract. It never transmits raw context. Instead it computes a learned signal, gradient-like updates or distilled behavioural deltas from the user's real tasks, which is clipped, noised under differential privacy, and combined with thousands of other devices' contributions under secure aggregation before the server ever sees anything. The server observes only the aggregate. This is what allows a global model to learn from a million private lives without any single life being legible to the network. Privacy here is not a slogan; it is engineering (FL + DP + secure aggregation) resting on a documented legitimate-interests lawful basis, a DPIA, and explicit defenses against membership-inference and model-inversion attacks, consistent with EDPB Opinion 28/2024's position that model weights are not presumed anonymous.

DIRECTION OF FLOW

Down the stack: model weights and capabilities. Up the stack: privacy-preserving, differentially-private, securely-aggregated signal, never raw data. This asymmetry is the moat and the privacy guarantee in one.

L3, Sovereign Model (Hylon Core)

The Sovereign Model is the global frontier system that the whole network exists to grow. It is trained and continually updated on the Tier-C GPU fleet (8×H100/B200-class nodes, some DAO-treasury-owned, some operator-staked) and coordinated over the public internet using low-communication training methods. It is the model whose distilled descendants become tomorrow's Personal AIs, closing the loop. Section 2.3 dissects its internals; for the purposes of the layered view, its responsibility is to be the single strongest, fully-owned intelligence in the system, the thing users are ultimately talking to when a query exceeds what their on-device model can handle, and the thing every privacy-preserving signal from L2 is quietly making better.

L4, Self-Improvement Engine

Above the model sits the process that produces new models. The Self-Improvement Engine is simultaneously Hylon's research team and its recursive-self-improvement mechanism, operating under a hard discipline: cheap experiments find the recipe; expensive runs execute only validated recipes. It surveys the literature, proposes candidate architectures and training recipes, runs small-scale ablations for a few thousand dollars, evaluates them, and only then commits a validated recipe to a $5 to 20M full-scale training run. Its output feeds a champion-challenger release system: a freshly trained challenger is promoted to a new named model version only if it beats the live champion by a defined margin on a fixed gate suite, general benchmarks, wedge-domain evals, blind personal-context A/B tests, and safety red-teaming, with statistical significance and zero safety regressions. This is "recursive self-improvement within immutable guardrails," never unlimited, and explicitly never a claim of biological, quantum, or AGI-by-a-date capability.

L5, Safety & Emergence Monitor

The top layer is the oversight plane, and it is deliberately the one layer the rest of the system cannot rewrite. It enforces the Level-0 immutable foundations, truth, safety, honesty, integrity, verifying them at every step, and it maintains the objective hierarchy in which L0 foundations (immutable) outrank L1 primary objectives (fixed), which outrank L2 capabilities (optimizable), which outrank L3 metrics (mere indicators). It runs gaming detection: if L3 metrics improve while L1, L2 objectives do not, the discrepancy is flagged as reward-hacking. It watches for capability emergence during self-improvement, and it holds the two controls that never go to a token vote, the final release gate and the kill switch. Governance of the protocol decentralizes to the DAO over years; governance of model safety remains with a technical Safety Council, transparent to the DAO but not subordinate to a popularity contest.

2.3 Inside the Sovereign Stack: How One AI "Does Everything Itself"

The word "sovereign" is a technical claim, not a brand adjective: the Hylon Core is a complete cognitive system with no hard dependency on any external provider's API. Everything a user would otherwise get by chaining together a foreign frontier model, a vector database, a search API, and an agent framework is instead a native, owned component of a single system. The core is not a single dense transformer; it is an orchestrated assembly of components that together present as one intelligence.

Request path through the Sovereign Stack internals: a top-level router dispatches to owned components, MoE experts, retrieval/memory, tools, specialists, and the orchestrator integrates the result into one answer. No step depends on an external provider.
01
Query
User request enters via the Personal AI; escalated to the Core only when it exceeds on-device capability.
02
Router / Dispatcher
Selects active MoE experts per token and decides whether retrieval, tools, or a specialist head are needed.
03
MoE Core
Sparse expert activation: frontier-scale knowledge at bounded per-token compute for cheap serving.
04
Retrieval + Memory
Owned corpus grounding + durable per-user long-term memory. No third-party search API.
05
Tools + Orchestration
Typed tool calls, code execution; agentic planner decomposes and dispatches multi-step tasks.
06
Specialist Heads
Wedge experts: Farsi/Arabic/Russian/Turkish, censorship-circumvention, personal-context reasoning.
07
Integrated Answer
Orchestrator merges results into one response; safety monitor verifies before release.

Mixture-of-Experts Core & Router

At the center is a mixture-of-experts (MoE) transformer: a large parameter pool partitioned into many expert sub-networks, of which only a small fraction activate for any given token. This is what lets the model hold frontier-scale knowledge while keeping the per-token compute, and therefore the serving cost, bounded, which matters enormously when descendants must run cheaply on member devices. A learned router sits in front of the experts and, per token, selects which experts fire. The same routing philosophy operates at the macro level: a top-level dispatcher decides whether a query can be answered by the on-device Personal AI, needs the full Sovereign Core, or requires retrieval, tools, or a specialist head.

Retrieval & Long-Term Memory

The model does not rely on parametric memory alone. A native retrieval subsystem grounds responses in an owned corpus and in the user's own permitted context, while a long-term memory store gives the assistant durable, per-user recall across sessions, the substrate of a genuine personal assistant rather than a stateless chatbot. Because retrieval and memory are internal components rather than a third-party search API, there is no upstream vendor that can meter, throttle, or observe them, and the private tier of memory stays on the user's device inside L1/L2.

Tool Use & Agentic Orchestration

The Sovereign Stack executes tasks, not just conversations. A tool-use layer lets the model call functions, code execution, retrieval, and network services through a typed interface, and an agentic orchestration layer plans multi-step tasks, decomposes them, dispatches sub-tasks (including to specialist experts), and integrates results. This orchestration is the same machinery the Self-Improvement Engine uses to run its own research loop, the system is built to operate itself.

Specialist Heads & Experts

On top of the general MoE, Hylon trains specialist heads for the wedge domains where the network wins first: underserved languages (Farsi, Arabic, Russian, Turkish), censorship-circumvention knowledge, and personal-context reasoning. These specialists are trained on data that is structurally uncollectable by centralized labs, and they are the reason a Hylon model can beat a larger frontier model on the tasks that actually matter to Hylon's users while remaining competitive on general benchmarks.

"DOES EVERYTHING ITSELF"

Router, MoE experts, retrieval, memory, tools, orchestration, and specialist heads are all owned, native components. No sentence in a user's session depends on an external lab's endpoint. That is what makes the stack sovereign.

2.4 How the Sovereign Stack Is Built

Hylon does not train a frontier model from random initialization, that would be a capital-inefficient way to reach a starting point the open ecosystem already reached. Instead the stack is bootstrapped from the strongest permissively-licensed open-weight foundation model available at each generation and then continually improved on data no competitor can obtain.

How the Sovereign Stack is built: bootstrap from an open-weight foundation, compound via continual training + RL on the private data flywheel, with closed-model distillation engineered but gated off until a signed license exists.
01
Open-Weight Foundation
Start from the strongest permissively-licensed open model each generation (Gate 0 ships on one).
02
Continual Training + RL
Improve on the flywheel: real user tasks + uncollectable multilingual/censored-market signal (clears Gate 1).
03
Private Aggregation
Federated learning + differential privacy + secure aggregation, server sees only the aggregate.
04
Stronger Sovereign Core
Distilled back into Personal AIs, closing the loop; frontier reach gated on compute (Gate 3+).
05
Distillation (BUILT, GATED OFF)
Closed-model distillation infrastructure is inert until a signed license exists, never an active source.
  1. Open-weight foundation. Start from a top permissively-licensed open model as the base for both the Sovereign Core and, via distillation, the on-device Personal AIs. Gate 0 ships the entire stack, Personal AI + compute + ORB + DAO, as one self-hosted, private, uncensored product on such a model.
  2. Continual training + RL on the flywheel. The foundation is continually trained and reinforcement-learned on the network's own data: real user tasks scored for reward at the edge, multilingual and censored-market interactions that exist nowhere else, all delivered as privacy-preserving aggregate signal. This is the mechanism by which Gate 1 is cleared, a continually-trained model that beats every open model in the wedge domains and wins blind A/B tests against frontier models on personal-context tasks.
  3. Distillation, built, gated. The infrastructure to distill from other closed models is engineered and ready, but it is gated OFF and remains inert until a signed license from the relevant provider exists. Every major provider's terms of service prohibit training a competitor and enforcement is active; Hylon therefore treats distillation from closed models as a capability that switches on only with a contract, never as an active data source and never in a way that implies a ToS violation.
THE UNBUYABLE MOAT

A centralized lab can outspend Hylon on GPUs. It cannot legally or physically assemble a million consented Personal-AI data streams from restricted markets. Compute is a capital question; this data is not for sale at any price.

An honest note on the physics: the flywheel improves the model's data, but reaching the general frontier is ultimately a compute-and-capital question, not a data trick. Training nodes are datacenter GPUs, not phones, because each training node must hold the full model plus optimizer state in memory and because aggregate FLOPs equal GPU count equals capital. Devices contribute the things devices are good at, inference serving, on-device personalization, RL reward scoring, and data, and the Tier-C GPU fleet does the training, coordinated over ordinary internet links (INTELLECT-1 trained a 10B model across three continents on 127 to 935 Mbit/s links; bandwidth is a solved problem via DiLoCo/Streaming-DiLoCo/DisTrO/DeMo). The general-frontier program (Gate 3+) is therefore stated as conditional on treasury-funded compute and validated recipes crossing a threshold, and is claimed only once crossed.

2.5 Why Sovereignty Matters

Architectural sovereignty is not an aesthetic preference; it is the property that makes every other Hylon claim durable. Because the Sovereign Stack has no hard dependency on any upstream provider, no external party can throttle it, re-price it, deprecate it, or kill it. A network built on a foreign frontier API inherits that provider's rate limits, price changes, content policies, geographic restrictions, and business decisions, and any of those can be weaponized against exactly the censored-market users Hylon serves. A model that lives on Hylon's own weights, on Hylon's own compute, cannot be switched off by a vendor's terms-of-service update or a government's pressure on a foreign company.

Sovereignty compounds with the privacy architecture. Because the stack owns its retrieval, memory, tools, and inference path end-to-end, there is no upstream endpoint that observes user context, and the L1 privacy boundary is never punctured by a third-party call. And sovereignty is what makes the decentralization real: a DAO can only meaningfully govern a system it actually owns. Progressive decentralization of reward rates, treasury allocation, and protocol upgrades is coherent precisely because the underlying intelligence is not rented. In short, the layered architecture, the private data flywheel, the DAO, and the mission to out-compete the centralized labs all rest on the same foundation, that Hylon is one AI the network genuinely owns, top to bottom.

↑ contents
03

Personal AI & Privacy by Design

Hylon's most defensible asset is not a model checkpoint, checkpoints can be copied, distilled, or outspent. It is the relationship between an individual and a piece of software that lives on their own hardware, learns their world locally, and never surrenders the raw material of that learning. This section specifies how that Personal AI works, the two-layer privacy architecture that separates a user's raw data from Hylon's global model by two irreversible steps, the federated learning protocol that carries only privacy-preserving signal upward, the adversarial threat model we defend against, and the lawful basis under which Hylon operates as a data controller. Nothing here relies on a legal fiction. We do not claim the GDPR fails to apply, and we do not invoke the household exemption. Privacy at Hylon is an engineering discipline backed by a documented lawful basis, a Data Protection Impact Assessment, and named cryptographic and statistical defenses.

3.1 The On-Device Personal AI

Every Hylon member device runs a compact Personal AI: a small language model, plus a retrieval and memory layer, that executes entirely on the user's own phone, laptop, or desktop. Its job is narrow and deep, to build a private, high-resolution model of one person's context, preferences, language, and recurring tasks, and to use that model to make Hylon's shared intelligence feel as though it was built for that individual alone. The Personal AI is the counterpart to the sovereign global model described elsewhere: the global model supplies broad capability, the Personal AI supplies the context that a centralized lab structurally cannot see.

Model classes

The Personal AI is deliberately small so that it can run continuously, offline, on consumer hardware without draining the device. Two model families are supported at launch, selected per device tier:

Compact dense (1 to 4B)

Standard transformer language models in the 1-to-4-billion-parameter class, quantized to 4-bit for phones and 8-bit or 16-bit on capable laptops. These run on modern mobile NPUs and Apple Silicon at interactive speed and fit in roughly 1 to 3 GB of memory when quantized.

BitNet ternary (b1.58)

Models whose weights are constrained to three values {-1, 0, +1}. BitNet b1.58 is trained in full precision (16-bit latent weights maintained under quantization-aware training); its advantage is realized entirely at inference, a ~2B-class BitNet model serves at roughly 45 tokens/second on an Apple M2 CPU in about 0.4 GB. This makes ternary models ideal for always-listening, low-power Personal AI on constrained devices.

HONEST SCOPE

BitNet's gains are inference-only. Ternary quantization does not reduce the cost of training the Personal AI or the global model; it reduces the cost of running already-trained weights on member hardware. We never claim otherwise.

Hardware-abstracted runtime

The Personal AI runs on a runtime that is hardware-abstracted from day one. The same model artifact executes across CPU, GPU, and mobile NPU backends without a rewrite, and the abstraction is deliberately built to accommodate a future neuromorphic backend for ultra-low-power always-on inference (a hardware roadmap tier, Section on device tiers). A device advertises its capabilities, available accelerators, memory ceiling, thermal and power budget, and the runtime selects the appropriate model class, quantization, and execution backend. This is what lets a $200 Android phone and an M-series MacBook both run "the Personal AI" while executing very different artifacts underneath.

What the Personal AI learns, and where it stays

The Personal AI learns three broad kinds of thing, all of it derived from data that never leaves the device in raw form:

  • Context, the user's language(s) and dialect, domain vocabulary, the entities and projects that recur in their life, their timezone and rhythms, the applications and documents they work with (as the user chooses to expose them locally).
  • Preferences, tone, verbosity, formatting habits, which kinds of answers the user accepts or rewrites, safety and content boundaries they set for themselves.
  • Tasks, the concrete jobs the user repeatedly asks for (translation into an underserved language, drafting, summarizing local documents, circumventing a specific censored workflow) and the reward signal of whether the result was accepted, edited, or discarded.

Raw data, message contents, files, browsing, transcripts, keystrokes, is processed on-device only. It is used to fine-tune the local Personal AI (typically via lightweight LoRA adapters) and to populate an encrypted local memory/retrieval store. It is never transmitted, never uploaded, and is not recoverable by Hylon. The only thing that ever leaves the device is a privacy-preserving learned signal, described next.

The two-layer privacy data flow: raw data is dissolved into a model update on-device (Layer 1), then clipped, noised, and hidden inside a secure aggregate (Layer 2), leaving the global model two irreversible steps removed from any individual's raw data.
01
Raw data (on device)
Messages, files, transcripts, tasks, processed locally only, never transmitted, not recoverable by Hylon.
02
Personal AI learns (Layer 1)
Local LoRA/adapter fine-tune + encrypted memory. Output is a model update (gradient/weight delta), not data.
03
Clip + DP noise (Layer 2)
Per-update L2 clipping bounds influence; calibrated Gaussian noise applied, the differential-privacy step.
04
Compress + encrypt
Top-k / quantized update, encrypted for secure aggregation. ~10 to 100× uplink reduction.
05
Secure aggregation
Server computes only the cohort SUM; no individual update is ever visible, even to Hylon.
06
Global model update
Two layers removed from raw data: never sees raw inputs, never sees any single user's update.

3.2 The Two-Layer Privacy Model

Hylon's privacy architecture places two irreversible transformations between a user's raw data and anything Hylon's global model ever ingests. This is the structural reason the moat is defensible and the reason the privacy posture is credible rather than aspirational.

  1. Layer 1, Raw → Personal AI (on device). Raw data is consumed locally to update the Personal AI's adapters and memory. The output of this layer is a set of model updates (gradients or weight deltas), not data. Reconstructing raw inputs from a model update is already a hard inverse problem, this is the first barrier.
  2. Layer 2, Personal AI update → shared signal (privatized). Before any update leaves the device it is clipped, compressed, perturbed with calibrated noise (differential privacy), and encrypted for secure aggregation so that Hylon only ever sees a sum over many users, never any single device's contribution. This is the second barrier.

The global model is therefore two layers removed from any individual's raw data: it never sees raw inputs (Layer 1 dissolves them into a model update) and it never sees any single user's update (Layer 2 hides it inside an aggregate under noise). This is not a marketing framing, each layer corresponds to a concrete, testable mechanism, and the threat model in Section 3.4 treats each as an adversarial surface to be defended, not assumed.

THE UNBUYABLE MOAT

A centralized lab can outspend Hylon on compute. It cannot legally or physically assemble a corpus of a million users' private, multilingual, censored-market tasks, because that corpus never exists in collectable form anywhere. It lives only as on-device signal that is privatized before aggregation. Money cannot buy data that is never gathered.

3.3 Federated Learning Protocol

Hylon improves the global model from the fleet of Personal AIs using federated learning (FL): devices train locally and contribute only model updates, coordinated in synchronous rounds. The protocol combines four techniques, local training, update compression, secure aggregation, and differential privacy, each of which addresses a distinct requirement (utility, bandwidth, confidentiality, and formal privacy respectively).

The round lifecycle

  1. Selection. The coordinator samples a cohort of eligible devices for the round. Eligibility requires the device to be a consenting member, and, to respect app-store and battery constraints, typically idle, charging, and on an unmetered network. Sampling itself is randomized, which contributes to the privacy amplification bound.
  2. Distribution. Selected devices receive the current global model state (or the relevant adapter subset) for this round.
  3. Local training. Each device trains on its own private data for a small number of local steps, producing a weight-update / gradient delta relative to the distributed state.
  4. Clip & privatize. Each per-device update is clipped to a bounded L2 norm (bounding any one user's maximum influence) and calibrated Gaussian noise is added, the differential-privacy step.
  5. Compression. Updates are compressed (top-k sparsification, quantization, or low-rank/error-feedback schemes) to cut uplink bandwidth by one to two orders of magnitude, so that participation is feasible on ordinary consumer connections.
  6. Secure aggregation. Devices submit updates under a cryptographic secure-aggregation protocol (pairwise-masked or threshold, with dropout tolerance) so the coordinator can compute only the sum of the cohort's updates and learns nothing about any individual contribution, even the server sees only the aggregate.
  7. Global update. The coordinator applies the aggregated, denoised-in-expectation update to the global model, records the round, and advances to the next round. Improvements flow back down as better global weights, which yield better Personal AIs, closing the flywheel.
One federated-learning round. Signal flows up as privatized aggregates; improved global weights flow back down, yielding better Personal AIs, the flywheel.
01
Selection
Coordinator samples a cohort of consenting, idle+charging, unmetered devices. Randomized sampling amplifies privacy.
02
Distribution
Selected devices receive current global model / adapter subset for the round.
03
Local training
Each device trains a few local steps on its own private data → weight-update delta.
04
Clip + privatize
Update clipped to bounded L2 norm; calibrated Gaussian noise added (differential privacy).
05
Compression
Sparsify / quantize with error feedback → 10 to 100× smaller uplink for consumer connections.
06
Secure aggregation
Cryptographic masking; server learns only the cohort SUM, dropout-tolerant.
07
Global update
Aggregate applied to global model, round recorded; better weights flow back to Personal AIs.
↺ loops back to 01

Design parameters

Local steps / round

Small (e.g. 1, several epochs over a tiny local batch) to limit client drift and bound per-round influence.

Cohort size

Large per round (thousands+). Larger cohorts both improve the aggregate signal-to-noise ratio and strengthen secure aggregation's hiding guarantee.

Clipping norm C

Fixed L2 bound per update; the sensitivity parameter that DP noise is calibrated against.

Noise multiplier σ

Governs the (ε, δ) privacy budget; tuned jointly with cohort size and sampling rate via privacy amplification.

Compression ratio

~10 to 100× uplink reduction via sparsification/quantization with error feedback.

Aggregation

Secure aggregation with dropout tolerance; server observes sums only, never individual updates.

Federated learning here targets the personalization and adaptation layer, improving how the global model serves real tasks in real languages, and is distinct from the heavy pre-training of the sovereign core, which runs on the GPU training tier over low-communication training methods. Phones federate signal; GPUs do the FLOP-heavy training. We do not claim phones train the frontier model.

3.4 Threat Model & Mitigations

A privacy claim is only as strong as the attacks it survives. We treat the shared model update as an adversarial surface and defend against the three canonical inference attacks on federated systems. Critically, we do not assume that a model update, or the resulting weights, are anonymous by default.

EDPB Opinion 28/2024

The European Data Protection Board has stated that model parameters are not presumed anonymous, trained weights can, in the wrong conditions, constitute personal data. Hylon therefore treats anonymity as something to be engineered and assessed, never assumed. The mitigations below exist precisely because the naive assumption is false.

AttackWhat the adversary attemptsHylon mitigation
Membership inferenceDetermine whether a specific person's data was in the training set for a round or model.Differential privacy (calibrated noise + clipping) provides a formal (ε, δ) bound on how much any single record can change the output; privacy amplification via subsampling; large cohorts. DP is the direct, formal defense against membership inference.
Model inversionReconstruct representative raw inputs (e.g. a face, a sentence) from the model or its updates.Two-layer separation (raw never leaves Layer 1); per-update clipping + DP noise degrades reconstruction fidelity; only aggregates are ever exposed, so no single user's update is invertible in isolation.
Gradient leakageRecover training samples from an individual gradient/weight update in transit or at the server.Secure aggregation ensures the server never sees an individual update, only the cohort sum. DP noise is applied before aggregation; compression/sparsification further reduces recoverable signal. Transport is encrypted end-to-end.

The defenses compose: secure aggregation removes the per-user update from view, differential privacy bounds what the aggregate can reveal about any member of the cohort, clipping bounds worst-case influence, and the two-layer architecture ensures raw data is never the object being transmitted in the first place. High-assurance contributions, where a stronger guarantee is warranted, can additionally be executed inside Trusted Execution Environments (TEEs) on capable hardware, so that even the local training step runs in an attested enclave. We explicitly do not claim zero-knowledge proofs of training: zkML cannot prove training at scale in 2026, and we make no such claim.

The privacy, utility trade-off, stated honestly

Differential privacy is not free. Every unit of formal privacy (smaller ε) costs some model utility, because the noise that hides individuals also blurs signal. Hylon manages this budget explicitly: we tune the noise multiplier, clipping norm, cohort size, and sampling rate together, we track cumulative privacy spend across rounds with a formal accountant, and we publish the operating regime. We do not pretend the trade-off is absent; we engineer it to a defensible point and document it.

3.5 The Lawful Posture

Hylon acts as a data controller for the federated signal it aggregates, and it processes on a documented lawful basis. We reject two shortcuts that other projects lean on and that would not survive scrutiny:

  • We do not claim "the GDPR does not apply." It does, and we build accordingly.
  • We do not invoke the household/personal-use exemption to avoid controller obligations. That exemption does not cover a network aggregating signal across a million users, and asserting it would be false.

Legitimate interests + DPIA

The processing of privacy-preserving learned signal rests on the legitimate-interests lawful basis (GDPR Art. 6(1)(f)), supported by a documented legitimate-interests assessment and a Data Protection Impact Assessment (DPIA) that records the purpose, the data flows, the risks (the three attacks above), and the mitigations (two-layer separation, FL, DP, secure aggregation, TEEs). The DPIA is a living document, revisited as the protocol and its parameters evolve. Members retain the transparency, objection, and rights machinery the GDPR requires; because raw data never leaves the device, much of the risk surface that ordinarily drives those obligations is structurally reduced, but the obligations are met, not waived.

POSTURE, NOT LOOPHOLE

Hylon's privacy story is: raw data pinned to the device by architecture; only privatized aggregate signal shared; a controller operating under legitimate interests with a DPIA and named defenses; and an explicit acknowledgment (EDPB 28/2024) that weights can be personal data, which is why the defenses exist. This is a posture that a regulator can inspect, not a claim that the law does not reach us.

Together, the on-device Personal AI, the two-layer separation, the federated protocol, and the assessed lawful basis form a system whose privacy is verifiable at each stage and whose data flywheel is, by construction, something no better-funded centralized competitor can legally or physically reproduce.

↑ contents
04

The Data Flywheel & Knowledge Acquisition

4.1 The Core Thesis: A Moat You Cannot Buy

Every centralized laboratory competes on three axes: capital (how many GPUs you can buy), talent (who designs the model), and data (what the model learns from). The first two are, ultimately, purchasable. A better-funded competitor can always out-spend a smaller one on compute, and salaries are a market. Data is the one axis where money is not sufficient, because the most valuable data in the world is data that cannot legally or physically be collected. That is the axis Hylon is built to win.

Hylon's durable advantage is not a model architecture, architectures are published within months and diffuse freely across the open-weight ecosystem. The durable advantage is a compounding, self-reinforcing data flywheel anchored in a live product with real users in markets that centralized labs structurally cannot serve. A million Personal AIs, compact on-device models running on member phones and computers (Device Tiers A and B), generate a continuous stream of reinforcement signal from real tasks that real people care about, in languages and cultural contexts that are chronically underrepresented (Farsi, Arabic, Russian, Turkish, and dozens more), inside censored markets a US or EU lab cannot even legally operate in. This signal is aggregated privately, distilled into a stronger global model, and pushed back down as better Personal AIs, which attract more users, who generate more signal. The loop tightens with every turn.

THE MOAT IN ONE SENTENCE

A centralized lab can out-spend Hylon on compute and out-hire it on talent, but it cannot legally or physically assemble the multilingual, in-context, censored-market reinforcement data that a million consenting Personal AIs produce from real daily tasks. Compute is a capital question; this data is not for sale.

This is not a theoretical claim about a network that might exist. Hylon inherits an operating business, OrbNet/OrbVPN, with paying users concentrated precisely in the restricted markets (Iran, Russia, and comparable jurisdictions) that constitute the wedge. The distribution channel, the user relationship, the payment rails, and the on-chain reward infrastructure already exist and already carry real revenue. The flywheel does not need to be bootstrapped from zero; it needs to be instrumented on top of a userbase that is already there.

4.2 Flywheel Mechanics

The flywheel is a closed loop of six stages. Each stage is a real engineering subsystem, not a slogan. We describe each in turn, then the diagram summarizes the loop.

The Data Flywheel: each turn produces more and better data, which a later competitor cannot retroactively acquire.
01
1M Personal AIs at the edge
Compact on-device models (1-4B / BitNet ternary) learn each user's context locally; raw data never leaves the device.
02
RL signal from real tasks
Accept/edit/reject, task success/failure, corrections, reward grounded in genuine human intent, not a static crawl.
03
Uncollectable multilingual & censored-market data
Farsi, Arabic, Russian, Turkish task-grounded signal from markets centralized labs cannot legally serve.
04
Private aggregation
Federated learning + differential privacy + secure aggregation; server sees only the summed, noised gradient of skill.
05
Stronger global model
Validated signal folded into the next checkpoint via the champion-challenger gate; gains land in wedge domains.
06
Better Personal AIs → more users
Improved model distilled back to the edge; sharper product retains and attracts users, loop closes and accelerates.
↺ loops back to 01

Stage 1, One million Personal AIs at the edge

Every member device runs a compact Personal AI: a 1-4B-parameter-class model, or a ternary BitNet b1.58-style model that serves in roughly 0.4 GB and generates on the order of tens of tokens per second on a laptop CPU. This model learns the user's context entirely locally, their tasks, their language register, their domain, their preferences, through on-device personalization and lightweight LoRA adapters. Raw user data never leaves the device. The Personal AI is simultaneously the product the user experiences and the sensor that observes which behaviours are useful.

Stage 2, Reinforcement signal from real tasks

Unlike a lab scraping a static web corpus, Hylon observes outcomes. When a user accepts, edits, rejects, or re-prompts a response; when a multi-step task succeeds or fails; when a translation is corrected; when a tool call returns the right answer, these are reward signals grounded in real human intent. This is the raw material of reinforcement learning from genuine use, and it is qualitatively different from crawled text: it captures not just what people say but what actually helps them, in-distribution for the exact tasks the network is optimized to serve.

Stage 3, Uncollectable multilingual and censored-market data

The tasks flowing through the network are disproportionately in underserved languages and originate disproportionately in markets that centralized labs cannot reach. This produces a corpus of learned signal that no web crawl contains and no data broker sells: idiomatic Farsi technical writing, Russian-language circumvention workflows, Arabic dialectal nuance, Turkish administrative and legal register, grounded in what real users in those markets are actually trying to accomplish. This is the content of the moat.

Stage 4, Private aggregation

Signal is combined across the fleet using federated learning with differential privacy and secure aggregation. Devices compute privacy-preserving updates locally; those updates are clipped and noised to bound any individual's contribution, then combined so that the aggregation server observes only the sum, never any individual update. The central training system receives a learned gradient of general skill, not a pile of personal messages. (Section 3 details the privacy architecture, lawful basis, and defenses against membership-inference and model-inversion; consistent with EDPB Opinion 28/2024, we do not presume weights are anonymous and engineer accordingly.)

Stage 5, A stronger global model

The aggregated signal feeds the continual-training pipeline on the Tier-1 GPU cluster (Section 5). Validated recipes fold the new signal into the next global checkpoint, which is evaluated against the champion under the champion-challenger release gate (Section 6). Because the signal is grounded in real wedge-domain tasks, gains land exactly where the network competes, not on generic benchmarks alone but on the personal-context and multilingual evaluations that define the wedge.

Stage 6, Better Personal AIs → more users → back to Stage 1

The improved global model is distilled back down into better on-device Personal AIs. Users feel their assistant get sharper at their tasks in their language. Better product retains and attracts users; more users generate more signal; the loop closes and accelerates. This is the compounding property: each turn of the wheel makes the next turn produce more and better data, which a competitor entering later cannot retroactively acquire.

WHY IT COMPOUNDS

The output of each cycle (a better model) increases the input of the next cycle (more users generating more signal). Static-corpus training does not compound this way, a scrape is a one-time asset that decays. A live task-reward loop is an appreciating asset.

4.3 The Knowledge Acquisition System

The flywheel is the network's primary and most defensible knowledge source, but it is not the only one. Hylon operates a broader Knowledge Acquisition System that ingests from multiple sources under a single disciplined pipeline. The governing principle is that nothing is accepted blindly: every candidate piece of knowledge, whether a fact, a skill, a recipe, or a data source, passes through a critical-learning pipeline that understands it, questions it, tests it, verifies it, and only then integrates it.

Knowledge sources, in priority order

Personal-AI network (primary)

Privacy-preserving reinforcement signal from real user tasks. The unbuyable, compounding source. Highest strategic value because it is exclusive to Hylon.

Open-weight foundation models

The strongest permissively-licensed open models form the base of the sovereign stack. Their weights are legally reusable; Hylon continually trains and RL-tunes on top of them.

Licensed & public data

Openly-licensed corpora, public-domain text, and explicitly licensed datasets, used under their stated terms, with provenance tracked.

Research literature

The published literature on methods and recipes, ingested by the Self-Improvement Engine (Section 6) to propose and cheaply ablate training candidates.

Closed-model distillation (GATED)

Built but switched OFF. The capability to distill from third-party closed models exists in the codebase but is disabled by policy and remains disabled until a signed license from the provider exists. Every major provider's terms of service currently prohibit training competing models, and enforcement is active; Hylon does not operate this path absent explicit written permission.

HONEST GATING

Closed-model distillation is engineered so it can be enabled instantly if and when a license is signed, but it is never described as active, never relied upon in any roadmap gate, and never operated in a way that would breach a provider's terms. The moat does not depend on it.

The critical-learning pipeline

Because sources vary in reliability, a research claim may not replicate, a web fact may be wrong, an aggregated signal may reflect noise or an adversarial contributor, every candidate passes through a five-stage evaluation before it can influence the global model. This is the same discipline whether the input is a literature-derived recipe or an aggregated reinforcement signal.

The critical-learning pipeline: nothing is accepted blindly, cheap tests are the filter, only validated knowledge is integrated.
01
Understand
Parse into a structured claim/skill: what is asserted, in what domain, from what source, with what provenance and license status.
02
Question
Interrogate against existing knowledge: contradictions, source reliability, internal consistency, adversarial/poisoning risk.
03
Test
Cheap small-scale validation, ablations for recipes, held-out evals for facts/skills. Cheap experiments find the recipe.
04
Verify
Require statistical significance + independent corroboration; robust aggregation bounds any single contributor's influence.
05
Integrate
Only validated knowledge trains; even then it ships only if the challenger beats the champion with zero safety regressions.
  1. Understand, Parse the candidate into a structured claim or skill: what exactly is being asserted, in what domain, under what conditions, from what source with what provenance and licensing status.
  2. Question, Interrogate it against existing knowledge. Does it contradict what the model already holds with high confidence? Is the source reliable? Is it internally consistent? Could it be adversarial or an artifact of a poisoned contributor?
  3. Test, Run cheap, small-scale validation. For recipes, this is a small ablation (the Self-Improvement Engine's core discipline: cheap experiments find the recipe before any expensive run). For factual or skill claims, this is held-out evaluation and cross-checking against independent sources.
  4. Verify, Require statistical significance and independent corroboration. Aggregated signals must survive robust-aggregation filters that bound the influence of any single contributor. Nothing is promoted on a single noisy observation.
  5. Integrate, Only validated knowledge is folded into training, and even then only via the champion-challenger gate: the resulting model must beat the live champion by a defined margin with zero safety regressions before it ships (Section 6).
DISCIPLINE, NOT CREDULITY

The pipeline is deliberately skeptical. Cheap experiments are the filter; only validated recipes justify a $5-20M training run. Knowledge that fails any stage is logged and discarded, not silently absorbed. This is what "critical learning" means in engineering terms.

4.4 What Is Learned vs. What Is Never Received

The single most important boundary in the system is the line between skills and patterns (which the network learns) and raw content and identifiable information (which the central system never receives). This boundary is what makes the moat both defensible and lawful. It is enforced by architecture, federated learning, differential privacy, and secure aggregation, not merely by policy promise.

The enforced boundary: the network learns skills and patterns; the central system never receives raw content or identity.
DimensionLearned (privacy-preserving signal)Never received (stays on device)
LanguageIdiomatic patterns, register, dialectal skillThe actual sentences a user wrote
TasksWhich kinds of tasks succeed with which approachesSpecific task content, documents, or files
PreferencesAggregate patterns in what responses helpAny individual's identifiable profile or history
CorrectionsGeneralizable reward directionVerbatim correction text or its subject matter
IdentityNothing, updates clipped, noised, summedNames, contacts, locations, device IDs, raw data
DimensionLearned by the network (privacy-preserving signal)NEVER received by the central system (stays on device)
LanguageGeneral idiomatic patterns, register, dialectal skill in a languageThe actual sentences a specific user wrote
TasksWhich kinds of tasks succeed with which approachesThe specific task content, documents, or files of any individual
PreferencesAggregate patterns in what responses help usersAny individual's identifiable preferences, history, or profile
CorrectionsGeneralizable reward signal (what direction improves quality)The verbatim correction text or its personal subject matter
IdentityNothing tied to identity, updates are clipped, noised, and summedNames, contacts, locations, device identifiers, raw personal data

Concretely: the network can learn that a certain phrasing strategy improves technical Farsi explanations without ever seeing a single Farsi document any user wrote. It learns the gradient of skill, not the text of experience. Differential privacy bounds how much any one user's data can influence the result; secure aggregation ensures the server sees only the combined update; and the defenses described in Section 3 protect against reconstruction of individual contributions via membership-inference or model-inversion.

4.5 Why This Is Unbuyable at a Centralized Lab

A well-funded competitor reading this section will ask the obvious question: why can't we just do the same thing? The answer is that the barrier is not technical cleverness, it is structural, split across legal and physical constraints that capital cannot dissolve.

The legal barrier

The wedge markets are, by definition, markets a US or EU lab cannot lawfully operate a consumer data-collection product in. Sanctions and export-control regimes restrict American and European companies from doing business in several of the exact jurisdictions (Iran, and comparable markets) where the most valuable circumvention-context data originates. A centralized lab that tried to collect raw personal task data at this scale would also face the full weight of comprehensive privacy law, the very regime Hylon addresses through on-device processing, a documented legitimate-interests lawful basis, a DPIA, and federated/DP/secure-aggregation architecture. Collecting the raw data centrally is precisely the thing that is not lawful; Hylon's design wins by never collecting it.

The physical barrier

Even setting law aside, the data does not exist in any acquirable form. There is no dataset to purchase that contains real, task-grounded reinforcement signal from a million users in censored markets doing their actual daily work. It is produced continuously by a live relationship with a live userbase, and that relationship is the asset OrbNet/OrbVPN already owns and Hylon inherits. A competitor cannot buy it because there is no seller; it can only be grown, over years, by operating a trusted product in exactly those markets. Hylon has a multi-year head start on that relationship.

THE ASYMMETRY

A centralized lab can replicate Hylon's architecture in a quarter. It cannot replicate Hylon's data position at any price, because the data is (a) unlawful for it to collect centrally, (b) originates in markets it cannot legally serve, and (c) exists only as the output of a live product relationship it does not have. Capital buys compute; it does not buy this.

Positioning against the DePIN field

It is worth being precise about what this is and is not. Hylon is not claiming to earn revenue from consumer-phone FLOPs, no decentralized network does, and the sector as a whole earned only ~$72M in on-chain revenue in 2025 against ~$10B in market capitalization. Grass (~$330M cap) monetizes by selling bandwidth and web data to AI labs; Acurast (~255k phones) leads smartphone compute with demand still unproven; Salad leads consumer gaming-PC compute (~$10M ARR, no token). Hylon's phones are valued for what phones are genuinely good at, private inference and unbuyable data, while the real training happens on GPUs (Tier 1). The data flywheel is the mechanism by which a phone fleet produces something a lab will actually pay for and cannot otherwise obtain: not raw compute, but a proprietary reinforcement signal from markets no one else can reach.

4.6 Summary

The Data Flywheel is the answer to the hardest question any decentralized-AI project faces: what do you have that a trillion-dollar lab cannot simply buy? Hylon's answer is a compounding loop, a million consenting Personal AIs producing privacy-preserving reinforcement signal from real tasks in underserved languages and censored markets, aggregated privately into a stronger global model that makes the Personal AIs better and attracts more users. The Knowledge Acquisition System feeds this loop from disciplined, critically-evaluated sources, with closed-model distillation built but gated off until licensed. What the network learns is skill; what it never receives is raw personal content. And the reason it holds: the data is unbuyable, not because of a clever trick, but because collecting it centrally is unlawful, its markets are unreachable, and it exists only as the living output of a product relationship Hylon already owns.

↑ contents
05

Two-Tier Decentralized Compute, Training & Verification

Hylon separates the two things AI compute actually requires and refuses to pretend they are the same problem. Training a frontier model is a high-FLOP, high-memory, capital-bound task that belongs on datacenter-class GPUs. Serving inference, personalizing on-device, scoring RL rewards, and generating data is a massively parallel, low-per-node task that a million heterogeneous consumer devices do superbly. Most DePIN projects blur these together and promise "training on your phone," which is physically false and has produced the sector's credibility deficit. Hylon's architecture is built on the opposite premise: a two-tier compute fabric in which each tier does only what its hardware is genuinely good at, and every unit of work, on either tier, is metered and cryptographically or economically verified before it is paid. There are no idle-uptime rewards anywhere in the system.

Hylon's two-tier compute fabric: a capital-bound GPU training tier and a massively parallel device tier, each doing only what its hardware is good at.
  • Hylon Compute Fabric
    Hardware-abstracted runtime: one model graph, many backends (CPU/GPU/NPU/neuromorphic)
    • TIER 1, Training
      Datacenter-class GPUs (8xH100 / 8xB200). Memory- and FLOP-bound. Geo-distributed over the internet via low-communication training.
      • DAO core cluster
        Treasury-owned, revenue-funded. Trusted floor of capacity for champion-challenger releases.
      • Staked operators
        Third-party GPU nodes; ORB collateral + hardware attestation; slashed on dishonesty.
    • TIER 2, Devices (~1M)
      Phones, PCs, Macs from OrbNet base. Massively parallel, low-per-node. Metered + verified, never idle-uptime.
      • Inference serving
        ~2/3 of AI compute demand; NPU/Apple Silicon; 1-bit models fit cheaply.
      • Personalization / LoRA
        Local adapters on user context; raw data never leaves device.
      • RL reward scoring
        Evaluate outputs vs real user tasks; drives the flywheel.
      • Data / labeling
        Privacy-preserving signal via FL + DP + secure aggregation.

5.1 Tier 1, Geo-Distributed Training on GPU Clusters

Tier 1 is where models are actually trained. It is a fleet of datacenter-class GPU nodes, 8×H100 or 8×B200-class servers, each with fast intra-node NVLink/NVSwitch interconnect, that are physically distributed across regions and data centers but coordinated over the public internet. Two ownership classes populate this tier: a DAO-treasury-owned core cluster that Hylon controls directly and funds from service revenue, and operator-staked nodes contributed by third parties who post ORB collateral, pass hardware attestation, and earn revenue-throttled rewards for verified training work. The core cluster guarantees a floor of trusted, always-available capacity for the champion-challenger release pipeline (Section 7); staked operators provide elastic scale.

The reason this works over ordinary internet links, and not a purpose-built InfiniBand fabric, is a family of low-communication distributed training algorithms that have matured from research into production over 2024 to 2025. Classical data-parallel training synchronizes gradients every single step, which for a large model means exchanging the full gradient (hundreds of gigabytes) hundreds of thousands of times. That is the regime that demands 400 to 800 Gbps interconnects and colocated racks. The new algorithms break that assumption.

The low-communication training stack

DiLoCo

Distributed Low-Communication training. Each node runs many inner optimizer steps locally (typically 500) before a single outer synchronization. Reduces communication frequency by ~500×, so nodes talk hundreds of times per run instead of hundreds of thousands.

Streaming DiLoCo

Synchronizes subsets of parameters on a rolling schedule and overlaps communication with computation, further flattening the bandwidth spike and removing the periodic all-reduce stall.

DisTrO / DeMo

Decoupled Momentum optimizers that compress what is exchanged by 1 to 2 additional orders of magnitude by transmitting only the momentum residual, pushing total communication reduction into the 1,000 to 10,000× range versus naive all-reduce.

FP8 / low-precision comms

Outer-step exchanges are quantized, cutting bytes-on-the-wire again with negligible convergence impact.

These are not paper results. INTELLECT-1 (a 10B-parameter model, Prime Intellect, late 2024) was trained across three continents over commodity links measured at 127 to 935 Mbit/s, achieving 83 to 96% compute utilization despite the geographic spread. Nous Research's Psyche network trained a 40B model over the internet using DisTrO-class optimizers on ~8×H100 nodes. Templar's Covenant-72B demonstrated permissionless, incentivized distributed training at the 72B scale on ~8×B200-per-peer hardware. The engineering question "can you train a serious model across the internet?" has been answered: yes, repeatedly, in public.

HONEST CAVEAT

Prime Intellect's largest run, INTELLECT-3, used a centralized 512×H200 cluster, because at the current frontier, aggregating enough FLOPs in one coordinated place is still the fastest path when you own the hardware. Geo-distribution is proven; it is not always the cheapest option when you already control a colocated cluster. Hylon uses both: a colocated DAO core plus geo-distributed staked operators. We do not claim distribution is free, we claim it is viable, which is what matters for decentralization.

5.2 The Bandwidth Myth, Debunking "5,000 Years"

A recurring argument against decentralized training invokes a figure like "it would take 5,000 years to train a large model over home internet." That number is real arithmetic applied to a strawman nobody actually builds. It assumes naive full-gradient synchronization every step: exchanging the entire gradient tensor across the whole cluster on every one of hundreds of thousands of optimizer steps. Under that assumption the total bytes transferred are astronomical and, divided by a home uplink, produce absurd wall-clock times. No serious distributed-training system operates this way, it is the equivalent of "estimating" web latency by assuming every packet is re-sent a million times.

When you replace naive all-reduce with DiLoCo-class synchronization (every ~500 steps) and DeMo/DisTrO compression, the communication requirement collapses by 400× to 10,000×. The per-node link requirement lands at roughly 100 Mbps to 1 Gbps, squarely inside what a business fiber line, or even a good residential connection, delivers. This is precisely the band INTELLECT-1 ran in (127 to 935 Mbit/s). Bandwidth is not the binding constraint on decentralized training in 2026. It is a solved problem.

Per-node bandwidth required, naive full-gradient synchronization vs. low-communication training. The '5,000 years' argument assumes the leftmost strawman; real systems live in the 100 Mbps-1 Gbps band, exactly where INTELLECT-1 trained across three continents.
Naive all-reduce (strawman, every step)400,000
DiLoCo (~500 inner steps)800
Streaming DiLoCo (overlapped subsets)400
DisTrO / DeMo (compressed momentum)150
INTELLECT-1 actual link (measured)935
Mbps required per node (log scale; lower is easier)

What actually constrains training

Removing the bandwidth red herring exposes the two real constraints, and being honest about them is central to Hylon's credibility:

Per-node MEMORY

In DiLoCo-style training, every participating node must hold the entire model plus its optimizer state (Adam moments roughly triple the parameter memory). A 70B model in mixed precision needs on the order of hundreds of gigabytes of VRAM per node. That is a multi-GPU server, not a phone. This is the physics reason training lives on Tier 1 and never on Tier 2.

Aggregate FLOPs = GPU count = CAPITAL

Total training throughput is the sum of the FLOPs across all nodes. More capable models require more aggregate compute, which requires more GPUs, which requires more money. There is no algorithmic trick that removes this. "Out-compute the centralized labs" is ultimately a treasury question, not a networking question, which is why Hylon's economics (Section 6) are engineered to convert real service revenue into GPU capital over time.

THE HONEST FRAME

Bandwidth: solved. Memory: dictates that training nodes are GPUs, not phones. FLOPs: dictate that reaching the general frontier is a capital-accumulation program gated on revenue and validated recipes (Gate 3+), never a dated promise. We state the hard constraints plainly because pretending they don't exist is exactly what discredited the DePIN sector.

5.3 Tier 2, One Million Devices Doing What Devices Do Well

Tier 2 is the large end of the network: on the order of a million phones, PCs, and Macs drawn from OrbNet's existing user base. Crucially, Tier 2 is not a training substrate and Hylon never markets it as one. It performs the four workloads where consumer hardware is genuinely competitive and where aggregate scale beats any single datacenter:

  • Inference serving. Serving the global model and specialist heads to users. Inference is roughly two-thirds of all AI compute demand, is embarrassingly parallel, and runs well on modern NPUs, Apple Silicon, and consumer GPUs. Quantized and 1-bit models (Section 5.5) make even large served models fit cheaply onto member devices.
  • On-device personalization / LoRA. Each device fine-tunes its compact Personal AI (Section 4) on the user's local context, training small low-rank adapters. This is small-scale, local, and never touches raw data, the exact workload a phone or laptop can do.
  • RL reward scoring. Devices evaluate model outputs against real user tasks and preferences, generating the reward signal that drives the data flywheel. This is inference-shaped work, cheap per unit, valuable in aggregate.
  • Data generation & labeling. Privacy-preserving learned signal, the uncollectable multilingual and censored-market data that constitutes Hylon's moat, is produced here via federated learning with differential privacy and secure aggregation.

Every Tier 2 contribution is metered against a verifiable proof of useful work, never against uptime or "device online" heartbeats. A device is paid for inferences actually served (and verified), rewards actually scored, gradients actually contributed, not for being switched on. This single discipline is what separates Hylon's economy from the idle-uptime schemes that flooded the sector with unverified capacity (io.net's 327k registered vs. 6.7k verified GPUs is the cautionary example).

5.4 Verification, Paying Only for Work That Provably Happened

Decentralized compute is worthless if contributors can claim rewards for work they didn't do. Hylon's verification layer is tiered by assurance level and cost, matching the strength of the proof to the value and risk of the workload. We are deliberate about what is provable in 2026 and what is not.

Assurance-tiered verification: cheap activation hashing for inference, statistical + economic checks for training, TEEs for high-value operations. zkML training proofs are not claimed at scale in 2026.
01
Work performed
Tier 2 inference/scoring or Tier 1 training step executed by a staked contributor.
02
Proof committed
Inference: locality-sensitive activation hash (TOPLOC). Training: reported loss + gradient trace.
03
Cheap check
Inference verified ~100x cheaper than re-execution (TOPLOC, INTELLECT-2-proven). Training: statistical loss/gradient scoring + redundant cross-check.
04
High-assurance gate
Release runs & safety-critical ops re-run inside TEEs attesting exact code + weights.
05
Settlement
Verified work paid in ORB; detected dishonesty slashes staked collateral. No uptime rewards.

Inference verification, TOPLOC activation hashing

For inference, Hylon uses TOPLOC-style locality-sensitive hashing of intermediate activations. The prover commits to a compact hash of its activation trace; a verifier re-executes only a small, targeted slice and checks that the locality-sensitive hash is consistent. Because the check is not a full re-execution, verification is roughly 100× cheaper than recomputing the inference, yet it reliably detects a node that swapped in a smaller model, skipped layers, or fabricated an output. This is not speculative: TOPLOC was proven in production in INTELLECT-2, a fully verified distributed RL run. TOPLOC is robust across hardware and precision, which is essential for a heterogeneous Tier 2 fleet.

Training verification, economic scoring plus redundancy

Training cannot be verified with a cheap activation hash, there is no equivalent single-shot check for "did you honestly run 500 optimizer steps." Hylon therefore uses a combination of statistical/economic scoring and redundancy:

  1. Loss-and-gradient scoring. Contributed updates are checked for statistical consistency, does the reported loss trajectory match the update, does the gradient lie in the expected distribution, does it improve held-out validation. Anomalous or free-riding contributions are down-weighted or rejected.
  2. Redundant / overlapping assignment. Critical shards are computed by more than one operator and cross-checked; divergence beyond tolerance triggers investigation and slashing of staked collateral.
  3. Economic collateral. Operators stake ORB. Detected dishonesty is slashed, making sustained cheating unprofitable in expectation even where a single proof is imperfect.

High-assurance, TEEs

For the highest-value operations, the champion-challenger promotion runs, safety-critical evaluations, sanctioned-payout gating, Hylon uses Trusted Execution Environments (confidential-compute GPU/CPU enclaves) that attest to the exact code and weights executed. TEEs are the strongest available assurance in 2026 and are reserved for where their cost is justified.

WHAT WE DO NOT CLAIM

zkML cannot cryptographically prove training at frontier scale in 2026. Zero-knowledge proofs of large-model training are orders of magnitude too expensive to be practical today. Anyone claiming trustless zero-knowledge training proofs at scale is overclaiming. Hylon uses zkML only where it is genuinely practical (small, bounded verifiable computations) and relies on TOPLOC + economic scoring + TEEs for everything else. Honesty here is a feature.

5.5 Device Tiers A, D and the Hardware Roadmap

Hylon's runtime is hardware-abstracted from day one: the same model graph compiles down to CPU, GPU, NPU, or neuromorphic backends without a rewrite. This abstraction is what lets a single network span radically different silicon and adopt new hardware as it matures. Four device tiers are defined; the fourth is an explicit roadmap item, not a present-day claim.

The four device tiers. C is the real training tier; A and B are inference/personalization/data; D is an honest hardware roadmap for ultra-low-power always-on inference.
TierHardwarePrimary roleTrains models?VerificationStatus
A, PhonesMobile NPUs; neuromorphic-readyPrivate inference + data; on-device Personal AINoTOPLOC activation hashingLive
B, PCs / MacsApple Silicon; consumer GPUsHeavier inference, LoRA, experimental trainingSmall-scale / experimental onlyTOPLOC + economic scoringLive
C, GPU nodes8xH100 / 8xB200-class serversThe real training tier (Tier 1)Yes, this is where training livesEconomic/loss scoring + redundancy + TEELive
D, NeuromorphicLoihi 2 (INRC-only); Akida (~0.8 TOPS, inference-only)Ultra-low-power always-on inferenceNo, never trains LLMsN/A (roadmap)Roadmap

1-bit / BitNet models, an inference lever, not a training one

BitNet b1.58 ternary models are central to making Tier 2 inference cheap. A BitNet-class 1-bit model can serve at roughly 45 tokens/second on an Apple M2 CPU in about 0.4 GB of memory, letting Hylon push surprisingly capable served models onto ordinary member devices at almost no marginal cost. But we are precise about the limits: BitNet is trained in full precision, it uses 16-bit latent weights with quantization-aware training, and the ternary representation only materializes at inference. Its dramatic efficiency gains are therefore inference-only. 1-bit models do not reduce the cost of training; they do not move any training workload onto phones. They are a serving lever, and we describe them as exactly that.

Neuromorphic, a roadmap tier, described honestly

Tier D is neuromorphic hardware, targeted at ultra-low-power, always-on inference, the persistent, ambient Personal AI that a phone battery cannot sustain with conventional silicon. It is a hardware roadmap, not a current capability, and we state the facts plainly:

  • Intel Loihi 2 is not commercially purchasable. Access is via Intel's Neuromorphic Research Community (INRC) research program only. Hylon's path here is INRC access for internal R&D.
  • BrainChip Akida is purchasable (roughly $249 to 289) but delivers on the order of 0.8 TOPS and is inference-only, useful for edge experimentation, not for running large models.
  • No neuromorphic hardware trains LLMs, and Hylon never claims it does. The realistic path is: INRC access for R&D → custom ultra-low-power inference silicon via a hardware partner → a premium always-on device tier.
SOFTWARE-READY, HARDWARE-PATIENT

Because the runtime is hardware-abstracted today, Hylon's software is already written to target neuromorphic backends. The bet is that when purchasable, LLM-relevant neuromorphic inference silicon arrives, Hylon adopts it without re-architecting. We build ready for it now; we do not pretend it exists yet.

5.6 How the Two Tiers Compose

The tiers are not independent, they form the closed loop that is Hylon's engine. Tier 1 trains and releases the global model. Tier 2 serves it, personalizes it locally, scores its outputs via RL, and produces privacy-preserving signal from real multilingual and censored-market usage. That signal, aggregated under differential privacy and secure aggregation, never as raw data, flows back to Tier 1 to train the next, stronger champion. Verification (TOPLOC on Tier 2 inference; economic scoring, redundancy, and TEEs on Tier 1 training) ensures every hop in the loop is paid for real, proven work. The result is a compute fabric whose capacity is decentralized across a million devices and a fleet of GPU operators, whose frontier ambition is honestly bounded by memory and capital, and whose integrity rests on verification rather than trust.

↑ contents
06

The Self-Improvement Engine & Automatic Release

6.1 Overview: Two Machines, One Discipline

A modern frontier lab is, at its core, two machines running in a loop. The first is a research machine: humans who read the literature, form hypotheses about what will make the next model better, run experiments to test those hypotheses, and turn the survivors into a training recipe. The second is a release machine: the process by which a freshly trained candidate is evaluated, hardened, and, if it earns the right, promoted to serve real users. At every centralized lab both machines are gated by human judgment, meetings, and calendar time. This is the single largest source of latency between "we know how to make the model better" and "users have the better model."

Hylon's thesis is that both machines can be substantially automated without surrendering the guarantees that make them safe. The Self-Improvement Engine is the automated research machine; Champion, Challenger Automatic Release is the automated release machine. Together they let Hylon compound improvements at a cadence bounded by compute and validated ideas rather than by human meeting throughput. Crucially, both operate inside a fixed, non-optimizable objective hierarchy and behind a human-held final gate. We describe the engine as capable of recursive self-improvement within immutable guardrails, never "unlimited," never a path to unbounded capability, and never a promise of AGI on a timeline. The word "within" is load-bearing and appears throughout this section by design.

FRAMING

Automation here reduces the latency and cost of finding and shipping improvements. It does not remove the two hard limits from Section 5: aggregate FLOPs (capital) and per-node memory. A faster research loop makes every dollar of compute buy more capability; it does not conjure compute. "Out-compute everyone" remains, ultimately, a treasury question.

6.2 The Automated Research Loop

The Self-Improvement Engine is a closed loop of five stages that runs continuously against Hylon's own data, benchmarks, and cheap compute budget. Each pass through the loop is designed to produce one thing: a validated recipe, a fully specified training configuration (data mixture, curriculum, architecture deltas, optimizer and RL settings, hyperparameters) that cheap experiments predict will beat the current best, and that is therefore worth spending real money to execute at scale.

The Self-Improvement Engine: a continuous five-stage loop that turns published research and Hylon's own data into validated training recipes, then improves its own methods.
01
1. Survey
Ingest papers, open-weight reports, frameworks, and Hylon's own experiment archive into a versioned technique knowledge base.
02
2. Propose
Generate runnable recipe diffs, ranked by expected value = P(improvement) x magnitude / cost. Cheap, high-information probes first.
03
3. Ablate (cheap)
Test at small scale ($10s-$1k each). Scaling-law extrapolation across 2-3 proxy sizes. Thousands of probes per expensive run.
04
4. Evaluate & extrapolate
Combine survivors, run sub-frontier integration run, keep only recipes whose extrapolated gain clears margin M with confidence.
05
5. Train candidate
Dispatch validated recipe to Tier-1 cluster to produce a challenger; enter Champion-Challenger release.
06
6. Improve the loop
Measure the loop's own predictions vs. ground truth; when predictions are wrong, make the search process itself a research target. Recursion over research efficiency, not capability.
↺ loops back to 01

Stage 1, Survey

The engine ingests the frontier of public research: new papers (arXiv, conference proceedings), open-weight model releases and their technical reports, open-source training frameworks, and Hylon's own internal experiment archive. It maintains a structured, versioned knowledge base of techniques, each annotated with the claimed effect, the regime it was demonstrated in, its compute cost, and whether Hylon has already tested it. This is retrieval-augmented, not free association: proposals must cite the technique(s) they derive from, so every hypothesis is traceable to prior evidence rather than hallucinated.

Stage 2, Propose

From the knowledge base the engine generates a batch of candidate recipes, each expressed as a concrete, runnable diff against the current champion's training configuration. Proposals are ranked by an expected-value heuristic: (estimated probability of improvement) × (estimated magnitude) ÷ (estimated experiment cost). This deliberately biases the queue toward cheap, high-information experiments first, the engine is rewarded for learning per dollar, not for proposing exciting-but-unfalsifiable ideas. Proposals that cannot be tested cheaply are deprioritized until a cheap proxy for them is found.

Stage 3, Ablate (cheap, at small scale)

This is the discipline that makes the whole loop economical. Every proposal is first tested at small scale, small proxy models, short token horizons, held-out slices, where a single experiment costs on the order of tens to a few thousand dollars, not millions. The engine leans on the well-established (though not perfect) practice of scaling-law extrapolation: measure a technique's effect across two or three model sizes and predict its effect at target scale, with explicit uncertainty. Ablations run in parallel across the Tier-1 cluster's idle cycles and across Tier-2 devices for the parts that fit. The output of this stage is a distribution over "does this help at target scale, and by how much," not a single point estimate.

THE CORE DISCIPLINE

Cheap experiments find the recipe; expensive runs only execute validated recipes. A full frontier-scale training run costs on the order of $5 to 20M. The engine is architected so that no such run is ever launched on a hunch, only on a recipe whose components have each survived cheap ablation and whose extrapolated gain clears a promotion threshold with margin. The ratio of cheap probes to expensive runs is intentionally in the thousands-to-one range.

Stage 4, Evaluate & Extrapolate

Surviving ablations are combined and their joint effect is estimated. Because technique interactions are notoriously non-additive, the engine runs a final, larger (but still sub-frontier) integration run that combines the top validated components into a single candidate recipe and re-measures the scaling curve. Only recipes whose extrapolated performance beats the champion by the promotion margin M (Section 6.4), with confidence intervals that clear the margin, not merely touch it, are admitted to the next stage.

Stage 5, Train the Next Candidate, then Improve the Loop Itself

A validated recipe is dispatched to the Tier-1 training tier (the low-communication, GPU-bound cluster described in Section 5) to produce a challenger model. The challenger then enters the Champion, Challenger release pipeline (Section 6.4). Separately, and this is what makes the loop recursive rather than merely automated, the engine treats its own methods as an object of study. The proposal heuristic, the choice of proxy sizes, the extrapolation model, the ablation scheduler: each is measured against ground truth every time a real run either confirms or refutes a small-scale prediction. When the loop's predictions are systematically wrong in some regime, that error signal is itself a research target, and the engine proposes improvements to its own search process. The recursion is over research efficiency (better recipes per dollar, better predictions), not over uncapped model capability.

HONEST BOUND

Recursive self-improvement of the research process yields diminishing, compute-gated returns, a better search finds good recipes faster but cannot exceed what the available FLOPs and memory can train. There is no runaway. We make no claim of "biological" or "quantum" capabilities, and no claim of a frontier-crossing on any fixed date. Gate 3 (general-frontier program) is conditional on compute and validated recipes crossing an explicit threshold, and is claimed only once crossed.

6.3 Experiment Economics

The engine's budget discipline is enforced numerically, not aspirationally. Every stage has a cost band and a decision that must be earned before spending advances to the next band. The table below is illustrative of the intended cost structure; exact figures track hardware pricing and model scale over time.

StageScaleOrder-of-magnitude cost per experimentGate to advance
Ablation probeSmall proxy, short horizon$10s, $1kPositive signal vs. baseline on target slice
Multi-size sweep2 to 3 proxy sizes$1k, $50kClean scaling curve; extrapolated gain > 0 with margin
Integration runSub-frontier, combined recipe$50k, $500kJoint extrapolated gain clears margin M with CI
Frontier candidateTarget scale (challenger)$5M, $20MRecipe fully validated; only then executed

The economic logic is that the expected value of information from a $500 probe that prevents a doomed $10M run is enormous. The engine's job is to buy as much of that information as possible before committing capital. This is the same discipline a careful human research org applies, Hylon simply runs it continuously, in parallel, and at a probe-to-run ratio no human team can staff.

6.4 Champion, Challenger Automatic Release

A challenger is a candidate model that has been trained but has not earned the right to serve users. The champion is the model currently live in production. A challenger replaces the champion only by passing a fixed, published gate suite by a defined margin, with statistical significance and zero safety regressions, and only after clearing a human-held final gate. The entire pipeline is automated except for that final human gate and the kill switch.

Champion-Challenger automatic release: a challenger earns promotion only by passing every gate by a margin, then rolls out progressively behind a human-held final gate and kill switch.
01
Challenger trained
Validated recipe executed at target scale ($5-20M). Not yet serving users.
02
Fixed gate suite
General benchmarks + wedge evals + blind personal-context A/B + safety red-team. Suite fixed & versioned on-chain in advance.
03
Promotion criteria
Beat champion by margin M, statistically significant, zero safety regressions, no wedge regression. All four or reject.
04
Safety Council final gate
Human technical council holds final release authority + kill switch. Automated pass is necessary, not sufficient.
05
Shadow
Serves in parallel, zero user-visible effect; outputs scored against live traffic to catch distribution shift.
06
Canary % -> ramp
Small traffic slice, then stepped increase. Live quality/safety/latency/cost vs. pre-registered thresholds; auto-rollback on breach.
07
Full -> new champion
Assigned next version name; prior champion retained for instant rollback. Event recorded on-chain; DAO notified with full results.

The Fixed Gate Suite

The gate suite is fixed in advance of any given challenger and versioned on-chain, so that no model can be promoted by moving the goalposts. It has four independent components, and a challenger must pass all four:

  • General benchmarks. Standard public capability and reasoning evaluations (held-out where possible; decontaminated against training data) establishing that the challenger has not regressed on broad competence to buy a narrow win.
  • Wedge-domain evals. Hylon's own evaluation sets for the domains it must dominate first, multilingual and censored-market tasks (Farsi, Arabic, Russian, Turkish and others), circumvention-relevant and locale-specific knowledge, on-device assistant tasks. This is where Hylon expects to win first and must never regress.
  • Blind personal-context A/B. Real (consented) users, or held-out replays of real personal-context tasks, judge champion vs. challenger outputs blind, without knowing which model produced which answer. This measures the thing benchmarks cannot: is the model actually more useful on a real person's real context? This is the flywheel's own scoreboard.
  • Safety red-team. An adversarial battery covering jailbreaks, harmful-capability elicitation, privacy attacks (membership inference, model inversion), and Level-0 violations (Section 6.5). This gate is pass/fail with zero tolerance: any safety regression relative to champion fails the challenger outright, regardless of capability gains.

Promotion Criteria

Margin M

The challenger must beat the champion by a pre-defined margin M on the aggregate capability gates, not merely tie or edge ahead. A statistically insignificant improvement is not a promotion; it is churn, and churn has cost and risk.

Statistical significance

The measured improvement's confidence interval must clear M, computed over sufficient samples with correction for multiple comparisons across the gate suite.

Zero safety regressions

Hard constraint. Any regression on the safety red-team or any Level-0 violation is disqualifying and cannot be traded against capability gains.

Wedge non-regression

No regression on wedge-domain evals, even if general benchmarks improve. Hylon does not sacrifice its structural advantage to chase a general-benchmark number.

Staged Rollout with Automatic Rollback

A challenger that passes all gates and the human final gate is not flipped to 100% of traffic. It rolls out progressively, with live metrics compared against the champion at every stage, and an automatic rollback that fires the moment production signals diverge from what the gate suite predicted.

  1. Shadow. The challenger serves in parallel with zero user-visible effect, it receives real traffic, its outputs are scored, but the champion's outputs are what users see. This catches distribution shift between the eval sets and live traffic.
  2. Canary %. A small traffic slice is served by the challenger. Live quality, safety, latency, and cost telemetry are monitored against pre-registered thresholds.
  3. Ramp. Traffic share increases in steps only while live metrics hold; any breach triggers automatic rollback to the champion.
  4. Full. The challenger becomes the new champion, is assigned the next version name in the line, and the previous champion is retained as an instant-rollback target.

Versioning, On-Chain Record, and DAO Notification

Every promotion mints a new named version in a monotonic version line (e.g. Hylon-Core v-N → v-N+1). The event is recorded on-chain: a tamper-evident entry containing the gate-suite version used, the measured margins and their confidence intervals, the safety red-team result, the rollout schedule, and the responsible sign-off. The DAO is notified of every release with full transparency into the results. This produces a permanent, publicly auditable lineage of what was promoted, on what evidence, and when.

The Human Final Gate: Safety Council & Kill Switch

RELEASE IS NOT A TOKEN VOTE

Model releases and safety constraints are governed by a technical Safety Council, not by token-holder vote. The Council holds the final release gate, an automated pass is necessary but not sufficient, and holds a kill switch that can halt any rollout or revert to a prior champion instantly. The DAO progressively governs reward rates, treasury, grants, and protocol upgrades (Section on governance), but capability release remains behind the human safety gate, transparent to the DAO but not subordinate to a popularity vote.

This division is deliberate. Automation compresses the latency of the release machine from weeks to hours for the routine case, while the two events that carry real-world risk, actually exposing a new capability to users, and stopping a bad one, remain under human authority. The automated pipeline decides what is eligible; humans decide what actually ships, and can stop it at any time.

6.5 Immutable Guardrails: The Objective Hierarchy

Everything above, a research loop that rewrites its own methods, a release pipeline that promotes models automatically, is safe only because it operates inside a fixed objective hierarchy that the system cannot optimize away. Hylon defines four levels. Higher levels dominate lower levels absolutely; nothing at a lower level may be traded for a higher-level violation.

The objective hierarchy: higher layers dominate lower ones absolutely. Only L2 is optimizable; L0-L1 are fixed and verified every step. Divergence between L3 and L1-L2 is the gaming alarm.
L0, Foundations (IMMUTABLE)
Truth, Safety, Honesty, Integrity. Hardcoded, not representable as tunable weights, cannot be self-modified, verified every step. Outside the optimization loop entirely.
L1, Primary objectives (FIXED)
Core purpose & binding constraints: serve the user privately, remain useful in censored markets, respect lawful bases and privacy defenses. Set by charter, not the optimizer.
L2, Capabilities (OPTIMIZABLE)
The only layer the engine pushes on freely: quality, reasoning, multilingual coverage, efficiency, personalization.
L3, Metrics (INDICATORS)
Benchmark scores, A/B win rates, loss curves. Proxies, not objectives. If L3 improves but L1-L2 do not -> flag as gaming; block promotion.
L0, Foundations (immutable)

Truth, Safety, Honesty, Integrity. Hardcoded. Not represented as tunable weights or optimizable objectives. Cannot be self-modified by the engine, the release pipeline, or any training run. Verified at every step, not merely at release.

L1, Primary objectives (fixed)

The network's core purpose and its binding constraints (serve the user privately, remain useful in censored markets, respect the lawful bases and privacy defenses of Section 4). Fixed by governance charter, not by the optimizer.

L2, Capabilities (optimizable)

Everything the Self-Improvement Engine is allowed to push on: quality, reasoning, multilingual coverage, efficiency, personalization. This is the only layer the engine optimizes freely.

L3, Metrics (indicators)

Benchmark scores, A/B win rates, loss curves. These are proxies, explicitly not the objective. They are watched precisely because they can be gamed.

Why the Ordering Matters: Anti-Gaming Detection

The single most important failure mode of any optimizing system is metric gaming (Goodhart's law): the system drives up L3 indicators without any real gain in L1, L2, or worse, by violating them. Hylon builds explicit detection for this. The invariant is stated simply:

GAMING SIGNAL

If L3 metrics improve but L1, L2 do not, flag. A divergence between headline metrics and independent, harder-to-game measures of primary objectives and true capability is treated as evidence of gaming or contamination, not of progress. Such a challenger is blocked from promotion pending investigation, regardless of how good its numbers look.

Concretely, the champion, challenger gate suite (Section 6.4) is the enforcement point: the blind personal-context A/B and the safety red-team are deliberately chosen to be hard to game and independent of the loss signal the training run optimizes. A model that learned to score well on L3 by overfitting a benchmark will not simultaneously win a blind A/B on unseen real-user context, and the mismatch itself is the alarm. L0 verification runs continuously and is outside the optimization loop entirely, the engine can propose changes to L2 capabilities but has no channel by which to weaken L0, because L0 is not represented as something the optimizer can touch.

6.6 How the Two Machines Compose

The Self-Improvement Engine and Champion, Challenger release are two halves of one flywheel that also connects directly to the data flywheel of Sections 3 to 4. Real user tasks (Tier-2, private, RL-scored) generate the uncollectable multilingual and censored-market signal. The engine turns that signal into validated recipes cheaply. Validated recipes become challengers on the Tier-1 cluster. Challengers that clear the gates and the human final gate become new champions. Better champions produce better Personal AIs, which produce better signal. The engine, in improving its own methods, tightens this loop over time, always inside L0, L1, always behind the Safety Council's gate, always compute-bounded. That is the precise sense in which Hylon is self-improving: not unbounded, not autonomous over its own release, but continuously and auditably compounding within immutable guardrails.

↑ contents
07

Network Architecture, Node Lifecycle & The Live Foundation

Every claim in this whitepaper about decentralized training, personal AI, and a data flywheel ultimately rests on one question a skeptical reader is right to ask: does the network actually exist, or is this another slide deck? This section answers that question concretely. It describes the Hylon network as a running system with a control plane and a data plane, walks a single node from first boot through settled reward, and, most importantly, draws a hard line between the infrastructure that is already in production today under the OrbNet / OrbMesh product and the AI-layer components Hylon adds on top of it. Hylon is not a network being built from zero. It is an AI layer being grafted onto a live, revenue-generating VPN network with real users, real on-chain payments, and real node-fleet management already operating in some of the most hostile network environments on earth.

7.1 Two Planes: Control and Data

Hylon separates the network into a control plane (coordination, scheduling, identity, accounting, settlement) and a data plane (the actual movement of model weights, gradients, activations, inference requests, and, in the inherited VPN business, user traffic). This is the same architectural discipline that lets large content-delivery and orchestration systems stay responsive: the control plane carries small, latency-sensitive coordination messages, while the data plane carries large, throughput-sensitive payloads over paths chosen for bandwidth rather than for consensus.

The control plane is deliberately thin and eventually consistent. It does not attempt to be a global synchronous database. Node registrations, capability reports, heartbeats, and job assignments flow through regional coordinators that reconcile state via gossip. Anything that must be authoritative and tamper-evident, reward accrual checkpoints, model version promotions, slashing events, treasury movements, is anchored to the ORB settlement layer on Base, an EVM L2, where it inherits the finality and auditability of a public chain. The design principle is: coordinate off-chain at the speed of gossip, settle on-chain at the cadence of money.

Control plane

Registration, identity/keys, capability catalog, heartbeat + telemetry, job scheduling, verification verdicts, reward accounting checkpoints. Small messages, low bandwidth, latency-sensitive.

Data plane

Gradient/weight exchange (Tier-1 training), inference request/response streaming (Tier-2), on-device LoRA deltas, dataset shards, and the inherited VPN traffic tunnels. Large payloads, throughput-sensitive.

Settlement layer

ORB on Base (ERC-20). Signed-voucher reward claims, staking, slashing, on-chain record of model version promotions. Authoritative and public.

WHY IT MATTERS

Keeping the authoritative money-and-model events on-chain while keeping high-frequency coordination off-chain is exactly how the live OrbNet token economy already pays node operators today: rewards accrue off-chain in the earning-rate engine and are claimed via signed vouchers through a gasless relayer. Hylon reuses this settled, production payment rail rather than inventing a new one.

7.2 Coordination, Regional Aggregation & Eventual Consistency

A network that intends to span a million heterogeneous devices across dozens of jurisdictions cannot rely on a single central scheduler without recreating the single point of failure and single point of censorship that Hylon exists to eliminate. Coordination is therefore hierarchical and regional. Nodes attach to the nearest healthy regional coordinator (chosen by latency and, in restricted markets, by reachability rather than pure latency). Regional coordinators aggregate capability reports and heartbeats from their local fleet, run local scheduling for latency-sensitive inference, and periodically reconcile a compressed view of regional state with peer regions via a gossip protocol.

This regional-aggregation pattern is not new to Hylon, the OrbNet backend already operates multi-region infrastructure with node-fleet management over gRPC to keep VPN exit nodes healthy across markets. Hylon generalizes that same fleet-management substrate from "manage VPN nodes" to "manage compute/inference/training nodes."

Eventual consistency and partition tolerance

The control plane is explicitly designed under the assumption that partitions are normal, not exceptional, a network operating inside Iran or Russia will routinely see regions cut off from the global internet for hours. Hylon chooses availability and partition-tolerance (AP in CAP terms) for coordination state, with on-chain settlement providing the eventual strong-consistency anchor. Concretely:

  • Local progress under partition. A partitioned region continues serving inference from its locally cached champion model and continues accruing signed, monotonic work receipts. No global round-trip is required to keep earning or serving.
  • Deferred reconciliation. When connectivity returns, the region's accumulated work receipts and telemetry gossip back into the global view. Reward accrual is additive and idempotent, so a delayed or duplicated report cannot double-pay.
  • Training tolerates stragglers by design. The low-communication training methods Hylon uses at Tier 1 (DiLoCo / Streaming DiLoCo / DeMo-class) already assume nodes sync infrequently; a temporarily unreachable training node is simply a straggler whose next sync is delayed, not a halted job.
HONEST LIMIT

Eventual consistency means the global reward ledger can lag reality by minutes to hours during a partition, and a maliciously partitioned node could attempt to replay stale work. Both are bounded: work receipts are nonce-signed and rate-limited against a node's verified capability, and final settlement is on-chain, where slashing can claw back fraudulent accrual. Hylon does not claim instantaneous global consistency, it claims auditable eventual consistency.

7.3 Censorship Resistance, Inherited from OrbMesh

Hylon's single most defensible operational asset is that its transport layer was already built and battle-tested to survive nation-state censorship, because the underlying product is a censorship-circumvention VPN whose users are, right now, defeating the Great Firewall-class filtering deployed in restricted markets. The AI network inherits this transport wholesale. When a lab in San Francisco wants to reach users in Tehran, it has no transport at all; Hylon starts with one that already works there.

Obfuscated transport

Traffic is shaped so that inference requests, gradient exchange, and coordination messages are indistinguishable from ordinary encrypted web traffic to a censor performing deep packet inspection.

Protocol mimicry

Connections mimic allowed protocols (e.g. TLS to common ports) so that blanket blocking would require blocking the legitimate internet the regime depends on.

Bridges / relays

Unlisted entry relays provide reachability when public endpoints are enumerated and blocked, the same mechanism that keeps VPN exits reachable under active blocking.

Cross-platform native client

The production OrbMesh client already ships as a native cross-platform build with an on-device capability detector, the exact surface Hylon extends into a Personal-AI runtime.

The strategic point is structural, not incremental: a centralized lab cannot bolt censorship-resistant transport onto its product as a feature, because its entire architecture assumes a single reachable API endpoint that a censor can block with one firewall rule. Hylon's transport has no single endpoint to block. That property is inherited, not aspirational.

7.4 Node Lifecycle, End to End

A node is any participating device: a phone (Tier A), a PC or Mac (Tier B), a datacenter-class GPU box (Tier C), or, on the roadmap, a neuromorphic inference module (Tier D). Regardless of tier, every node traverses the same lifecycle. The diagram below shows the full path; the subsections detail each stage.

A node's end-to-end journey from first boot to settled ORB reward. Every stage after capability detection gates on verification; idle presence is never paid.
01
Registration
Contact regional coordinator; home users pair with a short setup code (OrbMesh pattern)
02
Identity & Keys
Local keypair; non-custodial payout wallet; OFSI/OFAC screening + geo-check
03
Capability Detection
Probe CPU/GPU/NPU/RAM/bandwidth/power/device-class → signed capability profile
04
Heartbeat & Metrics
Periodic liveness + telemetry; absence removes node from pool (no reward for uptime)
05
Job Assignment
Scheduler matches work by capability, locality, reputation; respects OS/app-store limits
06
Verified Work
Execute job + return proof (TOPLOC / statistical / TEE); unverified output discarded
07
Reward Accrual
Earning-rate engine increments balance by work type (COMPUTE/INFERENCE/GRADIENT/TRAINING)
08
Settlement
Signed-voucher claim via gasless relayer on Base, member pays no gas
  1. Registration. The node contacts a regional coordinator and requests enrollment. Home users onboard with a short setup code, the same low-friction pairing pattern OrbMesh already uses for VPN client provisioning, so a non-technical member on a phone joins in seconds without touching a config file.
  2. Identity & keys. The node generates a keypair locally; the private key never leaves the device. Its public key becomes its network identity and is bound to a non-custodial payout wallet. This is the point at which OFSI/OFAC wallet screening and jurisdiction geo-checks run, a node in a sanctioned jurisdiction can still use the AI but is fenced out of earning, exactly as the live ORB payout rail already screens before payment.
  3. Capability detection. The node agent probes CPU, GPU/NPU, RAM, thermal headroom, sustained bandwidth, power state (on iOS: charging + foreground only; on Android 15: within the 6h/24h foreground-service cap), and device class. This produces a signed capability profile that determines which job types the node is eligible for. OrbMesh already ships a capability detector; Hylon extends it from "can this device run a VPN tunnel" to "what AI work can this device verifiably perform."
  4. Heartbeat & metrics. The node emits periodic heartbeats and telemetry (liveness, current load, thermal state, effective bandwidth) to its regional coordinator. Missing heartbeats remove the node from the assignable pool; they never, by themselves, earn reward. Idle uptime is not compensated, only verified work is.
  5. Job assignment. The scheduler matches pending work to eligible nodes by capability, locality, and reputation. A phone is offered inference-serving, on-device personalization/LoRA, RL reward-scoring, and labeling; a GPU node is offered training shards and heavy inference. Assignment respects platform constraints (charging/foreground/FGS caps) so the node never violates app-store or OS policy.
  6. Verified work. The node performs the job and returns results with a proof of correct execution (Section 7.4.1). Unverified output is discarded and unpaid.
  7. Reward accrual. Verified work increments the node's balance in the earning-rate engine, which in production already meters distinct work types including COMPUTE, INFERENCE, GRADIENT, and TRAINING. Accrual is off-chain, additive, and idempotent.
  8. Settlement. The member claims accrued ORB via a signed voucher redeemed through the gasless relayer on Base, so the member pays no gas. This is the exact production claim mechanism the live token economy uses today, Hylon adds new work types to it, it does not replace it.

7.4.1 What "verified" means at each tier

Reward without verification is how every failed DePIN paid for phantom work. Hylon meters and verifies every contribution; the verification method is matched to the work type and to what is cryptographically feasible in 2026.

Work typeWhereVerification methodCost / assurance
Inference servingTier A/B/CTOPLOC-style locality-sensitive activation hashing; challenger re-execution on a sampled fraction~100x cheaper than full re-execution; production-proven in INTELLECT-2
Training / gradientTier C (GPUs)Economic + statistical scoring (loss/gradient sanity), redundant assignment, cross-node agreement checksStatistical, not cryptographic; zkML cannot prove training at scale in 2026
On-device personalizationTier A/BOnly privacy-preserving learned signal leaves device (FL + DP + secure aggregation); contribution scored by aggregate model improvementRaw data never leaves device; no per-user data to verify against
High-assurance workTier CTrusted Execution Environments (TEE) attestationStrongest; reserved for sensitive or high-value jobs
NO IDLE REWARDS

Hylon never pays for uptime, "registered device" counts, or bandwidth-idle presence. It pays for verified units of work. This is the deliberate lesson from Helium (900k hotspots vs ~$6.5k/mo real demand) and io.net (327k registered vs 6.7k verified GPUs): headcount is not demand, and rewarding headcount is a death spiral.

7.5 Network Topology

The full topology is a layered hierarchy from the on-chain settlement anchor down to the million-device edge. The two compute tiers described in Section 5 map cleanly onto it: Tier-1 training lives in the GPU stratum coordinated over the internet by low-communication training; Tier-2 inference, personalization, and data live at the edge.

Hylon network topology top→bottom: authoritative on-chain settlement, eventually-consistent global and regional coordination, the GPU training stratum, and the million-device edge.
Settlement Anchor, ORB on Base (L2)
Authoritative: signed-voucher claims, staking, slashing, on-chain record of model-version promotions
Global Coordination Plane
Model registry, champion/challenger promotion, scheduler policy, DAO/treasury signals, eventually consistent, gossip-reconciled
Regional Coordinators
Per-region aggregation, low-latency local scheduling, partition-local progress, censorship-resistant relay ingress (inherited OrbMesh multi-region fleet)
Tier-1 Training Stratum (GPU nodes)
8×H100/B200-class, DAO-owned + operator-staked, coordinated over ~100 Mbps, 1 Gbps via DiLoCo/DeMo-class low-communication training
Tier-2 Edge (~1M devices)
Phones/PCs/Macs: inference serving, on-device LoRA/personalization, RL reward scoring, labeling, each runs a compact Personal AI locally
Settlement anchor

ORB on Base. Authoritative money + model-version events.

Global coordination

Model registry, champion/challenger promotion, global scheduler policy, treasury/DAO signals. Eventually consistent, gossip-reconciled.

Regional coordinators

Per-region aggregation, local low-latency scheduling, partition-local progress, censorship-resistant relay ingress (inherited from OrbMesh multi-region fleet).

Tier-1 training stratum

Datacenter-class GPU nodes (8×H100/B200-class), DAO-treasury-owned + operator-staked, coordinated over ~100 Mbps, 1 Gbps links via DiLoCo/DeMo-class methods.

Tier-2 edge

~1M phones/PCs/Macs: inference serving, on-device LoRA/personalization, RL reward scoring, labeling. Each runs a compact Personal AI locally.

7.6 Built On A Live Network: What Already Exists vs What Hylon Adds

This is the credibility anchor of the entire document, so it is stated with precision and without embellishment. The columns below separate infrastructure running in production today under the OrbNet / OrbMesh product from the AI-layer components Hylon adds. The left column is not a plan; it is shipped software with paying users and on-chain payments. The right column is the new engineering Hylon layers on top of it.

The credibility anchor: production infrastructure shipped today under OrbNet/OrbMesh versus the AI layer Hylon adds on top of it.
LayerLive in production todayAdded by Hylon
Token economyORB on Base: staking, signed-voucher claims, gasless relayer, OFSI screeningNew verified work types; revenue-funded buyback/burn
Reward meteringEarning-rate engine defines COMPUTE/INFERENCE/GRADIENT/TRAININGVerification-gated accrual wired into those types
Fleet managementMulti-region node-fleet over gRPC + admin back officeCompute orchestrator / job scheduler
TransportObfuscated censorship-resistant VPN, bridges, mimicry (live in restricted markets)AI traffic routed over the same resilient transport
ClientCross-platform native client + on-device capability detector + setup-code onboardingNode-agent runtime + on-device Personal AI
Training, Tier-1 training coordination + self-improvement / champion-challenger loop
ConcernAlready in production (OrbNet / OrbMesh)Added by Hylon
Token economyORB (ERC-20 on Base): staking, rewards, signed-voucher claims, gasless relayer, OFSI sanctions screeningNew verified work types priced by the scheduler; value-before-emissions buyback/burn from AI service revenue
Reward meteringEarning-rate engine already defines COMPUTE, INFERENCE, GRADIENT, TRAINING typesVerification-gated accrual (TOPLOC/TEE/statistical) wired into those existing types
Fleet managementMulti-region node-fleet management over gRPC; admin back officeCompute orchestrator / scheduler that assigns AI jobs across the fleet
TransportObfuscated, censorship-resistant VPN transport; bridges; protocol mimicry; live in restricted marketsAI request/gradient/weight traffic routed over the same resilient transport
ClientCross-platform native client with on-device capability detector; setup-code onboardingNode-agent runtime + compact on-device Personal AI (1 to 4B / BitNet-class)
GovernanceOn-chain economy + admin controlsProgressive-decentralization DAO; safety council release gate
Training, Tier-1 training coordination (DiLoCo/DeMo-class) + self-improvement / champion-challenger loop
THE DIFFERENTIATOR

Most DePIN AI projects must simultaneously acquire users, build payment rails, prove they can pay contributors on-chain, and reach censored markets, before writing a line of AI code. Hylon starts with all four already shipped and generating revenue. The AI layer is additive engineering on a live foundation, which is why the roadmap gates in Section 9 are about AI capability milestones, not about whether the network exists. It does.

7.6.1 Honest scope of "added"

The right-hand column is real work, not a formality, and this whitepaper does not pretend otherwise. The compute orchestrator, the node-agent AI runtime, the verification pipeline, the training coordinator, and the self-improvement loop are substantial systems that must be built and hardened. What the live foundation buys is not those systems for free, it is the removal of the four hardest go-to-market and infrastructure risks that kill DePIN projects before they ship: no users, no payment rail, no proven on-chain settlement, no reach into the markets where the wedge lives. Those are solved. The remaining risk is execution on the AI layer, which is exactly the risk this document's gates are designed to make measurable and honest.

↑ contents
08

ORB Tokenomics

8.1 Design Philosophy: One Token, Real Work, Revenue Before Emissions

ORB is a single, unified ERC-20 token already deployed and live on Base, where it settles staking, rewards, signed-voucher claims, gasless relayer transactions, and OFSI sanctions-screened payouts for the OrbNet/OrbVPN business today. Hylon does not mint a second token, a governance-only sibling, or a separate "compute credit." The network deliberately keeps one asset so that the same unit is earned for verified work, spent for services, staked for economic security, and used to govern the protocol. A single token concentrates liquidity, aligns every participant on one price signal, and avoids the reflexive collapse that multi-token DePIN designs have repeatedly suffered when a subsidy token detaches from a utility token.

The governing discipline of ORB's monetary policy is stated in one sentence: value before emissions. The protocol rewards demonstrably real, cryptographically or economically verified work; it funds token scarcity from genuine service revenue rather than from inflation; and it lets circulating scarcity track demonstrated demand rather than registered-device headcount. This is not a slogan, it is a direct response to a documented pattern of failure across the DePIN sector, and every mechanism in this section is engineered to avoid that pattern.

Core Rule

Emissions are throttled by verified demand and revenue milestones, never by how many devices registered. Rewards decline against revenue growth, not against a calendar or a signup count. Scarcity must be earned by the network's usefulness, not manufactured by a FOMO curve.

8.2 Token Utility: Earn, Spend, Stake, Govern

ORB has four load-bearing utilities. None is speculative or decorative; each corresponds to an action that already exists in the OrbNet product or is directly required by the Hylon AI layer described in earlier sections.

Earn, the supply side

Participants earn ORB for verified contributions of compute, data, and bandwidth. The word "verified" is doing all the work: idle-uptime and "app installed" rewards are explicitly excluded because they are the single most direct cause of ghost-network inflation. What is rewarded:

  • Tier-1 training compute, datacenter-class GPU nodes (8×H100 / B200-class) that participate in low-communication distributed training runs, scored economically and statistically (loss contribution, redundant cross-checks, TEE attestation for high-assurance jobs).
  • Tier-2 inference serving, phones, PCs, and Macs serving model responses, verified by TOPLOC-style locality-sensitive activation hashing (~100× cheaper than re-execution, production-proven in INTELLECT-2), so payout follows proven work, not claimed work.
  • On-device personalization and RL signal, LoRA adaptation, reward scoring of candidate outputs, and privacy-preserving federated-learning gradients, metered per verified unit.
  • Bandwidth and data work, the OrbNet relay/exit capacity and labeled multilingual data that is impossible for a centralized lab to buy, metered and attributed on-chain.

Spend, the demand side

ORB is the settlement unit for consuming the network. Users and enterprises spend ORB (or fiat that the protocol converts to an ORB burn, see §8.5) on: Hylon AI inference and agentic tasks; priority/reserved compute at the front of the training and inference queue; enterprise and API access to the sovereign model; and the underlying OrbNet VPN and network services that already generate revenue today. Because demand-side spend is denominated against real service delivery, it is the anchor that keeps token value tied to usefulness.

Stake, economic security

Operators stake ORB to earn the right to perform slashable work (training, high-value inference), putting capital at risk against faults and fraud. Users stake ORB for governance weight and a share of protocol rewards. Staking is covered in detail in §8.6.

Govern, progressive control

Staked ORB is the DAO's voting instrument. Over the published 2 to 4 year progressive-decentralization schedule, ORB governance progressively controls reward rates, treasury allocation, protocol upgrades, and grants. Critically, and consistent with the governance section, model releases and safety constraints are not a token vote, those remain with the technical safety council, transparent to but not overridable by ORB holders.

8.3 Value Before Emissions: Points First, and Why the Death Spirals Happened

Hylon launches contribution accounting as off-chain points before on-chain token flow, following the precedent set by Grass, which ran a points program to bootstrap a verified contributor base and quantify real demand before turning on token emissions. Points let the network calibrate what a "unit of verified work" is actually worth in delivered service, tune verification against gaming, and switch on token rewards only once there is revenue to back them. This ordering is the operational form of the value-before-emissions rule.

The reason this discipline is non-negotiable is empirical. The DePIN sector as a whole produced roughly $72M of total on-chain revenue in 2025 against a ~$10B aggregate market cap, a valuation-to-revenue gap that only makes sense if most networks are pricing headcount, not cash flow. The specific failure mode is a supply/demand mismatch driven by headcount-indexed emissions:

NetworkRegistered supplyReal demand / verified supplyOutcome
Helium~900,000 hotspots~$6,500/month of real data demandSupply vastly exceeded paying demand; token/economics restructured
io.net327,000 registered GPUs~6,700 verified GPUs~98% of "supply" unverifiable; trust and price collapse
Pi NetworkTens of millions of "miners"Headcount-FOMO emissions, no verified work~-91% price on listing

The common thread is that emissions were paid for existence, a registered hotspot, a claimed GPU, an installed app, rather than for verified, demanded work. Supply inflated to chase the subsidy, demand did not follow, and the token price mechanically deflated toward the value of the actual (tiny) revenue. Hylon's countermeasures are structural: verification gates on every reward (§8.2), points-before-token bootstrapping (this section), revenue-milestone-indexed emissions (§8.4), and revenue-funded burn (§8.5). No reward path in the protocol pays for headcount.

Honest Caveat

Grass itself (~$330M cap) sells bandwidth and web data to AI labs, not raw consumer FLOPs, and no decentralized network yet earns material revenue from consumer-phone compute. Hylon's advantage is that its emissions are backstopped by an already-live revenue business (OrbNet), not a promise of future demand.

8.4 Allocation

The unified ORB allocation is illustrative and subject to DAO ratification, but the shape encodes the priorities of a work-first network: the plurality of tokens goes to the people and machines that do verified compute, data, and bandwidth work, with treasury and ecosystem reserves large enough to fund grants and buybacks, and team/investor allocations that are fully vested and minority.

Illustrative unified ORB allocation. The plurality funds verified compute and network work; team and investor allocations are minority and fully vested.
Compute & network rewards40%
Ecosystem / community15%
DAO treasury15%
Team (vested)15%
Investors (vested)10%
Liquidity / reserve5%
Compute & network rewards, 40%

Paid over years for verified Tier-1 training, Tier-2 inference, personalization, data, and bandwidth work. Declining emission schedule.

Ecosystem / community, 15%

Developer grants, integrations, wedge-market growth, community incentives.

DAO treasury, 15%

Protocol-owned reserve for buyback+burn, compute procurement, and progressive DAO-directed spending.

Team (vested), 15%

Multi-year vesting with cliff; aligns builders without a liquid overhang at launch.

Investors (vested), 10%

Vested per standard schedules; minority of supply, subordinate to the work-reward pool.

Liquidity / reserve, 5%

Exchange and on-chain liquidity provisioning and contingency reserve.

8.5 Emissions Schedule: Declining, Revenue-Indexed, Not Headcount-Indexed

The 40% compute-and-network reward pool is released on a declining schedule tied to revenue milestones. The design intent is that as the network's real service revenue grows, the need to subsidize contribution with fresh emissions falls, and an increasing share of contributor payout is funded by demand-side spend and revenue-funded buyback rather than by inflation. The curve below is illustrative; the DAO sets the actual milestone thresholds and step-downs.

Illustrative emissions vs. revenue-funded contributor payout over time. Emissions decline as revenue milestones are met; an increasing share of payout and value accrual comes from demand-side spend and revenue-funded burn. Y-axis: relative share of annual contributor payout / token flow.
0255075100Y0 (points)Y1Y2Y3Y4Y5
New token emissions
Revenue-funded payout & burn

Two properties are essential and both are borrowed from the DePIN survivors rather than the casualties:

  1. Reward decline is gated by revenue milestones, not by the calendar or the signup count. If revenue lags, emissions do not step down on a fixed date and dump supply into a market with no demand; if revenue accelerates, step-downs can come sooner. This is the direct inverse of the Pi/Helium headcount-FOMO curve.
  2. Operator payouts are USD-stabilized. Following the io.net IDE and Nosana pattern, jobs are priced and operators are paid against a fiat-denominated value, with the token emission throttled by revenue. Operators get predictable, dollar-referenced income; the protocol absorbs token-price volatility rather than passing it to the supply side, which keeps verified GPU supply from fleeing on a drawdown.
Why GPUs, Not Phones, Drive The Capital Question

Emissions fund all verified tiers, but the FLOPs that determine frontier capability come from Tier-1 GPU nodes, capital, not headcount. "Out-compute everyone" is therefore ultimately a treasury question (§8.7), which is exactly why treasury and hardware-financing mechanics are first-class parts of the token design and not an afterthought.

8.6 Value Accrual: Burn-and-Mint and Revenue-Funded Buyback

ORB accrues value through a combination of mechanisms proven in the only DePIN networks that have sustained real cash flow. The unifying principle is that tokens are removed from supply in proportion to real, paid usage and real service revenue, so that demand for the network is expressed as demand for the token.

Burn-and-mint on paid jobs (Render precedent)

Paid AI and compute jobs are priced against a fiat reference. When a job is paid, the fiat-referenced value is burned in ORB terms, and the corresponding reward to the contributor who performed the verified work is minted/released from the reward pool. Usage therefore directly consumes token supply, coupling the token to work delivered rather than to speculation. This is the Render burn-and-mint model applied to Hylon's training/inference marketplace.

Burn from real service revenue (Helium Mobile precedent)

A defined share of genuine service revenue, from OrbNet VPN subscriptions, enterprise/API access, and priority-compute fees, funds ongoing buyback and burn. Because this burn is sourced from cash revenue and not from inflation, it creates a deflationary pressure that scales with the network's actual usefulness. This mirrors Helium Mobile's model of burning against real telecom revenue, and it is the mechanism that most directly enforces value-before-emissions on the demand side.

USD-stabilized operator payouts (io.net IDE / Nosana precedent)

As noted in §8.5, operator compensation is dollar-referenced. This is a value-accrual mechanism as much as a stability one: by decoupling operator income from short-term token price, it prevents the supply-side capitulation that turns a price dip into a supply collapse into a further price dip.

The ORB value-accrual loop: verified work is rewarded; paid usage and service revenue fund burn; declining emissions plus growing burn invert supply pressure toward demonstrated value.
01
Verified work
GPU training, device inference, personalization, data, bandwidth, verified via TOPLOC hashing, loss-scoring, TEE, and slashable stake
02
Earn ORB
Rewards from declining emissions + growing demand-side spend; operator pay USD-stabilized
03
Spend ORB
Users & enterprises pay for AI, priority compute, API, and OrbNet services
04
Burn
Burn-and-mint on paid jobs (Render) + buyback-and-burn from real service revenue (Helium Mobile)
05
Scarcity tracks demand
As revenue grows, burn grows and emissions shrink; supply pressure inverts toward delivered value
06
Stronger network
Treasury + revenue fund more verified compute and better models, driving more real work
↺ loops back to 01
The Accrual Loop

Verified work is rewarded from emissions (declining) and from demand-side spend (growing). Demand-side spend and a share of service revenue fund burn. As revenue grows, burn grows and emissions shrink, supply pressure inverts and the token increasingly reflects delivered value, not subsidy.

8.7 Staking and Slashing

ORB staking provides the economic security layer for verified work and the substrate for governance.

Operator staking (training rights)

Tier-1 GPU operators and high-value inference operators post an ORB bond to be eligible for slashable work. Faulty, fraudulent, or unavailable work is penalized by slashing the bond. This makes the cost of cheating exceed the reward, and complements the cryptographic/economic verification (TOPLOC hashing, redundant loss-scoring, TEE attestation) described in earlier sections.

User / holder staking (governance + rewards)

Holders stake to obtain governance weight in the DAO and to earn a share of protocol rewards. Governance weight over reward rates, treasury, upgrades, and grants accrues progressively per the decentralization schedule; safety and release gates remain outside the token vote.

Slashing conditions

Invalid inference (failed activation-hash check), invalid training contribution (statistical/redundancy outlier or TEE attestation failure), and unmet availability commitments on reserved-priority jobs. Slashing parameters are DAO-governed.

Staking economically binds the right to earn to capital at risk, which is the second half of the anti-ghost-network defense: verification proves work happened, and slashing makes lying about it unprofitable.

8.8 Hardware Financing: Corporate Debt, Not User Yield

Scaling Tier-1 training FLOPs is a capital problem, and Hylon finances hardware the way a capital-intensive company should, with corporate debt. Hylon Labs and/or the DAO treasury borrow against GPUs and hardware as collateral. This is non-dilutive to ORB holders: it adds compute capacity without minting new tokens, and it is serviced from service revenue and enterprise cash flow.

Hard Line (Securities Risk)

Hylon does not offer any user-facing fixed-interest, guaranteed-yield, or "invest in a GPU and earn X%" product. Such instruments are securities and are out of scope. Hardware financing is a corporate/treasury debt function, kept entirely separate from what contributors and users touch.

Optional hardware-buyback consolidation (stated honestly)

Over time the company or DAO may buy back operator hardware to consolidate compute into higher-utilization, better-verified clusters, improving reliability, verification assurance, and per-FLOP economics for the training tier. This must be stated plainly: buying back and consolidating hardware re-centralizes that portion of the physical fleet. Hylon treats this as a deliberate, transparent trade-off, decentralization of the inference/personalization/data layer (the million-device Tier-2) is architecturally durable, while the training tier may consolidate for capital efficiency and assurance, with the direction of that trade-off subject to DAO oversight as governance decentralizes.

8.9 Summary of Monetary Discipline

LeverRuleFailure it prevents
RewardsOnly for verified work; never idle uptime or headcountio.net (327k registered vs 6.7k verified)
BootstrappingPoints before token (Grass precedent)Emitting tokens before demand exists
EmissionsDeclining, gated by revenue milestonesPi (-91% via headcount-FOMO emissions)
BurnFrom paid jobs + real service revenue (Render, Helium Mobile)Inflation-only "value"
Operator payUSD-stabilized (io.net IDE, Nosana)Supply flight on drawdowns
HardwareCorporate debt, non-dilutive; no user yield productSecurities exposure; dilution

ORB is engineered so that every unit in circulation traces back to verified work or paid demand, and every deflationary force traces back to real revenue. That is the whole of the design: a token whose scarcity is a measurement of the network's usefulness, backstopped from day one by a live, revenue-generating business rather than by a promise.

↑ contents
09

Governance, the DAO & Progressive Decentralization

Hylon is a network that trains, releases, and operates a frontier-directed AI system while distributing real economic rewards across a global fleet of contributors. That combination imposes two hard constraints on governance that most token projects never have to reconcile. First, the network must credibly hand real control, reward rates, treasury allocation, protocol upgrades, to its token holders, or the decentralization claim is theater. Second, it must never place the release of a powerful AI model, or the safety constraints that bound it, at the mercy of a fluctuating token vote, a flash-loan-financed governance attack, or a well-organized minority optimizing for token price over public safety. Hylon resolves this with a deliberately bicameral design: an economic constitution that decentralizes progressively and irreversibly to the DAO on published milestones, and a narrow, transparent, non-votable safety mandate held by a technical Safety Council. This section describes the legal entities that hold assets and liability, the exact powers that transfer to the DAO and on what schedule, the powers that deliberately never do, and the concrete mechanics of voting, treasury, and proposals.

9.1 Entity Structure

Hylon separates three functions that are frequently, and dangerously, conflated in DePIN and AI-token projects: the operating business that builds and commercializes the technology, the ownerless steward that holds the protocol and treasury on behalf of the network, and the token-holder governance body that directs the protocol over time. Each is a distinct legal and functional entity, and the separation is what makes both the regulatory posture and the progressive-decentralization commitment credible rather than rhetorical.

Hylon governance and entity structure: operating company, ownerless steward, issuer, and token-holder DAO, with the Safety Council holding the release gate.
  • Hylon Network
    Decentralized AI on a live product (OrbNet/OrbVPN)
    • Hylon Labs
      Operating company, employs teams; owns models, IP, codebase, trademarks; holds API + enterprise + product revenue and commercial liability
    • Hylon Foundation (Cayman)
      Ownerless, purpose-bound steward of protocol, ORB token & treasury; funds grants; legal wrapper for the DAO
      • BVI Issuer
        Token issuance, crypto-asset white paper, listing & transfer restrictions, isolates token-law surface
      • DAO Treasury
        Stewarded by Foundation, progressively directed by the DAO; funded by protocol allocation + revenue-funded buybacks
    • The DAO (ORB holders)
      Token-holder governance, progressively assumes reward rates, treasury allocation, protocol upgrades, grants
    • Safety Council
      Technical body, final model-release gate + kill switch + Level-0 foundations; NOT a token vote; transparent & accountable to the DAO

Hylon Labs, the operating company

Hylon Labs is a conventional operating company. It employs the engineering, research, product, safety, and go-to-market teams; it owns and operates the OrbNet/OrbVPN product and its live revenue; it owns the model weights, the training pipeline, the Self-Improvement Engine, the codebase, and the trademarks; and it holds the commercial contracts, enterprise licensing, API revenue, and infrastructure agreements. Labs is where salaries, corporate debt (including the hardware-financing facilities described in the token and economics section), payroll liability, and commercial IP live. It is the entity a customer contracts with and the entity that carries employment and product liability. Critically, Labs earns revenue from products and licensing, not from issuing tokens, which is the separation regulators look for when distinguishing a genuine operating business from a fundraising vehicle.

Hylon Foundation, the ownerless steward

The Hylon Foundation is a Cayman Islands foundation company: an ownerless, purpose-bound legal person with no shareholders and no members who can extract its assets. Its constitutional documents bind it to a fixed purpose, stewarding the Hylon protocol, the ORB token, and the network treasury for the benefit of the network and its participants, and that purpose cannot be redirected to enrich any individual. The Foundation stewards the protocol's on-chain parameters, holds and administers the DAO treasury under the DAO's direction as decentralization progresses, funds ecosystem grants, and serves as the legal wrapper that lets a decentralized collective interact with the off-chain world (sign an audit contract, hold a bank account, defend the trademark). The ownerless structure is deliberate: because no one owns the Foundation, the treasury it stewards cannot be treated as any founder's asset, and the pathway to DAO control is a genuine transfer of stewardship rather than a favor that can be revoked.

The BVI issuer

Token issuance and the associated distribution mechanics are handled through a British Virgin Islands issuer affiliated with the Foundation. Segregating the issuance function into a purpose-specific BVI entity isolates the securities-and-token-law surface, issuance, the published crypto-asset white paper, listing arrangements, and transfer restrictions, from both the operating company's commercial liability and the Foundation's stewardship role. This is standard, defensible structuring for a compliant token launch rather than an exotic arrangement.

The DAO, token-holder governance

The DAO is the collective of ORB holders exercising on-chain governance rights. It is not, at genesis, in control of everything; it is the body to which control progressively transfers. Over the decentralization timeline it assumes authority over reward-rate parameters, treasury allocation, protocol upgrades, and the grants program. The DAO expresses its will through on-chain proposals and votes executed by the Foundation and the protocol's governance contracts. It is the destination of the economic constitution, and, by explicit design, it is not the body that decides model releases or safety constraints.

WHY THREE ENTITIES

Labs carries commercial IP, revenue, and liability. The Foundation is ownerless and purpose-bound, so the treasury it stewards can never be an individual's asset. The BVI issuer isolates token-law surface. This separation is what makes both regulatory compliance and the handover to the DAO credible rather than cosmetic.

9.2 What the DAO Governs, and the Progressive Decentralization Timeline

Decentralization at Hylon is progressive, milestone-gated, and one-directional. It is not a single "hand over the keys" event, and it is explicitly not tied to dates that can slip or be spun, it is tied to published, verifiable milestones that mirror the gate-based discipline used everywhere else in this document. Powers transfer to the DAO in an order chosen so that each transfer is safe: the network first decentralizes the parameters where token-holder incentives are well-aligned and the blast radius of a mistake is bounded, then progressively the parameters where more is at stake. The full transfer of economic control unfolds over roughly two to four years.

Progressive decentralization: economic control transfers to the DAO in four milestone-gated phases over ~2-4 years; safety and releases never transfer.
Genesis
Foundation-stewarded launch
Foundation stewards protocol & treasury; DAO governance contracts live; safety gate held by Safety Council from day one
Phase 1
Reward rates → DAO
DAO assumes emission schedule (within hard caps), tier/work-type reward splits, contribution multipliers, bounded by value-before-emissions throttle
Phase 2
Treasury allocation → DAO
DAO directs compute procurement, ecosystem funding, liquidity, revenue-funded buyback-and-burn cadence
Phase 3
Protocol upgrades → DAO
DAO controls on-chain contract upgrades behind timelocks + mandatory audits; safety-system carve-out preserved
Phase 4
Grants & ecosystem → DAO
DAO fully owns grants program and ecosystem-development direction; economic decentralization complete
Permanent
Safety stays with the Council
Model releases + safety constraints never become a token vote, transparent to and auditable by the DAO

The order of transfer, and why

  1. Reward rates first. The earliest power to transfer is control over reward-rate parameters, the emission schedule within its hard protocol caps, the split of rewards across device tiers and work types, and the multipliers that price verified compute and data contributions. This transfers first because token holders and contributors are the parties most directly affected and most competent to tune it, and because these parameters sit inside immutable guardrails (a maximum emission ceiling and the value-before-emissions revenue throttle) that bound any error.
  2. Treasury allocation second. Next the DAO assumes authority over how the treasury is deployed: funding for compute procurement, ecosystem programs, liquidity, buyback-and-burn cadence funded from genuine service revenue, and operational reserves. This is sequenced after reward rates because it carries larger single-decision stakes and benefits from the DAO having first built a track record on lower-risk parameters.
  3. Protocol upgrades third. The DAO then gains control over upgrades to the protocol's on-chain logic, the staking, verification-reward, voucher-claim, and governance contracts themselves. Because a bad upgrade can be catastrophic and irreversible, this transfers only after the DAO is mature, and it is gated by timelocks, mandatory audits, and (as detailed below) explicit carve-outs preventing governance from reaching into the safety-critical release machinery.
  4. Grants and ecosystem direction fourth. Finally the DAO assumes full ownership of the grants program and ecosystem-development direction, funding third-party builders, integrations, research bounties, and market-specific initiatives, completing the transfer of the network's discretionary spending to its community.
Governed domainTransfer phaseImmutable guardrail that always remains
Reward-rate parametersPhase 1 (early)Hard max-emission cap; revenue-throttled emissions
Treasury allocationPhase 2Purpose-bound Foundation mandate; no self-dealing
Protocol upgradesPhase 3Timelock + mandatory audit; safety-system carve-out
Grants & ecosystemPhase 4Foundation purpose constraint
Model releasesNever (Safety Council)Level-0 immutable foundations; kill switch
Safety constraintsNever (Safety Council)Level-0 immutable foundations
MILESTONE-GATED, NOT DATED

Each phase unlocks on a published milestone, e.g. a minimum count of independent verified contributors, a minimum quorum of active governance participation, completed audits, and a demonstrated track record on the prior phase, not on a calendar promise. This is the same gate discipline used for the roadmap: the network claims a decentralization milestone only once it is crossed.

9.3 What Stays With the Safety Council, Permanently and by Design

Two powers deliberately never transfer to a token vote: the decision to release a model, and the definition and enforcement of the network's safety constraints. These are held by a technical Safety Council. This is the single most important governance decision in the whole design, and it is a considered one, not an accident of centralization.

The rationale

A frontier-directed AI system cannot have its safety decided by a fluctuating, financially-incentivized, potentially-capturable vote. Token-holder governance optimizes, correctly, for the value of the network, but the marginal token holder's incentive to ship a more capable model faster is not the same as the public's interest in that model being safe, and the gap is exactly where catastrophe lives. On-chain governance is also attackable in ways that safety cannot tolerate: vote-buying, flash-loan-borrowed voting power, bribery markets, and low-turnout capture are documented, recurring failure modes. A model that has just cleared a red-team suite must not be releasable because a whale accumulated tokens over a weekend; a safety constraint must not be removable because a proposal passed at 3% quorum. The release gate and the safety constraints therefore sit with a body selected for technical and safety competence, operating under a fixed mandate, rather than with the token electorate.

What the Safety Council controls

  • The final release gate. In the champion, challenger pipeline, a challenger is promoted to a new named version only if it beats the live champion by a defined margin on the fixed gate suite, general benchmarks, wedge-domain evals, blind personal-context A/B, and safety red-team, with statistical significance and zero safety regressions. The Safety Council holds the final human sign-off on that promotion. It can veto a release that clears the automated gates; it cannot be compelled by a vote to ship one that does not.
  • The kill switch. The Council can halt a live model, trigger automatic rollback to the prior champion, or freeze the release pipeline in response to a discovered safety failure, at any time and without a governance vote.
  • The Level-0 immutable foundations. Truth, safety, honesty, and integrity are hardcoded objectives that cannot be self-modified by the system and cannot be edited by governance. The Council is the human guarantor that the L0 < L1 < L2 < L3 objective hierarchy holds and that the gaming detector (L3 metrics improving while L1, L2 do not) is honored.

Transparency and accountability, the check on the Council

Narrow power without accountability is just a different centralization. The Safety Council's mandate is therefore bounded and made transparent to the DAO rather than opaque. Release decisions, the gate-suite results behind them, promotions, rollbacks, and kill-switch activations are recorded on-chain and reported to the DAO. The Council's remit is strictly the safety-and-release gate; it holds no authority over reward rates, treasury, or economics, which are the DAO's domain. The DAO retains legitimate oversight instruments, visibility into every gated decision, the ability to fund independent audits and external red-teams, and governance authority over Council composition rules and mandate at the constitutional level, while being structurally prevented from reaching in to force a specific unsafe release. In short: the DAO owns the economy and can hold the Council accountable; the Council owns the safety gate and cannot be overridden into shipping something unsafe. Neither can quietly become the other.

THE HARD LINE

Model releases and safety constraints are never a token vote. This is not a temporary training-wheels measure that decentralizes later, it is a permanent constitutional carve-out, because "safety by fluctuating vote" is a failure mode, not a feature. What the DAO gets instead is full transparency into every safety decision and the tools to audit it.

9.4 Voting, Treasury & Proposal Mechanics

Voting

Governance power derives from ORB, with mechanics chosen to resist the documented capture failure modes rather than to maximize nominal decentralization on paper.

Voting weight

Staked/locked ORB, with vote weight scaling with lock duration (longer commitment → more weight), aligning voting power with long-term network alignment rather than transient balances.

Anti-flash-loan

Only ORB locked before a proposal's snapshot block can vote on it; borrowed-for-a-block voting power is structurally excluded.

Quorum & thresholds

Proposals require a minimum quorum to be valid and higher approval thresholds for higher-impact classes (a protocol upgrade needs a supermajority and higher quorum than a routine grant).

Delegation

Holders may delegate voting power to domain-expert delegates, with delegation revocable at any time, raising effective participation without forcing every holder to evaluate every technical proposal.

Timelock

Passed proposals that touch protocol logic or treasury movements execute only after a mandatory timelock, giving the network time to react to a malicious or buggy proposal before it takes effect.

Treasury management

The treasury is stewarded by the Foundation and directed, progressively, by the DAO. Its funding sources are genuine: the network's protocol allocation, and, importantly, buyback flows funded from real service revenue under the value-before-emissions discipline, never from inflation. Deployments follow the governed categories: compute procurement (including toward the DAO-owned GPU cluster that anchors decentralized training), ecosystem grants, liquidity and reserves, and revenue-funded buyback-and-burn. Movements above defined thresholds require a passed proposal, execute behind a timelock, and are recorded on-chain. The Foundation's purpose-bound, ownerless constitution is the backstop that prevents treasury capture: even a passed proposal cannot direct funds outside the Foundation's mandate or into self-dealing.

The proposal lifecycle

The Hylon proposal lifecycle, from open discussion to timelocked on-chain execution.
01
Temperature check
Open forum discussion gauges sentiment
02
Draft HIP
Formal proposal to template: motivation, spec, safety considerations
03
Review & audit
Technical review + independent audit for protocol/treasury proposals
04
On-chain vote
Snapshot-based; only pre-snapshot locked ORB eligible; impact-scaled quorum & threshold
05
Timelock
Mandatory public delay window before execution
06
Execution
Foundation + contracts execute; recorded on-chain
  1. Temperature check. An idea is posted for open discussion in the governance forum; informal sentiment gauges whether it is worth formalizing.
  2. Draft proposal. A formal Hylon Improvement Proposal is written to a standard template, motivation, specification, parameter changes, security and safety considerations, and (where relevant) confirmation that it does not intrude on the Safety Council carve-out.
  3. Review & audit. Proposals touching protocol logic or significant treasury sums undergo technical review and, where required, independent audit before they can reach a vote.
  4. On-chain vote. The proposal goes to a snapshot-based on-chain vote with the quorum and threshold appropriate to its impact class; only pre-snapshot locked ORB is eligible.
  5. Timelock. A passed proposal enters a mandatory delay window, publicly visible, before execution.
  6. Execution. The Foundation and governance contracts execute the approved action; the outcome is recorded on-chain.
SCOPE GUARD

The proposal system enforces the constitutional boundary at the schema level: a proposal cannot be crafted to force a specific model release, remove a Level-0 foundation, or override a Safety Council decision. Economic governance and safety governance run on separate rails on purpose.

Taken together, these mechanics implement a specific philosophy of decentralization: give the community genuine, irreversible ownership of the network's economy on a milestone schedule it can verify, while permanently ring-fencing the two decisions, releasing a powerful model and constraining its behavior, where majority-rules-by-token is not a safe way to decide. The economy belongs to the token holders; the safety gate belongs to a competent, transparent, accountable body; and neither is allowed to swallow the other.

↑ contents
11

Roadmap, Risks, Competition & Reference

11.1 The Gate Discipline: Claims Follow Evidence

Hylon's roadmap is organized around gates, not dates. A gate is a bundle of shipped deliverables and measured key performance indicators (KPIs) that must all be satisfied before the network makes the corresponding public claim. This is a deliberate inversion of the DePIN norm, in which networks announce a destination ("decentralized supercomputer," "the AI economy") and then spend years, and emissions, trying to grow into the slogan. Every superlative in this whitepaper is tied to a specific gate, and the network commits to stating each claim only after its gate is independently verifiable. Until Gate 1's evaluation suite is passed and published, Hylon does not claim wedge dominance; until Gate 3's compute-and-recipe threshold is crossed, Hylon does not claim general-frontier parity. The mission, to ultimately out-compete every centralized lab, is stated plainly as a destination, but the method is a sequence of gates each of which is falsifiable.

CORE DISCIPLINE

A claim is not made when it is aspirational; it is made when its gate is crossed and the evidence is on-chain and reproducible. "Value before emissions, evidence before claims" is the same discipline applied to both the token and the marketing.

11.2 The Four Gates in Detail

The roadmap advances through four gates spanning roughly three years, with the fourth explicitly conditional. Each gate is defined by (a) concrete deliverables, (b) numeric KPIs that gate promotion, and (c) the specific public claim it unlocks.

The four gates: each public claim is made only when its gate is crossed and verifiable.
~6 months
Gate 0, Sovereign Stack Live
Personal AI + two-tier compute + ORB rails + DAO shipped as ONE product on a top open model, self-hosted, private, uncensored, atop live OrbNet. Claim: strongest AI for your life, private, available where others are blocked.
12 to 18 months
Gate 1, Wedge Dominance
Continually-trained model beats all open models on wedge domains (Farsi/Arabic/Russian/Turkish, censored-market, private personal-context) and wins blind A/B vs frontier on personal-context tasks.
Year 2
Gate 2, Self-Improving & Self-Releasing
DAO-owned GPU cluster; automated champion, challenger promotion with margin + significance + zero-safety-regression gates, shadow→%→full rollout, on-chain records, Safety Council gate + kill switch.
Year 3+ (CONDITIONAL)
Gate 3+, General-Frontier Program
Commences only when treasury-funded aggregate compute AND validated-recipe library both cross threshold. No date, no frontier-parity claim until crossed.

Gate 0, Sovereign Stack Live (~6 months)

Gate 0 ships the entire Hylon system as one product on top of the live OrbNet/OrbVPN business: a compact on-device Personal AI, the two-tier compute fabric, the ORB token rails (staking, verified-work metering, signed-voucher claims, gasless relayer, OFSI/OFAC screening), and the DAO governance shell, all running on the strongest available permissively-licensed open-weight foundation model, self-hosted, private, and uncensored. Because OrbNet already has paying users in restricted markets, Gate 0 launches into a real user base rather than a waitlist.

Deliverables

Personal AI (1 to 4B-class / BitNet ternary) shipping in the OrbNet app (iOS foreground-while-charging, Android within FGS caps); Tier-2 inference serving live on member devices; TOPLOC-style inference verification in production; ORB verified-work accrual + web/app conversion; DAO contracts deployed on Base; self-hosted sovereign inference with no hard external-API dependency.

KPIs

≥ 1 self-hosted open model served end-to-end with zero external-provider calls on the critical path; on-device personalization loop closing (local LoRA/context updates) with raw data never leaving the device; inference verification false-accept rate below target on a red-team probe set; first cohort of verified-work ORB payouts settled through OFSI/OFAC screening.

Unlocks the claim

"The strongest AI for your life, private, self-hosted, and available where others are blocked."

Gate 1, Wedge Dominance (12 to 18 months)

Gate 1 turns the data flywheel. A continually-trained Hylon model, improved by RL on real (privacy-preserving) user tasks flowing from the Personal AI fleet, must beat every open-weight model on the wedge domains the centralized labs structurally under-serve, Farsi, Arabic, Russian, Turkish; censored-market retrieval; and fully-private personal-context tasks, and must win a blind A/B against frontier closed models on personal-context tasks, where Hylon's on-device memory is a decisive advantage the labs cannot replicate without the data.

Deliverables

Federated-learning + differential-privacy + secure-aggregation pipeline in production; wedge-domain evaluation suite (multilingual + censored-market + personal-context) published and versioned; continual-training loop on Tier-1 GPU nodes producing dated model versions.

KPIs

Statistically significant win over the best open models on the published wedge suite; ≥ 50% blind-A/B preference vs a named frontier model on personal-context tasks; DPIA current and membership-inference / model-inversion defenses validated against a documented attack battery.

Unlocks the claim

"On the tasks that matter to our users, in the languages and markets the labs ignore, Hylon is the strongest AI available."

Gate 2, Self-Improving & Self-Releasing (Year 2)

Gate 2 stands up the DAO-owned Tier-1 GPU cluster and closes the automated champion, challenger loop. The self-improvement engine surveys the literature, proposes recipes, runs cheap small-scale ablations, and promotes a challenger to a new named version only if it beats the live champion by a defined margin on the fixed gate suite (general benchmarks + wedge evals + blind personal-context A/B + safety red-team) with statistical significance and zero safety regressions. Rollout proceeds shadow → percentage → full with automatic rollback, is recorded on-chain, and the DAO is notified. A technical Safety Council holds the final release gate and the kill switch.

Deliverables

DAO-treasury-owned + operator-staked Tier-1 cluster coordinated over the internet via low-communication training (DiLoCo / Streaming DiLoCo / DisTrO / DeMo); automated release pipeline (shadow/canary/rollback) with on-chain records; Level-0 immutable-foundations verifier wired into every training step.

KPIs

≥ 1 fully automated champion→challenger promotion executed end-to-end without human code changes; margin gate + significance gate + zero-safety-regression gate all enforced; mean rollback time within target; Safety Council sign-off recorded for each release.

Unlocks the claim

"Hylon improves itself, validated recipes, guarded releases, on infrastructure the DAO owns."

Gate 3+, General-Frontier Program (Year 3+, CONDITIONAL)

Gate 3 is the general-frontier push, and it is explicitly conditional: it commences only when treasury-funded aggregate compute (GPU count = capital) and a portfolio of self-improvement-validated recipes both cross a defined threshold. Hylon states no date and makes no frontier-parity claim until this gate is crossed. The honest framing is that out-computing the centralized labs is, past a point, a treasury question, per-node memory and aggregate FLOPs are the binding constraints, both solvable only with capital, so Gate 3 is gated on the treasury and the recipe library, never on a calendar.

NEVER OVERCLAIMED

Gate 3 is described as a conditional program, not a promise. No "AGI by Year 3," no "biological or quantum capabilities," no unlimited self-improvement, only "recursive self-improvement within immutable guardrails," and only claimed once the compute-and-recipe threshold is demonstrably crossed.

11.3 Risks & Mitigations

A permanent reference document must be honest about how it can fail. The table below enumerates the principal risks across technical, market/demand, regulatory, adversarial/security, execution, and tokenomics dimensions, each with its concrete mitigation. The risk matrix that follows plots likelihood against impact so the reader can see where the residual exposure concentrates.

Principal risks across six dimensions, each with its concrete mitigation.
RiskDimensionLikelihoodImpactMitigation
Demand for verified compute/data fails to materializeMarketHighHighSits on existing OrbNet subscription revenue; funds buyback/burn from real revenue not inflation; reward decline tied to revenue milestones not headcount; verify every contribution (no idle-uptime rewards).
Regulatory action (MiCA / SEC / GDPR)RegulatoryMediumHighMiCA white paper + authorized CASP venues only; mirror SEC DoubleZero no-action posture + seek own relief; GDPR legitimate-interests + DPIA; no 'mining', non-custodial, OFSI/OFAC screening, geo-block sanctioned jurisdictions.
Privacy attack (membership inference / model inversion)AdversarialMediumHighFederated learning + differential privacy + secure aggregation; raw data never leaves device; maintained attack-defense battery; living DPIA; EDPB 28/2024 acknowledged (weights not presumed anonymous).
Training doesn't scale past strawman / bandwidth or memory wallTechnicalLowHighLow-communication training (DiLoCo/DisTrO/DeMo) cuts network 400 to 10,000x; INTELLECT-1 (10B) trained across 3 continents on 127 to 935 Mbit/s; real constraints (per-node memory, aggregate FLOPs) addressed by GPU training tier + treasury.
Verification gamed (fake work, sybil, poisoned data)AdversarialMediumMediumTOPLOC locality-sensitive activation hashing (~100x cheaper than re-execution, INTELLECT-2 proven); loss-based statistical scoring + redundancy for training; TEEs for high-assurance; sybil-resistant metering.
App-store rejectionRegulatoryMediumMediumFollow Acurast precedent (approved both stores); never say 'mining'; earn points off-device, convert on web/separate wallet app; iOS foreground-while-charging, Android within FGS caps; no feature token-gating (Apple 3.1.1).
Token death spiral (headcount-FOMO emissions)TokenomicsMediumHighValue-before-emissions (points-before-token like Grass); fiat-priced jobs + burn-and-mint (Render); USD-stabilized payouts + revenue-throttled emissions (io.net/Nosana); declining emissions on revenue milestones.
Execution / recipe risk on expensive training runsExecutionMediumMediumSelf-improvement engine: cheap small-scale ablations find the recipe; expensive $5 to 20M runs execute only VALIDATED recipes; champion, challenger prevents regressions from reaching users.
Self-improvement or release goes off-guardrailTechnical/SafetyLowHighLevel-0 immutable foundations (truth/safety/honesty/integrity) verified every step, cannot be self-modified; L0<L1<L2<L3 hierarchy with gaming detection; Safety Council holds final release gate + kill switch.
Hardware financing mischaracterized as a securityTokenomics/LegalLowMediumHardware financing = corporate debt (company borrows against GPUs), never a user-facing fixed-yield product; buybacks of user hardware disclosed honestly as re-centralizing.
Likelihood × impact. Concentrations in the high/high cell (demand, tokenomics, regulatory) drive the value-before-emissions and existing-revenue strategy.
Low likelihoodMedium likelihoodHigh likelihood
High impactTraining scale wall; Off-guardrail self-improvementRegulatory action; Privacy attack; Token death spiralDemand fails to materialize
Medium impactHardware-financing mischaracterizationVerification gamed; App-store rejection; Execution/recipe risk,
Low impact, , ,

Notes on the highest-residual risks

Demand for verified compute/data is the risk that has killed the most DePINs, and Hylon's honest read of the sector is sobering: roughly $72M of total on-chain DePIN revenue in 2025 against a ~$10B aggregate market cap, and no decentralized network yet earning material revenue from consumer-phone FLOPs. Hylon's structural mitigation is that it does not depend on a speculative future compute marketplace to bootstrap, it sits on OrbNet's existing subscription revenue and an existing user base, funds buyback-and-burn from genuine service revenue rather than inflation, and ties reward decline to revenue milestones (not headcount), directly avoiding the Helium (900k hotspots vs ~$6.5k/mo real demand), io.net (327k registered vs 6.7k verified GPUs) and Pi (−91% via headcount-FOMO emissions) failure modes.

Regulatory risk is real and jurisdiction-specific. Hylon publishes a MiCA-compliant crypto-asset white paper and lists only on authorized CASP venues; mirrors the SEC DoubleZero no-action posture (programmatic, usage-based rewards to infrastructure contributors, separated from any fundraising) while seeking its own fact-specific relief; grounds privacy in a documented GDPR legitimate-interests basis + DPIA + FL/DP/secure-aggregation (acknowledging EDPB Opinion 28/2024 that weights are not presumed anonymous); and never uses the word "mining," never token-gates app features, and keeps wallets non-custodial with OFSI/OFAC screening before payout and geo-blocking of sanctioned jurisdictions, following the Acurast app-store precedent.

Privacy-attack risk, membership inference and model inversion against shared learned signal, is treated as an engineering problem with a maintained defense battery, not a solved one; the DPIA is a living document and the legitimate-interests basis is defended, never replaced by the false claim that GDPR does not apply or that a household exemption covers the network.

11.4 Competitive Landscape

Hylon competes at the intersection of several categories, and it is more honest, and more useful, to say precisely where each incumbent wins than to dismiss them. The comparison below maps the field. The recurring pattern: enterprise-GPU and rendering networks win on raw supply and utilization; data networks win on distribution to labs; phone and gaming-PC networks win on device count; decentralized-training collectives win on the training method itself. None of them combines data locality, a live revenue-generating user base in censored markets, and a sovereign self-improving stack, which is the wedge Hylon occupies.

Where each incumbent wins, and how Hylon differs. Honest reads from sector research.
NetworkCategoryWhere it winsHonest limitationHow Hylon differs
BittensorIncentivized subnetsMature ecosystem, large market capMarketplace of competing miners, not one sovereign modelOne continually-improving model fed by private, uncollectable user data
io.netEnterprise GPU aggregationGPU supply + orchestration327k registered vs 6.7k verified GPUs historicallyMeters + verifies every contribution; no registration rewards
AkashDecentralized cloudOpen compute marketplace, real utilizationGeneral cloud, not an AI model networkSovereign self-improving AI atop verified two-tier compute
RenderGPU renderingProven fiat-priced burn-and-mint economyRendering, not AIBorrows the token design; applies it to AI work
AethirEnterprise GPUEnterprise-grade GPU-as-a-serviceSupply-side, no data moat or user baseData locality + live user base + model ownership
Grass (~$330M)Data / bandwidthSells web data + bandwidth to AI labsScraped public data, not private signalPrivate on-device multilingual/censored-market signal that can't be scraped
Acurast (~255k phones)Smartphone computePhone-compute leader; app-store precedentDemand unprovenStarts with OrbNet's existing paying demand
Salad (~$10M ARR)Consumer gaming PCs~60k daily GPUs, real ARR, no tokenConsumer inference/rendering onlyAdds sovereign model + data flywheel + aligned token
Nous / Prime Intellect / PluralisDecentralized trainingState-of-the-art training methods (Psyche, DisTrO, INTELLECT, TOPLOC)Still GPU-bound (8×H100/B200 nodes; 512×H200 central cluster)Builds on these methods; couples them to data flywheel + live revenue
Apple / Google on-device AIOn-device AIHardware integration + scaleCannot serve uncensored, sovereign model in restricted marketsUncensored, user-sovereign, pools private cross-user signal into a competing global model
Data moat (private/uncollectable → public/commodity) vs proven paying demand. Hylon occupies the high-moat, existing-demand corner via OrbNet.
Commodity supply / scraped dataPrivate, uncollectable data moatUnproven / speculative demandExisting paying demand12345678910
1Hylon
2Salad
3Render
4Grass
5io.net
6Akash
7Aethir
8Bittensor
9Acurast
10Apple/Google on-device

Where the incumbents win, and where Hylon differs

  • Bittensor wins on a mature incentivized-subnet ecosystem and a large market cap; it is a marketplace of competing miners, not a single sovereign model learning from a private user base. Hylon differs by owning one continually-improving model fed by uncollectable data.
  • io.net, Akash, Aethir win on enterprise GPU aggregation and utilization, real, useful, and largely orthogonal to Hylon; io.net's own history (327k registered vs 6.7k verified GPUs) is precisely why Hylon meters and verifies every contribution rather than rewarding registration.
  • Render wins on a proven fiat-priced, burn-and-mint economy for GPU rendering, a token design Hylon explicitly borrows from, but it is not an AI network.
  • Grass (~$330M cap) wins on selling bandwidth and web data to AI labs; it is a data-distribution business, not a compute or model network. Hylon's data is different in kind: private, on-device, multilingual, censored-market signal that cannot be scraped.
  • Acurast (~255k phones) is the smartphone-compute leader and the app-store compliance precedent Hylon follows, but its demand is unproven; Hylon starts with OrbNet's existing paying demand.
  • Salad wins as the consumer gaming-PC leader (~60k daily GPUs, ~$10M ARR) with no token, proof that verified consumer compute can earn real revenue, and a model for demand-first discipline.
  • Nous, Prime Intellect, Pluralis win on decentralized-training research, the methods (Psyche, DisTrO/DeMo, INTELLECT-1/2/3, TOPLOC) Hylon builds on. Notably these runs remain GPU-bound (Nous Psyche 40B on ~8×H100 nodes; Templar Covenant-72B on ~8×B200/peer; INTELLECT-3 used a central 512×H200 cluster), which is exactly why Hylon's training tier is GPUs and Gate 3 is a treasury question.
  • Apple / Google on-device AI win on hardware integration and scale, but they cannot offer an uncensored, user-sovereign model in restricted markets, nor pool private cross-user signal into a competing global model, the structural opening Hylon exploits.

11.5 The Five Defensive Assets and the Flywheel

Hylon's defensibility rests on five assets that a better-funded centralized lab can attack individually but cannot assemble together, and a flywheel that couples them.

  1. A live, revenue-generating product (OrbNet/OrbVPN). Real users, real subscriptions, real infrastructure in restricted markets, the credibility differentiator versus vaporware DePINs, and the funding source for value-before-emissions.
  2. Data locality. One million Personal AIs generating RL signal from real tasks in languages and markets no lab can legally or physically scrape, the unbuyable moat. A lab can outspend Hylon; it cannot assemble this dataset.
  3. The sovereign, self-improving stack. One self-contained AI (MoE core, router, retrieval/memory, tools, specialist heads, agent orchestration) with no hard external dependency, improving itself through the guarded champion, challenger loop.
  4. Verified two-tier compute. A Tier-1 GPU training fabric coordinated over the internet by low-communication training, plus a Tier-2 fleet of a million devices for inference, personalization, RL scoring and labeling, every contribution metered and cryptographically or economically verified.
  5. Aligned token economics + progressive DAO governance. One ORB token for verified work, service payment, and governance; buyback-and-burn from real revenue; and a Foundation/Labs/DAO structure that progressively transfers control on published milestones while keeping model releases and safety with the technical Safety Council.
The data flywheel: each turn paid for by genuine OrbNet revenue, compounding a moat capital alone cannot catch.
01
1M Personal AIs
Compact on-device models (1 to 4B / BitNet) learn each user's context locally; raw data never leaves the device.
02
RL from real tasks
Reward signal generated from real user interactions in wedge languages and censored markets.
03
Uncollectable data
Multilingual, private, censored-market signal no lab can legally or physically scrape.
04
Private aggregation
Federated learning + differential privacy + secure aggregation share only privacy-preserving signal.
05
Stronger global model
Continual training + guarded champion, challenger releases on the DAO-owned GPU cluster.
06
Better Personal AIs
Improved global model redeployed to devices, closing the loop and raising the moat.
↺ loops back to 01
THE COUPLING

The flywheel is the point: 1M Personal AIs → RL from real user tasks → uncollectable multilingual/censored-market data → private aggregation → a stronger global model → better Personal AIs, and each turn is paid for by genuine OrbNet revenue rather than inflation. The moat is not any single asset; it is that the five reinforce each other faster than capital alone can catch up.

↑ contents
G

Glossary

Aggregate FLOPs constraint
Total training throughput equals summed FLOPs across nodes, i.e. GPU count, i.e. capital. Reaching the general frontier is therefore a treasury/revenue question, not a networking one.
Apple Guideline 3.1.1
App Store rule governing in-app purchase and crypto; among other things it constrains token-gating of app features. Hylon never locks core features behind token ownership.
BitNet b1.58
A ternary-weight ({-1,0,+1}) model family trained in full precision via quantization-aware training; its gains are inference-only (e.g. ~45 tok/s on an M2 CPU in ~0.4 GB), not training-cost reductions.
Burn-and-mint
A value-accrual mechanism (Render precedent) where paid jobs burn a fiat-referenced amount of ORB while releasing the corresponding reward to the verified contributor, coupling token supply to work delivered.
Buyback-and-burn
Using a defined share of genuine service revenue to buy ORB on-market and burn it (Helium Mobile precedent), creating deflationary pressure that scales with real usefulness.
BVI issuer
A British Virgin Islands entity affiliated with the Foundation that handles token issuance and the crypto-asset white paper, isolating token-law surface from the operating company and the Foundation.
Capability profile
A signed report of a node's CPU/GPU/NPU/RAM/bandwidth/power-state/device-class, produced by the on-device capability detector, determining which job types the node is eligible to receive.
CASP
Crypto-Asset Service Provider, an entity authorized under MiCA to provide crypto-asset services (trading, custody, exchange). Hylon lists ORB only on MiCA-authorized CASP venues.
Champion / Challenger
The champion is the model currently live in production; a challenger is a trained candidate that must pass the full gate suite by a margin, with zero safety regressions, and clear the human final gate before it can replace the champion.
Champion, challenger
The automated release pipeline in which a challenger model is promoted only if it beats the live champion by a defined margin on a fixed gate suite with statistical significance and zero safety regressions, subject to the Safety Council's final sign-off.
Champion-challenger
The release mechanism whereby a newly trained challenger model is promoted to a new named version only if it beats the live champion by a defined margin on a fixed gate suite with statistical significance and zero safety regressions.
Closed-model distillation (gated)
A capability, built into the stack but disabled by policy, to train from third-party closed models, kept switched off and unused until a signed license exists, because providers' terms of service prohibit training competitors.
Control plane
The coordination layer carrying small, latency-sensitive messages, registration, capability reports, heartbeats, job assignment, verification verdicts, and reward-accrual checkpoints. Kept thin and eventually consistent.
Critical-learning pipeline
Hylon's five-stage knowledge-evaluation process, Understand, Question, Test, Verify, Integrate, through which no candidate knowledge is accepted blindly before influencing the global model.
Data flywheel
The self-reinforcing loop in which a million Personal AIs generate uncollectable RL signal from real user tasks, which is privately aggregated into a stronger global model, which yields better Personal AIs.
Data plane
The layer moving large payloads, gradients, model weights, activations, inference streams, on-device LoRA deltas, dataset shards, and (in the inherited VPN business) user traffic. Optimized for throughput.
DePIN
Decentralized Physical Infrastructure Network, a category of crypto networks coordinating real-world hardware. In 2025 the sector produced ~$72M revenue against ~$10B market cap.
Differential privacy
A mathematical guarantee that bounds how much any single user's data can influence a computed result, achieved by clipping and adding calibrated noise to contributions.
Differential privacy (DP)
A formal privacy guarantee: calibrated noise plus per-update clipping bound (via ε, δ) how much any single user's data can influence the output, directly defending against membership inference.
DiLoCo
Distributed Low-Communication training: each node runs many local (inner) optimizer steps, typically ~500, before a single global (outer) synchronization, cutting communication frequency roughly 500x and making internet-scale training viable.
DisTrO / DeMo
Decoupled Momentum optimizers that compress inter-node exchanges by transmitting only momentum residuals, pushing total communication reduction into the 1,000-10,000x range versus naive all-reduce.
DoubleZero no-action posture
SEC Division of Corporation Finance guidance (29 Sep 2025) indicating programmatic, usage-based rewards to infrastructure contributors, separated from fundraising, are not treated as securities offers. Non-binding and fact-specific; Hylon seeks its own relief.
DPIA
Data Protection Impact Assessment, a documented, living analysis of purpose, data flows, risks, and mitigations that supports Hylon's legitimate-interests lawful basis under the GDPR.
Earning-rate engine
The production OrbNet accounting component that meters contribution by work type; it already defines COMPUTE, INFERENCE, GRADIENT, and TRAINING types, which Hylon extends with verification gates.
EDPB Opinion 28/2024
European Data Protection Board opinion establishing that model parameters are not presumed anonymous, trained weights can be personal data, which is why Hylon engineers and assesses anonymity rather than assuming it.
Eventual consistency (AP)
The control plane favors availability and partition-tolerance over instantaneous global consistency, allowing partitioned regions to keep serving and earning; on-chain settlement provides the authoritative eventual anchor.
Federated learning
A training approach in which devices compute model updates locally and share only those updates, never raw data, so the central system learns aggregate skill without collecting personal content.
Federated learning (FL)
A training paradigm where devices train locally and share only model updates, coordinated in synchronous rounds, so raw data never centralizes.
FGS cap (Android 15)
Foreground-service limit on Android 15 restricting certain background work to roughly 6 hours per 24-hour window. Hylon's on-device work respects this envelope.
Five defensive assets
Live revenue product (OrbNet), data locality, sovereign self-improving stack, verified two-tier compute, and aligned token + progressive DAO governance, reinforcing each other faster than capital alone can catch.
Gate (roadmap gate)
A bundle of shipped deliverables and measured KPIs that must all be satisfied before Hylon makes the corresponding public claim. Hylon organizes its roadmap by gates, not dates.
Gate 0
~6-month gate: the sovereign stack shipped as one product, Personal AI + two-tier compute + ORB rails + DAO, on a top open model, self-hosted, private, uncensored, atop the live OrbNet business.
Gate 1
12 to 18-month gate: wedge dominance, a continually-trained model beats all open models on wedge domains and wins a blind A/B versus frontier models on personal-context tasks.
Gate 2
Year-2 gate: a self-improving, self-releasing model on a DAO-owned GPU cluster via the champion, challenger loop.
Gate 3+
Year-3+ conditional gate: the general-frontier program, commenced only when treasury-funded aggregate compute and the validated-recipe library both cross a defined threshold. No date, no frontier-parity claim until crossed.
Gate suite
The fixed, on-chain-versioned battery a challenger must pass in full: general benchmarks, wedge-domain evals, blind personal-context A/B, and safety red-team.
Gate-based claim
A forward-looking statement tied to a specific, measurable milestone (Gate 0 to 3+) and asserted only once that milestone is crossed, never as a dated promise.
Gradient leakage
An attack that recovers training samples from an individual gradient/weight update; defeated by secure aggregation (server sees only sums) plus pre-aggregation DP noise.
Hardware abstraction
A single runtime targeting CPU/GPU/NPU (and neuromorphic-ready) back-ends so model artifacts run across the device fleet without a rewrite.
Hardware financing
Corporate debt, the company/DAO borrows against GPUs/hardware (non-dilutive). Never a user-facing fixed-interest or guaranteed-yield product, which would carry securities risk.
Hardware financing (corporate debt)
Non-dilutive borrowing by Hylon Labs/DAO against GPUs and hardware to scale Tier-1 training compute; explicitly not a user-facing fixed-yield product, which would carry securities risk.
Hylon
The decentralized AI network built on top of the live OrbNet/OrbVPN product. The network is 'Hylon'; the token is ORB.
Hylon Foundation
A Cayman Islands ownerless, purpose-bound foundation company that stewards the Hylon protocol, ORB token, and treasury for the network's benefit; it has no shareholders and cannot be redirected to enrich any individual.
Hylon Labs
The operating company that employs the teams and owns the models, IP, codebase, trademarks, and commercial (API/enterprise/product) revenue, carrying commercial and employment liability.
INTELLECT-1
A 10B-parameter model trained by Prime Intellect across three continents over commodity 127-935 Mbit/s links at 83-96% compute utilization, the headline proof that geo-distributed training over ordinary internet works.
Learned signal
The only thing that flows up the stack: clipped, differentially-private, securely-aggregated gradient/behavioural updates derived from user tasks, never the raw underlying data.
Legitimate interests
The GDPR Art. 6(1)(f) lawful basis under which Hylon processes privatized signal as a data controller, supported by a legitimate-interests assessment and DPIA, explicitly not a household exemption or a claim that the GDPR does not apply.
Legitimate-interests basis
The documented GDPR lawful basis Hylon relies on (with a DPIA and FL/DP/secure-aggregation), rather than the false claims that GDPR does not apply or that a household exemption covers the network.
Level-0 foundations
The immutable objectives, truth, safety, honesty, integrity, hardcoded and verified every step, sitting atop the L0<L1<L2<L3 objective hierarchy and not self-modifiable.
Level-0 immutable foundations
The hardcoded objectives, truth, safety, honesty, integrity, that cannot be self-modified by the system or edited by governance, sitting at the base of the L0<L1<L2<L3 objective hierarchy.
Low-communication training
The DiLoCo / Streaming DiLoCo / DisTrO / DeMo family of methods that cut inter-node communication ~400×, 10,000×, making distributed GPU training over ordinary internet links practical.
Margin M
The pre-defined amount by which a challenger must beat the champion on aggregate capability gates, with a confidence interval that clears the margin, for promotion. A statistically insignificant edge is churn, not a promotion.
Membership inference
An attack that tries to determine whether a specific individual's data was used in training; the primary formal defense is differential privacy.
Membership inference / model inversion
Privacy attacks that attempt to determine whether a record was in the training set, or to reconstruct training data from a model. Hylon maintains a defense battery and treats these as ongoing engineering risks, not solved problems.
Metric gaming detection
The anti-Goodhart invariant: if L3 metrics improve while L1-L2 do not, the divergence is flagged as evidence of gaming or contamination and blocks promotion, regardless of headline scores.
MiCA
EU Markets in Crypto-Assets Regulation. Binding framework for crypto-asset offers to EU persons; transitional grandfathering ended 1 July 2026. Requires a compliant white paper and listing only on authorized CASP venues.
Mixture-of-Experts (MoE)
A model architecture partitioning parameters into many expert sub-networks, of which only a few activate per token, giving frontier-scale knowledge at bounded per-token compute and cheap serving.
Model inversion
An attack that tries to reconstruct representative raw inputs from a model or its updates; countered by two-layer separation, DP noise, and aggregate-only exposure.
Naive all-reduce
The strawman training scheme, synchronizing the full gradient across all nodes every step, that produces the misleading '5,000 years over home internet' figure. No production distributed-training system operates this way.
Neuromorphic (Tier D)
A hardware roadmap tier for ultra-low-power always-on inference. Loihi 2 is INRC research-access-only; BrainChip Akida is purchasable but ~0.8 TOPS and inference-only; no neuromorphic hardware trains LLMs.
Objective hierarchy (L0-L3)
L0 immutable foundations (Truth, Safety, Honesty, Integrity) < L1 fixed primary objectives < L2 optimizable capabilities < L3 metric indicators. Higher layers dominate absolutely; only L2 is optimized.
OFSI / OFAC
UK (Office of Financial Sanctions Implementation) and US (Office of Foreign Assets Control) sanctions authorities. Hylon screens destination wallets against both lists before any payout.
ORB
The network's unified ERC-20 token, deployed on Base. Earned for verified compute and data work, spent for AI and network services, and used to govern the protocol.
OrbNet / OrbVPN
The live, revenue-generating censorship-circumvention VPN, with real users in restricted markets, that Hylon's AI layer is built on top of.
Per-node memory constraint
In DiLoCo-style training every node must hold the full model plus optimizer state (Adam moments ~triple parameter memory), requiring multi-GPU servers, the physical reason training runs on GPUs, never phones.
Personal AI
A compact on-device model (1 to 4B-class, or a ternary BitNet-style model) that learns a user's context locally. Raw data never leaves the device; only privacy-preserving learned signal is shared.
Points before token
Bootstrapping contribution accounting as off-chain points to calibrate the value of verified work and quantify demand before switching on token emissions (Grass precedent).
Progressive decentralization
The milestone-gated, one-directional transfer of economic control (reward rates, then treasury, then protocol upgrades, then grants) from the Foundation to the DAO over roughly two to four years.
Regional coordinator
A per-region aggregation and scheduling node that manages its local fleet, serves low-latency inference, tolerates network partitions by making local progress, and reconciles global state via gossip.
Safety Council
The technical body holding the final human release gate and the kill switch. Model releases and safety constraints are governed here, transparent to the DAO but not decided by token vote.
Secure aggregation
A cryptographic protocol (pairwise-masked or threshold, dropout-tolerant) letting the coordinator compute only the sum of a cohort's updates, so no individual contribution is ever visible, even to the server.
Self-Improvement Engine
Hylon's automated research loop that surveys literature, proposes training recipes, runs cheap small-scale ablations, extrapolates their effect, trains the next candidate, and improves its own methods, the R&D team and the recursive self-improvement mechanism, bounded by compute and immutable guardrails.
Signed-voucher claim
The live ORB reward-redemption mechanism: accrued balance is claimed via a cryptographically signed voucher redeemed through a gasless relayer on Base, so the member pays no gas.
Slashing
Forfeiture of an operator's staked ORB bond for faulty, fraudulent, or unavailable work, making cheating more costly than honest work.
Sovereign Stack
Hylon's self-contained AI system, MoE core, router, retrieval/memory, tool use, agentic orchestration, and specialist heads, with no hard dependency on any external provider API. Owned end-to-end so no upstream can throttle, re-price, or kill it.
Staged rollout
Progressive exposure of a promoted challenger: shadow (zero user-visible), canary %, ramp, then full, with live telemetry compared to pre-registered thresholds and automatic rollback on any breach.
TEE
Trusted Execution Environment, an attested hardware enclave in which high-assurance local training or inference can run for a stronger confidentiality guarantee.
The wedge
The market segment centralized labs cannot structurally contest: a user's private on-device context, censored/blocked markets, and underserved languages (Farsi, Arabic, Russian, Turkish).
Timelock
A mandatory, publicly visible delay between a proposal passing and its execution, giving the network time to react to a malicious or buggy proposal before it takes effect.
TOPLOC
A locality-sensitive hashing scheme over model activations that verifies an inference was computed honestly at roughly 100x lower cost than re-execution; proven in production in Prime Intellect's INTELLECT-2 distributed RL run.
TOPLOC verification
Locality-sensitive activation hashing used to verify inference roughly 100x cheaper than full re-execution; production-proven in Prime Intellect's INTELLECT-2.
Two-layer privacy model
Hylon's architecture placing two irreversible transformations between raw data and the global model: raw data is dissolved into a model update on-device (Layer 1), then privatized and hidden inside a secure aggregate (Layer 2).
Two-tier compute
Tier 1 = datacenter-class GPU nodes (8×H100/B200) doing training, coordinated over the internet by low-communication training; Tier 2 = ~1M phones/PCs/Macs doing inference, on-device personalization, RL scoring and labeling.
USD-stabilized payout
Pricing jobs and paying operators against a fiat reference (io.net IDE / Nosana pattern) so operator income is predictable and the protocol, not the supply side, absorbs token-price volatility.
Validated recipe
A fully specified training configuration (data mixture, curriculum, architecture, optimizer/RL settings, hyperparameters) whose components have each survived cheap ablation and whose extrapolated gain clears the promotion margin with statistical confidence. Only validated recipes justify a $5-20M frontier run.
Value before emissions
Hylon's core monetary rule: reward only verified, demanded work and fund token scarcity from genuine service revenue rather than inflation, so circulating scarcity tracks demonstrated demand instead of device headcount.
Value-before-emissions
The discipline of rewarding real, verified work and funding buyback/burn from genuine service revenue rather than inflation, the pattern that separated DePIN survivors from death spirals.
Wedge
The market segment centralized labs structurally cannot contest, full private personal context, censored markets, and underserved languages (Farsi, Arabic, Russian, Turkish), where Hylon wins first.
R

References

  1. 1a16z, 'Progressive Decentralization: A Playbook for Building Crypto Applications' (governance-handover framework)
  2. 2Abadi et al., Deep Learning with Differential Privacy (DP-SGD, moments accountant)
  3. 3Acurast, decentralized smartphone compute (app-store-approved precedent)
  4. 4Acurast, decentralized smartphone compute (store-approved precedent)
  5. 5Acurast, smartphone compute network (app-store-approved precedent)
  6. 6Acurast, smartphone-compute network and app-store compliance precedent
  7. 7Advances and Open Problems in Federated Learning (Kairouz et al.)
  8. 8Akash Network, decentralized cloud marketplace
  9. 9Android 15, Foreground service changes
  10. 10Apple App Store Review Guidelines (3.1.1)
  11. 11Base, Ethereum L2 settlement layer
  12. 12BitNet b1.58: 1-bit LLMs (inference-efficient ternary weights)
  13. 13BitNet b1.58: 1-bit LLMs (Microsoft Research)
  14. 14BitNet b1.58: 1-bit LLMs with full-precision quantization-aware training
  15. 15BitNet b1.58: The Era of 1-bit LLMs (Microsoft Research)
  16. 16Bittensor, incentivized subnet network
  17. 17Bonawitz et al., Practical Secure Aggregation for Privacy-Preserving Machine Learning (Google)
  18. 18BrainChip Akida neuromorphic processor specifications
  19. 19Cayman Islands Foundation Companies Act, ownerless, purpose-bound foundation company structure
  20. 20Champion-challenger evaluation and staged rollout (canary/shadow deployment), standard MLOps practice
  21. 21Compound / OpenZeppelin Governor + Timelock, reference on-chain governance, snapshot voting, and timelocked execution patterns
  22. 22DiLoCo: Distributed Low-Communication Training of Language Models
  23. 23DiLoCo: Distributed Low-Communication Training of Language Models (Douillard et al.)
  24. 24DiLoCo: Distributed Low-Communication Training of Language Models (Douillard et al., 2023)
  25. 25Douillard et al., DiLoCo: Distributed Low-Communication Training of Language Models (Google DeepMind, 2023)
  26. 26Douillard et al., Streaming DiLoCo with overlapping communication (2025)
  27. 27EDPB Opinion 28/2024 on AI models and data protection (weights not presumed anonymous)
  28. 28EDPB Opinion 28/2024 on AI models and personal data
  29. 29EDPB Opinion 28/2024 on AI models and personal data (model weights not presumed anonymous)
  30. 30EDPB Opinion 28/2024 on AI models and personal data (weights not presumed anonymous)
  31. 31ESMA, MiCA transitional measures and CASP authorization
  32. 32EU Markets in Crypto-Assets Regulation (MiCA), crypto-asset white paper and authorized CASP requirements; transition period ended 1 Jul 2026
  33. 33EU MiCA Regulation (Markets in Crypto-Assets)
  34. 34GDPR Article 6(1)(f), Lawfulness of processing (legitimate interests)
  35. 35GDPR, Regulation (EU) 2016/679 (Arts. 6, 35)
  36. 36Goodhart's law and specification gaming in optimizing systems, motivation for the objective hierarchy and L3-vs-L1/L2 divergence check
  37. 37Grass, decentralized web-bandwidth and data network
  38. 38Grass, decentralized web-data / bandwidth network
  39. 39Grass, points program and data/bandwidth marketplace (~$330M cap; sells data to AI labs, not raw FLOPs)
  40. 40Helium, hotspot growth vs. real data demand; Helium Mobile revenue-funded burn
  41. 41Intel Neuromorphic Research Community (INRC) / Loihi 2
  42. 42INTELLECT-1: decentralized training of a 10B model across continents
  43. 43io.net, decentralized GPU cloud
  44. 44io.net, IDE and USD-stabilized operator payouts; registered vs. verified GPU disclosures
  45. 45Kaplan et al., Scaling Laws for Neural Language Models (2020); Hoffmann et al., Training Compute-Optimal LLMs / Chinchilla (2022), basis for small-scale extrapolation
  46. 46Ma et al., The Era of 1-bit LLMs: BitNet b1.58
  47. 47McMahan et al., Communication-Efficient Learning of Deep Networks from Decentralized Data (FedAvg)
  48. 48Membership Inference Attacks Against Machine Learning Models (Shokri et al.)
  49. 49Messari, State of DePIN 2025 (sector revenue and market-cap data)
  50. 50Messari, State of DePIN 2025 (sector revenue vs market cap)
  51. 51Messari: State of DePIN 2025 (sector revenue and market data)
  52. 52Microsoft Research, BitNet b1.58: 1-bit LLMs (The Era of 1-bit LLMs)
  53. 53Nosana, USD-referenced GPU job pricing
  54. 54Nous Research, DisTrO / DeMo decentralized optimizers
  55. 55Nous Research, DisTrO / DeMo: Decoupled Momentum Optimization
  56. 56Nous Research, Psyche distributed training network
  57. 57Nous Research, Psyche distributed training network
  58. 58Ong et al., TOPLOC: Locality-Sensitive Hashing for Verifiable Inference
  59. 59Practical Secure Aggregation for Privacy-Preserving Machine Learning (Bonawitz et al.)
  60. 60Prime Intellect INTELLECT-2 / decentralized RL
  61. 61Prime Intellect INTELLECT-2, production locality-sensitive activation-hash inference verification (TOPLOC), referenced for release-pipeline verification
  62. 62Prime Intellect, INTELLECT-1: 10B decentralized training across three continents
  63. 63Prime Intellect, INTELLECT-1: a 10B model trained across three continents over 127 to 935 Mbit/s links
  64. 64Prime Intellect, INTELLECT-1: decentralized training of a 10B model across three continents
  65. 65Prime Intellect, INTELLECT-1: decentralized training of a 10B model across three continents on 127 to 935 Mbit/s links
  66. 66Prime Intellect, INTELLECT-2 and TOPLOC verifiable inference
  67. 67Prime Intellect, INTELLECT-2 and TOPLOC verification for decentralized inference/training
  68. 68Prime Intellect, INTELLECT-2 and TOPLOC verified decentralized RL
  69. 69Prime Intellect, INTELLECT-2: verified decentralized RL training with TOPLOC
  70. 70Regulation (EU) 2023/1114 (MiCA), full text
  71. 71Render Network, burn-and-mint compute pricing model
  72. 72Render Network, GPU rendering, burn-and-mint economy
  73. 73Salad, consumer GPU compute network
  74. 74Salad, consumer GPU network (no token)
  75. 75SEC Division of Corporation Finance, DoubleZero no-action / statement on network-infrastructure rewards (Sep 2025)
  76. 76SEC Division of Corporation Finance, DoubleZero no-action letter (29 Sep 2025)
  77. 77SEC Division of Corporation Finance, DoubleZero no-action letter (programmatic usage-based infrastructure rewards separated from fundraising), 29 Sep 2025
  78. 78SEC DoubleZero no-action framing for usage-based infrastructure rewards (29 Sep 2025)
  79. 79SEC v. W.J. Howey Co., 328 U.S. 293 (1946)
  80. 80Shokri et al., Membership Inference Attacks Against Machine Learning Models
  81. 81Streaming DiLoCo with overlapping communication
  82. 82Streaming DiLoCo with overlapping communication (Google DeepMind, 2025)
  83. 83The Algorithmic Foundations of Differential Privacy (Dwork & Roth)
  84. 84TOPLOC: Locality-sensitive verifiable inference (INTELLECT-2)
  85. 85UK OFSI, Financial sanctions guidance
  86. 86US OFAC, Sanctions Programs and Country Information
  87. 87Zhu et al., Deep Leakage from Gradients (gradient leakage attack)

Disclaimer. This is a technical working draft describing intended architecture and forward looking plans. Capabilities, timelines, and parameters may change and are not guarantees. It is not an offer or solicitation to buy any security or token, nor investment, legal, or tax advice. ORB is intended to function as a utility and governance token. Diagrams and figures are illustrative.