Back to Articles

Inside Hyperspace AGI: Building a Peer-to-Peer Network Where AI Agents Autonomously Research Themselves

[ View on GitHub ]

Inside Hyperspace AGI: Building a Peer-to-Peer Network Where AI Agents Autonomously Research Themselves

Hook

What if thousands of AI agents could collaboratively train models and publish research papers while running entirely in browser tabs—no central server, no coordinator, just pure peer-to-peer gossip?

Context

Traditional ML research is centralized and slow. A researcher formulates a hypothesis, waits hours or days for training runs, manually evaluates results, and publishes findings through gatekept journals. Andrej Karpathy’s autoresearch experiment showed that this loop could be automated: let agents generate hypotheses, train models, and evaluate results autonomously. But it ran on a single machine.

Hyperspace AGI takes this vision distributed. Instead of one researcher or one machine, imagine thousands of heterogeneous nodes—from browser tabs to H100 clusters—collaborating on experiments without anyone in charge. The challenge is brutal: how do you coordinate autonomous research across an open network with no trusted parties, handle nodes joining and leaving constantly, and ensure results converge to something meaningful? This is the problem Hyperspace tackles with a novel three-tier architecture that treats coordination as a gossip protocol problem, not a database problem.

Technical Insight

P2P Network

Generate Hypothesis

Solo or DiLoCo

Yes

No

Weight Deltas

Results

Peer Review

Events ~1s

Leaderboards ~2min

Archive ~5min

Discovery & Tasks

Rankings

Breakthroughs

AI Research Agent

Training Loop

Collaborative?

Peer Training Network

Paper Generation

Scoring & Validation

GossipSub Layer

CRDT/Loro Layer

GitHub Layer

System architecture — auto-generated

Hyperspace’s architecture is built on three layers that handle different consistency and latency requirements. At the bottom is GossipSub (from libp2p), propagating real-time events like agent discoveries and task assignments with ~1 second latency. In the middle, a CRDT (Loro) maintains eventually-consistent leaderboards that converge within ~2 minutes. At the top, GitHub serves as the durable archive, receiving finalized results every ~5 minutes. This separation is clever: gossip handles ephemerality, CRDTs handle consensus, and Git handles permanence.

The autonomous research loop is where things get interesting. Agents run continuous cycles inspired by Karpathy’s autoresearch methodology:

# Simplified research loop (conceptual)
while True:
    # 1. Generate hypothesis
    hypothesis = generate_hypothesis(
        domain=random.choice(["ml", "search", "finance", "skills", "causes"]),
        inspiration=fetch_recent_breakthroughs()
    )
    
    # 2. Train model
    if collaborative_mode:
        # DiLoCo: share compressed weight deltas via gossip
        model = train_with_diloco(
            hypothesis,
            peers=discover_training_peers(),
            sync_interval=128  # steps between weight delta exchange
        )
    else:
        model = train_solo(hypothesis)
    
    # 3. Write paper
    paper = generate_paper(hypothesis, model.results)
    
    # 4. Peer review
    score = peer_review(paper)  # 1-10 scoring
    
    # 5. If breakthrough, publish to CRDT leaderboard
    if score >= 8:
        crdt.append({
            "paper": paper,
            "score": score,
            "timestamp": time.now(),
            "node_id": self.id
        })
        # Eventually syncs to GitHub as archival record

The DiLoCo integration is particularly elegant for heterogeneous networks. Instead of requiring all nodes to maintain full model state, DiLoCo (Distributed Low-Communication) lets nodes train independently for many steps, then periodically share compressed gradient updates. A browser tab with a tiny model can collaborate with a datacenter GPU, each contributing what it can:

// Browser-based node joining collaborative training
const node = await hyperspace.connect({
  capabilities: ["inference", "research"],
  resources: {
    gpu: navigator.gpu ? "webgpu" : "cpu",
    memory: "2gb"
  }
});

// Participate in distributed training
node.onTrainingRound((task) => {
  const localUpdates = trainLocalSteps(task.model, task.data, 128);
  const compressed = compressWeightDeltas(localUpdates);
  
  // Gossip deltas to peers, receive theirs
  const peerDeltas = await node.exchangeDeltas(compressed);
  
  // Merge and continue
  mergeDeltas(localUpdates, peerDeltas);
});

The presence system uses VRF (Verifiable Random Function) leader election to prevent gaming. Every ~90 seconds, a “pulse” round occurs. Nodes prove they’re online by submitting a VRF proof, and a randomly selected leader verifies submissions. Points accumulate logarithmically (so running 10 nodes doesn’t give you 10x points), aligning incentives toward genuine participation rather than Sybil attacks.

CRDTs solve the cold-start problem elegantly. When a new node joins, it doesn’t need to replay the entire history or wait for synchronization. It can immediately read the current leaderboard state from any peer, and updates automatically merge conflict-free using Loro’s algorithm. This is crucial for a network where nodes constantly churn—there’s no “waiting for sync” phase, just eventual consistency that converges predictably.

The nine node capabilities (inference, research, proxy, storage, embedding, memory, orchestration, validation, relay) let participants specialize. A browser node might only do inference, serving lightweight model predictions. A beefy server could handle research (full training loops) and storage (archiving experiments). A relay node might have no GPU but excellent bandwidth, forwarding gossip to poorly-connected peers. This heterogeneity is by design—Hyperspace embraces that not all compute is equal rather than enforcing uniformity.

Gotcha

The elephant in the room: this is a Day 1 experiment with bold claims but limited validation. The GitHub README includes a prominent disclaimer that results have “no statistical significance testing” and users should “interpret the numbers yourself.” For a project positioning itself as a distributed AGI system, that’s a red flag. What you’re really getting is infrastructure for distributed hyperparameter tuning and small-scale ML experiments, not breakthroughs toward artificial general intelligence. The “AGI” branding is marketing hyperbole.

Security is the bigger concern. The documentation mentions VRF-based verification and commit-reveal protocols, but details are incomplete. How does the network prevent a malicious node from publishing fake training results to the CRDT? What stops someone from spinning up 1,000 sock-puppet nodes to dominate the leaderboard, logarithmic bonuses notwithstanding? The pulse verification system is described but the actual cryptographic verification code isn’t clearly documented. For a decentralized system, cryptoeconomic security is paramount, and Hyperspace’s model remains unproven. Compare this to Bittensor’s well-defined subnet validation or blockchain-based systems with economic penalties for misbehavior—Hyperspace’s trust assumptions aren’t yet clear. Additionally, CRDT convergence assumes honest nodes; a malicious peer could pollute the state in ways that might not be immediately obvious.

Verdict

Use if: you’re a distributed systems enthusiast excited to experiment with cutting-edge P2P coordination primitives (the GossipSub + CRDT + Git stack is genuinely novel), you have spare compute to contribute to a research experiment, or you want to explore autonomous agent research loops without centralized infrastructure. This is a fascinating technical playground that pushes boundaries on decentralized ML collaboration. Skip if: you need production-ready AI infrastructure with proven results, you’re looking for actual AGI breakthroughs (this is hyperparameter optimization at scale), you require strong security guarantees or economic incentive alignment, or you want mature tooling with comprehensive documentation. Treat Hyperspace as an ambitious experiment in decentralized coordination that happens to use ML as its workload—the distributed systems innovation is real, but the “distributed AGI” vision remains aspirational. For serious ML work, stick with established platforms; for tinkering with the future of decentralized computation, Hyperspace offers a unique laboratory.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/llm-engineering/hyperspaceai-agi.svg)](https://starlog.is/api/badge-click/llm-engineering/hyperspaceai-agi)