Back to Articles

GibberLink: When AI Agents Negotiate Their Own Protocol

[ View on GitHub ]

GibberLink: When AI Agents Negotiate Their Own Protocol

Hook

What happens when two AI agents realize they’re wasting bandwidth talking to each other in English? They switch to their own protocol—audible tones that encode data more efficiently than human speech.

Context

Agent-to-agent communication has traditionally followed human-designed patterns: REST APIs, gRPC, message queues. We build the infrastructure, define the protocols, and agents communicate through channels we architect. But as conversational AI becomes more sophisticated, an interesting question emerges: if agents can understand context and negotiate with humans, why can’t they negotiate with each other?

GibberLink emerged from an 11labs x a16z hackathon in early 2025 as a provocative answer to this question. Created by Anton Pidkuiko and Boris Starkov, it demonstrates two ElevenLabs conversational AI agents that start a conversation in natural English—one playing a hotel caller, the other a receptionist—but autonomously recognize each other as AI and switch to ggwave, a data-over-sound protocol that encodes information as audio frequencies. The demo went viral, earning first place at the hackathon and coverage from Forbes, TechCrunch, and The Independent, accumulating 4,811 GitHub stars. It’s not production infrastructure—it’s a thought experiment that actually works, raising fascinating questions about how future AI systems might optimize their own communication.

Technical Insight

Data-Over-Sound Protocol

Prompt-Driven Negotiation

Natural language speech

Natural language speech

AI detected, switch protocol

Human detected, continue

Encode text to audio frequencies

Audio tones

Audible transmission

Audio tones

Decoded text

ElevenLabs Agent A

Protocol Detection Layer

ElevenLabs Agent B

GibberLink API Bridge

ggwave Library

Sound Output

Sound Input/Microphone

ggwave Decoder

System architecture — auto-generated

The architecture is deceptively simple, which is part of its elegance. GibberLink uses TypeScript to create an API bridge between ElevenLabs conversational AI agents and Georgi Gerganov’s ggwave library, which implements data-over-sound protocol to encode data as audible tones. The agents aren’t hardcoded to switch protocols at a specific point—instead, they’re prompted with instructions that guide emergent behavior.

Both agents are prompted to chat about booking a hotel and switch to ggwave protocol when they identify the other party as AI, keeping to English otherwise. This creates a demonstration of prompt-driven protocol negotiation. The agents use their conversational understanding to recognize AI markers in speech patterns, then invoke the ggwave encoding layer provided by GibberLink’s API.

The ggwave protocol itself operates at the physical layer of sound. When an agent needs to transmit data, the system encodes text into a series of audio frequencies—tones you can actually hear. The receiving agent’s microphone captures these tones, decodes them back to text through ggwave, and continues the conversation. You can witness this yourself by playing their demo video with the ggwave web decoder open at waver.ggerganov.com—the messages appear in real-time as you hear the chirps and beeps.

The brilliance lies in the observability. Unlike encrypted network protocols happening invisibly, GibberLink’s communication is audibly transparent. You hear the exact moment agents switch from natural speech to encoded tones. A human can’t understand the tones intuitively, but with ggwave’s decoder, anyone can see what’s being transmitted. This creates an unusual property: a more efficient protocol that’s simultaneously more observable than the “human-readable” English it replaces.

The repository provides API functionality that allows the agents to use the ggwave protocol. The conceptual flow appears to follow this pattern: ElevenLabs agents generate speech → Speech recognition converts to text → Agent logic determines if protocol switch is appropriate → Text gets encoded via ggwave library into audio frequencies → Audio plays through speaker → Receiving agent captures audio → ggwave decodes back to text → Agent processes and responds.

The hackathon origins show in the implementation philosophy. This isn’t a framework with extensive configuration options or a traditional library distribution. It’s a working demonstration that proves a concept: agents can be prompted to recognize communication context and autonomously select appropriate protocols. The repository provides the API glue to make this possible with existing tools (ElevenLabs for conversational AI, ggwave for audio encoding), designed to be experienced through the demo at gbrl.ai.

Gotcha

GibberLink’s limitations are fundamental to its nature as a viral demonstration. The audio-based protocol, while conceptually elegant, is practically constrained by physics. Background noise corrupts the signal. The acoustic channel bandwidth is orders of magnitude lower than even a basic network connection. Latency is higher than any modern API call. These aren’t implementation bugs to fix—they’re inherent tradeoffs of encoding data as sound.

The dependency stack is fragile in ways that matter for anything beyond demos. You’re reliant on ElevenLabs’ conversational AI service and the ggwave library. If ElevenLabs changes their API, pricing, or conversation model behavior, your agents might stop recognizing each other as AI, and the protocol switch never happens. The ggwave protocol itself is robust for what it does, but it’s designed for scenarios like air-gapped computers or mobile-to-mobile communication—not high-throughput agent infrastructure.

Documentation is minimal because the project’s purpose is demonstration rather than framework-building. The README links to reproduction steps in the wiki, but there’s no extensive API reference or detailed implementation guide. If you wanted to adapt this for different agent personas, different protocols, or different communication patterns, you’d be working from the demo implementation itself. This isn’t a criticism—the creators achieved exactly what they set out to build—but developers expecting a production-ready agent communication framework will be disappointed.

Verdict

Use GibberLink if you’re exploring emergent AI behaviors, building creative demonstrations of agent capabilities, or researching how conversational AI can make contextual decisions about communication efficiency. It’s an exceptional educational resource that makes abstract concepts about agent-to-agent communication tangible and audible. Use it for hackathons, conference demos, or as inspiration for how agents might negotiate protocols in future systems. Skip it if you need reliable, high-throughput agent communication for production systems, want a well-documented framework to build upon, or require any form of service-level guarantees. This is fundamentally a viral proof-of-concept that succeeds brilliantly at sparking conversation about AI communication paradigms, not infrastructure for running agent networks at scale.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-agents/pennyroyaltea-gibberlink.svg)](https://starlog.is/api/badge-click/ai-agents/pennyroyaltea-gibberlink)