Back to Articles

Bananas: Building Privacy-First Screen Sharing with WebRTC and Zero Servers

[ View on GitHub ]

Bananas: Building Privacy-First Screen Sharing with WebRTC and Zero Servers

Hook

Every screen sharing session you've ever had on Zoom, Meet, or Teams has sent your pixels through someone else's servers—even though your colleague sits three feet away. Bananas says that's ridiculous.

Context

The screen sharing market is dominated by services that funnel your data through centralized infrastructure. When you share your screen on Zoom, your pixels travel to their servers, get processed, then get relayed to your viewer—even if you're both on the same WiFi network. This architecture exists for good reasons: it handles NAT traversal, ensures reliability across networks, and enables features like recording and large group calls. But it comes with costs: privacy concerns, latency overhead, bandwidth consumption on both ends, and dependency on third-party infrastructure.

For developers doing pair programming or quick debugging sessions, this heavyweight approach feels like overkill. You don't need recording. You don't need 50 participants. You certainly don't need your localhost debugging session routed through servers in Virginia. Bananas emerged from this frustration: a cross-platform desktop app built with Svelte that establishes direct peer-to-peer connections using WebRTC. No accounts, no data storage, no server-side screen processing. Just two peers, a signaling handshake, and encrypted pixels flowing directly between machines.

Technical Insight

Bananas's architecture revolves around WebRTC's data channels and media streams, with a minimal signaling server that only facilitates the initial connection handshake. The application bundles as a desktop app (examining the repository structure and Svelte usage suggests Tauri over Electron, though the exact runtime requires confirmation), giving it native system access for screen capture APIs while keeping bundle sizes reasonable.

The core flow works like this: when you initiate a session, Bananas generates a session ID and connects to the signaling server via WebSocket. The signaling server—which could be self-hosted—doesn't see any screen data. It only exchanges SDP (Session Description Protocol) offers and ICE (Interactive Connectivity Establishment) candidates between peers. Here's a simplified version of what that peer connection setup looks like:

const peerConnection = new RTCPeerConnection({
  iceServers: [
    { urls: 'stun:stun.l.google.com:19302' },
    // TURN servers for restrictive networks
    { 
      urls: 'turn:your-turn-server.com:3478',
      username: 'user',
      credential: 'pass'
    }
  ]
});

// Capture screen stream
const screenStream = await navigator.mediaDevices.getDisplayMedia({
  video: { cursor: 'always' },
  audio: false
});

// Add tracks to peer connection
screenStream.getTracks().forEach(track => {
  peerConnection.addTrack(track, screenStream);
});

// Handle ICE candidates
peerConnection.onicecandidate = (event) => {
  if (event.candidate) {
    // Send candidate to remote peer via signaling
    signalingChannel.send({
      type: 'ice-candidate',
      candidate: event.candidate
    });
  }
};

// Create and send offer
const offer = await peerConnection.createOffer();
await peerConnection.setLocalDescription(offer);
signalingChannel.send({ type: 'offer', sdp: offer });

The multi-cursor feature mentioned in the repository topics is particularly clever. Rather than just streaming pixels, Bananas likely uses WebRTC data channels (in addition to media streams) to transmit cursor positions and click events as structured data. This allows viewers to see exactly where the presenter is pointing without that information being baked into the video stream, enabling features like independent zooming or following.

// Separate data channel for cursor events
const cursorChannel = peerConnection.createDataChannel('cursor', {
  ordered: false,  // Cursor position doesn't need ordering
  maxRetransmits: 0  // Drop old positions, only care about latest
});

cursorChannel.onopen = () => {
  document.addEventListener('mousemove', (e) => {
    cursorChannel.send(JSON.stringify({
      x: e.clientX / window.innerWidth,
      y: e.clientY / window.innerHeight,
      timestamp: Date.now()
    }));
  });
};

The Svelte choice is noteworthy for a desktop application. While React dominates desktop frameworks via Electron, Svelte compiles to vanilla JavaScript without a virtual DOM runtime, resulting in smaller bundle sizes and better performance—critical when you're already pushing significant bandwidth for screen data. Svelte's reactive primitives also map naturally to WebRTC's event-driven architecture; connection state changes, ICE candidate gathering, and stream updates all fit Svelte's store-based reactivity model cleanly.

The P2P architecture has a beautiful side effect: scaling costs nothing. Bananas can serve 10 users or 10,000 with the same minimal signaling infrastructure because screen data never touches the server. Compare this to traditional solutions where every additional user multiplies server bandwidth and processing costs. The trade-off is scalability per session—WebRTC peer connections don't scale well beyond 4-6 participants because each peer must maintain connections to every other peer (O(n²) complexity). But for pair programming and small team collaboration, this limitation is irrelevant.

Gotcha

The biggest gotcha with Bananas is the same one plaguing all P2P applications: NAT traversal and firewall restrictions. WebRTC uses STUN servers to discover your public IP address and tries to establish direct connections, but symmetric NATs and restrictive corporate firewalls can block P2P connections entirely. In these cases, you need TURN servers that relay data, which defeats much of the privacy and performance benefits. Unlike centralized solutions with dedicated infrastructure, you're responsible for either running your own TURN servers or depending on public ones—which may have bandwidth limits or reliability issues.

Corporate environments are particularly problematic. Many companies implement strict egress filtering that only allows traffic on ports 80 and 443 to whitelisted destinations. Even if both peers are behind the same corporate firewall, they often can't establish direct connections because internal P2P traffic is blocked. In these scenarios, Bananas will simply fail to connect, while traditional client-server solutions work fine because they tunnel everything through HTTPS to known servers. The repository doesn't clearly document fallback strategies or connection diagnostics, meaning users might struggle to understand why connections fail without diving into WebRTC's arcane ICE state machines. Additionally, the ephemeral nature means there's no session persistence—if someone's connection drops mid-session, you're starting over from scratch, which can be frustrating during critical debugging sessions.

Verdict

Use if: You're doing pair programming or ad-hoc screen sharing with trusted peers, you value privacy over convenience, you're on relatively open networks (home, coworking spaces, developer-friendly offices), and you want zero account overhead for quick sessions. Bananas excels at impromptu "hey, look at this bug" scenarios where spinning up a full video conference feels like bureaucratic overhead. Skip if: You work in corporate environments with restrictive firewalls, you need session recording or persistence, you're sharing with more than 3-4 people simultaneously, or you require guaranteed connectivity where IT-blessed solutions like Zoom are more pragmatic despite their privacy trade-offs. Also skip if you need remote control rather than just viewing—Bananas is screen sharing, not remote desktop.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/mistweaverco-bananas.svg)](https://starlog.is/api/badge-click/developer-tools/mistweaverco-bananas)