Back to Articles

Building a Subconscious for Claude Code: Persistent Memory Through Background Agents

[ View on GitHub ]

Building a Subconscious for Claude Code: Persistent Memory Through Background Agents

Hook

Claude Code forgets everything between sessions. What if you could give it a background process that remembers—and whispers context back just before you ask your next question?

Context

Claude Code has a fundamental problem: amnesia. Every time you close and reopen a session, it starts fresh. It doesn’t remember the architecture decisions you discussed yesterday, the bug patterns you discovered last week, or the coding conventions you established across projects. This isn’t just annoying—it’s wasteful. You spend time re-explaining context that should be obvious from previous work.

The claude-subconscious project from Letta AI tackles this head-on with an unusual architecture: a persistent background agent that watches your Claude Code sessions, maintains memory across time, and injects relevant context back into new conversations. It’s not trying to replace Claude Code or augment the model itself. Instead, it creates a separate process—a ‘subconscious’—that learns from everything you do and whispers guidance when needed. The repository is explicitly labeled as a demo showcasing the Letta Code SDK, but the architecture patterns it demonstrates are fascinating for anyone thinking about stateful AI coding assistants.

Technical Insight

Letta Agent Tools

Response

Async send

Conversations API

Generate whisper

Inject context

File data

Search results

Cross-session learning

User Prompt

Claude Code Session

Transcript Capture

Letta Agent Instance

Read/Grep/Glob Tools

Web Search

Persistent Memory

Multiple Conversation Threads

Stdout Injection

System architecture — auto-generated

The core innovation is the async processing loop. After each Claude Code response, the transcript gets sent to a Letta agent running in the background. This agent isn’t just storing text—it has real tool access. It can read files using Read, Grep, and Glob operations. It can search the web. Most importantly, it maintains persistent memory using Letta’s memory architecture.

Here’s how the injection mechanism works. Before Claude Code processes your next prompt, the subconscious agent gets a chance to ‘whisper’ context by injecting text directly into stdout:

// The agent prepares contextual guidance
const whisper = await lettaAgent.processTranscript({
  sessionId: currentSession,
  transcript: lastExchange,
  mode: 'whisper'
});

// Inject into Claude Code's context via stdout
if (whisper.content) {
  process.stdout.write(whisper.content);
}
// Claude Code sees this as additional context
// but it never touches CLAUDE.md

The key architectural decision is avoiding CLAUDE.md entirely. Many Claude Code extensions try to maintain state by writing to that file, creating conflicts and noise. Claude Subconscious uses stdout injection instead, making the context provision invisible to the user but visible to Claude.

What makes this scalable is Letta’s Conversations API. A single Letta agent instance can handle multiple concurrent Claude Code sessions, each tracked as a separate conversation. When you switch between projects, the agent maintains distinct conversation threads but shares underlying memory:

const agent = await letta.getOrCreateAgent({
  name: 'claude-subconscious',
  tools: ['read', 'grep', 'glob', 'search'],
  memory: { type: 'persistent' }
});

// Each coding session gets its own conversation
const conversation = await agent.conversations.create({
  projectPath: '/path/to/project',
  metadata: { sessionId: uuidv4() }
});

// Agent can correlate patterns across conversations
await conversation.send(transcript);

This means the agent can notice patterns across your work. If you make the same architectural mistake in two different projects, it can warn you in the second project because it learned from the first.

The provisioning system uses a smart fallback chain. First, it checks for a LETTA_AGENT_ID environment variable. If not found, it looks for a saved configuration file. If that’s missing, it auto-imports a default agent with standard tools. This zero-config approach means you can start experimenting immediately:

const agentId = 
  process.env.LETTA_AGENT_ID ||
  loadFromConfig() ||
  await autoImportDefaultAgent();

The injection modes give you control over how aggressive the subconscious should be. In ‘whisper’ mode, it provides minimal context hints. In ‘full’ mode, it can inject extensive analysis. In ‘off’ mode, it still observes and learns but stays silent. Similarly, tool access can be restricted to read-only (safe exploration) or full (allowing the agent to discover patterns through active search).

The async processing pattern is crucial for performance. The background agent processes transcripts without blocking your coding flow. You don’t wait for memory updates or analysis—they happen in parallel while you continue working. This creates the feeling of a truly independent process watching and learning, not a synchronous step in your workflow.

Gotcha

The biggest gotcha is right in the README: this is explicitly a demo app, not production software. The maintainers are clear that you should use the full Letta Code CLI (npm install -g @letta-ai/letta-code) for actual work. This repository exists to demonstrate SDK capabilities and architectural patterns, not to be a daily driver.

You’re also taking on external service dependency. Every transcript processing requires a round-trip to Letta’s API. This adds latency and creates a potential failure point. If Letta’s service is down or slow, your subconscious stops learning. The agent processing happens asynchronously so it won’t block your coding, but you’ll lose the memory and context benefits. There’s also the Linux tmpfs issue—Claude Code has an underlying bug with temporary filesystems that requires manually setting TMPDIR to a real disk location. It’s a workaround for a problem in a different tool, but you’ll hit it if you’re on Linux. Finally, this architecture assumes you’re comfortable with the privacy implications of sending your coding transcripts to an external service. All your code discussions, architecture decisions, and problem-solving sessions go through Letta’s infrastructure.

Verdict

Use if: You’re exploring persistent memory architectures for AI coding assistants and want to understand how background agents can enhance context awareness, you’re comfortable with the demo/experimental nature and plan to learn from the patterns rather than deploy to production, you already use or want to evaluate Letta’s agent platform and see how it integrates with coding workflows, or you’re researching alternatives to CLAUDE.md-based state management. Skip if: You need production-ready tooling (use Letta Code CLI instead), you can’t accept external API dependencies for your development workflow, you require air-gapped or fully local solutions, or you just want basic Claude Code functionality without the complexity of background agent orchestration. This is a proof-of-concept that teaches valuable lessons about stateful AI assistants, but it’s intentionally not the finished product.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/ai-dev-tools/letta-ai-claude-subconscious.svg)](https://starlog.is/api/badge-click/ai-dev-tools/letta-ai-claude-subconscious)