Memoria: Building Agent Memory Systems That Remember Why, Not Just What
Hook
Most RAG systems give your AI agent a photographic memory of documents, but no ability to remember why it made a decision last Tuesday. Memoria fixes that by treating agent memory like human memory: connected, contextual, and capable of learning from experience.
Context
The current generation of AI agents suffers from institutional amnesia. Tools like AutoGPT and BabyAGI can plan multi-step tasks, but they forget everything between sessions. Current solutions bolt on vector databases for retrieval-augmented generation (RAG), which works brilliantly for document search but poorly for agent cognition. When your agent needs to remember "I tried this API integration approach and it failed because of rate limiting," storing that as an embedding loses critical context: the causal chain, the attempted solution, the failure mode, and the workaround that eventually succeeded.
Memoria takes a different architectural bet: that agent memory requires both semantic similarity (vectors) and relational context (graphs). Instead of treating memory as a passive document store, it models memory as a write-back system where agents store decisions, reasoning artifacts, preferences, and action outcomes. The project acknowledges it's pre-MVP, but the architecture reveals a more sophisticated understanding of what agent memory actually needs than most production RAG wrappers.
Technical Insight
At its core, Memoria orchestrates two complementary storage systems: Milvus for vector similarity search and Neo4j for graph-structured relationships. This isn't just architectural redundancy—it reflects fundamentally different retrieval strategies. When an agent queries "how should I handle database connection failures?", the vector layer surfaces semantically similar past experiences, while the graph layer traverses relationships like DECISION -> CAUSED_BY -> ERROR -> RESOLVED_BY -> WORKAROUND.
The memory schema differentiates between artifact types that matter to agents. Rather than generic "documents," Memoria stores typed memories:
# Simplified representation of memory artifact types
class MemoryArtifact:
# Core decision memory
decision = {
'context': 'What situation prompted this',
'reasoning': 'Why this approach was chosen',
'outcome': 'What actually happened',
'timestamp': 'When this occurred'
}
# Methodological learning
method = {
'problem_type': 'Category of problem',
'approach': 'Strategy that worked/failed',
'conditions': 'When this applies',
'effectiveness': 'How well it worked'
}
# User preferences and constraints
preference = {
'preference_type': 'What aspect of behavior',
'value': 'Preferred approach',
'strength': 'How strongly this applies',
'context': 'When this preference matters'
}
This schema enables queries that traditional RAG can't answer. An agent can ask "what approaches have I tried for this problem type?" and get not just similar text snippets, but structured provenance: which approaches were attempted, under what conditions, with what outcomes, and what the agent learned.
The graph layer makes this queryable through relationship traversal. In Neo4j, a memory might look like:
// Creating a decision memory with relationships
CREATE (d:Decision {
id: 'dec_20240115_api_retry',
context: 'Third-party API returning 429 rate limit errors',
reasoning: 'Implemented exponential backoff instead of fixed retry',
outcome: 'Success after 3 retries average',
timestamp: datetime()
})
CREATE (e:Error {type: 'RateLimitError', api: 'external_service'})
CREATE (s:Strategy {name: 'exponential_backoff', pattern: 'retry_with_backoff'})
CREATE (o:Outcome {success: true, avg_retries: 3})
CREATE (d)-[:RESPONDED_TO]->(e)
CREATE (d)-[:APPLIED]->(s)
CREATE (d)-[:RESULTED_IN]->(o)
CREATE (s)-[:RESOLVED]->(e)
Now when the agent encounters a rate limit error again, retrieval isn't just embedding similarity—it's graph traversal: "Find decisions that responded to RateLimitError, see what strategies they applied, check the outcomes, and retrieve the reasoning." This captures not just what worked, but why it worked and when it applies.
The FastAPI backend exposes memory operations as RESTful endpoints, allowing agents to write memories during execution and query them for context. The write-back loop is critical: unlike pure RAG where humans curate the knowledge base, agents actively write their own experiences:
# Conceptual agent integration
class MemoriaAgent:
def execute_task(self, task):
# Query relevant past experiences
similar_contexts = self.memoria.query_vector(
embedding=self.embed(task.description),
limit=5
)
related_decisions = self.memoria.query_graph(
pattern="MATCH (d:Decision)-[:RESPONDED_TO]->(e:Error)
WHERE e.type = $error_type
RETURN d, e",
params={'error_type': task.error_type}
)
# Make decision informed by memory
decision = self.decide(task, similar_contexts, related_decisions)
# Execute and observe outcome
outcome = self.execute(decision)
# Write back to memory
self.memoria.store_artifact(
type='decision',
context=task.description,
reasoning=decision.rationale,
outcome=outcome.result,
relationships=[
('RESPONDED_TO', task.trigger_event),
('APPLIED', decision.strategy),
('RESULTED_IN', outcome)
]
)
This architecture enables compound learning: each execution enriches the memory, making future decisions more informed. The agent doesn't just retrieve relevant documents—it retrieves its own lived experience, complete with causal reasoning and outcome tracking.
Gotcha
The project documentation explicitly warns that Memoria is pre-MVP, and this isn't false modesty. Core features around memory pruning and temporal decay are still in development, which is a serious limitation. Without pruning strategies, agent memory will accumulate noise—bad decisions, outdated context, and contradictory learnings. Human memory's strength isn't just remembering but strategically forgetting, and Memoria hasn't solved this yet.
The lack of a deployed backend means you're spinning up Milvus and Neo4j locally for experimentation, which is fine for research but adds operational complexity. For production use, you'd need to solve deployment, backup, memory lifecycle management, and privacy isolation between agents—none of which are addressed. The React frontend exists primarily for visualization during development, not as a production admin interface. If you need agent memory today in a production system, you're essentially adopting Memoria's architecture as inspiration and building the implementation yourself, because the codebase isn't ready to run critical applications.
Verdict
Use if: You're researching agent memory architectures and want to experiment with hybrid vector-graph approaches, you're building an agent system where learning from past decisions is core to the value proposition, or you want to contribute to an open-source project tackling a genuinely hard problem in an architecturally interesting way. Skip if: You need production-ready agent memory right now (use Mem0 or Zep instead), you're building simple chatbots where conversation buffer memory is sufficient, or you're uncomfortable running pre-release software that explicitly warns against production usage. The ideas here are solid, but give it 2-3 months to mature before betting critical systems on it.