Tachikoma: Building Security Alerts on Diffs Instead of State
Hook
Most security monitoring tools force you to write complex stateful logic to avoid alert storms. Tachikoma flips this on its head: your analyzers only ever see what changed, making alert logic almost embarrassingly simple.
Context
Security monitoring tools typically suffer from a fundamental design problem: they force developers to implement their own change detection logic. You query your AWS infrastructure, your Slack channels, your user permissions—then you're stuck writing boilerplate code to compare current state against previous state, figure out what changed, and decide whether those changes matter. This leads to one of two outcomes: either teams skip the diffing logic entirely and drown in alert noise, or they write hundreds of lines of stateful comparison code that's tedious to maintain and nearly impossible to test.
Tachikoma, developed by the team at CaliDog, takes a radically different approach by making change detection a first-class architectural concern. Instead of treating diffing as something each analyzer implements independently, the framework builds it into the core pipeline. Generators collect data from various sources, the Differ component automatically compares current results against persisted historical state, and analyzers receive only the delta. This architectural decision transforms alert logic from complex stateful analysis into trivial conditional checks—if you see a diff in your analyzer, something actually changed.
Technical Insight
The elegance of Tachikoma's architecture lies in its three-phase pipeline with built-in persistence and namespace-based routing. Let's walk through how these components interact.
Generators are responsible for collecting data from external sources. They're simple functions (or coroutines) that return structured data. Here's what a hypothetical AWS IAM user generator might look like:
import boto3
from tachikoma.generators import generator
@generator('aws.iam.users')
async def collect_iam_users():
"""Collect all IAM users and their attached policies."""
iam = boto3.client('iam')
users = iam.list_users()['Users']
result = []
for user in users:
policies = iam.list_attached_user_policies(
UserName=user['UserName']
)['AttachedPolicies']
result.append({
'username': user['UserName'],
'created': user['CreateDate'].isoformat(),
'policies': [p['PolicyName'] for p in policies]
})
return result
The @generator decorator registers this function with the namespace aws.iam.users. When Tachikoma runs, it executes all generators, persists their results, and passes them to the Differ component. The Differ compares current results against the previous run's persisted state and produces a structured diff object containing additions, deletions, and modifications.
Now here's where the magic happens—your analyzer only receives the diff, not the full state:
from tachikoma.analyzers import analyzer
from tachikoma.emitters import emit
@analyzer('aws.iam.users')
def detect_privilege_escalation(diff):
"""Alert when users gain admin-level policies."""
dangerous_policies = ['AdministratorAccess', 'PowerUserAccess']
for addition in diff.added:
for policy in addition['policies']:
if policy in dangerous_policies:
emit('slack.security', {
'severity': 'high',
'message': f"New user {addition['username']} created with {policy}"
})
for modification in diff.modified:
old_policies = set(modification['old']['policies'])
new_policies = set(modification['new']['policies'])
added_policies = new_policies - old_policies
for policy in added_policies:
if policy in dangerous_policies:
emit('slack.security', {
'severity': 'critical',
'message': f"User {modification['new']['username']} escalated to {policy}"
})
Notice how clean this analyzer code is. No database queries, no manual state comparison, no caching logic—just straightforward conditional checks on the diff object. The diff.added list contains entirely new items, diff.modified contains items that existed before but changed (with both old and new versions), and diff.deleted contains items that disappeared.
The namespace-based routing system enables sophisticated many-to-many relationships between components. An analyzer decorated with @analyzer('aws.*') would receive diffs from all AWS-related generators, while @analyzer('aws.iam.*') would only see IAM-related changes. This decoupling means you can add new generators without touching analyzer code, or create cross-cutting analyzers that correlate changes across multiple data sources.
Emitters work similarly—they're registered to namespaces and receive whatever analyzers emit to those namespaces. This indirection means your analyzer code never directly couples to Slack, PagerDuty, or whatever alerting system you're using. Want to add email alerts? Just register an emitter for the appropriate namespace.
The persistence layer abstracts storage behind a simple interface with get() and set() methods. The default implementation uses local JSON files, but you could swap in Redis, S3, or a database without changing any generator or analyzer code. This abstraction also makes testing trivial—inject a fake persistence layer that returns controlled historical data, run your analyzer against known diffs, and verify the expected alerts fire.
For execution, Tachikoma supports both async generators (using coroutines) and synchronous generators (executed in a thread pool). This hybrid approach lets you use modern async AWS/HTTP clients where available while still supporting legacy synchronous libraries. The framework orchestrates concurrent execution internally, collecting all results before moving to the diff phase.
Gotcha
Tachikoma's biggest limitation is that it only supports batch processing—it runs periodically (via cron or similar) rather than processing events in real-time. If a critical security event happens, you won't know about it until the next scheduled run. The GitHub repository mentions streaming pipeline support as future work, but given the project's apparent abandonment since 2017 (it's still marked ALPHA with only 22 stars and minimal recent activity), don't hold your breath for that feature.
The other significant issue is limited documentation around what generators, analyzers, and emitters exist out of the box. The README provides architectural overview but doesn't enumerate available integrations. You'll likely need to write most components from scratch, which might be fine for a framework but feels incomplete for a security tool where pre-built checks add substantial value. The diff-based architecture is genuinely clever and could save development time, but you're essentially getting architectural patterns rather than a batteries-included security monitoring solution. For production use, the lack of active maintenance is a serious red flag—security tools need ongoing updates to address new attack vectors and integrate with evolving cloud platforms.
Verdict
Use if: You're building custom security monitoring for a specific environment, you value clean architecture over pre-built integrations, and you're comfortable maintaining (or forking) the codebase yourself. The diff-based approach is legitimately innovative and could serve as excellent architectural inspiration for a homegrown security framework. It's also worth considering if you're in a Python-heavy shop that wants lightweight, testable security monitoring without the operational overhead of tools like Security Monkey. Skip if: You need production-ready tooling with active development, extensive pre-built security checks, real-time alerting capabilities, or vendor support. The project's alpha status and apparent abandonment make it risky for production use unless you're willing to become the maintainer. For most teams, mature alternatives like Cloud Custodian (active development, real-time support, extensive AWS coverage) or Prowler (comprehensive security checks) will be safer bets despite lacking Tachikoma's elegant diffing architecture.