All articles

Developer Tools

388 articles

Developer Tools

PenGym: Training Reinforcement Learning Agents on Real Penetration Testing Infrastructure

By Rob Ragan ★ 56 Python Apr 3, 2026
Developer Tools

SEC-bench: Automated Benchmarking for LLM Agents on Real-World Security Vulnerabilities

By Rob Ragan ★ 63 Python Apr 3, 2026
Developer Tools

HackBench: The Security Benchmark Where LLMs Learn to Exploit Real Vulnerabilities

By Rob Ragan ★ 69 Rich Text Format Apr 3, 2026
Developer Tools

InterCode-CTF: How Simple Prompts Cracked 95% of Security Challenges (And What That Means for LLM Benchmarking)

By Rob Ragan ★ 70 Python Apr 3, 2026
Developer Tools

BandSox: Running Firecracker MicroVMs from Docker Images in Milliseconds

By Rob Ragan ★ 70 Python Apr 3, 2026
Developer Tools

RedCode: The First Large-Scale Safety Benchmark That Actually Tests Code Agents in the Wild

By Rob Ragan ★ 71 Python Apr 3, 2026
Developer Tools

TTI: Teaching Web Agents to Learn from Their Own Mistakes at Test Time

By Rob Ragan ★ 73 Python Apr 3, 2026
Developer Tools

Agenspy: Making DSPy Protocol-Native for the Agent-to-Agent Future

By Rob Ragan ★ 74 Python Apr 3, 2026
Developer Tools

Building an Autonomous Reverse Engineering Agent with Dual-LLM Verification Loops

By Rob Ragan ★ 79 Python Apr 3, 2026
Developer Tools

Auto-Exploits: When AI Writes Security Exploits For You

By Rob Ragan ★ 83 Python Apr 3, 2026
Developer Tools

Edgar: A Minimalist Python Gateway to SEC Financial Filings

By Rob Ragan ★ 104 Python Apr 3, 2026
Developer Tools

GhostLine: Building a Voice-Cloning Vishing Framework with FastAPI, Twilio, and LLMs

By Rob Ragan ★ 112 Python Apr 3, 2026
Developer Tools

Natural Language Playwright Tests with Anthropic's Computer Use API

By Rob Ragan ★ 115 TypeScript Apr 3, 2026
Developer Tools

Azul: Building a Hundred-Million-Sample Malware Knowledge Base on Kubernetes

By Rob Ragan ★ 136 Unknown Apr 3, 2026
Developer Tools

CewlAI: Teaching Language Models to Think Like a Pentester

By Rob Ragan ★ 137 Python Apr 3, 2026
Developer Tools

Building Explainer Videos as Code: Inside Video Explainer's AI-Powered Pipeline

By Rob Ragan ★ 138 Python Apr 3, 2026
Developer Tools

Building a Browser Extension Threat Intelligence Feed with Flat Files and Local-First Scanning

By Rob Ragan ★ 142 HTML Apr 3, 2026
Developer Tools

CyberStrike: Turning Your Claude or GPT Subscription Into an Autonomous Penetration Testing Agent

By Rob Ragan ★ 151 TypeScript Apr 3, 2026
Developer Tools

Nightwire: Ship Production Code from Your Phone via Signal

By Rob Ragan ★ 155 Python Apr 3, 2026
Developer Tools

VulnBot: When AI Agents Learn to Think Like Penetration Testers

By Rob Ragan ★ 157 Python Apr 3, 2026
Developer Tools

ARTKIT: Building Multi-Turn Adversarial Testing Pipelines for Gen AI Systems

By Rob Ragan ★ 165 Jupyter Notebook Apr 3, 2026
Developer Tools

CVE-Bench: Testing Whether AI Agents Can Exploit Real-World Vulnerabilities

By Rob Ragan ★ 188 Python Apr 3, 2026
Developer Tools

How Engine Simulator Synthesizes Authentic V8 Rumble from Physics, Not Samples

By Rob Ragan ★ 9.3k C++ Apr 3, 2026
Developer Tools

MFASweep: Finding the Weak Links in Microsoft's MFA Enforcement Chain

By Rob Ragan ★ 1.6k PowerShell Apr 3, 2026