Offsec & AI Agent Tool Intelligence

// LATEST

AI Agents

Building a Unified AI Gateway: How IBM's ContextForge Federates MCP, REST, and Agent Protocols

★ 3.8k May 8, 2026

AI Agents

AgentAuditor: The Invisible Research Project That Might Transform AI Agent Verification

★ 4 May 8, 2026

Cybersecurity

RAPTOR: Building an Autonomous Security Agent from Claude Code and Adversarial Thinking

★ 2.8k May 8, 2026

AI Dev Tools

Happy: Monitoring AI Coding Agents From Your Phone Without Leaking Your Code

★ 21.4k May 8, 2026

LLM Engineering

TheAgentCompany: The First Real-World Benchmark That Makes AI Agents Look Bad

★ 715 May 8, 2026

Developer Tools

Training Web Agents Through Test-Time Interaction: Inside TTI's Filtered BC Approach

★ 75 May 8, 2026

Developer Tools

AGI SDK: Building Browser Agents Against Production-Quality Web Replicas

★ 409 May 8, 2026

AI Agents

SPORT: Teaching Multimodal Agents to Self-Improve Without Human Labels

★ 20 May 8, 2026

AI Agents

RF-Agent: Teaching Language Models to Design Reward Functions Through Tree Search

★ 11 May 8, 2026

LLM Engineering

SEC-bench: A NeurIPS Framework for Benchmarking LLM Agents Against Real Security Vulnerabilities

★ 77 May 8, 2026

AI Agents

ARTEMIS: Stanford's Multi-Agent Red Teaming System That Orchestrates LLMs to Hunt Vulnerabilities

★ 516 May 8, 2026

AI Agents

LatentMAS: How Multi-Agent Systems Learned to Think Without Speaking

★ 966 May 8, 2026

AI Dev Tools

HumanLayer: The Context Engineering Framework That's Mostly Vapor

★ 10.9k May 8, 2026

AI Agents

Maestro: Orchestrating Multiple AI Coding Agents with Git Worktrees and Batch Automation

★ 3.0k May 8, 2026

Cybersecurity

Teaching Machines to Hack: Inside AutoPentest-DRL's Reinforcement Learning Approach

★ 431 May 8, 2026

AI Agents

AG-UI: The Missing Protocol Between AI Agents and Real-Time User Interfaces

★ 13.4k May 8, 2026

AI Agents

Membrane: Building a Transparent Sandbox for AI Agents with eBPF and Nested Containers

★ 53 May 8, 2026

LLM Engineering

Heretic: Automatic Abliteration for Uncensoring Language Models

★ 20.6k May 8, 2026

AI Agents

Leash: Runtime Guardrails for AI Coding Agents Using eBPF and Cedar Policies

★ 555 May 8, 2026

AI Dev Tools

Serena: The MCP Toolkit That Turns AI Agents into IDE-Native Developers

★ 24.0k May 8, 2026

Starlog — Page 62

// LATEST

Building a Unified AI Gateway: How IBM's ContextForge Federates MCP, REST, and Agent Protocols

AgentAuditor: The Invisible Research Project That Might Transform AI Agent Verification

RAPTOR: Building an Autonomous Security Agent from Claude Code and Adversarial Thinking

Happy: Monitoring AI Coding Agents From Your Phone Without Leaking Your Code

TheAgentCompany: The First Real-World Benchmark That Makes AI Agents Look Bad

Training Web Agents Through Test-Time Interaction: Inside TTI's Filtered BC Approach

AGI SDK: Building Browser Agents Against Production-Quality Web Replicas

SPORT: Teaching Multimodal Agents to Self-Improve Without Human Labels

RF-Agent: Teaching Language Models to Design Reward Functions Through Tree Search

SEC-bench: A NeurIPS Framework for Benchmarking LLM Agents Against Real Security Vulnerabilities

Superpowers: Teaching AI Agents to Stop Coding Like Caffeinated Interns

Stagehand: The Browser Automation SDK That Caches AI Actions Like Code

Steel Browser: The Open-Source Browser API That Lets AI Agents See the Web

HackingBuddyGPT: Teaching LLMs to Think Like Penetration Testers

ARTEMIS: Stanford's Multi-Agent Red Teaming System That Orchestrates LLMs to Hunt Vulnerabilities

LatentMAS: How Multi-Agent Systems Learned to Think Without Speaking

HumanLayer: The Context Engineering Framework That's Mostly Vapor

Maestro: Orchestrating Multiple AI Coding Agents with Git Worktrees and Batch Automation

Teaching Machines to Hack: Inside AutoPentest-DRL's Reinforcement Learning Approach

AG-UI: The Missing Protocol Between AI Agents and Real-Time User Interfaces

Membrane: Building a Transparent Sandbox for AI Agents with eBPF and Nested Containers

Heretic: Automatic Abliteration for Uncensoring Language Models

Leash: Runtime Guardrails for AI Coding Agents Using eBPF and Cedar Policies

Serena: The MCP Toolkit That Turns AI Agents into IDE-Native Developers