Offsec & AI Agent Tool Intelligence

// LATEST

LLM Engineering

BitNet: Running 100B Parameter Models on Your Laptop at Human Reading Speed

★ 38.9k May 9, 2026

Developer Tools

Gitingest: Turn Any GitHub Repository Into LLM-Ready Text With a URL Trick

★ 14.6k May 9, 2026

AI Dev Tools

ARTKIT: Why Enterprise Gen AI Testing Requires Adversarial Multi-Turn Conversations

★ 168 May 9, 2026

LLM Engineering

LLM-Check: Detecting Hallucinations by Reading Your Model's Mind

★ 40 May 9, 2026

LLM Engineering

Mapping LLM Safety as a Landscape: How Weight Perturbations Reveal the Fragility of Alignment

★ 40 May 9, 2026

Developer Tools

Extracting Neural Network Weights Through Black-Box Queries: A Cryptanalytic Attack Framework

★ 2 May 9, 2026

Developer Tools

Privacy Backdoors: When Pre-Trained Models Betray Your Training Data

★ 6 May 9, 2026

LLM Engineering

VERL: The Hybrid-Controller Framework Reshaping How We Train LLMs with Reinforcement Learning

★ 21.2k May 9, 2026 AI 35

AI Agents

AgentBoard: Why LLM Agent Benchmarks Need Multi-Turn Analysis, Not Just Success Rates

★ 413 May 9, 2026

Data & Knowledge

WrenAI: The Semantic Context Layer That Keeps LLMs From Wrecking Your Data Governance

★ 15.1k May 9, 2026

Developer Tools

Marker: How a Multi-Stage CV Pipeline Achieves 25 Pages/Second PDF Parsing

★ 34.8k May 9, 2026

AI Dev Tools

STRIDE GPT: How AI-Powered Threat Modeling Adapts to Agentic Systems

★ 1.0k May 9, 2026 AI 15

LLM Engineering

IB4LLMs: Using Information Bottleneck Theory to Build Jailbreak-Resistant Language Models

★ 27 May 9, 2026

Developer Tools

Teaching AI to Read API Docs: Inside OpenAPI MCP Server's Progressive Disclosure Pattern

★ 892 May 9, 2026

Starlog — Page 54

// LATEST

BitNet: Running 100B Parameter Models on Your Laptop at Human Reading Speed

Gitingest: Turn Any GitHub Repository Into LLM-Ready Text With a URL Trick

ARTKIT: Why Enterprise Gen AI Testing Requires Adversarial Multi-Turn Conversations

LLM-Check: Detecting Hallucinations by Reading Your Model's Mind

Mapping LLM Safety as a Landscape: How Weight Perturbations Reveal the Fragility of Alignment

Extracting Neural Network Weights Through Black-Box Queries: A Cryptanalytic Attack Framework

Privacy Backdoors: When Pre-Trained Models Betray Your Training Data

VERL: The Hybrid-Controller Framework Reshaping How We Train LLMs with Reinforcement Learning

AgentBoard: Why LLM Agent Benchmarks Need Multi-Turn Analysis, Not Just Success Rates

WrenAI: The Semantic Context Layer That Keeps LLMs From Wrecking Your Data Governance

Marker: How a Multi-Stage CV Pipeline Achieves 25 Pages/Second PDF Parsing

STRIDE GPT: How AI-Powered Threat Modeling Adapts to Agentic Systems

IB4LLMs: Using Information Bottleneck Theory to Build Jailbreak-Resistant Language Models

URET: Adversarial Testing for ML Models Beyond Images

RedCode: The First Real Safety Benchmark for Autonomous Code Agents

LangGround: Teaching AI Agents to Coordinate Like Humans, Not Vectors

Coderoller: Flattening Repositories Into LLM-Ready Markdown

SRMT: Teaching Robots to Share Their Thoughts Through Memory

Sponge Poisoning: The Stealth Attack That Makes Neural Networks Energy Vampires

Teaching LLMs to Predict the Future: World Models for Web Agents

NegMerge: Fixing Machine Unlearning's Hyperparameter Lottery Problem

Best-of-N Jailbreaking: How Sampling Beats Sophistication in LLM Attacks

Building Resumable LLM Evaluations: A Template for Rate-Limited API Testing

Teaching AI to Read API Docs: Inside OpenAPI MCP Server's Progressive Disclosure Pattern