Back to Articles

Sysdig Inspect: Forensic-Grade System Call Analysis Without the Command Line Chaos

[ View on GitHub ]

Sysdig Inspect: Forensic-Grade System Call Analysis Without the Command Line Chaos

Hook

Your container crashed at 3 AM. Your monitoring dashboard shows nothing. But somewhere in 4GB of binary system call traces lies the exact sequence of file writes, network connections, and process spawns that killed it. Finding that needle used to mean piping strace output through ten different grep commands.

Context

Traditional Linux troubleshooting has a fundamental gap between high-level monitoring and low-level tracing. Tools like Prometheus show you CPU spiked at 03:42:17, but not why. Tools like strace show you every system call, but dump thousands of lines per second to stdout with no context or correlation. For years, experts bridged this gap with elaborate shell scripts, parsing tools, and institutional knowledge about which syscalls matter.

Containers made this worse. When you have dozens of processes across multiple containers sharing a kernel, traditional process-focused tools lose context. Which container made that network connection? Which process wrote to that file descriptor? The sysdig project (originally created by Loris Degioanni, who also created Wireshark) solved capture by creating a kernel module that records everything with container context. But analyzing those captures meant learning sysdig’s CLI filter language or writing custom scripts. Sysdig Inspect emerged as the graphical microscope for these traces—a way to navigate gigabytes of syscall data the way Wireshark lets you navigate packet captures.

Technical Insight

Drill-Down Interface

Sysdig Inspect (Analysis)

Linux System (Collection)

syscalls, network, I/O

binary trace

load capture

decoded events

filtered data

click/filter

select spike

hosts

Sysdig Kernel Module

Sysdig CLI

.scap Capture File

SCAP Parser

Redux State Store

React UI Components

Overview Tiles

Timeline View

System Call Details

Electron Runtime

System architecture — auto-generated

Sysdig Inspect operates on a two-stage architecture that separates collection from analysis. On your Linux system, the sysdig CLI (with its kernel module) captures system activity into .scap files—binary traces containing every syscall, with configurable payload snaplen. Then Sysdig Inspect, which runs anywhere (including machines without the kernel module), loads these captures into an Electron-based interface built on React and Redux.

The killer feature is the tile-based drill-down workflow. When you open a capture, you see overview tiles: top processes by CPU, network connections, file I/O, container activity. Click a tile and you filter into a timeline view showing sub-second metrics. Click a spike in that timeline and you see the actual system calls. Click a syscall and you see the payload bytes. This transforms archaeological analysis from writing filter chains to visual navigation.

Here’s what a typical workflow looks like. First, capture on your Linux system:

# Capture 60 seconds with full container context
# -s 8192 captures first 8KB of each I/O buffer
sudo sysdig -s 8192 -w capture.scap

# Or capture only specific containers
sudo sysdig -s 8192 -w capture.scap container.name=nginx

That .scap file contains everything: process lifecycle events, file descriptor operations, network activity, all timestamped with nanosecond precision and tagged with container metadata. The snaplen parameter (-s 8192) determines how many bytes of each read/write buffer get captured. Higher values mean complete payload visibility but larger files.

In Sysdig Inspect, the visual query model replaces sysdig’s CLI filter syntax. Instead of remembering that fd.name contains /var/log and evt.type=write filters for log writes, you click the Files tile, sort by write bytes, and select /var/log entries. The interface constructs these filters visually as you navigate. Power users can still type filters directly, but the UI makes them discoverable.

The timeline view reveals patterns invisible to traditional metrics. Standard monitoring samples every 10-60 seconds, averaging away burst behavior. Sysdig Inspect shows you what happened in each 100-millisecond window. That database query that “takes 2 seconds on average”? The timeline shows it actually blocks for 50ms every 200ms—classic lock contention. That network connection that shows steady throughput? Actually bursty writes followed by 500ms pauses—TCP window scaling issue.

The payload inspection capability turns Sysdig Inspect into a security investigation tool. Because the .scap file captured actual I/O buffers (up to your snaplen), you can reconstruct what a process actually did. Select a write() syscall to /etc/cron.d/ and see the exact crontab entry an attacker added. Select a connect() syscall and see the first bytes of the TLS handshake. This is why security teams use it for incident response—it’s a black box recorder for your entire system.

Under the hood, Sysdig Inspect uses a columnar storage approach for the in-memory representation of trace data. When you load a multi-gigabyte .scap file, it parses the binary format into typed columns: timestamps, process IDs, syscall types, buffer contents. React components then query these structures with filter predicates that compile to efficient scans. The Redux store maintains your drill-down path, so you can navigate back through your investigation. This architecture enables responsive UI even on captures containing millions of events.

Gotcha

The biggest gotcha is the two-tool workflow itself. You must collect captures on Linux with the sysdig kernel module installed, which requires root access and kernel headers. Kubernetes environments often have locked-down node access, and some organizations prohibit kernel module installation entirely. You cannot point Sysdig Inspect at a live system—it only analyzes pre-recorded .scap files. This makes it fantastic for planned troubleshooting (“let’s capture during the deployment and analyze after”) but frustrating for unexpected incidents (“I wish we’d been capturing when that happened”).

File size becomes problematic quickly with high snaplen values. Capturing 60 seconds of a busy web server with full payload snaplen can generate multi-gigabyte files. These take time to parse, consume significant RAM when loaded, and make the UI sluggish during timeline rendering. You’ll often make multiple captures: first with low snaplen to identify the timeframe, then targeted captures with high snaplen. The tool doesn’t stream or partially load large files—it’s all or nothing. Also, while Sysdig Inspect parses many common protocols in the payload view (HTTP, DNS, etc.), it’s not as comprehensive as Wireshark’s dissector library. You’ll sometimes stare at hex dumps when protocol-level parsing would help.

Verdict

Use if: You’re troubleshooting complex issues in containerized Linux environments where you need to correlate metrics across process, network, and filesystem activity at sub-second granularity. It’s invaluable for security incident response when you need to reconstruct exactly what a compromised container did, for performance archaeology when symptoms appear and disappear faster than your metrics sample rate, or when you need actual payload inspection to debug application-level protocol issues. The visual drill-down interface dramatically reduces time-to-insight compared to command-line trace analysis. Skip if: You need real-time monitoring and alerting (use Falco or commercial Sysdig instead), work primarily on Windows or macOS systems, cannot install kernel modules in your environment, or your troubleshooting needs are met by existing metrics and logs. Also skip if your security requirements prohibit capturing potentially sensitive data in syscall payloads—remember that high snaplen values capture everything, including passwords and API keys that happen to flow through system calls.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/infrastructure/draios-sysdig-inspect.svg)](https://starlog.is/api/badge-click/infrastructure/draios-sysdig-inspect)