Back to Articles

Stratosphere: Using Password-Cracking Algorithms to Find Exposed Cloud Storage Buckets

[ View on GitHub ]

Stratosphere: Using Password-Cracking Algorithms to Find Exposed Cloud Storage Buckets

Hook

Password crackers and cloud security researchers face the same problem: predicting what humans will name things. Stanford's Stratosphere applies John the Ripper's playbook to find exposed S3 buckets.

Context

Cloud storage misconfigurations represent one of the most persistent security vulnerabilities in modern infrastructure. Every few months, headlines announce another breach: Capital One's 100 million records exposed through a misconfigured S3 bucket, Tesla's Kubernetes pods leaking credentials, or government contractors leaving classified data publicly accessible. The problem isn't sophisticated attacks—it's simple human error combined with cloud providers' default-public or easily-misconfigured access controls.

Traditional bucket discovery tools rely on brute force dictionary attacks or simple permutations of company names. If you want to find Acme Corp's buckets, you try "acme", "acme-prod", "acme-backup", and so on. This approach misses the vast majority of buckets because humans create names with identifiable patterns that simple dictionaries can't capture. Researchers at Stanford's Empirical Security Research Group recognized that this problem mirrors password cracking: given a set of known passwords (or bucket names), how do you generate candidates that match human naming patterns? Stratosphere applies decades of password research—probabilistic context-free grammars, neural networks, and Markov models—to cloud storage enumeration, creating a research framework that discovered thousands of publicly accessible buckets across AWS, Google Cloud, and Alibaba Cloud.

Technical Insight

Stratosphere's architecture consists of three distinct phases that mirror offensive security reconnaissance: extraction, validation, and generation. The extraction phase gathers seed bucket names from passive sources, the validation phase confirms which candidates are real and accessible, and the generation phase uses machine learning to create new candidates based on discovered patterns.

The extraction pipeline pulls bucket names from multiple intelligence sources. It queries Bing's API for strings matching cloud storage URL patterns, parses Farsight Security's passive DNS data for S3 endpoints, scrapes GrayHat Warfare's bucket index, and pulls VirusTotal's domain intelligence. This multi-source approach builds a foundation of real-world bucket names that exhibit actual human naming patterns rather than theoretical combinations.

Validation happens through a distributed queueing system built on Beanstalk. When candidate bucket names arrive, they're pushed to a queue processed by a Go-based listener that coordinates ZMap and ZGrab2 scans. This architecture exists because cloud providers aggressively rate-limit bucket enumeration attempts from single IPs. Here's how the validation worker processes candidates:

import greenstalk
import subprocess
import json

class BucketValidator:
    def __init__(self, beanstalk_host='localhost', port=11300):
        self.queue = greenstalk.Client((beanstalk_host, port))
        self.queue.watch('bucket_candidates')
        
    def validate_s3_bucket(self, bucket_name):
        """Probe S3 bucket using ZGrab2 for HTTP metadata"""
        zgrab_cmd = [
            'zgrab2', 'http',
            '--input-file=-',
            '--output-file=-',
            f'--domain={bucket_name}.s3.amazonaws.com'
        ]
        
        result = subprocess.run(
            zgrab_cmd,
            input=f"{bucket_name}.s3.amazonaws.com\n".encode(),
            capture_output=True
        )
        
        response = json.loads(result.stdout)
        
        # 200 = public listing, 403 = exists but forbidden
        if response.get('data', {}).get('http', {}).get('status_code') in [200, 403]:
            return {
                'exists': True,
                'public': response['data']['http']['status_code'] == 200,
                'bucket': bucket_name
            }
        return {'exists': False}
    
    def process_queue(self):
        while True:
            job = self.queue.reserve()
            candidate = job.body
            result = self.validate_s3_bucket(candidate)
            
            if result['exists']:
                self.queue.use('validated_buckets')
                self.queue.put(json.dumps(result))
            
            self.queue.delete(job)

The generation phase is where Stratosphere differentiates itself from conventional bucket enumeration tools. Rather than simple string manipulation, it implements three algorithmic approaches borrowed from password cracking research. The LSTM (Long Short-Term Memory) neural network learns character-level patterns from validated bucket names, treating bucket naming as a sequence prediction problem. Feed it "acme-prod-2023" and similar corporate naming patterns, and it generates candidates like "acme-staging-2024" or "acme-dev-2023" based on learned positional character probabilities.

The PCFG (Probabilistic Context-Free Grammar) approach segments bucket names into token types—digits, lowercase letters, special characters—and learns the grammar of how these segments combine. A bucket name like "company-data-01" becomes a structure pattern L+S+L+S+D+ (lowercase string, separator, lowercase string, separator, digits). The system then generates new names following high-probability structural patterns with different content.

The n-gram implementation uses Markov models to predict character sequences. Here's a simplified version of the trigram generator:

from collections import defaultdict, Counter
import random

class TrigramGenerator:
    def __init__(self):
        self.trigrams = defaultdict(Counter)
    
    def train(self, bucket_names):
        """Build trigram frequency model from seed buckets"""
        for name in bucket_names:
            # Add start/end markers
            padded = f"^^{name}$$"
            
            for i in range(len(padded) - 2):
                context = padded[i:i+2]
                next_char = padded[i+2]
                self.trigrams[context][next_char] += 1
    
    def generate(self, max_length=63, count=1000):
        """Generate new bucket name candidates"""
        candidates = []
        
        for _ in range(count):
            name = "^^"
            
            while len(name) < max_length + 2:
                context = name[-2:]
                
                if context not in self.trigrams:
                    break
                
                # Weight by frequency
                chars = list(self.trigrams[context].keys())
                weights = list(self.trigrams[context].values())
                next_char = random.choices(chars, weights=weights)[0]
                
                if next_char == '$':
                    break
                    
                name += next_char
            
            # Strip markers and validate
            candidate = name.replace('^', '').replace('$', '')
            if 3 <= len(candidate) <= 63:  # S3 naming constraints
                candidates.append(candidate)
        
        return candidates

This trigram approach captures patterns like corporate naming conventions (company-environment-service), date patterns (backup-2023-01), and common separators. When trained on thousands of real bucket names from the extraction phase, it generates candidates that feel "human" rather than random.

The research methodology includes comparing hit rates across generation algorithms. In Stanford's published results, the LSTM approach achieved the highest discovery rate for new buckets, finding valid targets in approximately 0.3% of generated candidates—dramatically better than pure dictionary attacks. The PCFG approach excelled at generating variants of known organizational patterns, while n-grams provided the fastest generation for high-volume scanning.

Gotcha

Stratosphere is fundamentally a research artifact, not a production security tool, and this creates significant deployment friction. The setup requires configuring ZMap, ZGrab2, Beanstalk message queues, Go compilation environments, and multiple Python dependencies. You need to provision source IPs for distributed scanning to avoid rate limiting, configure API keys for multiple intelligence sources (some requiring paid access), and understand the legal implications of large-scale network scanning. The repository documentation assumes academic research context and provides minimal guidance on ethical boundaries or compliance with computer fraud laws.

The generation algorithms require substantial seed data to produce useful results. Training an LSTM on a few hundred bucket names won't capture meaningful patterns—you need thousands of examples to learn organizational naming conventions. This creates a bootstrapping problem: you need to find many buckets before you can effectively find more buckets. The published research had access to Farsight's complete passive DNS database and commercial threat intelligence feeds that individual researchers can't easily replicate. Additionally, cloud providers have evolved their security defaults and detection capabilities since Stratosphere's publication. AWS now warns about public buckets, implements block-public-access controls by default for new accounts, and likely flags enumeration patterns that Stratosphere's traffic generates. The tool's effectiveness on modern cloud infrastructure is uncertain without active maintenance.

Verdict

Use Stratosphere if you're conducting academic security research on cloud misconfiguration patterns, need a reproducible implementation of published methodology for comparative studies, or want to understand how algorithmic generation outperforms dictionary attacks in security contexts. It's valuable for learning how password-cracking techniques translate to other domains and for researchers building on Stanford's work. Skip it if you need a production security tool for auditing your own infrastructure (use cloud-native security tools instead), lack the legal authorization and infrastructure for large-scale Internet scanning, want a low-friction bucket discovery tool (lighter alternatives exist), or need actively maintained software compatible with current cloud provider APIs. This tool requires significant technical expertise, ethical clearance, and resource investment to deploy meaningfully. Security practitioners should view it as academic literature with code rather than operational tooling.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/data-knowledge/stanford-esrg-stratosphere.svg)](https://starlog.is/api/badge-click/data-knowledge/stanford-esrg-stratosphere)