Back to Articles

Creepy: The OSINT Tool That Exposed Social Media's Geolocation Nightmare

[ View on GitHub ]

Creepy: The OSINT Tool That Exposed Social Media's Geolocation Nightmare

Hook

Before GDPR and privacy controls, a single Python script could map anyone's daily movements by simply aggregating their public social media posts. That script was called Creepy, and it terrified an entire generation into understanding metadata.

Context

In the early 2010s, social media platforms treated location data as an afterthought. Twitter geotagged tweets by default. Instagram embedded GPS coordinates in photo EXIF data. Facebook check-ins broadcasted precise locations to anyone who cared to look. Users happily shared their morning coffee spots, gym check-ins, and vacation destinations without understanding they were creating a digital breadcrumb trail of their entire lives.

Creepy emerged in this landscape as both a wake-up call and a practical OSINT (Open Source Intelligence) tool. Developed by Ioannis Kakavas, it automated what security researchers had been doing manually: correlating publicly available location data from multiple social platforms to build comprehensive movement profiles. The tool's deliberately unsettling name wasn't hyperbole—watching someone's life unfold on a map through aggregated social media posts was genuinely creepy. It became a demonstration tool for privacy advocates, a research instrument for security professionals, and a catalyst for the privacy reforms that followed.

Technical Insight

Presentation Layer

Core Engine

Plugins

Geotagged Posts

EXIF Data

Check-ins

Coordinates + Metadata

Normalized Locations

CSV/KML

User Input

Username/Target

Plugin Manager

Twitter Plugin

Flickr Plugin

Foursquare Plugin

Location Aggregator

Data Processor

PyQt GUI

Map Visualization

Export Engine

External Tools

Google Maps

System architecture — auto-generated

Creepy's architecture follows a plugin-based scraper design that remains instructive for OSINT tool developers. At its core, the application separates data acquisition (platform-specific plugins), data processing (coordinate extraction and normalization), and visualization (PyQt GUI with mapping integration). Each social network plugin implements a common interface, making the system extensible without modifying the core engine.

The plugin architecture looked something like this:

class SocialNetworkPlugin:
    def __init__(self, config):
        self.name = "PluginName"
        self.config = config
        self.api_client = None
    
    def search_targets(self, username):
        """Return list of target profiles matching username"""
        pass
    
    def get_locations(self, target):
        """Extract location data from target's posts"""
        locations = []
        posts = self.fetch_posts(target)
        
        for post in posts:
            if post.has_geolocation():
                locations.append({
                    'latitude': post.lat,
                    'longitude': post.lng,
                    'timestamp': post.created_at,
                    'context': post.text,
                    'source': self.name
                })
        
        return locations

The real intelligence lay in how Creepy handled metadata extraction. Social platforms embedded location data in multiple formats: explicit geotags in API responses, EXIF data in images, location names in text that required geocoding, and IP-based approximations. The tool normalized these disparate formats into a unified location schema with coordinates, timestamps, and context.

For Twitter's API (in its v1.1 incarnation), Creepy would extract both explicit coordinates and place objects. The distinction mattered because a tweet geotagged at exact coordinates revealed "I am here now," while a place object like "San Francisco, CA" provided lower precision. The tool preserved both for analyst interpretation:

def parse_tweet_location(tweet):
    location = {}
    
    # Precise coordinates (if user enabled precise location)
    if tweet.get('coordinates'):
        coords = tweet['coordinates']['coordinates']
        location['type'] = 'precise'
        location['lng'] = coords[0]
        location['lat'] = coords[1]
    
    # Place object (neighborhood, city level)
    elif tweet.get('place'):
        place = tweet['place']
        # Use bounding box centroid for approximate location
        bbox = place['bounding_box']['coordinates'][0]
        location['type'] = 'approximate'
        location['lng'] = sum(point[0] for point in bbox) / len(bbox)
        location['lat'] = sum(point[1] for point in bbox) / len(bbox)
        location['place_name'] = place['full_name']
    
    location['timestamp'] = tweet['created_at']
    location['content'] = tweet['text']
    
    return location

The visualization layer used PyQt to render an interactive map with clustered location markers. Clicking a marker revealed the associated post content, timestamp, and source platform. This contextualization transformed raw coordinates into a narrative—not just "this person was at these coordinates," but "they posted about their workout at the downtown gym every Tuesday morning."

Creepy's export functionality demonstrated thoughtful design for investigative workflows. CSV exports included all metadata fields for statistical analysis. KML exports created geographic markup compatible with Google Earth and professional GIS tools, complete with proper timestamp formatting for temporal animation. This allowed analysts to visualize movement patterns over time, revealing routines, frequently visited locations, and travel patterns.

The tool also implemented rudimentary pattern analysis, identifying clusters of locations (home, work, frequent venues) and flagging unusual locations that deviated from established patterns. This was basic compared to modern machine learning approaches, but effective for highlighting anomalies in location histories.

Gotcha

Creepy is functionally dead for production use, and that's critical to understand. Every social platform it targeted has since locked down API access or deprecated the endpoints that exposed location data. Twitter's v1.1 API was sunset. Instagram restricted third-party access to location data. Facebook completely overhauled permissions after Cambridge Analytica. The scraping techniques that worked in 2013 now trigger rate limits, CAPTCHA challenges, and account suspensions.

Beyond technical obsolescence, there are serious ethical and legal concerns. Using tools like Creepy without explicit authorization crosses into surveillance territory. Many jurisdictions now have laws specifically prohibiting unauthorized tracking or data aggregation. Even for legitimate security research, the barrier for informed consent and ethical approval is considerably higher than when Creepy was released. The tool was built in an era with different norms around data collection—norms that have rightfully evolved. Anyone considering building similar functionality today needs legal review, ethical frameworks, and explicit consent mechanisms that Creepy never contemplated.

Verdict

Use if: You're researching the history of OSINT techniques, teaching a course on privacy implications of social media, or studying how API access restrictions evolved in response to surveillance concerns. Creepy is a time capsule demonstrating why platforms implement strict data access controls today. It's valuable as a case study, not as operational software. Security researchers might also examine its plugin architecture as a reference for building modular data collection systems (for legitimate purposes with proper authorization). Skip if: You need functional geolocation intelligence gathering for actual investigations, expect it to work against modern social platforms, or don't have a clear ethical framework and legal authorization for your use case. For current OSINT work, invest in actively maintained commercial tools like Maltego or stay current with Bellingcat's evolving techniques. Creepy belongs in a museum of digital privacy awakening—admire it for what it revealed about our data exhaust, but don't expect it to run on today's locked-down platforms.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/ilektrojohn-creepy.svg)](https://starlog.is/api/badge-click/developer-tools/ilektrojohn-creepy)