Back to Articles

Cloudlist: Multi-Cloud Asset Discovery for Attack Surface Management

[ View on GitHub ]

Cloudlist: Multi-Cloud Asset Discovery for Attack Surface Management

Hook

Your attack surface spans AWS, GCP, Azure, and Kubernetes—but your asset inventory tool only speaks one cloud dialect. When a security team at a Fortune 500 company discovered 400+ orphaned EC2 instances during a routine audit, they realized their multi-cloud sprawl had become invisible.

Context

Cloud asset management is the unglamorous foundation of security operations. Before you can secure something, you need to know it exists. In single-cloud environments, this is tractable: AWS Config lists your EC2 instances, GCP Asset Inventory catalogs your compute resources, Azure Resource Graph queries your subscriptions. But modern infrastructure rarely respects provider boundaries.

The reality is messier: development teams spin up resources in AWS, data scientists provision GCP instances for machine learning workloads, legacy applications run in Azure, and edge services deploy to DigitalOcean. Meanwhile, Kubernetes clusters orchestrate containers across multiple clouds, Terraform state files document infrastructure-as-code deployments, and Nomad schedules jobs on hybrid infrastructure. Security teams need a unified view, but native cloud tools are walled gardens. Cloudlist emerged from ProjectDiscovery's offensive security toolkit as a solution: a single binary that speaks every cloud provider's API and normalizes the chaos into a consistent stream of assets.

Technical Insight

Cloudlist implements a provider-plugin architecture where each cloud platform is abstracted behind a common interface. The core engine reads a unified configuration file, instantiates provider clients with appropriate credentials, executes enumeration logic specific to each platform, and normalizes results into a consistent output format. This design separates authentication concerns from discovery logic, making it trivial to add new providers without modifying the core codebase.

Configuration lives in provider-config.yaml, where each provider gets a top-level key with credential details and optional filtering parameters. Here's a realistic multi-cloud setup:

aws:
  - profile: production
    services:
      - ec2
      - route53
      - s3
  - profile: staging
    services:
      - ec2

gcp:
  - project: my-project-123
  - organization: 123456789
    use_asset_inventory: true

azure:
  - tenant_id: xxx
    client_id: yyy
    client_secret: zzz
    subscription_id: abc

kubernetes:
  - kubeconfig: /home/user/.kube/config

do:
  - token: dop_v1_xxxxx

The magic happens in how each provider implements the enumeration interface. AWS uses the official SDK to paginate through service APIs—listing EC2 instances, S3 buckets, Route53 zones, and more. GCP offers two modes: fast project-level enumeration where Cloudlist directly calls Compute Engine and Cloud DNS APIs, or comprehensive organization-level discovery using the Cloud Asset Inventory API for massive environments. Azure queries the Resource Graph, Kubernetes reads pod and service definitions from clusters, and DigitalOcean, Linode, and other providers connect through their respective REST APIs.

The stdout-first design philosophy is crucial for security automation. Rather than maintaining a database or UI, Cloudlist streams results as newline-delimited JSON or plain text, making it a perfect building block in reconnaissance pipelines:

# Enumerate all assets, filter for IPs, feed to port scanner
cloudlist -config provider-config.yaml -ip | nmap -iL - -oA scan-results

# Find all hostnames across clouds, resolve DNS, check for subdomain takeovers
cloudlist -config provider-config.yaml -host | dnsx -silent | nuclei -t takeovers/

# Export full inventory as JSON for processing
cloudlist -config provider-config.yaml -json | jq '.[] | select(.provider=="aws") | .public_ipv4'

Output filtering happens through command-line flags: -provider limits results to specific clouds, -service targets particular resource types (ec2, gke, droplet), -ip extracts only IP addresses, and -host returns hostnames. The JSON output includes rich metadata—provider name, service type, public/private IPs, hostnames, and resource IDs—enabling sophisticated downstream analysis.

Under the hood, the provider implementations reveal thoughtful engineering choices. The GCP provider, for instance, checks whether Asset Inventory API is enabled and falls back to project-level enumeration if unavailable. The Kubernetes provider supports multiple authentication methods (kubeconfig, in-cluster service accounts, GKE/EKS integration) to handle diverse deployment scenarios. The Terraform provider parses state files to extract managed resources, giving visibility into infrastructure-as-code deployments that might not be discoverable through cloud APIs.

Extensibility is baked into the architecture. Adding a new provider means implementing a simple interface: an Init() method for authentication, a Resources() method that returns a slice of discovered assets, and marshaling logic to convert provider-specific responses into Cloudlist's normalized format. The repository's contribution guidelines walk through adding providers, and the codebase includes examples for reference implementations.

Gotcha

Cloudlist's power creates operational challenges around credential management. A production configuration might require AWS profiles with ReadOnlyAccess across multiple accounts, GCP service accounts with roles/cloudasset.viewer at the organization level, Azure service principals with Reader permissions on subscriptions, Kubernetes RBAC granting list/get on pods and services across namespaces, and API tokens for half a dozen SaaS providers. Distributing these credentials securely—especially in CI/CD pipelines—demands thoughtful secrets management. Teams often solve this with credential vending systems, ephemeral tokens, or running Cloudlist from privileged bastion hosts, but the complexity is real.

The tool is explicitly designed for enumeration, not analysis. You get a flat list of assets with basic metadata, but no relationship mapping (which security groups allow traffic between resources?), no configuration analysis (are S3 buckets publicly readable?), no drift detection (what changed since yesterday?), and no compliance checks (does this meet CIS benchmarks?). Cloudlist answers "what exists" brilliantly but stays silent on "what's risky" or "what changed." For attack surface management workflows focused on discovery and reconnaissance, this is perfect—you pipe results into specialized tools like Nuclei for vulnerability scanning or custom scripts for policy enforcement. But if you need a comprehensive Cloud Security Posture Management platform with alerting, remediation workflows, and historical tracking, Cloudlist is just the data ingestion layer.

Verdict

Use Cloudlist if you're building security automation pipelines that need lightweight, repeatable asset discovery across multiple cloud providers—especially if you're already in the ProjectDiscovery ecosystem with tools like Nuclei, httpx, or subfinder. It's ideal for red teams maintaining target inventories, blue teams tracking attack surface, and DevSecOps teams needing quick multi-cloud reconnaissance without heavy infrastructure. Skip it if you need a full-featured CSPM with compliance dashboards, change tracking, and remediation workflows, or if you're a single-cloud shop where native tools like AWS Config or GCP Asset Inventory provide deeper integration. Also skip if your credential management practices can't support distributing broad read permissions across cloud providers—Cloudlist requires significant access to be useful.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/cybersecurity/projectdiscovery-cloudlist.svg)](https://starlog.is/api/badge-click/cybersecurity/projectdiscovery-cloudlist)