Rack::Attack: Building Application-Layer Rate Limiting That Doesn't Suck

Hook

Most production Rails apps leak memory and database connections to abusive requests before they ever reach their authentication layer—Rack::Attack stops attackers at the door, before your application even wakes up.

Context

Web applications are expensive to run. Every request that hits your Rails router consumes memory, opens database connections, and executes application code—even requests from bots scraping your site, attackers probing for vulnerabilities, or users accidentally hammering your API. By the time a malicious request reaches your authentication layer or controller logic, you've already paid the computational cost.

Traditional solutions live at the wrong layer. You could configure nginx rate limiting, but it lacks application context—it doesn't know which endpoints are expensive or which users are authenticated. You could build rate limiting into each controller, but that's repetitive and still happens too late in the request lifecycle. Cloud WAFs like CloudFlare work great until you need custom business logic ("allow 10 requests per minute for free users, 1000 for premium"). Rack::Attack emerged from Kickstarter's production infrastructure as a solution to this layering problem: a Rack middleware that intercepts requests early, has full access to request context, and returns HTTP 429 responses before your application framework does any real work.

Technical Insight

System architecture — auto-generated

Rack::Attack's architecture is elegant because it exploits Rack's middleware stack. In a typical Rails application, middleware wraps your app like Russian nesting dolls—each layer can inspect the request, decide whether to pass it along, or return early. Rack::Attack sits near the top of this stack, right after logging and static file serving but before routing, session management, or database queries.

The core abstraction is a rule DSL with three primitives: safelists, blocklists, and throttles. Rules are evaluated in precedence order—safelist matches allow requests immediately, blocklist matches reject them, and throttles track request counts in a cache. Here's how you might protect a login endpoint:

Rack::Attack.throttle('login/email', limit: 5, period: 60) do |req|
  if req.path == '/login' && req.post?
    # Return a discriminator - requests with the same discriminator share a limit
    req.params['email'].to_s.downcase.presence
  end
end

Rack::Attack.blocklist('block-suspicious-ips') do |req|
  # Block requests from IPs that failed login 10 times in 5 minutes
  Rack::Attack::Allow2Ban.filter(req.ip, maxretry: 10, findtime: 5.minutes, bantime: 1.hour) do
    req.path == '/login' && req.post?
  end
end

Rack::Attack.safelist('allow-internal-traffic') do |req|
  req.ip == '10.0.0.0/8'
end

The throttle block returns a discriminator—the value that identifies what you're rate limiting. Here it's the email address, so each email gets 5 attempts per 60 seconds. Return nil and the request isn't throttled. The discriminator becomes a cache key like "rack::attack:1234567890:login/email:user@example.com", where the timestamp is bucketed to the period (so all requests in the same minute share a counter).

Under the hood, throttling relies on a cache backend—Redis in production, Memcached, or Rails.cache. When a throttled request arrives, Rack::Attack increments a counter in the cache with an expiry matching the period. If the counter exceeds the limit, it returns a 429 Too Many Requests response. This cache dependency is both a strength and a weakness: it means state is shared across application servers (essential for distributed systems), but it also means you need infrastructure beyond your app.

The Allow2Ban filter implements Fail2Ban-style progressive blocking. It tracks failed attempts (first block returns truthy) and if attempts exceed maxretry within findtime, it bans the discriminator (typically an IP) for bantime. This creates sophisticated protection: legitimate users who mistype their password aren't blocked, but attackers attempting credential stuffing get progressively locked out.

Rack::Attack also implements exponential backoff throttling for scenarios where you want increasingly strict limits:

Rack::Attack.throttle('exponential-backoff', limit: 3, period: 60) do |req|
  if req.path.start_with?('/api')
    # Combine IP and a timestamp bucket for the discriminator
    discriminator = req.ip
    
    # Check how many times this IP has been throttled
    throttle_count = Rack::Attack.cache.count("throttle-count:#{discriminator}", 1.hour)
    
    # Increase the throttle period exponentially
    period = 60 * (2 ** throttle_count)
    
    [discriminator, period]
  end
end

One of Rack::Attack's most underrated features is its instrumentation. It publishes ActiveSupport notifications that you can subscribe to for logging, metrics, and alerting:

ActiveSupport::Notifications.subscribe('throttle.rack_attack') do |name, start, finish, id, payload|
  req = payload[:request]
  Rails.logger.warn(
    "Rate limit exceeded",
    ip: req.ip,
    path: req.path,
    matched: req.env['rack.attack.matched'],
    match_type: req.env['rack.attack.match_type']
  )
  
  # Send to your metrics system
  Metrics.increment('rack_attack.throttle', tags: ["rule:#{req.env['rack.attack.matched']}"])
end

The middleware also supports adding RateLimit headers to responses, helping well-behaved clients implement their own backoff:

Rack::Attack.throttled_responder = lambda do |req|
  match_data = req.env['rack.attack.match_data']
  now = match_data[:epoch_time]
  
  headers = {
    'RateLimit-Limit' => match_data[:limit].to_s,
    'RateLimit-Remaining' => '0',
    'RateLimit-Reset' => (now + match_data[:period]).to_s
  }
  
  [429, headers, ["Rate limit exceeded. Try again in #{match_data[:period]} seconds.\n"]]
end

This standards-compliant approach (RateLimit headers are part of the IETF draft standard) means sophisticated API clients can dynamically adjust their request rates without hitting limits.

Gotcha

Rack::Attack's Achilles heel is its dependence on a shared cache for distributed deployments. If you're running multiple application servers—and you should be—you need Redis or Memcached. The in-memory cache adapter works for development but will give you inconsistent rate limiting in production where each server maintains its own counters. This means you're paying for another piece of infrastructure and accepting another potential failure mode. If Redis goes down, your rate limiting disappears (though your application keeps running, which is both good and bad).

IP-based discrimination is inherently flawed for modern internet architecture. Legitimate users behind corporate NATs or CGNAT (common with mobile carriers) share IP addresses, so one abusive user can exhaust the rate limit for thousands of innocent people. Conversely, sophisticated attackers use residential proxy networks or cloud infrastructure to rotate through thousands of IPs, making IP-based blocking cat-and-mouse at best. You need to think carefully about your discriminators—user IDs work great for authenticated endpoints, but unauthenticated ones are genuinely hard to protect without collateral damage. Rack::Attack gives you the tools but doesn't solve the fundamental attribution problem that makes rate limiting hard.

Verdict

Use Rack::Attack if you're running a Ruby web application with Redis or Memcached already in your stack and need fine-grained, application-aware rate limiting that you can version control alongside your code. It's perfect for protecting expensive API endpoints, preventing credential stuffing on authentication, or implementing tiered rate limits based on user subscription levels—scenarios where infrastructure-layer solutions lack necessary context. The declarative DSL makes rules readable and the instrumentation hooks enable proper observability. Skip it if you're not on Ruby/Rack (obviously), if you don't have shared cache infrastructure and can't justify adding it, or if your rate limiting needs are simple enough to handle with nginx or a CDN. Also skip it if you need rate limiting to work even when your cache fails—tools like API gateways or specialized rate limiting proxies have more sophisticated fallback behavior.

Rack::Attack: Building Application-Layer Rate Limiting That Doesn't Suck

Rack::Attack: Building Application-Layer Rate Limiting That Doesn't Suck

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

Rack::Attack: Building Application-Layer Rate Limiting That Doesn't Suck

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

How Ripgrep Makes Searching 10x Faster Than Grep: A Deep Dive Into Rust-Powered Text Search

Open Interpreter: Running GPT-4 with Root Access to Your Machine

Accomplish: Why Wrapping OpenCode Instead of Building an Agent Runtime Was the Right Bet

NVIDIA Cosmos: A Case Study in Strategic Repository Deprecation

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]