Handling Throttling Responses from the Hyperproof API

Hyperproof does not impose any specific limits for customers using our API, but you may occasionally encounter a throttling response during periods of high activity. These responses are rare, but when they do happen, your application should handle them gracefully rather than failing outright.

This guide walks through the throttling responses you might see, why they happen, and — most importantly — how to build robust retry logic so that a momentary slowdown doesn't turn into a failed operation.

NOTE: This document discusses retry strategies that may not apply to all of Hyperproof's API responses such as the Retry-After header which is not present in all throttling responses. Use this document as a generic guide to handling API throttling responses in all of your integration projects.

What Does a Throttling Response Look Like?

There are two HTTP response codes your application may encounter when the API is asking you to back off:

HTTP 429 — Too Many Requests

This is the standard rate-limit response. It means your application has sent too many requests in a short window of time. The response body will look something like this:

{
  "statusCode": 429,
  "message": "Rate limit is exceeded. Try again in 5 seconds."
}

HTTP 403 — Forbidden (Bandwidth Quota Exceeded)

This is a quota-based response. It means the cumulative volume of API activity has temporarily exceeded a threshold. You might see this during particularly busy periods. The response body will typically include a message like:

{
  "statusCode": 403,
  "message": "Bandwidth quota exceeded. Try again later."
}

Both of these responses may include a Retry-After header in the response. This header tells you exactly how many seconds to wait before trying again. This is your best friend when it comes to handling throttling — always check for it and respect it.

NOTE: These responses are transient. They don't mean your credentials are wrong, your request is malformed, or anything is permanently broken. They simply mean "try again in a moment." Your application should treat them differently from a true 403 authorization error.

Why Does Throttling Happen?

The Hyperproof API, like most cloud-hosted APIs, uses throttling to maintain stability and performance across the platform. During peak activity periods the API may temporarily limit the rate of incoming requests to protect service quality.

The thresholds are generous and most integrations will never hit them. But if your application is processing large datasets, making many concurrent requests, or calling the API in rapid succession, you may occasionally see a 429 or 403 throttle response. This is expected behavior and nothing to worry about — as long as your application knows how to wait and retry.

The Golden Rule: Check for the `Retry-After` Header

Before we get into retry strategies, let's cover the most important thing first. When you receive a 429 or a quota-related 403, always look for the Retry-After response header. This header contains the number of seconds the API is recommending you wait before retrying.

Here's how to read it from a typical HTTP response in TypeScript:

const retryAfterHeader = response.headers.get('Retry-After');
const retryAfterSeconds = retryAfterHeader ? parseInt(retryAfterHeader, 10) : 5;

Or in Python:

retry_after = int(response.headers.get('Retry-After', 5))

Or in C#:

var retryAfter = response.Headers.RetryAfter?.Delta?.TotalSeconds ?? 5;

If the header is present, use it. If it's not (which would be unusual), fall back to a reasonable default like 5 seconds. The API is telling you exactly when it's safe to try again — there's no need to guess.

Distinguishing a Throttle 403 from an Authorization 403

This is a critical point that trips people up. HTTP 403 can mean two very different things:

Authorization failure — your credentials are invalid, your token has expired, or you don't have permission to access the resource. Retrying will never fix this.
Quota exceeded — the API is temporarily throttling you. Retrying after a short wait will almost certainly succeed.

You need to tell these apart. The simplest way is to check the response body for quota-related keywords:

async function isQuotaExceeded(response: Response): Promise<boolean> {
  try {
    const body = await response.clone().json();
    const message = (body.message || '').toLowerCase();
    return message.includes('quota') || message.includes('bandwidth');
  } catch {
    return false;
  }
}

Another reliable indicator is the possible presence of the Retry-After header — a true authorization 403 won't include one, but a quota 403 will.

Only retry quota-related 403 responses. If the 403 is an authorization error, retrying will just waste time.

Building a Retry Helper

Let's build a reusable function that handles both 429 and quota 403 responses. We'll start with a simple version that relies on Retry-After:

Simple Retry (TypeScript)

async function fetchWithRetry(
  url: string,
  options: RequestInit,
  maxRetries: number = 3
): Promise<Response> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    // Success — return immediately.
    if (response.ok) {
      return response;
    }

    // Check if this is a throttling response.
    const isThrottled =
      response.status === 429 ||
      (response.status === 403 && await isQuotaExceeded(response));

    if (isThrottled && attempt < maxRetries) {
      // Respect the Retry-After header if present.
      const retryAfter = response.headers.get('Retry-After');
      const waitSeconds = retryAfter ? parseInt(retryAfter, 10) : 5;

      console.log(
        `Throttled (HTTP ${response.status}). ` +
        `Waiting ${waitSeconds}s before retry ${attempt + 1}/${maxRetries}.`
      );

      await new Promise(resolve => setTimeout(resolve, waitSeconds * 1000));
      continue;
    }

    // Not a throttle response, or we've exhausted retries.
    return response;
  }

  throw new Error(`Request failed after ${maxRetries} retries.`);
}

Simple Retry (Python)

import time
import requests

def fetch_with_retry(method, url, max_retries=3, **kwargs):
    for attempt in range(max_retries + 1):
        response = requests.request(method, url, **kwargs)

        if response.ok:
            return response

        is_throttled = (
            response.status_code == 429 or
            (response.status_code == 403 and is_quota_exceeded(response))
        )

        if is_throttled and attempt < max_retries:
            wait_seconds = int(response.headers.get('Retry-After', 5))
            print(
                f"Throttled (HTTP {response.status_code}). "
                f"Waiting {wait_seconds}s before retry "
                f"{attempt + 1}/{max_retries}."
            )
            time.sleep(wait_seconds)
            continue

        return response

    raise Exception(f"Request failed after {max_retries} retries.")


def is_quota_exceeded(response):
    try:
        message = response.json().get('message', '').lower()
        return 'quota' in message or 'bandwidth' in message
    except Exception:
        return False

Exponential Backoff with Jitter

The simple retry approach above works well when the Retry-After header is present. But for a more robust fallback — or if you want a general-purpose retry strategy — exponential backoff is the industry-standard approach.

The idea is simple: wait a little, then wait longer, then wait even longer. This gives the API time to recover instead of hammering it with retries at a fixed interval.

Adding jitter (a small random offset) on top of the backoff prevents a "thundering herd" problem — where many clients that got throttled at the same time all retry at the exact same moment, causing another spike.

Exponential Backoff (TypeScript)

async function fetchWithBackoff(
  url: string,
  options: RequestInit,
  maxRetries: number = 5,
  baseDelayMs: number = 1000,
  maxDelayMs: number = 30000
): Promise<Response> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.ok) {
      return response;
    }

    const isThrottled =
      response.status === 429 ||
      (response.status === 403 && await isQuotaExceeded(response));

    if (isThrottled && attempt < maxRetries) {
      // Prefer Retry-After if available.
      const retryAfter = response.headers.get('Retry-After');
      let waitMs: number;

      if (retryAfter) {
        waitMs = parseInt(retryAfter, 10) * 1000;
      } else {
        // Exponential backoff: 1s, 2s, 4s, 8s, 16s...
        const exponentialDelay = baseDelayMs * Math.pow(2, attempt);
        // Add jitter: random value between 0 and 1000ms.
        const jitter = Math.random() * 1000;
        // Cap at the maximum delay.
        waitMs = Math.min(exponentialDelay + jitter, maxDelayMs);
      }

      console.log(
        `Throttled (HTTP ${response.status}). ` +
        `Retrying in ${Math.round(waitMs)}ms ` +
        `(attempt ${attempt + 1}/${maxRetries}).`
      );

      await new Promise(resolve => setTimeout(resolve, waitMs));
      continue;
    }

    return response;
  }

  throw new Error(`Request failed after ${maxRetries} retries.`);
}

Exponential Backoff (Python)

import time
import random
import requests

def fetch_with_backoff(method, url, max_retries=5, base_delay=1.0,
                       max_delay=30.0, **kwargs):
    for attempt in range(max_retries + 1):
        response = requests.request(method, url, **kwargs)

        if response.ok:
            return response

        is_throttled = (
            response.status_code == 429 or
            (response.status_code == 403 and is_quota_exceeded(response))
        )

        if is_throttled and attempt < max_retries:
            retry_after = response.headers.get('Retry-After')

            if retry_after:
                wait = float(retry_after)
            else:
                # Exponential backoff with jitter.
                exponential_delay = base_delay * (2 ** attempt)
                jitter = random.uniform(0, 1)
                wait = min(exponential_delay + jitter, max_delay)

            print(
                f"Throttled (HTTP {response.status_code}). "
                f"Retrying in {wait:.1f}s "
                f"(attempt {attempt + 1}/{max_retries})."
            )
            time.sleep(wait)
            continue

        return response

    raise Exception(f"Request failed after {max_retries} retries.")

How the Timing Works

With the defaults above (baseDelay = 1s, maxDelay = 30s), the retry delays would look roughly like this:

Attempt	Base Delay	+ Jitter (up to)	Total Wait
1	1,000ms	~1,000ms	~1–2s
2	2,000ms	~1,000ms	~2–3s
3	4,000ms	~1,000ms	~4–5s
4	8,000ms	~1,000ms	~8–9s
5	16,000ms	~1,000ms	~16–17s

That gives you a cumulative wait of roughly 30–35 seconds across all retries — plenty of time for a transient throttle to clear, without making the user wait forever.

NOTE: If the Retry-After header IS present, the helper uses that value instead of calculating its own backoff. The header always wins — it's the API telling you exactly what it needs.

Handling Throttling in Polling Scenarios

A common pattern when working with the Hyperproof API is polling — making repeated calls to check the status of a long-running operation. Polling loops are especially susceptible to throttling because they're making many requests in a row by design.

Here's how to build throttle-aware polling:

async function pollForResult(
  url: string,
  options: RequestInit,
  pollIntervalMs: number = 5000,
  maxAttempts: number = 60
): Promise<any> {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    await new Promise(resolve => setTimeout(resolve, pollIntervalMs));

    const response = await fetch(url, options);

    // Check for throttling first.
    const isThrottled =
      response.status === 429 ||
      (response.status === 403 && await isQuotaExceeded(response));

    if (isThrottled) {
      const retryAfter = response.headers.get('Retry-After');
      const extraWait = retryAfter ? parseInt(retryAfter, 10) * 1000 : 10000;
      console.log(`Throttled during poll. Waiting an extra ${extraWait / 1000}s.`);
      await new Promise(resolve => setTimeout(resolve, extraWait));
      continue; // Don't count this as a failed poll attempt.
    }

    if (!response.ok) {
      throw new Error(`Poll request failed with HTTP ${response.status}.`);
    }

    const data = await response.json();

    if (data.status === 'complete' || data.status === 'ready') {
      return data;
    }

    // Not ready yet — loop continues.
  }

  throw new Error('Polling timed out after maximum attempts.');
}

The key detail here is the continue statement after a throttle response. We add an extra wait, but we don't count it as a poll attempt — the throttle doesn't mean the operation isn't progressing, it just means we need to ease up on asking about it.

Best Practices at a Glance

Always respect Retry-After. When the header is present, use its value. It's the most reliable indicator of when you can safely retry.

Distinguish throttles from real errors. HTTP 429 is always a throttle. HTTP 403 can be a quota throttle OR an authorization error. Check the response body to tell them apart — only retry quota-related responses.

Use exponential backoff as a fallback. When Retry-After isn't available, exponential backoff with jitter is the standard approach. Start with a 1-second base delay and cap at 30 seconds.

Set a maximum retry count. Don't retry forever. Three to five retries is reasonable for most scenarios. If the API is still throttling after that, something unusual is going on, and it's better to surface a clear error.

Don't retry non-transient errors. Only retry on 429 and quota-related 403 responses. Do not retry on 400 (bad request), 401 (unauthorized), 404 (not found), or other client errors. These will never succeed on retry.

Log your retries. When your application backs off and retries, log it. This makes debugging much easier and helps you understand your application's behavior during high-volume periods.

Be mindful of concurrent requests. If your application makes many API calls in parallel, consider adding a small stagger between them. This is much cheaper than handling throttle responses after the fact.

What NOT to Do

A few antipatterns to avoid:

Don't retry immediately. If you get throttled and immediately fire the same request again, you're making the problem worse. Always wait.

Don't retry in a tight loop without a cap. An infinite retry loop with no maximum will tie up your application indefinitely and generate unnecessary load.

Don't ignore the response code. Some developers treat any non-200 response as a generic failure and surface it to the user. A 429 is not a failure — it's a "try again in a moment." Handle it accordingly.

Don't build your own rate tracking. You might be tempted to count your own API calls and preemptively throttle yourself. This adds complexity for very little benefit. Instead, just handle the throttle responses when they come — they'll tell you everything you need to know.

Don't confuse a quota 403 with an auth 403. This is the most common mistake. If you blindly retry all 403s, you'll waste time retrying authorization errors that will never succeed. Always check the response body.

Quick Reference: Retry Decision Tree

When your API call returns an error, here's the logic to follow:

Is the status code 429? → Yes, this is a throttle. Check Retry-After, wait, and retry.
Is the status code 403? → Check the response body.
- Message contains "quota" or "bandwidth"? → Throttle. Check Retry-After, wait, and retry.
- Message indicates authorization or permission error? → Do not retry. Fix your credentials or permissions.
Is the status code 5xx (500, 502, 503, 504)? → Server-side issue. These can also benefit from a retry with backoff, though they aren't throttling per se.
Is the status code 4xx (400, 401, 404, etc.)? → Client error. Do not retry. The request itself needs to be fixed.

Build this once as a utility function, use it everywhere you're making API calls, and your application will handle the occasional throttle like a champ. Happy integrating!