Rate Limits

All /v1/* endpoints are rate-limited per project based on the project's plan tier. Rate limiting is enforced by the RateLimitMiddleware using a sliding window algorithm.

Rate Limit Tiers

Plan	Requests / hour	Price
Free	100	Free
Starter	5,000	$29/mo
Pro	50,000	$99/mo
Business	500,000	$299/mo

The requests-per-hour limit applies to all API calls (job creation, polling, listing, metrics, etc.). The limit is enforced per project based on the project's RateLimitPerHour setting.

note

Jobs-per-day quotas and per-tier pricing enforcement are planned for Week 6 (Stripe billing integration). Currently, only the requests-per-hour limit is enforced.

Response Headers

Every successful response to a /v1/* endpoint includes rate limit headers:

Header	Type	Description
`X-RateLimit-Limit`	integer	Maximum requests allowed per hour for this project
`X-RateLimit-Remaining`	integer	Requests remaining in the current window
`X-RateLimit-Reset`	integer	Unix timestamp (seconds) when the current window resets

Example response headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4832
X-RateLimit-Reset: 1710779400
Content-Type: application/json

429 Too Many Requests

When the rate limit is exhausted, the API returns a 429 status code with the standard error envelope:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit of 5000 requests per hour exceeded.",
    "request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
  }
}

The X-RateLimit-* headers are still included on 429 responses, so you can read X-RateLimit-Reset to know when to retry.

Handling Rate Limits

Recommended backoff strategy

Read the X-RateLimit-Reset header from the 429 response
Compute the wait time: reset_timestamp - current_timestamp
Wait that duration before retrying

C# example

var response = await httpClient.PostAsync("/v1/jobs", content);

if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
    if (response.Headers.TryGetValues("X-RateLimit-Reset", out var values)
        && long.TryParse(values.First(), out var resetEpoch))
    {
        var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
        var delay = resetTime - DateTimeOffset.UtcNow;

        if (delay > TimeSpan.Zero)
            await Task.Delay(delay);
    }

    // Retry the request
    response = await httpClient.PostAsync("/v1/jobs", content);
}

Proactive throttling

To avoid hitting the limit in the first place, monitor the X-RateLimit-Remaining header on every response. When it drops below a threshold (e.g., 10% of X-RateLimit-Limit), reduce your request rate.

SDK Behavior

The Zeridion.Flare SDK surfaces rate limit information but does not perform automatic 429-aware backoff:

When a 429 response is received, the SDK throws a FlareRateLimitException with Limit, Remaining, and ResetAt properties derived from the X-RateLimit-* response headers.
The SDK's worker poll loop catches all exceptions (including FlareRateLimitException) and waits for the configured PollInterval before retrying. It does not inspect the ResetAt header to compute a longer backoff. For enqueue calls in your application code, you should implement your own retry/backoff logic using the ResetAt property (see the C# example above).

See the SDK Exceptions reference for details on FlareRateLimitException.

Rate Limit Tiers​

Response Headers​

Example response headers​

429 Too Many Requests​

Handling Rate Limits​

Recommended backoff strategy​

C# example​

Proactive throttling​

SDK Behavior​