Queues and Concurrency
Queues let you isolate different types of work so they don't compete for the same worker slots. Concurrency controls limit how many jobs each worker processes in parallel. Together, they give you fine-grained control over throughput and resource usage.
Named queues
Every job is assigned to a queue. The default queue is "default". Use named queues to separate workloads — fast email sends should not be blocked by slow report generation:
[JobConfig(Queue = "email")]
public class SendWelcomeEmail : IJob<NewUserPayload> { ... }
[JobConfig(Queue = "reports")]
public class GenerateMonthlyReport : IJob<ReportPayload> { ... }
[JobConfig(Queue = "critical")]
public class ProcessPayment : IJob<PaymentPayload> { ... }
Assigning a queue
You can set the queue at two levels. More specific settings override less specific ones.
Per-class (attribute)
[JobConfig(Queue = "email")]
public class SendWelcomeEmail : IJob<NewUserPayload>
{
public async Task ExecuteAsync(NewUserPayload payload, JobContext ctx)
{
// Always enqueued to the "email" queue by default
}
}
Per-call (options)
await jobs.EnqueueAsync<SendWelcomeEmail>(payload, new JobOptions
{
Queue = "critical" // Overrides the class-level "email" queue
});
Resolution order
| Level | How to set | Default |
|---|---|---|
| Per-call | new JobOptions { Queue = "..." } | — |
| Per-class | [JobConfig(Queue = "...")] | "default" |
Resolution: JobOptions (per-call) > [JobConfig] (per-class) > "default".
Queue names are trimmed and normalized by the server. Max length is 100 characters.
Worker queue binding
The SDK worker automatically polls all queues used by its registered job types. When the worker starts, it discovers which queues to listen on from the JobTypeRegistry:
If your application registers three job types with queues email, reports, and critical, the worker sends all three queue names in its POST /v1/workers/poll request.
Queue isolation with separate workers
For strict workload isolation, deploy separate worker instances that only register specific job types:
// Worker A: handles email jobs only
builder.Services.AddZeridionFlare(o =>
{
o.ApiKey = "...";
o.JobAssemblies = [typeof(SendWelcomeEmail).Assembly];
});
// Worker B: handles report jobs only
builder.Services.AddZeridionFlare(o =>
{
o.ApiKey = "...";
o.JobAssemblies = [typeof(GenerateMonthlyReport).Assembly];
});
Worker A only polls the email queue; Worker B only polls the reports queue. Slow reports cannot starve email delivery.
How polling works
The SDK worker uses long-polling to claim jobs from the API:
Key behaviors:
SELECT ... FOR UPDATE SKIP LOCKED— atomically claims jobs without blocking other workers. Two workers polling simultaneously never claim the same job.WHERE "Queue" = ANY(@queues)— only jobs in the worker's registered queues are considered.LIMIT @capacity— the worker only requests as many jobs as it has available concurrency slots.- Long-polling — if no jobs are available, the server holds the connection for up to 30 seconds, checking every 2 seconds, before returning an empty response.
Concurrency control
ConcurrencyLimit controls how many jobs a single worker instance processes in parallel. It defaults to 10:
builder.Services.AddZeridionFlare(o =>
{
o.ApiKey = "...";
o.ConcurrencyLimit = 5;
});
Under the hood, the worker uses a SemaphoreSlim initialized to ConcurrencyLimit. Before starting each job, the worker acquires a semaphore slot. When the job completes, the slot is released. The poll request's capacity parameter equals the number of available semaphore slots, so the worker never claims more jobs than it can handle.
Choosing the right limit
| Job type | Recommended ConcurrencyLimit | Rationale |
|---|---|---|
| I/O-bound (HTTP calls, email) | 10–20 | Jobs spend most time waiting; higher parallelism is safe |
| CPU-bound (image processing) | 2–4 | Jobs consume CPU; too many in parallel causes contention |
| Memory-intensive (large reports) | 2–5 | Each job uses significant memory; limit prevents OOM |
| Mixed workload | 10 (default) | Good general-purpose starting point |
Scaling workers
Scale horizontally by deploying multiple worker instances. Each instance polls independently and SKIP LOCKED ensures no double-claiming:
| Workers | ConcurrencyLimit | Max parallel jobs |
|---|---|---|
| 1 | 10 | 10 |
| 2 | 10 | 20 |
| 3 | 10 | 30 |
| 5 | 20 | 100 |
Scaling strategies
Uniform scaling — all workers process all job types. Simple to deploy, good for balanced workloads:
Worker Instance 1: all queues, ConcurrencyLimit = 10
Worker Instance 2: all queues, ConcurrencyLimit = 10
Worker Instance 3: all queues, ConcurrencyLimit = 10
Queue-isolated scaling — dedicated workers per queue. Scale each workload independently:
Email Workers (3 instances): queue = "email", ConcurrencyLimit = 20
Report Workers (1 instance): queue = "reports", ConcurrencyLimit = 2
Payment Workers (2 instances): queue = "critical", ConcurrencyLimit = 5
Use Azure Container Apps or Kubernetes to auto-scale worker replicas based on queue depth metrics.
Queue depth monitoring
GET /v1/metrics/queues returns the current depth of each queue:
{
"queues": [
{
"name": "default",
"pending": 45,
"processing": 10,
"scheduled": 3
},
{
"name": "email",
"pending": 120,
"processing": 15,
"scheduled": 0
}
]
}
| Field | Description |
|---|---|
pending | Jobs waiting to be claimed by a worker |
processing | Jobs currently being executed by a worker |
scheduled | Jobs with a future RunAt or waiting for a parent to complete |
Backlog detection
A growing pending count means jobs are arriving faster than workers can process them. Possible responses:
- Increase
ConcurrencyLimit— if workers have idle CPU/memory - Add worker instances — horizontal scaling
- Investigate slow jobs — a single slow job type may be consuming all worker slots
Autoscaling with KEDA
On Azure Container Apps with KEDA, you can scale worker replicas based on queue depth by polling the metrics endpoint and feeding it into a custom KEDA scaler.
Poll interval
PollInterval controls how long the worker waits between poll cycles when the previous poll returned no jobs:
builder.Services.AddZeridionFlare(o =>
{
o.PollInterval = TimeSpan.FromSeconds(5); // Default: 2s
});
Lower values increase responsiveness (faster job pickup) but increase API call volume. The default of 2 seconds is a good balance for most workloads.
The poll interval only applies when the long-poll returns empty. When jobs are available, the worker polls again immediately after processing the claimed batch.
Graceful shutdown
When the host shuts down (e.g., SIGTERM from a container orchestrator), the worker:
- Stops accepting new poll requests
- Waits for all in-flight jobs to complete
- Acks each completed job (success or failure)
- Exits cleanly
This prevents jobs from being orphaned mid-execution. The server's stuck job reaper (heartbeat timeout) provides a safety net for cases where the worker crashes without completing the shutdown sequence.
Best practices
-
Use descriptive queue names —
email,reports,billing,importsare immediately meaningful. Avoid generic names likequeue1. -
Isolate long-running jobs — put slow jobs (report generation, data imports) in their own queue so they don't block fast jobs (email sends, webhook deliveries).
-
Match concurrency to resource requirements — CPU-bound jobs need lower concurrency than I/O-bound jobs. Start with the default (10) and adjust based on monitoring.
-
Monitor queue depth — track
pendingcounts viaGET /v1/metrics/queues. Rising backlogs mean you need more workers or faster jobs. -
Scale horizontally, not just vertically — adding worker instances is generally more effective than increasing
ConcurrencyLimitbeyond 20, because each instance gets its own process memory and CPU scheduling.
See also
- ZeridionFlareOptions —
ConcurrencyLimit,PollInterval,DefaultQueue - JobOptions — per-call
Queueoverride - JobConfigAttribute — per-class
Queuedefault - Workers API — poll, ack, register, heartbeat endpoints
- Monitoring — metrics API and health endpoints