Skip to main content

Workers API

The Workers API is used by the SDK's background worker to claim jobs, report results, send heartbeats, and register on startup. These endpoints form the execution side of the job lifecycle.

Base URL: https://api.zeridion.com

All endpoints require Bearer token authentication and are subject to rate limits.

Worker Lifecycle


POST /v1/workers/poll

Claim available jobs from the specified queues. The worker calls this in a loop to pick up new work.

Request

POST /v1/workers/poll
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
worker_idstringYesUnique identifier for this worker instance
queuesstring[]YesList of queue names to poll from
capacityintegerNoMaximum number of jobs to claim in one call (1–50). Default: 1
job_typesstring[]NoOnly claim jobs matching these types

Example

{
"worker_id": "wrk_local_dev_1",
"queues": ["default", "email"],
"capacity": 5,
"job_types": ["email.send", "report.generate"]
}

Response

200 OK

{
"jobs": [
{
"id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"job_type": "email.send",
"payload": { "to": "user@example.com", "subject": "Welcome!" },
"attempt": 1,
"max_attempts": 5,
"timeout_seconds": 60,
"enqueued_at": "2026-03-18T15:30:00Z"
}
]
}

Job item fields

FieldTypeDescription
idstringJob identifier — pass to ack and heartbeat
job_typestringJob type for routing to the correct handler
payloadobjectJSON payload to pass to the job
attemptintegerWhich attempt this is (1-based)
max_attemptsintegerMaximum attempts before dead-lettering
timeout_secondsintegerPer-attempt timeout in seconds
enqueued_atstring (ISO 8601)When the job was originally enqueued

Behavior

  • Uses SELECT ... FOR UPDATE SKIP LOCKED for atomic, race-free claiming across multiple worker instances.
  • The server holds the request for up to 30 seconds (long polling), checking for available jobs every 2 seconds. If no work is found within the deadline, the response returns an empty jobs array. This avoids excessive polling traffic while providing near-instant job pickup.
  • The SDK manages the client-side poll loop via ZeridionFlareOptions.PollInterval (default 2 seconds between poll requests).
  • Claimed jobs transition from pending to processing and have their worker_id set.

POST /v1/workers/ack

Acknowledge job completion or failure. The worker calls this after executing a job.

Request

POST /v1/workers/ack
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
job_idstringYesID of the job being acknowledged
worker_idstringYesID of the worker that processed the job
statusstringYes"succeeded" or "failed"
duration_msintegerNoHow long execution took in milliseconds
errorobjectNoError details when status is "failed" (see error detail)

Error detail object

FieldTypeDescription
typestringException type name
messagestringError message
stack_tracestringStack trace of the failure

Success example

{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_local_dev_1",
"status": "succeeded",
"duration_ms": 1042
}

Failure example

{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_local_dev_1",
"status": "failed",
"duration_ms": 503,
"error": {
"type": "System.Net.Http.HttpRequestException",
"message": "Connection refused",
"stack_trace": " at System.Net.Http..."
}
}

Response

200 OK

{
"action": "retry",
"retry_at": "2026-03-18T15:31:00Z"
}
FieldTypeDescription
actionstring"done" (job is complete or dead-lettered) or "retry" (will be retried)
retry_atstring (ISO 8601)When the next attempt is scheduled. Only present when action is "retry".
children_activatedintegerNumber of continuation jobs activated on success. Omitted from the response when no children exist (due to WhenWritingNull JSON serialization).

Behavior

On success (status: "succeeded"):

  • Job transitions to succeeded, completed_at is set.
  • Any child continuation jobs in scheduled state are activated (moved to pending).
  • Response: action: "done". The children_activated field is included with the count; if no children exist, the field is omitted entirely.

On failure with retries remaining (status: "failed", attempt < max_attempts):

  • Job transitions back to pending with a future run_at using exponential backoff.
  • Backoff formula: 15 * 2^(attempt - 1) seconds (15s, 30s, 60s, 120s...) plus 0–3 seconds of random jitter.
  • Worker assignment is cleared so any worker can pick it up.
  • Response: action: "retry", retry_at: "...".

On failure with no retries remaining (status: "failed", attempt >= max_attempts):

  • Job transitions to dead_letter, completed_at is set.
  • Any child continuation jobs are cancelled.
  • Response: action: "done".

Errors

StatusCodeCondition
400invalid_requestValidation failed (see ack validation)
404job_not_foundJob does not exist
409worker_mismatchworker_id does not match the worker that claimed the job
409invalid_stateJob is not in processing state (e.g., already acked)

POST /v1/workers/heartbeat

Send a keep-alive signal for a job that is currently being processed. Heartbeats prevent the stuck-job reaper from reclaiming long-running jobs, and also deliver cancellation signals to the worker.

Request

POST /v1/workers/heartbeat
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
job_idstringYesID of the job being processed
worker_idstringYesID of the worker processing the job
{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_local_dev_1"
}

Response

200 OK

{ "status": "ok" }
status valueMeaning
"ok"Continue processing normally
"cancel"The job has been cancelled — abort execution and ack as failed

Behavior

  • The SDK sends heartbeats every timeout_seconds / 3 seconds (minimum 10 seconds).
  • A StuckJobReaperService scans for jobs whose LastHeartbeatAt is stale beyond a grace period. Stuck jobs are retried or dead-lettered.
  • The "cancel" status is reserved for future use. Currently, POST /v1/jobs/{job_id}/cancel only accepts jobs in pending or scheduled state and returns 409 for processing jobs.

Errors

StatusCodeCondition
404job_not_foundJob does not exist
409invalid_stateJob is not in processing state
409worker_mismatchworker_id does not match the worker that claimed the job

POST /v1/workers/register

Register a worker on startup. The SDK calls this once when FlareWorkerService starts. Registration logs worker metadata and optionally upserts recurring job schedules.

Request

POST /v1/workers/register
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
worker_idstringYesUnique identifier for this worker instance
queuesstring[]YesQueues this worker will poll
job_typesstring[]YesJob types this worker can handle
hostnamestringNoMachine hostname for diagnostics
sdk_versionstringNoSDK version string
recurring_schedulesarrayNoRecurring job definitions to auto-register

recurring_schedules items

FieldTypeRequiredDescription
job_typestringYesJob type for the recurring job
cron_expressionstringYesCron schedule expression
queuestringNoTarget queue
timezonestringNoIANA timezone
max_attemptsintegerNoPer-execution retry limit
timeout_secondsintegerNoPer-execution timeout

Example

{
"worker_id": "wrk_myhost_12345_abc",
"queues": ["default", "critical"],
"job_types": ["email.send", "report.generate"],
"hostname": "myhost",
"sdk_version": "0.1.0-beta.1",
"recurring_schedules": [
{
"job_type": "MyApp.Jobs.CleanupExpiredSessions",
"cron_expression": "0 3 * * *",
"queue": "maintenance"
}
]
}

Response

200 OK

{
"status": "registered"
}

Behavior

  • Worker metadata is logged for diagnostics.
  • If recurring_schedules is provided, each schedule is upserted as a recurring job with an ID derived from the job type (e.g., rjob_CleanupExpiredSessions). This enables auto-registration of recurring jobs from SDK [JobConfig(CronSchedule = "...")] attributes.
  • If an individual recurring schedule upsert fails (e.g., invalid cron expression), the failure is logged but the registration still returns 200 with "status": "registered". Check server logs if schedules are not appearing as expected.