2xx status, times out, or is unreachable — Virtuous retries the delivery. Retries are the safety net that keeps your integration consistent with the source organization’s data when transient failures occur, but they come with two costs: duplicate deliveries to your endpoint (covered on Idempotency and Safe Reprocessing), and a finite number of attempts before the event is permanently dropped.
This page covers what triggers a retry, the retry schedule, what counts as a final failure, and the patterns that minimize retry pressure on your endpoint.
What triggers a retry
Virtuous retries a webhook delivery when any of the following occurs:| Condition | Retry triggered? |
|---|---|
Endpoint returns 2xx (200, 201, 202, 204) | No — delivery succeeded. |
Endpoint returns 4xx (400, 401, 403, 404, etc.) | Yes — though 4xx typically indicates a client-side problem that won’t resolve on retry. |
Endpoint returns 5xx (500, 502, 503, 504) | Yes — server-side errors are typically transient. |
| Endpoint connection refused or unreachable | Yes — connection-level failures are typically transient. |
| Endpoint times out before responding | Yes — timeouts are treated as failed deliveries. |
| Endpoint returns invalid HTTPS certificate | Yes (in some configurations) or no (delivery rejected outright). |
Returning a
4xx to deliberately reject an event — for example, returning 401 for a signature verification failure — does not prevent retries. Virtuous will retry the same payload on the same schedule as for 5xx failures. If your endpoint is rejecting events for reasons that won’t resolve on retry (a permanently invalid signature, an unhandled event type), the retries are wasted on both sides. Consider returning 2xx for events you intentionally drop, and tracking those drops in your own logs rather than relying on the 4xx semantics.The retry schedule
Virtuous retries follow a delayed-attempts pattern — each retry is separated from the previous by a longer interval, giving transient failures time to recover. The typical industry pattern is exponential backoff with a cap, with retries continuing over a window of hours to days before giving up. While the exact schedule is being confirmed, the safe assumption set for partner integrations:- Retries continue for at least several hours. A transient outage of an hour or two on your endpoint should not cause permanent event loss.
- Maximum attempts are bounded. Eventually retries stop and the event is no longer redelivered.
- The window between attempts grows. Early retries are minutes apart; later retries may be hours apart.
Final failure — when retries stop
After the maximum number of retries, Virtuous stops attempting delivery. The event is no longer queued and will not be delivered later, even if your endpoint comes back online.What to do after a final-failure event
If your endpoint missed events during an extended outage, the reconciliation path depends on whether Virtuous surfaces failed deliveries:- If failed deliveries are accessible: poll the failed-delivery endpoint or UI after recovering from the outage, replay the missed events through your normal handler.
- If failed deliveries are silently dropped: fall back to a polled reconciliation query against the source resource. For each resource type your webhook subscribes to, run a Query with
modifiedDateTimeUtc > outage_start_timeand process the results as if they had arrived via webhook. See Reconcile Failed Syncs.
Inactive subscriptions are not queued
A separate failure mode worth highlighting: when a webhook subscription is deactivated (either viaPUT /api/Webhook/{webhookId}/Active?active=false or deleted entirely), events occurring while the subscription is inactive are dropped, not queued for delivery when the subscription is reactivated.
The right pattern during a planned outage: leave the subscription active, let Virtuous retry deliveries to your unavailable endpoint, and recover automatically when your endpoint comes back online. Deactivating the subscription should only be done when you intend to permanently stop receiving the event.
Designing a receiver that minimizes retries
The most effective way to reduce retry pressure is to make your endpoint reliably acknowledge deliveries within the timeout window. Three patterns help.1. Acknowledge immediately, process asynchronously
The webhook handler’s only synchronous responsibility is to verify the signature and enqueue the event for processing. Everything else — database writes, downstream API calls, side effects — runs in a background worker.JavaScript
2. Use a durable queue
The enqueue step needs to be reliable enough that you can acknowledge confidently. If the enqueue fails, your acknowledgement is a lie — you told Virtuous you received the event, but you have no record of it. Use a managed queue with strong delivery guarantees: AWS SQS, Google Cloud Tasks, GCP Pub/Sub, Redis Streams with persistence, or your platform’s equivalent. In-memory queues (e.g.,setImmediate(processEvent), an in-process worker pool) are not durable — if your process crashes between acknowledging and processing, the event is lost.
3. Return 2xx for events you intentionally skip
If your handler decides not to act on a particular event (an unsubscribed event type, an event for a customer you no longer service, a formSubmission for a form you don’t track), return 2xx and log the decision rather than returning a 4xx. The 4xx triggers retries that will keep failing — wasted load on both sides.
JavaScript
Monitoring retry health
Two metrics are worth tracking on a webhook receiver:| Metric | Why |
|---|---|
| Endpoint response time | Sustained increases predict timeout-driven retries. Alert when the 95th percentile approaches the delivery timeout. |
| Non-2xx response rate | Increases predict failed deliveries that will be retried. Alert when the rate exceeds a small baseline. |
Where to go next
Idempotency and Safe Reprocessing
Retries mean duplicate deliveries. Your handler must produce the same result on a second delivery as on the first.
Reconcile Failed Syncs
The polled-reconciliation pattern that catches events lost during an extended outage.
Local Testing
Test your retry-handling code locally by simulating slow and failing responses.
Signature Verification
Verification failures trigger retries — make sure your verifier is correct before going live.