Retry Behavior - Virtuous API Docs

When a webhook delivery fails — your endpoint returns a non-2xx status, times out, or is unreachable — Virtuous retries the delivery. Retries are the safety net that keeps your integration consistent with the source organization’s data when transient failures occur, but they come with two costs: duplicate deliveries to your endpoint (covered on Idempotency and Safe Reprocessing), and a finite number of attempts before the event is permanently dropped. This page covers what triggers a retry, the retry schedule, what counts as a final failure, and the patterns that minimize retry pressure on your endpoint.

The exact retry schedule, maximum attempt count, and final-failure handling are not documented in the CRM+ OpenAPI spec. The patterns described on this page reflect industry-standard webhook retry behavior and conservative recommendations for receiver design. Before relying on any specific retry guarantee in production, confirm the live behavior with Virtuous engineering.

What triggers a retry

Virtuous retries a webhook delivery when any of the following occurs:

Condition	Retry triggered?
Endpoint returns `2xx` (`200`, `201`, `202`, `204`)	No — delivery succeeded.
Endpoint returns `4xx` (`400`, `401`, `403`, `404`, etc.)	Yes — though `4xx` typically indicates a client-side problem that won’t resolve on retry.
Endpoint returns `5xx` (`500`, `502`, `503`, `504`)	Yes — server-side errors are typically transient.
Endpoint connection refused or unreachable	Yes — connection-level failures are typically transient.
Endpoint times out before responding	Yes — timeouts are treated as failed deliveries.
Endpoint returns invalid HTTPS certificate	Yes (in some configurations) or no (delivery rejected outright).

Returning a 4xx to deliberately reject an event — for example, returning 401 for a signature verification failure — does not prevent retries. Virtuous will retry the same payload on the same schedule as for 5xx failures. If your endpoint is rejecting events for reasons that won’t resolve on retry (a permanently invalid signature, an unhandled event type), the retries are wasted on both sides. Consider returning 2xx for events you intentionally drop, and tracking those drops in your own logs rather than relying on the 4xx semantics.

The retry schedule

Virtuous retries follow a delayed-attempts pattern — each retry is separated from the previous by a longer interval, giving transient failures time to recover. The typical industry pattern is exponential backoff with a cap, with retries continuing over a window of hours to days before giving up.

The specific retry intervals, total duration, and maximum attempt count used by CRM+ are not published. Until they are documented, design your integration to be resilient against either an aggressive schedule (many retries in a short window) or a sparse schedule (few retries spread over a longer window).

While the exact schedule is being confirmed, the safe assumption set for partner integrations:

Retries continue for at least several hours. A transient outage of an hour or two on your endpoint should not cause permanent event loss.
Maximum attempts are bounded. Eventually retries stop and the event is no longer redelivered.
The window between attempts grows. Early retries are minutes apart; later retries may be hours apart.

Final failure — when retries stop

After the maximum number of retries, Virtuous stops attempting delivery. The event is no longer queued and will not be delivered later, even if your endpoint comes back online.

What happens to events that fail all retries is not documented. Industry-standard options include: silently dropping the event, surfacing the failed delivery in a UI for manual inspection, or exposing a “failed deliveries” endpoint that the partner can poll.

What to do after a final-failure event

If your endpoint missed events during an extended outage, the reconciliation path depends on whether Virtuous surfaces failed deliveries:

If failed deliveries are accessible: poll the failed-delivery endpoint or UI after recovering from the outage, replay the missed events through your normal handler.
If failed deliveries are silently dropped: fall back to a polled reconciliation query against the source resource. For each resource type your webhook subscribes to, run a Query with modifiedDateTimeUtc > outage_start_time and process the results as if they had arrived via webhook. See Reconcile Failed Syncs.

The polling-fallback pattern is the safer default — it works regardless of whether the platform exposes failed deliveries. Most partner integrations should implement it as a periodic safety net even when webhooks are functioning normally.

Inactive subscriptions are not queued

A separate failure mode worth highlighting: when a webhook subscription is deactivated (either via PUT /api/Webhook/{webhookId}/Active?active=false or deleted entirely), events occurring while the subscription is inactive are dropped, not queued for delivery when the subscription is reactivated.

This is not a retry scenario. The subscription is not failing — it is intentionally not subscribed. Events that occur during the inactive window are lost. If your endpoint is experiencing a temporary outage but the subscription remains active, the retry mechanism will catch up. If you deactivate the subscription during the outage, the retry mechanism is bypassed and events are lost permanently.

The right pattern during a planned outage: leave the subscription active, let Virtuous retry deliveries to your unavailable endpoint, and recover automatically when your endpoint comes back online. Deactivating the subscription should only be done when you intend to permanently stop receiving the event.

Designing a receiver that minimizes retries

The most effective way to reduce retry pressure is to make your endpoint reliably acknowledge deliveries within the timeout window. Three patterns help.

1. Acknowledge immediately, process asynchronously

The webhook handler’s only synchronous responsibility is to verify the signature and enqueue the event for processing. Everything else — database writes, downstream API calls, side effects — runs in a background worker.

JavaScript

app.post('/virtuous/webhook', express.raw({ type: 'application/json' }), async (req, res) => {
  // 1. Verify signature
  if (!verifyVirtuousSignature(req.body, req.headers, process.env.VIRTUOUS_WEBHOOK_SECRET)) {
    return res.status(401).send('Invalid signature');
  }

  // 2. Enqueue for async processing — this should be a fast, durable operation
  const event = JSON.parse(req.body.toString('utf8'));
  await queue.send({ eventId: event.eventId, payload: event });

  // 3. Acknowledge — well inside the timeout window
  res.status(200).send('OK');
});

Why this matters: the slowest step in most webhook handlers is the downstream business logic — looking up records, calling third-party APIs, updating multiple database rows. If you do that work synchronously inside the request handler, transient slowness in your downstream dependencies translates directly into webhook timeouts and triggers retries even when your code is correct.

2. Use a durable queue

The enqueue step needs to be reliable enough that you can acknowledge confidently. If the enqueue fails, your acknowledgement is a lie — you told Virtuous you received the event, but you have no record of it. Use a managed queue with strong delivery guarantees: AWS SQS, Google Cloud Tasks, GCP Pub/Sub, Redis Streams with persistence, or your platform’s equivalent. In-memory queues (e.g., setImmediate(processEvent), an in-process worker pool) are not durable — if your process crashes between acknowledging and processing, the event is lost.

3. Return `2xx` for events you intentionally skip

If your handler decides not to act on a particular event (an unsubscribed event type, an event for a customer you no longer service, a formSubmission for a form you don’t track), return 2xx and log the decision rather than returning a 4xx. The 4xx triggers retries that will keep failing — wasted load on both sides.

JavaScript

const handlers = {
  'contact.created': handleContactCreated,
  'contact.updated': handleContactUpdated,
  'gift.created': handleGiftCreated,
  // ... etc.
};

async function processEvent(event) {
  const handler = handlers[event.eventType];
  if (!handler) {
    console.info('Skipping unhandled event type', { eventType: event.eventType, eventId: event.eventId });
    return; // No retry needed — we explicitly chose not to handle this.
  }
  await handler(event);
}

Monitoring retry health

Two metrics are worth tracking on a webhook receiver:

Metric	Why
Endpoint response time	Sustained increases predict timeout-driven retries. Alert when the 95th percentile approaches the delivery timeout.
Non-2xx response rate	Increases predict failed deliveries that will be retried. Alert when the rate exceeds a small baseline.

If your platform supports it, also track the gap between event timestamps and processing timestamps — a growing gap usually indicates retry-driven delivery clumping (Virtuous retrying old events alongside fresh ones).

Set up alerting that distinguishes “your endpoint is healthy but slow” from “your endpoint is failing.” The former drives retries through timeouts; the latter drives retries through error responses. The remediation is different — capacity scaling for the first, debugging and rollback for the second.

Where to go next

Idempotency and Safe Reprocessing

Retries mean duplicate deliveries. Your handler must produce the same result on a second delivery as on the first.

Reconcile Failed Syncs

The polled-reconciliation pattern that catches events lost during an extended outage.

Local Testing

Test your retry-handling code locally by simulating slow and failing responses.

Signature Verification

Verification failures trigger retries — make sure your verifier is correct before going live.

​What triggers a retry

​The retry schedule

​Final failure — when retries stop

​What to do after a final-failure event

​Inactive subscriptions are not queued

​Designing a receiver that minimizes retries

​1. Acknowledge immediately, process asynchronously

​2. Use a durable queue

​3. Return 2xx for events you intentionally skip

​Monitoring retry health

​Where to go next

Idempotency and Safe Reprocessing

Reconcile Failed Syncs

Local Testing

Signature Verification

What triggers a retry

The retry schedule

Final failure — when retries stop

What to do after a final-failure event

Inactive subscriptions are not queued

Designing a receiver that minimizes retries

1. Acknowledge immediately, process asynchronously

2. Use a durable queue

3. Return `2xx` for events you intentionally skip

Monitoring retry health

Where to go next