Skip to main content
The Volunteer API’s rate-limit story is entirely undocumented in the OpenAPI spec. No endpoint documents a 429 Too Many Requests response, no header on any response describes a budget or remaining-allowance, and no overall policy is published. Yet rate limits almost certainly exist — most production APIs enforce them, and Volunteer is no exception. This page covers what’s known (very little, formally), what to expect in practice (limits exist), and the defensive patterns that keep partner integrations inside whatever budget the platform applies. ⚠️ Spec gap: The Volunteer OpenAPI spec is entirely silent on rate limits. No 429 response is documented on any endpoint. No headers describing budget or remaining allowance are documented in any response. The thresholds, retry policies, and tier-based limits (if any) are not published. The patterns on this page assume standard rate-limit conventions (token bucket per API token, 429 response with Retry-After header). Confirm against actual API behavior for production-critical workloads.

What the spec confirms

Nothing — literally nothing. There’s no rate-limit information anywhere in the OpenAPI spec. For comparison: CRM+ and Raise specs are also silent on rate limits, but at least one of them (CRM+) has rate-limit information available elsewhere through partner channels. The Volunteer case is similar but with less external documentation.

What to expect in practice

Based on common API conventions and the platform’s likely Laravel architecture, partner integrations should reasonably expect:
ConcernLikely behavior
Per-token rate limitYes — typical for any production API
429 Too Many Requests response when exceededYes — standard HTTP convention
Retry-After header on 429 responsesLikely — Laravel’s throttle middleware emits this by default
Per-endpoint limitsPossibly — different endpoints may have different budgets
Burst allowancePossibly — short bursts often allowed even within sustained limits
Daily or hourly capsPossibly — some APIs enforce these in addition to per-second rates
None of this is in the spec. The patterns on this page work even if the specific values differ.

The defensive baseline

Build the integration to behave well within any reasonable rate budget. Three principles:
PrincipleDescription
Pace requests appropriately for the workloadBulk reads space out requests; real-time reads can be tighter
Honor 429 responses with backoffWhen the API says “slow down,” slow down. Use Retry-After if present.
Monitor request volume per customerA customer’s usage shouldn’t grow uncontrolled without alerting
These principles produce well-behaved integrations regardless of the specific rate limit. The integration won’t be the partner that causes problems for other partners by hammering the API.

A throttled HTTP client

For partner integrations operating at scale, build a throttled client that paces requests automatically:
class ThrottledVomoClient {
  constructor({ token, requestsPerSecond = 5 }) {
    this.token = token;
    this.requestsPerSecond = requestsPerSecond;
    this.queue = [];
    this.lastRequestAt = 0;
  }
 
  async request(url, options = {}) {
    await this._throttle();
 
    const response = await fetch(url, {
      ...options,
      headers: {
        Authorization: `Bearer ${this.token}`,
        Accept: 'application/json',
        ...options.headers,
      },
    });
 
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const delayMs = retryAfter ? parseInt(retryAfter, 10) * 1000 : 30 * 1000;
      await sleep(delayMs);
      return this.request(url, options); // Retry after backoff
    }
 
    return response;
  }
 
  async _throttle() {
    const minIntervalMs = 1000 / this.requestsPerSecond;
    const elapsed = Date.now() - this.lastRequestAt;
    if (elapsed < minIntervalMs) {
      await sleep(minIntervalMs - elapsed);
    }
    this.lastRequestAt = Date.now();
  }
}
 
function sleep(ms) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}
A few design decisions worth understanding:
DecisionWhy
Default to 5 requests/secondConservative for any plausible limit; rarely too slow for real workloads
Sequential throttle (one request at a time)Simpler than token-bucket; sufficient for most integrations
Automatic retry on 429Handles the rare hits without caller intervention
Honor Retry-After when presentRespects platform guidance even when undocumented
Default to 30s backoff if no headerReasonable conservative fallback
For higher-throughput integrations, replace the sequential throttle with a token-bucket implementation that allows controlled bursts.

A token-bucket for higher throughput

When you need concurrent requests but still want to stay inside a budget:
class TokenBucket {
  constructor({ capacity, refillPerSecond }) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillPerSecond = refillPerSecond;
    this.lastRefillAt = Date.now();
  }
 
  async acquire() {
    while (true) {
      this._refill();
      if (this.tokens >= 1) {
        this.tokens -= 1;
        return;
      }
      const waitMs = ((1 - this.tokens) / this.refillPerSecond) * 1000;
      await sleep(waitMs);
    }
  }
 
  _refill() {
    const now = Date.now();
    const elapsedSec = (now - this.lastRefillAt) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsedSec * this.refillPerSecond);
    this.lastRefillAt = now;
  }
}
 
const bucket = new TokenBucket({ capacity: 20, refillPerSecond: 10 });
 
async function rateLimitedRequest(url, options) {
  await bucket.acquire();
  return fetch(url, options);
}
This allows short bursts up to 20 requests but sustains an average of 10 requests per second. Tune capacity (burst size) and refillPerSecond (sustained rate) based on the workload. For most partner integrations, the simpler sequential throttle is sufficient. Use token bucket when you need parallelism (e.g., concurrent calls across multiple customers’ workers).

Per-workload rate-limit budgets

Different workloads have different needs. A reasonable starting allocation per customer:
WorkloadSuggested rateRationale
Interactive UI lookups10–20 req/sec burstUser waiting for response — fast matters
Steady-state sync (polling)1–5 req/secLots of background work; latency tolerant
Backfill (one-time bulk read)2–5 req/secThrottle harder during backfills to avoid sustained pressure
Daily reconciliation1–2 req/secOff-hours work; no need to rush
These are starting points. Tune based on what works against the actual API. For partner integrations serving many customers, the per-customer rate matters less than the total across all customers. If 100 customers’ integrations each run at 5 req/sec, the partner’s total rate is 500 req/sec — which may itself trigger any platform-side per-source limits.

Monitoring rate-limit pressure

Track metrics that surface rate-limit issues before they become customer-facing:
MetricHealthy baselineAlert threshold
429 rate per customer0Any non-zero sustained
Average request latency per customerStableSudden spike (often a precursor to 429)
Time between requests per customerAt configured pacingFalling below configured rate — bug
Total requests per customer per daySteadySudden growth without traffic change
A few patterns to look for:
async function request(url, options) {
  const start = Date.now();
  const response = await fetch(url, options);
  const latencyMs = Date.now() - start;
 
  metrics.timing('vomo.request.latency', latencyMs, { endpoint: simplifyUrl(url) });
 
  if (response.status === 429) {
    metrics.increment('vomo.request.rate_limited', { customerId });
    // Alert if this is sustained
  }
 
  return response;
}
Per-customer 429 rate is the canary. If one customer’s integration starts seeing rate limits, investigate before it affects others.

When the integration is the culprit

A 429 response means your integration’s traffic exceeded what the platform allows. Common causes:
CauseSolution
Workload changed scale (new customer, new feature)Spread the new load over more time
Inefficient pagination (re-reading pages, not using links.next)Use links.next and stop at null
Polling too frequentlyIncrease the poll interval; use longer windows per poll
Lookups in a tight loop without batchingBatch or cache the lookups
Reference data re-fetched on every requestCache reference data with appropriate TTL
The fixes are often more substantial than “just slow down.” Address the underlying inefficiency rather than papering over it with throttling. See API Performance Tips for the broader performance patterns.

When the platform is the culprit

Sometimes the rate limit feels too strict for the workload — typically because the customer has growth needs that exceed what the default budget supports. The path forward:
StepAction
Document the workload requirementsHow many records, how often, by which integration path
Confirm the limit is being hit consistentlyNot just occasional; show the data
Coordinate with the customer’s VOMO conciergeRequest a higher rate tier for this customer’s account
Confirm the new limit worksAfter the upgrade, monitor to confirm
Rate-limit tiers (if they exist) are administered per-customer, not per-integration. The customer’s relationship with VOMO is the path to a higher tier.

Polling-specific rate considerations

Volunteer has no webhooks — partner integrations that need change detection must poll. This creates a specific rate-limit consideration: don’t poll faster than your business need actually requires.
Workload typeReasonable poll frequency
Daily reporting refreshOnce daily
Same-day data freshnessOnce every 1–4 hours
Near-real-time syncOnce every 5–15 minutes
Real-time requiredReconsider the design — polling at <1 minute intervals is rarely sustainable
For partner integrations that need real-time updates, the architecture, not the rate limit, is usually the issue. Polling every 30 seconds wastes rate-limit budget and still has a 30-second lag. Consider whether your customer’s workflow genuinely needs sub-minute freshness or whether the perceived need is a constraint that could be relaxed. See Polling and Sync for the broader polling architecture.

A common-cases reference

Quick examples of what reasonable rate-limit handling looks like for typical workloads:

Interactive lookup (user-waiting workflow)

// Direct request without throttling — fast for users
async function lookupUser(userId) {
  const response = await fetch(`https://api.vomo.org/v1/users/${userId}`, {
    headers: { Authorization: `Bearer ${token}` },
  });
 
  if (response.status === 429) {
    const retryAfter = parseInt(response.headers.get('Retry-After') ?? '5', 10);
    await sleep(retryAfter * 1000);
    return lookupUser(userId); // Single retry
  }
 
  if (!response.ok) throw new Error(`Lookup failed: ${response.status}`);
  return response.json();
}

Steady-state polling (every 15 minutes)

async function pollUsersSync(customerId) {
  const client = new ThrottledVomoClient({ token, requestsPerSecond: 3 });
  const lastSync = await getCheckpoint(customerId);
 
  let url = `https://api.vomo.org/v1/users?updated_after=${encodeURIComponent(lastSync)}`;
 
  while (url) {
    const response = await client.request(url);
    const page = await response.json();
    await processUsers(customerId, page.data);
    url = page.links.next;
  }
 
  await advanceCheckpoint(customerId);
}
 
// Run on 15-minute interval
setInterval(() => pollUsersSync(customerId).catch(console.error), 15 * 60 * 1000);

Backfill (one-time bulk read)

async function backfillUsers(customerId) {
  const client = new ThrottledVomoClient({ token, requestsPerSecond: 2 });
  let url = 'https://api.vomo.org/v1/users';
 
  while (url) {
    const response = await client.request(url);
    const page = await response.json();
    await processUsers(customerId, page.data);
 
    console.log(`Backfill: ${page.meta.to}/${page.meta.total}`);
    url = page.links.next;
  }
}
Backfills should pace more aggressively than steady-state — they’re a sustained bulk read that benefits from slower pacing.

A rate-limit checklist

Walk through this when designing or auditing the integration:
  • All API requests go through a throttled client (sequential or token-bucket)
  • 429 responses honor Retry-After when present
  • Default backoff (when no Retry-After) is at least 15 seconds
  • Per-customer request volume is monitored
  • 429 rate is alerted on (any sustained non-zero)
  • Polling intervals match the actual business need (not “as fast as possible”)
  • Reference data is cached to avoid repeated lookups
  • Bulk operations use throttling sized for sustained work
  • Interactive operations have a separate (higher) throttle
  • No retry-forever loops — bounded attempts only These practices keep the integration well-behaved even though the specific rate limit isn’t documented.

Where to go next

The pagination pattern that’s tightly coupled with rate-limit-aware design.The classification framework for 429 and other error responses.The broader patterns for keeping request volume manageable.The polling-specific architecture that depends on rate-limit-aware design.
Last modified on May 22, 2026