The Volunteer API’s rate-limit story is entirely undocumented in the OpenAPI spec. No endpoint documents a 429 Too Many Requests response, no header on any response describes a budget or remaining-allowance, and no overall policy is published. Yet rate limits almost certainly exist — most production APIs enforce them, and Volunteer is no exception.
This page covers what’s known (very little, formally), what to expect in practice (limits exist), and the defensive patterns that keep partner integrations inside whatever budget the platform applies.
⚠️ Spec gap: The Volunteer OpenAPI spec is entirely silent on rate limits. No 429 response is documented on any endpoint. No headers describing budget or remaining allowance are documented in any response. The thresholds, retry policies, and tier-based limits (if any) are not published. The patterns on this page assume standard rate-limit conventions (token bucket per API token, 429 response with Retry-After header). Confirm against actual API behavior for production-critical workloads.
What the spec confirms
Nothing — literally nothing. There’s no rate-limit information anywhere in the OpenAPI spec.
For comparison: CRM+ and Raise specs are also silent on rate limits, but at least one of them (CRM+) has rate-limit information available elsewhere through partner channels. The Volunteer case is similar but with less external documentation.
What to expect in practice
Based on common API conventions and the platform’s likely Laravel architecture, partner integrations should reasonably expect:
| Concern | Likely behavior |
|---|
| Per-token rate limit | Yes — typical for any production API |
429 Too Many Requests response when exceeded | Yes — standard HTTP convention |
Retry-After header on 429 responses | Likely — Laravel’s throttle middleware emits this by default |
| Per-endpoint limits | Possibly — different endpoints may have different budgets |
| Burst allowance | Possibly — short bursts often allowed even within sustained limits |
| Daily or hourly caps | Possibly — some APIs enforce these in addition to per-second rates |
None of this is in the spec. The patterns on this page work even if the specific values differ.
The defensive baseline
Build the integration to behave well within any reasonable rate budget. Three principles:
| Principle | Description |
|---|
| Pace requests appropriately for the workload | Bulk reads space out requests; real-time reads can be tighter |
Honor 429 responses with backoff | When the API says “slow down,” slow down. Use Retry-After if present. |
| Monitor request volume per customer | A customer’s usage shouldn’t grow uncontrolled without alerting |
These principles produce well-behaved integrations regardless of the specific rate limit. The integration won’t be the partner that causes problems for other partners by hammering the API.
A throttled HTTP client
For partner integrations operating at scale, build a throttled client that paces requests automatically:
class ThrottledVomoClient {
constructor({ token, requestsPerSecond = 5 }) {
this.token = token;
this.requestsPerSecond = requestsPerSecond;
this.queue = [];
this.lastRequestAt = 0;
}
async request(url, options = {}) {
await this._throttle();
const response = await fetch(url, {
...options,
headers: {
Authorization: `Bearer ${this.token}`,
Accept: 'application/json',
...options.headers,
},
});
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After');
const delayMs = retryAfter ? parseInt(retryAfter, 10) * 1000 : 30 * 1000;
await sleep(delayMs);
return this.request(url, options); // Retry after backoff
}
return response;
}
async _throttle() {
const minIntervalMs = 1000 / this.requestsPerSecond;
const elapsed = Date.now() - this.lastRequestAt;
if (elapsed < minIntervalMs) {
await sleep(minIntervalMs - elapsed);
}
this.lastRequestAt = Date.now();
}
}
function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
A few design decisions worth understanding:
| Decision | Why |
|---|
| Default to 5 requests/second | Conservative for any plausible limit; rarely too slow for real workloads |
| Sequential throttle (one request at a time) | Simpler than token-bucket; sufficient for most integrations |
Automatic retry on 429 | Handles the rare hits without caller intervention |
Honor Retry-After when present | Respects platform guidance even when undocumented |
| Default to 30s backoff if no header | Reasonable conservative fallback |
For higher-throughput integrations, replace the sequential throttle with a token-bucket implementation that allows controlled bursts.
A token-bucket for higher throughput
When you need concurrent requests but still want to stay inside a budget:
class TokenBucket {
constructor({ capacity, refillPerSecond }) {
this.capacity = capacity;
this.tokens = capacity;
this.refillPerSecond = refillPerSecond;
this.lastRefillAt = Date.now();
}
async acquire() {
while (true) {
this._refill();
if (this.tokens >= 1) {
this.tokens -= 1;
return;
}
const waitMs = ((1 - this.tokens) / this.refillPerSecond) * 1000;
await sleep(waitMs);
}
}
_refill() {
const now = Date.now();
const elapsedSec = (now - this.lastRefillAt) / 1000;
this.tokens = Math.min(this.capacity, this.tokens + elapsedSec * this.refillPerSecond);
this.lastRefillAt = now;
}
}
const bucket = new TokenBucket({ capacity: 20, refillPerSecond: 10 });
async function rateLimitedRequest(url, options) {
await bucket.acquire();
return fetch(url, options);
}
This allows short bursts up to 20 requests but sustains an average of 10 requests per second. Tune capacity (burst size) and refillPerSecond (sustained rate) based on the workload.
For most partner integrations, the simpler sequential throttle is sufficient. Use token bucket when you need parallelism (e.g., concurrent calls across multiple customers’ workers).
Per-workload rate-limit budgets
Different workloads have different needs. A reasonable starting allocation per customer:
| Workload | Suggested rate | Rationale |
|---|
| Interactive UI lookups | 10–20 req/sec burst | User waiting for response — fast matters |
| Steady-state sync (polling) | 1–5 req/sec | Lots of background work; latency tolerant |
| Backfill (one-time bulk read) | 2–5 req/sec | Throttle harder during backfills to avoid sustained pressure |
| Daily reconciliation | 1–2 req/sec | Off-hours work; no need to rush |
These are starting points. Tune based on what works against the actual API.
For partner integrations serving many customers, the per-customer rate matters less than the total across all customers. If 100 customers’ integrations each run at 5 req/sec, the partner’s total rate is 500 req/sec — which may itself trigger any platform-side per-source limits.
Monitoring rate-limit pressure
Track metrics that surface rate-limit issues before they become customer-facing:
| Metric | Healthy baseline | Alert threshold |
|---|
429 rate per customer | 0 | Any non-zero sustained |
| Average request latency per customer | Stable | Sudden spike (often a precursor to 429) |
| Time between requests per customer | At configured pacing | Falling below configured rate — bug |
| Total requests per customer per day | Steady | Sudden growth without traffic change |
A few patterns to look for:
async function request(url, options) {
const start = Date.now();
const response = await fetch(url, options);
const latencyMs = Date.now() - start;
metrics.timing('vomo.request.latency', latencyMs, { endpoint: simplifyUrl(url) });
if (response.status === 429) {
metrics.increment('vomo.request.rate_limited', { customerId });
// Alert if this is sustained
}
return response;
}
Per-customer 429 rate is the canary. If one customer’s integration starts seeing rate limits, investigate before it affects others.
When the integration is the culprit
A 429 response means your integration’s traffic exceeded what the platform allows. Common causes:
| Cause | Solution |
|---|
| Workload changed scale (new customer, new feature) | Spread the new load over more time |
Inefficient pagination (re-reading pages, not using links.next) | Use links.next and stop at null |
| Polling too frequently | Increase the poll interval; use longer windows per poll |
| Lookups in a tight loop without batching | Batch or cache the lookups |
| Reference data re-fetched on every request | Cache reference data with appropriate TTL |
The fixes are often more substantial than “just slow down.” Address the underlying inefficiency rather than papering over it with throttling.
See API Performance Tips for the broader performance patterns.
Sometimes the rate limit feels too strict for the workload — typically because the customer has growth needs that exceed what the default budget supports. The path forward:
| Step | Action |
|---|
| Document the workload requirements | How many records, how often, by which integration path |
| Confirm the limit is being hit consistently | Not just occasional; show the data |
| Coordinate with the customer’s VOMO concierge | Request a higher rate tier for this customer’s account |
| Confirm the new limit works | After the upgrade, monitor to confirm |
Rate-limit tiers (if they exist) are administered per-customer, not per-integration. The customer’s relationship with VOMO is the path to a higher tier.
Polling-specific rate considerations
Volunteer has no webhooks — partner integrations that need change detection must poll. This creates a specific rate-limit consideration: don’t poll faster than your business need actually requires.
| Workload type | Reasonable poll frequency |
|---|
| Daily reporting refresh | Once daily |
| Same-day data freshness | Once every 1–4 hours |
| Near-real-time sync | Once every 5–15 minutes |
| Real-time required | Reconsider the design — polling at <1 minute intervals is rarely sustainable |
For partner integrations that need real-time updates, the architecture, not the rate limit, is usually the issue. Polling every 30 seconds wastes rate-limit budget and still has a 30-second lag. Consider whether your customer’s workflow genuinely needs sub-minute freshness or whether the perceived need is a constraint that could be relaxed.
See Polling and Sync for the broader polling architecture.
A common-cases reference
Quick examples of what reasonable rate-limit handling looks like for typical workloads:
Interactive lookup (user-waiting workflow)
// Direct request without throttling — fast for users
async function lookupUser(userId) {
const response = await fetch(`https://api.vomo.org/v1/users/${userId}`, {
headers: { Authorization: `Bearer ${token}` },
});
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get('Retry-After') ?? '5', 10);
await sleep(retryAfter * 1000);
return lookupUser(userId); // Single retry
}
if (!response.ok) throw new Error(`Lookup failed: ${response.status}`);
return response.json();
}
Steady-state polling (every 15 minutes)
async function pollUsersSync(customerId) {
const client = new ThrottledVomoClient({ token, requestsPerSecond: 3 });
const lastSync = await getCheckpoint(customerId);
let url = `https://api.vomo.org/v1/users?updated_after=${encodeURIComponent(lastSync)}`;
while (url) {
const response = await client.request(url);
const page = await response.json();
await processUsers(customerId, page.data);
url = page.links.next;
}
await advanceCheckpoint(customerId);
}
// Run on 15-minute interval
setInterval(() => pollUsersSync(customerId).catch(console.error), 15 * 60 * 1000);
Backfill (one-time bulk read)
async function backfillUsers(customerId) {
const client = new ThrottledVomoClient({ token, requestsPerSecond: 2 });
let url = 'https://api.vomo.org/v1/users';
while (url) {
const response = await client.request(url);
const page = await response.json();
await processUsers(customerId, page.data);
console.log(`Backfill: ${page.meta.to}/${page.meta.total}`);
url = page.links.next;
}
}
Backfills should pace more aggressively than steady-state — they’re a sustained bulk read that benefits from slower pacing.
A rate-limit checklist
Walk through this when designing or auditing the integration:
- All API requests go through a throttled client (sequential or token-bucket)
429 responses honor Retry-After when present
- Default backoff (when no
Retry-After) is at least 15 seconds
- Per-customer request volume is monitored
429 rate is alerted on (any sustained non-zero)
- Polling intervals match the actual business need (not “as fast as possible”)
- Reference data is cached to avoid repeated lookups
- Bulk operations use throttling sized for sustained work
- Interactive operations have a separate (higher) throttle
- No retry-forever loops — bounded attempts only These practices keep the integration well-behaved even though the specific rate limit isn’t documented.
Where to go next
The pagination pattern that’s tightly coupled with rate-limit-aware design.The classification framework for 429 and other error responses.The broader patterns for keeping request volume manageable.The polling-specific architecture that depends on rate-limit-aware design. Last modified on May 22, 2026