API Performance Tips - Virtuous API Docs

The Volunteer API is read-heavy and pagination-bound (15 records per page, partner cannot configure). For partner integrations of any meaningful scale, performance is determined by how cleverly you avoid unnecessary requests — not by how fast each request is. A well-cached integration can serve 100,000 users on the same rate budget that a naive integration uses for 5,000. This page covers the patterns that make Volunteer integrations performant in production: caching, request consolidation, parallelization, and the specific Volunteer quirks (small page size, no conditional requests, embedded-data endpoints) that shape the right approach.

The fundamental constraint

Page size on GET /users (and most list endpoints) is 15 records, not partner-configurable. Concretely:

Customer size	List requests for full read
1,000 users	~67 requests
10,000 users	~667 requests
100,000 users	~6,667 requests

The rate budget is not unlimited. Naive integrations that re-read all users every poll cycle quickly exhaust their budget for nothing — most of those records didn’t change. The single most valuable performance practice: use updated_after filters religiously. A poll that returns 50 actually-changed users beats a poll that returns 10,000 unchanged ones.

Pattern 1: cache what doesn’t change often

A few resource types in Volunteer change infrequently and benefit massively from caching:

Resource	Change frequency	Cache TTL
Forms (structure)	Days to weeks	1 hour during active sync; 24 hours otherwise
Form Fields	Same	Same
Certificates	Weeks to months	24 hours
Organization metadata	Months	24 hours
Project metadata (for known Projects)	Hours to days	5-15 minutes

A reference cached fetcher:

JavaScript

class TtlCache {
  constructor({ ttlSeconds }) {
    this.ttlMs = ttlSeconds * 1000;
    this.entries = new Map();
  }

  async getOrFetch(key, fetchFn) {
    const cached = this.entries.get(key);
    if (cached && Date.now() - cached.fetchedAt < this.ttlMs) {
      return cached.value;
    }
    const value = await fetchFn();
    this.entries.set(key, { value, fetchedAt: Date.now() });
    return value;
  }

  invalidate(key) {
    this.entries.delete(key);
  }
}

// Usage
const formCache = new TtlCache({ ttlSeconds: 3600 }); // 1 hour

async function getFormDefinition(customerId, formId) {
  return formCache.getOrFetch(
    `${customerId}:form:${formId}`,
    () => fetchFormFromVomo(customerId, formId)
  );
}

Per-customer keys

Cache keys must include the customer ID. Two customers may have Forms with the same ID (different VOMO orgs); cross-customer cache hits would produce wrong data.

When to invalidate

The cache invalidates either:

On a TTL boundary (most common — simple and safe)
On observed change (e.g., when polling detects the resource was modified)

For Forms, observed-change invalidation looks like:

JavaScript

async function pollForms(customerId) {
  const lastSync = await getCheckpoint(customerId, 'forms');
  const updated = await fetchUpdatedForms(customerId, lastSync);

  for (const form of updated) {
    // Invalidate the cache for this Form
    formCache.invalidate(`${customerId}:form:${form.id}`);
  }
}

The next access to that Form refetches and re-populates the cache.

Pattern 2: use embedded data instead of separate fetches

The Volunteer API embeds related data in several endpoints. Use the embeddings rather than separate fetches.

Project Detail embeds schedule

GET /projects/{id} returns the Project with all_dates[] and next_date already populated. Don’t separately fetch Project Dates for schedule data — they’re already there.

JavaScript

// ❌ Anti-pattern: separate Project Date fetches
const project = await getProject(projectId);
const dates = await Promise.all(
  project.all_date_ids.map((id) => getProjectDate(id)) // wasteful — already embedded
);

// ✅ Use the embedded data
const project = await getProject(projectId);
const dates = project.all_dates; // already there

This eliminates N+1 patterns for common schedule reads.

User Detail embeds participations

GET /users/{id} returns the User with participations[] already populated. Don’t separately fetch participations — they’re embedded.

JavaScript

const user = await getUser(userId);
const participations = user.participations ?? []; // already there

Project Date Detail embeds participants

GET /projects/date/{id} returns the Project Date with participants[] already populated.

JavaScript

const date = await getProjectDate(dateId);
const participants = date.participants ?? []; // already there

When NOT to use embedded data

The embedded data is summary-shape, not full-resource-shape. For example:

participations on User Detail include the basics but not the Project name
participants on Project Date Detail include the User basics but not the User’s full profile

For workflows that need the other resource’s full detail, you still need separate fetches. But for the common cases (showing schedule on a Project, showing participants on a Date), embedded data is enough.

Pattern 3: batch with parallelism (carefully)

Sequential requests are slow. Parallel requests are faster but can spike rate limits. The middle ground: bounded parallelism.

JavaScript

async function bulkFetchProjectDetails(projectIds, customerId, concurrency = 5) {
  const results = [];
  const queue = [...projectIds];

  async function worker() {
    while (queue.length > 0) {
      const id = queue.shift();
      if (id === undefined) return;

      try {
        const project = await getProjectDetail(customerId, id);
        results.push({ id, project });
      } catch (err) {
        results.push({ id, error: err });
      }
    }
  }

  // Start N workers concurrently
  await Promise.all(Array.from({ length: concurrency }, worker));

  return results;
}

The concurrency parameter caps simultaneous in-flight requests. For Volunteer, 3-5 is a reasonable upper bound — higher rates the risk of triggering rate limits.

Combining parallelism with throttling

Combine bounded parallelism with the throttled client pattern:

JavaScript

class ParallelThrottledFetcher {
  constructor({ token, requestsPerSecond = 5, concurrency = 5 }) {
    this.client = new ThrottledVomoClient({ token, requestsPerSecond });
    this.concurrency = concurrency;
  }

  async fetchMany(urls) {
    const queue = [...urls];
    const results = [];

    async function worker() {
      while (queue.length > 0) {
        const url = queue.shift();
        if (!url) return;
        const response = await this.client.fetch(url); // throttled internally
        results.push({ url, response });
      }
    }

    await Promise.all(
      Array.from({ length: this.concurrency }, () => worker.call(this))
    );

    return results;
  }
}

The throttled client paces requests at the global rate; the parallelism just keeps the pipeline full.

Pattern 4: shorten polling cycles with `updated_after`

Every list endpoint that supports updated_after should use it. Without:

GET /users  →  667 pages for a 10k-user customer

With:

GET /users?updated_after=2025-04-19T03:00:00Z  →  maybe 3-5 pages of actual changes

The rate budget saved is the difference between “full re-read” (~660 requests) and “what changed” (~3-5 requests). For partner integrations serving many customers, this is the difference between a sustainable cost model and rate-limit collisions.

Initial sync as the exception

The first poll for a new customer doesn’t have a checkpoint — it must read everything. Schedule the initial backfill explicitly (during onboarding) and avoid blocking the polling worker on it:

JavaScript

async function pollIfBackfilled(customerId) {
  const config = await getSyncConfig(customerId);
  if (!config.backfillCompletedAt) {
    // Backfill hasn't run yet — don't poll
    return { skipped: true, reason: 'awaiting_initial_backfill' };
  }
  return pollUserChanges(customerId);
}

This prevents the polling worker from doing a full read accidentally if the checkpoint is at epoch.

Pattern 5: pre-compute aggregations

For reporting workloads, raw API queries are too slow at scale. Pre-compute the answers:

-- Refreshed nightly: per-user lifetime stats
CREATE MATERIALIZED VIEW user_lifetime_stats AS
SELECT
  customer_id,
  vomo_user_id,
  SUM(hours) AS lifetime_hours,
  COUNT(*) AS lifetime_participations,
  MIN(signed_up_at) AS first_participated_at,
  MAX(signed_up_at) AS last_participated_at
FROM participations
WHERE verified = true
GROUP BY customer_id, vomo_user_id;

CREATE INDEX user_lifetime_stats_by_hours
  ON user_lifetime_stats (customer_id, lifetime_hours DESC);

A dashboard query that would otherwise need to walk thousands of participation records hits this view in milliseconds.

Refresh cadence

View	Cadence
Per-user lifetime stats	Daily
Per-project performance	Hourly during active hours
Org-level summaries	Daily
Monthly trend rollups	Daily after midnight
Real-time dashboards	Use the canonical tables directly with proper indexes

Don’t refresh too frequently — refreshing a 100k-row materialized view every minute is more expensive than the underlying queries it’s accelerating. See Report on Volunteer Hours for the full pattern.

Pattern 6: avoid N+1 patterns

The classic API anti-pattern: list N items, then fetch each one’s detail. Volunteer’s small page size makes this especially expensive.

N+1 in practice

JavaScript

// ❌ N+1 anti-pattern
const users = await listUsers(); // 667 page requests for 10k users
for (const user of users) {
  const detail = await getUserDetail(user.id); // 10,000 detail requests!
  // ... process ...
}
// Total: ~10,667 requests

Total request count: list pagination + N detail fetches. For 10k users, that’s ~10,000 detail requests on top of the list pagination.

Alternative 1: use list shape if it’s enough

JavaScript

const users = await listUsers(); // 667 requests
for (const user of users) {
  // Use only list-shape fields; skip detail fetches
  await processUserSummary(user);
}
// Total: 667 requests

If you only need the list-shape fields (basic profile, IDs, timestamps), skip the detail.

Alternative 2: detail only for changed records

JavaScript

const params = new URLSearchParams({ updated_after: lastSync.toISOString() });
const changedUsers = await paginate(`/users?${params}`); // typically << 10k

for (const user of changedUsers) {
  // Now the detail fetches are bounded by actual changes
  const detail = await getUserDetail(user.id);
  await processUser(detail);
}

The polling pattern is itself an answer to N+1 — process only what changed.

Alternative 3: parallel detail fetches with bounded concurrency

JavaScript

async function processUsersWithDetail(users) {
  const queue = [...users];
  const concurrency = 5;

  await Promise.all(
    Array.from({ length: concurrency }, async () => {
      while (queue.length > 0) {
        const user = queue.shift();
        if (!user) return;
        const detail = await getUserDetail(user.id);
        await processUser(detail);
      }
    })
  );
}

For workloads where detail is unavoidable, parallelism cuts wall-clock time substantially (5x with concurrency=5).

Pattern 7: warm caches on startup

For partner integrations with high-frequency reads (portals, dashboards), cold caches at startup cause spike-in-traffic patterns. Warm them:

JavaScript

async function warmCachesForCustomer(customerId) {
  // Pre-fetch slowly-changing data
  await Promise.all([
    cacheOrganizations(customerId),
    cacheCertificates(customerId),
    cacheActiveForms(customerId),
    // Don't try to pre-warm Users or Projects — too many
  ]);
}

// On worker startup
await warmCachesForAllActiveCustomers();

The warming happens once per startup; subsequent requests hit the cache.

Selective warming

Don’t warm everything — only data that:

Is small enough to cache in memory
Changes infrequently
Is accessed frequently

Forms, Certificates, and Organizations are the typical candidates. Don’t warm Users (too many) or Project Dates (too many for active customers).

Pattern 8: estimate before bulk operations

Before kicking off a large operation, estimate its cost:

JavaScript

async function estimateBackfillCost(customerId) {
  // 1. Get total count from first page's meta
  const response = await fetch(
    'https://api.vomo.org/v1/users?page=1',
    { headers: { Authorization: `Bearer ${token}` } }
  );
  const firstPage = await response.json();
  const totalUsers = firstPage.meta.total;

  // 2. Compute pages needed
  const pageSize = 15;
  const totalPages = Math.ceil(totalUsers / pageSize);

  // 3. Compute time at given rate
  const requestsPerSecond = 3;
  const seconds = totalPages / requestsPerSecond;

  return {
    totalUsers,
    totalPages,
    estimatedMinutes: Math.round(seconds / 60),
    estimatedHours: Math.round(seconds / 3600 * 10) / 10,
  };
}

// Usage during onboarding
const estimate = await estimateBackfillCost(customerId);
console.log(`Backfill estimate: ${estimate.totalUsers} users, ~${estimate.estimatedMinutes} min`);

Surface the estimate to the customer during onboarding (“Initial sync will take approximately 30 minutes”). Avoid scheduling backfills that exceed the API’s rate limit or your worker’s runtime.

Pattern 9: monitor cost over time

What gets measured gets managed. Track:

Metric	What it tells you
Requests per customer per day	Cost trends
Requests per resource per day	Which resource type dominates
Cache hit rate	Whether caching is effective
Average pages per polling cycle	Whether `updated_after` is being used effectively
429 rate	Whether you’re hitting rate limits
Detail vs list request ratio	N+1 detection signal

A monthly report that shows “Customer X consumed 850K requests last month, of which 95% were User polling, 60% returned zero changes” tells you exactly where to optimize.

Per-customer budgets

For multi-tenant partner integrations:

JavaScript

class CustomerBudget {
  async checkBudget(customerId) {
    const today = await getTodaysRequestCount(customerId);
    const monthly = await getMonthlyRequestCount(customerId);
    const customerTier = await getCustomerTier(customerId);

    const dailyLimit = TIER_LIMITS[customerTier].daily;
    const monthlyLimit = TIER_LIMITS[customerTier].monthly;

    if (today > dailyLimit) {
      await alertOps({ severity: 'medium', type: 'daily_budget_exceeded', customerId });
      return { allowed: false, reason: 'daily_budget' };
    }
    if (monthly > monthlyLimit) {
      await alertOps({ severity: 'medium', type: 'monthly_budget_exceeded', customerId });
      return { allowed: false, reason: 'monthly_budget' };
    }

    return { allowed: true };
  }
}

Per-customer budgets prevent one customer’s runaway integration from exhausting the shared rate budget.

Pattern 10: avoid the “real-time” tax

Customers often request “real-time” sync. Most of the time:

“I want to see new volunteers as soon as they sign up” → 15-minute polling is fine
“I need participation data immediately after a shift” → reconciliation within an hour is fine
“Dashboards should reflect current state” → 5-minute cache TTL is fine

The cost of “real-time” sync (in API requests, infrastructure, complexity) is rarely worth the actual freshness improvement. Push back on this requirement:

“We can poll every 15 minutes, which means new volunteers appear in your dashboard within 15 minutes of signing up. We can poll every 5 minutes, which makes that ~5 minutes — but uses 3x the API requests. Is the latency difference worth the cost?”

Most customers, when forced to articulate, accept 15-30 minute polling. The few who genuinely need sub-minute reactivity often have other architectural needs (real-time UI, websockets, etc.) that polling can’t satisfy regardless.

A reference performance-aware integration

A summary of the patterns in action:

JavaScript

class PerformanceAwareIntegration {
  constructor({ customerId, token }) {
    this.customerId = customerId;
    this.client = new ThrottledVomoClient({ token, requestsPerSecond: 3 });
    this.formCache = new TtlCache({ ttlSeconds: 3600 });
    this.certificateCache = new TtlCache({ ttlSeconds: 86400 });
  }

  async warmCaches() {
    // One-time at startup
    await Promise.all([
      this._warmForms(),
      this._warmCertificates(),
    ]);
  }

  async pollIncremental() {
    const lastSync = await getCheckpoint(this.customerId, 'user_sync');
    const params = new URLSearchParams({ updated_after: lastSync.toISOString() });

    let url = `https://api.vomo.org/v1/users?${params}`;
    let latestSeen = lastSync;

    while (url) {
      const response = await this.client.fetch(url);
      if (!response.ok) throw new Error(`Poll failed: ${response.status}`);

      const page = await response.json();
      for (const user of page.data) {
        await this._processUser(user); // uses list shape only
        const u = new Date(user.updated_at);
        if (u > latestSeen) latestSeen = u;
      }
      url = page.links.next;
    }

    await setCheckpoint(this.customerId, 'user_sync', latestSeen);
  }

  async _processUser(user) {
    // List shape is enough — no detail fetch
    await externalSystem.upsertUser({
      external_id: `vomo-${user.id}`,
      email: user.email,
      // ... list-shape fields only ...
    });
  }

  async _warmForms() {
    const response = await this.client.fetch('https://api.vomo.org/v1/forms');
    const forms = (await response.json()).data;
    for (const form of forms) {
      this.formCache.set(`form:${form.id}`, form);
    }
  }
}

The patterns combined:

Pattern	Where
`updated_after`	`pollIncremental` queries only changes
List shape over detail	`_processUser` uses list fields only
Cache warming	`warmCaches` for slow-changing resources
Throttled client	Shared across operations
Per-customer scope	All operations take `customerId`

The result: a polling cycle that consumes maybe 5-10 requests per cycle for a customer with little activity, vs. 667+ for a naive integration.

Where to go next

Error Recovery Patterns

The resilience patterns that pair with performance optimization.

Rate Limits

The rate-limiting patterns these performance practices coexist with.

Data Modeling

The data model that enables fast queries.

Sync Architecture Patterns

The broader architectural patterns these practices fit into.

​The fundamental constraint

​Pattern 1: cache what doesn’t change often

​Per-customer keys

​When to invalidate

​Pattern 2: use embedded data instead of separate fetches

​Project Detail embeds schedule

​User Detail embeds participations

​Project Date Detail embeds participants

​When NOT to use embedded data

​Pattern 3: batch with parallelism (carefully)

​Combining parallelism with throttling

​Pattern 4: shorten polling cycles with updated_after

​Initial sync as the exception

​Pattern 5: pre-compute aggregations

​Refresh cadence

​Pattern 6: avoid N+1 patterns

​N+1 in practice

​Alternative 1: use list shape if it’s enough

​Alternative 2: detail only for changed records

​Alternative 3: parallel detail fetches with bounded concurrency

​Pattern 7: warm caches on startup

​Selective warming

​Pattern 8: estimate before bulk operations

​Pattern 9: monitor cost over time

​Per-customer budgets

​Pattern 10: avoid the “real-time” tax

​A reference performance-aware integration

​Where to go next

Error Recovery Patterns

Rate Limits

Data Modeling

Sync Architecture Patterns

The fundamental constraint

Pattern 1: cache what doesn’t change often

Per-customer keys

When to invalidate

Pattern 2: use embedded data instead of separate fetches

Project Detail embeds schedule

User Detail embeds participations

Project Date Detail embeds participants

When NOT to use embedded data

Pattern 3: batch with parallelism (carefully)

Combining parallelism with throttling

Pattern 4: shorten polling cycles with `updated_after`

Initial sync as the exception

Pattern 5: pre-compute aggregations

Refresh cadence

Pattern 6: avoid N+1 patterns

N+1 in practice

Alternative 1: use list shape if it’s enough

Alternative 2: detail only for changed records

Alternative 3: parallel detail fetches with bounded concurrency

Pattern 7: warm caches on startup

Selective warming

Pattern 8: estimate before bulk operations

Pattern 9: monitor cost over time

Per-customer budgets

Pattern 10: avoid the “real-time” tax

A reference performance-aware integration

Where to go next