API Performance Tips - Virtuous API Docs

A partner integration’s performance is mostly determined by a handful of structural choices: which endpoints you call, how you paginate, what response shape you request, and where you cache. This page consolidates the performance guidance scattered across the rest of the docs into a single, scannable reference. The audience is the engineering team scaling an integration past its initial proof-of-concept — when “it works for ten records” is becoming “it needs to work for fifty thousand.”

The two constraints that matter

Two constraints shape almost every performance decision:

Constraint	What it implies
The 5,000-requests-per-hour rate limit per Virtuous organization.	Every API call counts. Sustained throughput above this rate is impossible without engineering escalation. Design with it as a hard ceiling.
The single-threaded write path for Transactions.	Contact Transactions and Gift Transactions are processed by the nightly batch — submission is fast, but resolution is asynchronous. Don’t wait for resolution synchronously.

Most performance issues partner integrations hit are about one of these two. The patterns below all map back to handling them. See Rate Limits for the detailed rate-limit reference and Transactions for the holding-state model.

Prefer webhooks over polling

The single highest-leverage performance decision: use webhooks for change detection wherever Virtuous publishes events for the change.

Polling cost	Webhook cost
Every poll consumes a request from your rate-limit budget — even when nothing has changed.	Webhook deliveries do not consume your rate-limit budget.
Polling frequency caps how fresh your data can be. Polling every minute = 60 requests/hour just for one resource type.	Webhooks arrive within seconds of the source event.
Catches only changes that happened before your last poll timestamp.	Fires for every event regardless of source (your integration, other integrations, manual UI changes).

A polling-only sync that runs every 15 minutes against POST /api/Contact/Query and POST /api/Gift/Query consumes ~400 requests/day per resource — almost 8% of an hour’s budget burned on basic change detection. The webhook equivalent costs zero from your budget. Use polling as a reconciliation backstop, not as the primary signal. See Webhooks Overview and Reconcile Failed Syncs.

Choose the right response shape

Most read endpoints have multiple response shapes — abbreviated and full. Choose deliberately.

For Contact Queries

Endpoint	Returns	When to use
`POST /api/Contact/Query`	Abbreviated Contact: id, name, contactType, email, phone, address summary	List views, sync deltas, segment exports — most uses.
`POST /api/Contact/Query/FullContact`	Full Contact: all ContactIndividuals, all addresses, all custom fields, all contactReferences	Only when you genuinely need the full record for every result.

FullContact is meaningfully slower per request. For a 10,000-Contact result set, the difference can be the cost of an hour of clock time and several hundred extra requests on the rate-limit budget.

Pattern: abbreviated query + targeted full fetch

For workflows where you process the abbreviated result and only need full detail for a small subset:

JavaScript

async function syncDonorTier(token) {
  // 1. Get abbreviated results — cheap
  const allContacts = await pageThrough('https://api.virtuoussoftware.com/api/Contact/Query', {
    groups: [{ conditions: [{ parameter: 'Last Modified Date', operator: 'Is After', value: lastSync }] }],
    take: 1000,
  });

  // 2. Filter down to the subset that needs full detail
  const needsFullDetail = allContacts.filter((c) => contactNeedsTierUpdate(c));

  // 3. Fetch full detail only for the subset
  for (const contact of needsFullDetail) {
    const full = await fetch(`https://api.virtuoussoftware.com/api/Contact/${contact.id}`, {
      headers: { Authorization: `Bearer ${token}` },
    }).then((r) => r.json());
    await processWithFullDetail(full);
  }
}

The abbreviated query handles the broad scan; targeted full fetches handle the work that needs full data. Both together cost less than a FullContact query over the entire set.

Paginate with the right strategy

The skip/take pattern works for small to medium result sets. For very large or unstable result sets, ID-cursor pagination is better.

Skip/take

JavaScript

let skip = 0;
const take = 1000;
do {
  const page = await query({ skip, take });
  process(page.list);
  skip += take;
} while (skip < page.total);

Strengths: simple, predictable, the API’s default. Works fine for result sets up to tens of thousands. Weaknesses:

At high skip values, server-side query performance degrades. A request with skip=50000 is slower than skip=0.
If records are inserted while paginating, the offsets shift — you can re-process or miss records.

ID-cursor pagination

JavaScript

let cursorId = 0;
const take = 1000;
while (true) {
  const page = await query({
    groups: [{ conditions: [{ parameter: 'Contact Id', operator: 'Greater Than', value: cursorId.toString() }] }],
    sortBy: 'id',
    descending: false,
    skip: 0,
    take,
  });
  if (page.list.length === 0) break;
  process(page.list);
  cursorId = page.list[page.list.length - 1].id;
}

Strengths:

Each query is bounded — no skip overhead at high offsets.
Stable under concurrent inserts: a record inserted with a new ID lands in a later page, not in a page you’ve already processed.
Naturally resumable: persist the cursor between batches and resume from where you stopped.

Weaknesses: more verbose; requires that Contact Id (or equivalent ID parameter) be an indexed filter on the Query endpoint. See Query Contacts by Filters — resumable exports for the full pattern.

Maximize `take` for bulk operations

Every paginated read endpoint accepts a take parameter capped at 1,000. Use the cap for bulk operations.

Pattern	Requests for 100K records
`take: 25` (the default for some endpoints)	4,000 requests — ~2.7 hours at the rate limit
`take: 100`	1,000 requests — ~40 minutes
`take: 1000` (the cap)	100 requests — ~4 minutes

For interactive UIs showing a few records at a time, smaller take values are fine. For sync, exports, and reconciliation, always use 1,000.

Filter aggressively in the request, not client-side

A common partner integration anti-pattern: pulling broad result sets and filtering on the client side. Two costs:

More requests. Each unnecessary record paginated is rate-limit budget consumed.
More data transferred. The full response payload for records you discard is wasted bandwidth.

Push filters into the request body wherever the QueryOptions endpoint exposes the right parameter. A query for “Contacts in California modified in the last 24 hours” should look like this:

{
  "groups": [
    {
      "conditions": [
        { "parameter": "Last Modified Date", "operator": "Is After", "value": "2024-12-14T00:00:00Z" },
        { "parameter": "State", "operator": "Is", "value": "CA" }
      ]
    }
  ],
  "take": 1000
}

Not:

JavaScript

// ❌ Bad — over-fetches and filters client-side
const all = await pageThrough(modifiedSinceQuery);
const california = all.filter((c) => c.state === 'CA');

If a filter you need isn’t available in QueryOptions, that’s the constraint — but exhaust the server-side filter options first.

Cache reference data aggressively

Some data changes rarely and is referenced often. Cache it.

What’s cacheable

Data	TTL	Why
QueryOptions for a resource type	1 day	Filter parameters and operators change rarely.
Project list and codes	1 day	New Projects are added occasionally but the existing set is stable.
Campaign list	1 day	Same as Projects.
Premium list	1 day	Configured at setup; rarely changes during normal operation.
RelationshipTypes	1 week	Almost never changes.
GiftCustomFields, ContactCustomFields metadata	1 day	Adding a new custom field is rare.

What’s not cacheable

Data	Why
Specific Contacts or Gifts	Change frequently; cached records go stale fast.
Query results	Result sets are filter-dependent; caching them produces stale data and complex invalidation.
Webhook subscription details	Could change; query when needed.

Implementation pattern

JavaScript

class VirtuousReferenceCache {
  constructor(token) {
    this.token = token;
    this.cache = new Map();
    this.ttls = new Map();
  }

  async getProjects() {
    return this.getCached('projects', () =>
      fetch('https://api.virtuoussoftware.com/api/Project/Query', {
        method: 'POST',
        headers: { Authorization: `Bearer ${this.token}`, 'Content-Type': 'application/json' },
        body: JSON.stringify({ groups: [], take: 1000 }),
      }).then((r) => r.json())
    );
  }

  async getCached(key, fetchFn, ttlSeconds = 86400) {
    const now = Date.now();
    if (this.cache.has(key) && this.ttls.get(key) > now) {
      return this.cache.get(key);
    }
    const value = await fetchFn();
    this.cache.set(key, value);
    this.ttls.set(key, now + ttlSeconds * 1000);
    return value;
  }
}

For multi-tenant integrations, scope the cache per customer — different organizations have different Projects, Campaigns, etc.

Run requests concurrently, but bounded

If you have a hundred records to update, you don’t need to do them sequentially. But you also can’t fire all hundred concurrently — that bursts the rate limit and your error rate spikes. The pattern: a bounded concurrency limit, typically 4–8 concurrent in-flight requests.

JavaScript

async function runWithConcurrencyLimit(tasks, limit = 4) {
  const results = [];
  const inFlight = new Set();

  for (const task of tasks) {
    const promise = task().then((result) => {
      inFlight.delete(promise);
      return result;
    });
    inFlight.add(promise);
    results.push(promise);

    if (inFlight.size >= limit) {
      await Promise.race(inFlight);
    }
  }

  return Promise.all(results);
}

// Usage
const updates = pendingUpdates.map((u) => () => updateContact(u));
await runWithConcurrencyLimit(updates, 4);

At a 4-concurrent limit with a typical 200ms request time, you can sustain ~20 requests/second — well within the rate limit budget while completing 1,000 updates in under a minute.

Bounded concurrency interacts with the rate-limit ceiling. At 4 concurrent in-flight requests with 200ms per request, you’re at ~72,000 requests/hour — far above the 5,000/hour limit. The concurrency limit only helps with burst control; you still need to throttle the overall rate. Pair concurrent execution with a rate limiter that paces dispatch.

Use a token bucket for steady-state pacing

For continuous workloads (incremental sync, ongoing reconciliation), a token bucket smooths request dispatch to fit the rate limit while still allowing modest bursts:

JavaScript

class TokenBucket {
  constructor(capacity, refillRatePerSecond) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRatePerSecond;
    this.lastRefill = Date.now();
  }

  async acquire() {
    while (true) {
      const now = Date.now();
      const elapsed = (now - this.lastRefill) / 1000;
      this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
      this.lastRefill = now;

      if (this.tokens >= 1) {
        this.tokens -= 1;
        return;
      }

      const waitMs = ((1 - this.tokens) / this.refillRate) * 1000;
      await sleep(waitMs);
    }
  }
}

// 1,200 requests/hour = 0.333/second, capacity 20 to allow modest bursts
const bucket = new TokenBucket(20, 1200 / 3600);

async function makeRateLimitedRequest(...args) {
  await bucket.acquire();
  return fetch(...args);
}

Sized this way (1,200/hour with 20-token capacity), the bucket allows momentary bursts up to 20 requests but settles to a sustainable rate over time. The 20% headroom below the 5,000/hour cap absorbs the occasional refresh of cached reference data without spilling into rate-limit errors.

Reuse HTTP connections

Every TCP and TLS handshake adds latency. Reusing connections via HTTP keep-alive eliminates that cost. In Node.js with the global fetch, connection pooling is automatic for the same origin. In other runtimes:

Java/Apache HttpClient: configure a connection pool with appropriate maxConnPerRoute.
Python/requests: use a Session() object rather than module-level requests.get().
Go: use a long-lived http.Client with a configured Transport.
Ruby/Net::HTTP: use a long-lived Net::HTTP::Persistent instance.

The performance impact: for a typical 10K-request sync, eliminating per-request handshakes saves on the order of 5–10 seconds total. Modest but free.

Defer heavy work in webhook handlers

A webhook handler should acknowledge the delivery as quickly as possible and defer processing. Heavy work inside the handler increases the risk of timeout-triggered retries — which produces duplicate deliveries you have to defend against.

JavaScript

// ❌ Bad — heavy work inline
app.post('/webhook', async (req, res) => {
  if (!verifySignature(req)) return res.status(401).send('Invalid');
  const event = JSON.parse(req.body);
  await processEvent(event);                    // could take seconds
  res.status(200).send('OK');
});

// ✅ Good — acknowledge fast, queue for async
app.post('/webhook', async (req, res) => {
  if (!verifySignature(req)) return res.status(401).send('Invalid');
  const event = JSON.parse(req.body);
  await queue.send(event);                      // milliseconds
  res.status(200).send('OK');
});

See Webhooks Overview — The receiver pattern.

Batch where the API supports it

Most Virtuous endpoints accept one record per request. A few accept batches:

Endpoint	Batch
`POST /api/Tag/Bulk`	Apply a tag to many Contacts in one request
`POST /api/ContactNote/Bulk`	Create many notes in one request

For workloads that involve many small writes — applying a “Year-End-2024” tag to ten thousand donors, for instance — these batch endpoints reduce request count by 100x or more. Always check whether a batch endpoint exists before looping over a single-record endpoint.

The CRM+ spec exposes only a small set of batch endpoints. If a workflow requires bulk operations that don’t have a batch endpoint, you may need to escalate to Virtuous engineering for either a per-organization rate-limit exception (see Rate Limits) or a feature request for a new batch endpoint.

Monitor the right metrics

Performance regressions are usually visible in one of these metrics before they become user-visible:

Metric	Investigate when
Average request latency	Sustained increases (server-side slowness or networking issue)
95th percentile request latency	Outliers growing (some specific query type is slow)
Rate-limit headers on responses	`X-Rate-Limit-Remaining` trending toward zero
429 response count	Any non-zero count is a sign the throttle is misconfigured
Queue depth (for async architectures)	Growing depth indicates a downstream bottleneck

Most well-functioning integrations sit far below the rate limit. If you’re routinely brushing against it, the issue is usually elsewhere — too-frequent polling, overly-broad queries, or missing caching — not “the limit is too low.”

Where to go next

Error Recovery Patterns

The companion practices for handling the inevitable failures.

Rate Limits

The reference for the rate-limit budget all these patterns optimize for.

Pagination and Filtering

The mechanics of paginated reads that several patterns on this page depend on.

Build a Nightly Data Sync

A recipe that puts the throttling and pacing patterns from this page into practice.

​The two constraints that matter

​Prefer webhooks over polling

​Choose the right response shape

​For Contact Queries

​Pattern: abbreviated query + targeted full fetch

​Paginate with the right strategy

​Skip/take

​ID-cursor pagination

​Maximize take for bulk operations

​Filter aggressively in the request, not client-side

​Cache reference data aggressively

​What’s cacheable

​What’s not cacheable

​Implementation pattern

​Run requests concurrently, but bounded

​Use a token bucket for steady-state pacing

​Reuse HTTP connections

​Defer heavy work in webhook handlers

​Batch where the API supports it

​Monitor the right metrics

​Where to go next

Error Recovery Patterns

Rate Limits

Pagination and Filtering

Build a Nightly Data Sync

The two constraints that matter

Prefer webhooks over polling

Choose the right response shape

For Contact Queries

Pattern: abbreviated query + targeted full fetch

Paginate with the right strategy

Skip/take

ID-cursor pagination

Maximize `take` for bulk operations

Filter aggressively in the request, not client-side

Cache reference data aggressively

What’s cacheable

What’s not cacheable

Implementation pattern

Run requests concurrently, but bounded

Use a token bucket for steady-state pacing

Reuse HTTP connections

Defer heavy work in webhook handlers

Batch where the API supports it

Monitor the right metrics

Where to go next