Skip to main content
A partner integration’s performance is mostly determined by a handful of structural choices: which endpoints you call, how you paginate, what response shape you request, and where you cache. This page consolidates the performance guidance scattered across the rest of the docs into a single, scannable reference. The audience is the engineering team scaling an integration past its initial proof-of-concept — when “it works for ten records” is becoming “it needs to work for fifty thousand.”

The two constraints that matter

Two constraints shape almost every performance decision:
ConstraintWhat it implies
The 5,000-requests-per-hour rate limit per Virtuous organization.Every API call counts. Sustained throughput above this rate is impossible without engineering escalation. Design with it as a hard ceiling.
The single-threaded write path for Transactions.Contact Transactions and Gift Transactions are processed by the nightly batch — submission is fast, but resolution is asynchronous. Don’t wait for resolution synchronously.
Most performance issues partner integrations hit are about one of these two. The patterns below all map back to handling them. See Rate Limits for the detailed rate-limit reference and Transactions for the holding-state model.

Prefer webhooks over polling

The single highest-leverage performance decision: use webhooks for change detection wherever Virtuous publishes events for the change.
Polling costWebhook cost
Every poll consumes a request from your rate-limit budget — even when nothing has changed.Webhook deliveries do not consume your rate-limit budget.
Polling frequency caps how fresh your data can be. Polling every minute = 60 requests/hour just for one resource type.Webhooks arrive within seconds of the source event.
Catches only changes that happened before your last poll timestamp.Fires for every event regardless of source (your integration, other integrations, manual UI changes).
A polling-only sync that runs every 15 minutes against POST /api/Contact/Query and POST /api/Gift/Query consumes ~400 requests/day per resource — almost 8% of an hour’s budget burned on basic change detection. The webhook equivalent costs zero from your budget. Use polling as a reconciliation backstop, not as the primary signal. See Webhooks Overview and Reconcile Failed Syncs.

Choose the right response shape

Most read endpoints have multiple response shapes — abbreviated and full. Choose deliberately.

For Contact Queries

EndpointReturnsWhen to use
POST /api/Contact/QueryAbbreviated Contact: id, name, contactType, email, phone, address summaryList views, sync deltas, segment exports — most uses.
POST /api/Contact/Query/FullContactFull Contact: all ContactIndividuals, all addresses, all custom fields, all contactReferencesOnly when you genuinely need the full record for every result.
FullContact is meaningfully slower per request. For a 10,000-Contact result set, the difference can be the cost of an hour of clock time and several hundred extra requests on the rate-limit budget.

Pattern: abbreviated query + targeted full fetch

For workflows where you process the abbreviated result and only need full detail for a small subset:
JavaScript
async function syncDonorTier(token) {
  // 1. Get abbreviated results — cheap
  const allContacts = await pageThrough('https://api.virtuoussoftware.com/api/Contact/Query', {
    groups: [{ conditions: [{ parameter: 'Last Modified Date', operator: 'Is After', value: lastSync }] }],
    take: 1000,
  });

  // 2. Filter down to the subset that needs full detail
  const needsFullDetail = allContacts.filter((c) => contactNeedsTierUpdate(c));

  // 3. Fetch full detail only for the subset
  for (const contact of needsFullDetail) {
    const full = await fetch(`https://api.virtuoussoftware.com/api/Contact/${contact.id}`, {
      headers: { Authorization: `Bearer ${token}` },
    }).then((r) => r.json());
    await processWithFullDetail(full);
  }
}
The abbreviated query handles the broad scan; targeted full fetches handle the work that needs full data. Both together cost less than a FullContact query over the entire set.

Paginate with the right strategy

The skip/take pattern works for small to medium result sets. For very large or unstable result sets, ID-cursor pagination is better.

Skip/take

JavaScript
let skip = 0;
const take = 1000;
do {
  const page = await query({ skip, take });
  process(page.list);
  skip += take;
} while (skip < page.total);
Strengths: simple, predictable, the API’s default. Works fine for result sets up to tens of thousands. Weaknesses:
  • At high skip values, server-side query performance degrades. A request with skip=50000 is slower than skip=0.
  • If records are inserted while paginating, the offsets shift — you can re-process or miss records.

ID-cursor pagination

JavaScript
let cursorId = 0;
const take = 1000;
while (true) {
  const page = await query({
    groups: [{ conditions: [{ parameter: 'Contact Id', operator: 'Greater Than', value: cursorId.toString() }] }],
    sortBy: 'id',
    descending: false,
    skip: 0,
    take,
  });
  if (page.list.length === 0) break;
  process(page.list);
  cursorId = page.list[page.list.length - 1].id;
}
Strengths:
  • Each query is bounded — no skip overhead at high offsets.
  • Stable under concurrent inserts: a record inserted with a new ID lands in a later page, not in a page you’ve already processed.
  • Naturally resumable: persist the cursor between batches and resume from where you stopped.
Weaknesses: more verbose; requires that Contact Id (or equivalent ID parameter) be an indexed filter on the Query endpoint. See Query Contacts by Filters — resumable exports for the full pattern.

Maximize take for bulk operations

Every paginated read endpoint accepts a take parameter capped at 1,000. Use the cap for bulk operations.
PatternRequests for 100K records
take: 25 (the default for some endpoints)4,000 requests — ~2.7 hours at the rate limit
take: 1001,000 requests — ~40 minutes
take: 1000 (the cap)100 requests — ~4 minutes
For interactive UIs showing a few records at a time, smaller take values are fine. For sync, exports, and reconciliation, always use 1,000.

Filter aggressively in the request, not client-side

A common partner integration anti-pattern: pulling broad result sets and filtering on the client side. Two costs:
  • More requests. Each unnecessary record paginated is rate-limit budget consumed.
  • More data transferred. The full response payload for records you discard is wasted bandwidth.
Push filters into the request body wherever the QueryOptions endpoint exposes the right parameter. A query for “Contacts in California modified in the last 24 hours” should look like this:
{
  "groups": [
    {
      "conditions": [
        { "parameter": "Last Modified Date", "operator": "Is After", "value": "2024-12-14T00:00:00Z" },
        { "parameter": "State", "operator": "Is", "value": "CA" }
      ]
    }
  ],
  "take": 1000
}
Not:
JavaScript
// ❌ Bad — over-fetches and filters client-side
const all = await pageThrough(modifiedSinceQuery);
const california = all.filter((c) => c.state === 'CA');
If a filter you need isn’t available in QueryOptions, that’s the constraint — but exhaust the server-side filter options first.

Cache reference data aggressively

Some data changes rarely and is referenced often. Cache it.

What’s cacheable

DataTTLWhy
QueryOptions for a resource type1 dayFilter parameters and operators change rarely.
Project list and codes1 dayNew Projects are added occasionally but the existing set is stable.
Campaign list1 daySame as Projects.
Premium list1 dayConfigured at setup; rarely changes during normal operation.
RelationshipTypes1 weekAlmost never changes.
GiftCustomFields, ContactCustomFields metadata1 dayAdding a new custom field is rare.

What’s not cacheable

DataWhy
Specific Contacts or GiftsChange frequently; cached records go stale fast.
Query resultsResult sets are filter-dependent; caching them produces stale data and complex invalidation.
Webhook subscription detailsCould change; query when needed.

Implementation pattern

JavaScript
class VirtuousReferenceCache {
  constructor(token) {
    this.token = token;
    this.cache = new Map();
    this.ttls = new Map();
  }

  async getProjects() {
    return this.getCached('projects', () =>
      fetch('https://api.virtuoussoftware.com/api/Project/Query', {
        method: 'POST',
        headers: { Authorization: `Bearer ${this.token}`, 'Content-Type': 'application/json' },
        body: JSON.stringify({ groups: [], take: 1000 }),
      }).then((r) => r.json())
    );
  }

  async getCached(key, fetchFn, ttlSeconds = 86400) {
    const now = Date.now();
    if (this.cache.has(key) && this.ttls.get(key) > now) {
      return this.cache.get(key);
    }
    const value = await fetchFn();
    this.cache.set(key, value);
    this.ttls.set(key, now + ttlSeconds * 1000);
    return value;
  }
}
For multi-tenant integrations, scope the cache per customer — different organizations have different Projects, Campaigns, etc.

Run requests concurrently, but bounded

If you have a hundred records to update, you don’t need to do them sequentially. But you also can’t fire all hundred concurrently — that bursts the rate limit and your error rate spikes. The pattern: a bounded concurrency limit, typically 4–8 concurrent in-flight requests.
JavaScript
async function runWithConcurrencyLimit(tasks, limit = 4) {
  const results = [];
  const inFlight = new Set();

  for (const task of tasks) {
    const promise = task().then((result) => {
      inFlight.delete(promise);
      return result;
    });
    inFlight.add(promise);
    results.push(promise);

    if (inFlight.size >= limit) {
      await Promise.race(inFlight);
    }
  }

  return Promise.all(results);
}

// Usage
const updates = pendingUpdates.map((u) => () => updateContact(u));
await runWithConcurrencyLimit(updates, 4);
At a 4-concurrent limit with a typical 200ms request time, you can sustain ~20 requests/second — well within the rate limit budget while completing 1,000 updates in under a minute.
Bounded concurrency interacts with the rate-limit ceiling. At 4 concurrent in-flight requests with 200ms per request, you’re at ~72,000 requests/hour — far above the 5,000/hour limit. The concurrency limit only helps with burst control; you still need to throttle the overall rate. Pair concurrent execution with a rate limiter that paces dispatch.

Use a token bucket for steady-state pacing

For continuous workloads (incremental sync, ongoing reconciliation), a token bucket smooths request dispatch to fit the rate limit while still allowing modest bursts:
JavaScript
class TokenBucket {
  constructor(capacity, refillRatePerSecond) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRatePerSecond;
    this.lastRefill = Date.now();
  }

  async acquire() {
    while (true) {
      const now = Date.now();
      const elapsed = (now - this.lastRefill) / 1000;
      this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
      this.lastRefill = now;

      if (this.tokens >= 1) {
        this.tokens -= 1;
        return;
      }

      const waitMs = ((1 - this.tokens) / this.refillRate) * 1000;
      await sleep(waitMs);
    }
  }
}

// 1,200 requests/hour = 0.333/second, capacity 20 to allow modest bursts
const bucket = new TokenBucket(20, 1200 / 3600);

async function makeRateLimitedRequest(...args) {
  await bucket.acquire();
  return fetch(...args);
}
Sized this way (1,200/hour with 20-token capacity), the bucket allows momentary bursts up to 20 requests but settles to a sustainable rate over time. The 20% headroom below the 5,000/hour cap absorbs the occasional refresh of cached reference data without spilling into rate-limit errors.

Reuse HTTP connections

Every TCP and TLS handshake adds latency. Reusing connections via HTTP keep-alive eliminates that cost. In Node.js with the global fetch, connection pooling is automatic for the same origin. In other runtimes:
  • Java/Apache HttpClient: configure a connection pool with appropriate maxConnPerRoute.
  • Python/requests: use a Session() object rather than module-level requests.get().
  • Go: use a long-lived http.Client with a configured Transport.
  • Ruby/Net::HTTP: use a long-lived Net::HTTP::Persistent instance.
The performance impact: for a typical 10K-request sync, eliminating per-request handshakes saves on the order of 5–10 seconds total. Modest but free.

Defer heavy work in webhook handlers

A webhook handler should acknowledge the delivery as quickly as possible and defer processing. Heavy work inside the handler increases the risk of timeout-triggered retries — which produces duplicate deliveries you have to defend against.
JavaScript
// ❌ Bad — heavy work inline
app.post('/webhook', async (req, res) => {
  if (!verifySignature(req)) return res.status(401).send('Invalid');
  const event = JSON.parse(req.body);
  await processEvent(event);                    // could take seconds
  res.status(200).send('OK');
});

// ✅ Good — acknowledge fast, queue for async
app.post('/webhook', async (req, res) => {
  if (!verifySignature(req)) return res.status(401).send('Invalid');
  const event = JSON.parse(req.body);
  await queue.send(event);                      // milliseconds
  res.status(200).send('OK');
});
See Webhooks Overview — The receiver pattern.

Batch where the API supports it

Most Virtuous endpoints accept one record per request. A few accept batches:
EndpointBatch
POST /api/Tag/BulkApply a tag to many Contacts in one request
POST /api/ContactNote/BulkCreate many notes in one request
For workloads that involve many small writes — applying a “Year-End-2024” tag to ten thousand donors, for instance — these batch endpoints reduce request count by 100x or more. Always check whether a batch endpoint exists before looping over a single-record endpoint.
The CRM+ spec exposes only a small set of batch endpoints. If a workflow requires bulk operations that don’t have a batch endpoint, you may need to escalate to Virtuous engineering for either a per-organization rate-limit exception (see Rate Limits) or a feature request for a new batch endpoint.

Monitor the right metrics

Performance regressions are usually visible in one of these metrics before they become user-visible:
MetricInvestigate when
Average request latencySustained increases (server-side slowness or networking issue)
95th percentile request latencyOutliers growing (some specific query type is slow)
Rate-limit headers on responsesX-Rate-Limit-Remaining trending toward zero
429 response countAny non-zero count is a sign the throttle is misconfigured
Queue depth (for async architectures)Growing depth indicates a downstream bottleneck
Most well-functioning integrations sit far below the rate limit. If you’re routinely brushing against it, the issue is usually elsewhere — too-frequent polling, overly-broad queries, or missing caching — not “the limit is too low.”

Where to go next

Error Recovery Patterns

The companion practices for handling the inevitable failures.

Rate Limits

The reference for the rate-limit budget all these patterns optimize for.

Pagination and Filtering

The mechanics of paginated reads that several patterns on this page depend on.

Build a Nightly Data Sync

A recipe that puts the throttling and pacing patterns from this page into practice.
Last modified on May 27, 2026