Scaling the Medplum Search Index: Strategies for Mitigating API Search Latency in Large Databases

2026-03-26 16:00:00+00:00

As healthcare systems transition to cloud-native architectures, serverless FHIR platforms like AWS Medplum have become highly popular for their security, flexibility, and adherence to medical standards. However, in large databases—where millions of Patient, Observation, Communication, and DocumentReference resources accumulate—scaling search operations introduces unique database challenges.

Under the hood, Medplum stores FHIR resources inside PostgreSQL JSONB columns. To power real-time queries, it utilizes custom index mapping tables or external indices (like Elasticsearch). Because of this asynchronous, multi-tiered pipeline, the search index is eventually consistent.

During traffic spikes or bulk clinical ingestions, search indexing lags. An immediate search query executed right after a resource creation can return stale data. This article details three architectural strategies to mitigate API search latency, scale Postgres-backed FHIR search tables, and keep user interfaces feeling instantaneous.

🏗️ The Medplum Dual-Path Query Pipeline

To scale search reads while preserving transactional write integrity, Medplum operates a dual-path routing engine. Writes go directly to the primary database, which then pushes change event notifications to an asynchronous background worker. The worker updates the search indexing table:

🚀 Strategy 1: Bypassing Search with Direct ID Lookups

Standard patient portal flows often search for resources using specific attributes (e.g. searching for a patient profile by email: GET /fhir/R4/Patient?email=patient@axiobionics.com). If you execute this query immediately during user registration, it can return a 404 Not Found error because the background index worker has not yet completed the email parameter index mapping.

For all critical write-to-read sequences (such as magic link authorization gates or background relational synchronizers), do not search. Instead, pass the newly created resource's UUID directly and perform a primary key lookup (readResource). Primary key queries read from Postgres directly, completely bypassing eventual consistency delays:

// resource-fetcher.ts (Direct Primary Key Fallback Pattern)
import { MedplumClient } from '@medplum/core';
import { Patient } from '@medplum/fhir';
import { logger } from './logger.js';

export class ResilientIntakeClient {
  private medplum: MedplumClient;

  constructor(medplum: MedplumClient) {
    this.medplum = medplum;
  }

  /**
   * Safe fetch: Falls back to direct read if search index lag is detected
   */
  public async getPatientData(patientId: string, email: string): Promise<Patient> {
    try {
      // 1. Primary path: Fast primary key read (Strongly consistent)
      logger.info({ patientId }, 'Attempting strong-consistency primary key read...');
      return await this.medplum.readResource('Patient', patientId);
    } catch (err) {
      logger.warn({ err, patientId }, 'Primary key read failed. Falling back to search index with retry...');
      
      // 2. Fallback path: Retried search parameters
      return await this.searchPatientWithRetry(email);
    }
  }

  private async searchPatientWithRetry(email: string, maxAttempts = 3): Promise<Patient> {
    for (let attempt = 1; attempt <= maxAttempts; attempt++) {
      const results = await this.medplum.searchResources('Patient', { email: email });
      
      if (results.length > 0) {
        return results[0] as Patient;
      }

      logger.warn(
        { attempt, email }, 
        'Eventually consistent search index is stale. Retrying query...'
      );
      // Linear backoff delay to allow background indexing to complete
      await new Promise(resolve => setTimeout(resolve, 1500 * attempt));
    }

    throw new Error(`Patient record with email ${email} could not be resolved from index.`);
  }
}

🎨 Strategy 2: Frontend Optimistic UI Updates

When a patient posts a new chat message or uploads a document in a portals communication feed, triggering an immediate API search refresh causes the message bubble to blink or fail to render due to index replication lag.

To resolve this, the frontend must implement an Optimistic UI Pattern. When a message is sent, we append a mock/temporary Communication resource directly to the local React state. This renders the bubble instantly. The portal only re-fetches from the server search endpoint during manual refreshes or long-polling cycles:

// MessageThreadComponent.tsx (Optimistic UI Messaging Component)
import React, { useState } from 'react';
import { Box, TextInput, Button, Stack, Card, Text } from '@mantine/core';
import { useMedplum } from '@medplum/react';
import { Communication } from '@medplum/fhir';

export function MedplumMessageThread() {
  const medplum = useMedplum();
  const [messages, setMessages] = useState<Communication[]>([]);
  const [text, setText] = useState('');
  const [isSending, setIsSending] = useState(false);

  const handleSendMessage = async () => {
    if (!text.trim()) return;

    const tempId = 'optimistic-' + Math.random().toString(36).substring(2);
    
    // 1. Construct temporary optimistic resource
    const optimisticMessage: Communication = {
      resourceType: 'Communication',
      id: tempId,
      status: 'completed',
      sent: new Date().toISOString(),
      payload: [{ contentString: text }]
    };

    // 2. Render instantly in local state
    setMessages(prev => [...prev, optimisticMessage]);
    setText('');
    setIsSending(true);

    try {
      // 3. Post to Medplum server
      const savedMessage = await medplum.createResource<Communication>({
        resourceType: 'Communication',
        status: 'completed',
        payload: [{ contentString: text }]
      });

      // 4. Swap the optimistic resource with the verified server resource
      setMessages(prev => 
        prev.map(m => m.id === tempId ? savedMessage : m)
      );
    } catch (err) {
      // 5. Rollback on API failure
      setMessages(prev => prev.filter(m => m.id !== tempId));
      notifications.show({ color: 'red', message: 'Failed to deliver message. Please retry.' });
    } finally {
      setIsSending(false);
    }
  };

  return (
    <Stack gap="md" style={{ maxWidth: 500, margin: '20px auto' }}>
      <Box style={{ minHeight: 300, border: '1px solid #eee', padding: 15, borderRadius: 8 }}>
        {messages.map(msg => (
          <Card 
            key={msg.id} 
            mt="xs" 
            shadow="sm"
            // Set overflow visible to prevent truncating nested attachment structures
            style={{ overflow: 'visible', height: 'auto' }}
          >
            <Text size="sm">{msg.payload?.[0]?.contentString}</Text>
          </Card>
        ))}
      </Box>
      <Group>
        <TextInput 
          value={text} 
          onChange={(e) => setText(e.target.value)} 
          placeholder="Type your clinical update..." 
          style={{ flex: 1 }}
        />
        <Button onClick={handleSendMessage} loading={isSending}>Send</Button>
      </Group>
    </Stack>
  );
}

🗄️ Strategy 3: Postgres JSONB Search Index Optimization

When scaling Medplum databases to millions of records, standard database scans degrade API speeds. To keep FHIR search queries fast, you must create explicit indices on high-frequency JSONB search paths inside your Postgres database instance:

1. PostgreSQL JSONB GIN Indexing

A GIN (Generalized Inverted Index) is optimized for indexing composite JSON values. By creating a GIN index directly on the resource payload, Postgres can scan arrays and nested attributes instantly:

CREATE INDEX idx_patient_data_gin ON "Patient" USING gin (content);

2. Tailored Expression Indexes (B-Tree)

If your clinical dashboard constantly queries specific attributes (like search by email or MRN), general GIN indexes are less efficient than tailored B-Tree expression indexes:

-- Fast index lookup for Patient email search
CREATE INDEX idx_patient_email ON "Patient" (
  lower(jsonb_path_query_first(content, '$.telecom[*] ? (@.system == "email").value')::text)
);

By combining database GIN indices with targeted expression indexes, search API latency is cut from seconds to sub-milliseconds, even under massive multi-million record database loads.

📈 Summary of Benefits

Designing your clinical identity and transaction endpoints around search consistency models provides key operational benefits:

Zero Perceived Latency: Implementing frontend Optimistic UI updates completely hides background index processing lag from patients and providers.
Immunity to Stale Reads: Shifting transaction pipelines to direct primary key lookups (readResource) guarantees clinical routines always read fresh, strong-consistency database records.
Sub-Millisecond Queries: Tuning Postgres instances with generalized GIN indexes and targeted expression indexes ensures search dashboards remain fast as databases grow.
Seamless System Integration: Decoupling write completions from eventually consistent search indexes eliminates race conditions between clinical portals and internal relational EHR synchronizers.

By combining direct primary key retrievals on the backend, Optimistic UI structures in the frontend, and tuned JSONB expression indexes in the database layer, you build a resilient, high-integrity medical application capable of scaling to millions of patient events.