Real-Time Patient Notifications with FHIR SSE

Insights from Alvin Henrick, AVP of Engineering at b.well Connected Health

Why do you need FHIR SSE? Once or twice in your life, you’ve hit this wall. You move from Dallas, Texas, to San Francisco. New city, new provider, new health system, and the first question at your new doctor’s office, “Can you get your records transferred?” You call your old clinic, and they fax something, maybe. Three weeks later, half your history is missing, and you’re repeating labs you did six months ago.

Now flip the script. What if your health data followed you in real-time? What if the moment your old provider updated a record, your new provider’s system knew about it? What if an AI health assistant on your phone could tell you, “your lab results just came in, here’s what changed since last time,” seconds after your doctor signed off?

That’s what FHIR Subscriptions with Server-Sent Events enable. But building it for real patients, at scale, in a multi-tenant healthcare platform, that’s where the engineering gets interesting.

What We’re Solving at b.well

At b.well Connected Health, this is the kind of engineering challenge we wake up to every day. Our platform aggregates health data from hundreds of sources, payers, providers, labs, and pharmacies into a unified, FHIR-native data layer. The patient doesn’t care which system their data lives in. They care that it’s there when they need it, and that they know when something changes.

We’re solving these hard distributed systems problems, cross-pod event delivery, multi-tenant security isolation, real-time credential lifecycle, not as academic exercises, but because a patient in San Francisco shouldn’t have to wait three weeks for their Dallas records. Because when a lab result is ready, the patient’s AI health assistant should know before the patient remembers to check the portal.

Healthcare engineering is uniquely unforgiving. You can’t “move fast and break things” when the things are patient records. You can’t have eventual consistency in tenant isolation, since a single cross-tenant leak is a HIPAA incident. You can’t have a notification gap where a critical lab result silently disappears because a pod restarted during a deploy.

The engineering challenges below are how we improve healthcare and patients’ lives. Every architectural decision here traces back to a patient experience: faster notifications, safer data delivery, zero dropped events, and a foundation that AI agents can build on to give patients a real-time understanding of their own health.

The Gap Between Spec and Reality

The FHIR R5 Subscription Backport IG defines channel types for delivering notifications: rest-hook, websocket, email, sms, message. In the age of AI agents and LLM-powered patient experiences, polling a FHIR server every N minutes for changes is not going to cut it. Patients expect to know immediately when their lab result drops, when an appointment changes, and when a care plan is updated, not on the next poll cycle.

FHIR SSE is not a first-class channel in the spec. But §2.4.0.0.5 of the Backport IG explicitly leaves room for custom channel types. That’s the conformance hook we use.

The real challenge isn’t the protocol; SSE is well-understood. The challenge is everything the spec doesn’t address:

How do you deliver events to the right patient without sticky sessions?
How do you revoke credentials on an already-open stream?
What happens when a JWT expires mid-stream on a 24-hour connection?
How do you ensure no PHI leaks through the notification channel itself?
How do you scale horizontally without duplicating every event to every pod?

The FHIR SSE Architecture

Here’s what we built:

The key insight: one pod processes each Kafka message (shared consumer group), then Redis Pub/Sub fans it out to every pod that has a connected client for that subscription. No sticky sessions. No duplicate consumption. Any pod can accept any client connection.

Why Not Sticky Sessions?

Sticky sessions break during rolling deployments. They break during autoscaling. They make your load balancer a single point of failure for specific connections. In healthcare, “Sorry, your notification stream died because we deployed a patch,” is not acceptable.

Why Not One Consumer Group Per Pod?

If you have 10 pods, you’d process every Kafka message 10 times. That’s wasteful, and it breaks exactly-once semantics for sequence number assignment.

The Custom FHIR SSE Channel

Our FHIR SSE channel is a superset of the Backport IG. The notification Bundle shape is IG-compatible:

The payload is id-only (claim-checker pattern). No PHI on the wire. The client gets a reference (Observation/lab-123), then fetches the actual resource from the FHIR server with its own JWT. The FHIR server re-authorizes that fetch, so even if something goes wrong in the notification layer, the patient only sees what their token allows.

This is deliberate. In an LLM-agent world where AI assistants subscribe to health events on behalf of patients, the claim-checker pattern means the notification channel is a signal layer, not a data layer. The AI agent gets “something changed,” fetches the resource through proper authorization, then summarizes it for the patient. Two authorization checkpoints instead of one.

The Hard Parts

1. Cross-Pod Event Delivery

Pod 1 processes the Kafka message. Pod 2 has the patient’s connection. Redis Pub/Sub bridges the gap. If Redis is down, clients recover missed events from ClickHouse on reconnect using Last-Event-Id.

2. Dual-Dimension Security Filtering

Every event carries security codes extracted from the FHIR resource’s meta.security:

Access codes — which organization can see this resource
Owner codes — which partner owns this resource

The patient’s JWT carries matching scopes. At delivery time, we intersect:

This is multi-tenancy at the event level, not the connection level. Two subscriptions on the same pod, same subscription ID, even, but different tenant security tags will see different events. Cross-tenant leakage is structurally impossible because filtering happens per-event, per-delivery.

3. Token Expiry Mid-Stream

SSE connections live up to 24 hours. JWTs live ~1 hour. The spec says nothing about this. Here’s the lifecycle we implemented:

The token-expiring warning gives clients 5 minutes to prepare. On reconnect, Last-Event-Id ensures zero data loss. The gap between disconnect and reconnect is covered by ClickHouse persistence.

4. Real-Time Credential Revocation

When an admin revokes a client credential, active SSE connections for that client must terminate immediately, not on the next token refresh, not on the next heartbeat, now.

Revocation propagates across all pods via a dedicated Redis Pub/Sub control channel. The revocation is also persisted with a TTL equal to max JWT lifetime, so if a pod restarts, it loads active revocations before accepting connections. No zombie streams.

5. The Replay Safety Net

ClickHouse stores every event for 7 days, partitioned by date, ordered by (subscription_id, sequence_number). This is the safety net that makes the entire fire-and-forget architecture safe:

Redis Pub/Sub down? → Client reconnects, replays from ClickHouse.
Pod crashed? → Client reconnects, replays from ClickHouse.
Network blip? → Client reconnects with Last-Event-Id, picks up where it left off.

The key design decision: ClickHouse is the durability layer, not Redis. Redis Pub/Sub is ephemeral by design, and that’s fine; it’s the fast path. ClickHouse is the reliable path.

The Patient Perspective in the AI Era

Here’s where it gets interesting. As engineers building healthcare infrastructure, we’re also patients. When you build a real-time notification system, you’re building the substrate that AI patient agents will consume.

Consider the flow for an LLM-powered health assistant:

Patient grants access to their health data (SMART on FHIR scopes)
AI agent creates a FHIR Subscription for relevant resource types
AI agent opens SSE connection, receives id-only notifications
On each notification: fetches resource, interprets it, notifies patient in plain language
Patient asks follow-up questions; agent has full context

The id-only pattern is perfect for this. The AI agent doesn’t need the raw FHIR Bundle streaming over SSE; it needs a signal that something changed, then it fetches, interprets, and acts. Two authorization checkpoints (SSE access filter + FHIR server re-auth) mean the patient’s data stays protected even when an AI intermediary is involved.

The FHIR SSE channel becomes the nervous system. The LLM becomes the interpreter. The patient gets real-time, understandable health updates instead of a portal they check once a week.

What the FHIR Spec Gives You vs. What You Need

Concern	FHIR Spec Provides	What You Actually Need
Event format	✓︎ Notification Bundle shape	Same
Payload content levels	✓︎ empty / id-only / full-resource	Same
Last-Event-Id replay	✓︎ Protocol-level	ClickHouse backing store + sequence numbers
Multi-tenancy	✗︎ Not addressed	Per-event security filtering
Cross-pod delivery	✗︎ Not addressed	Redis Pub/Sub fan-out
Token lifecycle	✗︎ Not addressed	Proactive expiry warnings + graceful reconnect
Credential revocation	✗︎ Not addressed	Cross-pod revocation propagation
Horizontal scaling	✗︎ Not addressed	Shared consumer group + fan-out pattern
PHI protection	✗︎ Not addressed	id-only + re-authorization on fetch

The spec gives you the protocol. The hard part — and the interesting engineering — is everything around it.

Key Technology Choices

Component	Technology	Why
Event ingestion	Kafka (shared consumer group)	Already in CDC pipeline, ordering guarantees via partition key
Cross-pod delivery	Redis Pub/Sub	Lightweight, no persistence needed (ClickHouse is safety net)
Sequence numbers	Redis INCR	Distributed atomic counter, fast
Event persistence	ClickHouse	Time-series optimized, columnar, built-in TTL, cheap storage
SSE streaming	Spring WebFlux (Reactor)	Non-blocking, handles thousands of long-lived connections
Auth	Multi-issuer JWT (JWKS)	Enterprise reality: multiple IdPs
Resilience	Circuit breakers (Resilience4j)	ClickHouse down shouldn’t kill live delivery

Wrapping Up

The FHIR Subscription Backport IG gives you a solid foundation — the custom channel extension point is well-designed, and the notification Bundle shape is interoperable. But deploying real-time notifications in a multi-tenant healthcare platform requires solving problems the spec intentionally leaves to implementers.

If you’re building something similar — especially if you’re thinking about how LLM agents will consume health events on behalf of patients — the claim-checker pattern (id-only) with per-event security filtering is the architecture I’d recommend. Keep the notification channel thin (signals, not data), enforce authorization at every boundary, and design for the reality that connections will break, tokens will expire, and credentials will be revoked.

The patients — and their AI agents — shouldn’t have to worry about any of that. They should just get the update.

Source: Built and deployed in production at b.well Connected Health. Architecture patterns described here are generalizable to any FHIR-native platform implementing real-time subscriptions.

Real-Time Patient Notifications with FHIR SSE — Beyond the Spec

What We’re Solving at b.well

The Gap Between Spec and Reality