Part 1: The Missing Metric in Healthcare AI and Why Data Quality Scores Matter

Insights from Arun Raghavan, VP of Data Engineering

The Question Nobody’s Asking

Your organization just invested millions in AI infrastructure. Your team spent months building clinical decision support systems. Your models are state-of-the-art.

But here’s the question that should come first: What’s your data quality score? Not “is your data good” or “have you cleaned your data.” What’s the actual, measured, quantifiable score? If you don’t know the answer, you’re building on an unknown foundation. And in healthcare, that’s a significant risk.

The Measurement Problem

Most healthcare organizations don’t systematically measure data quality. When they do, they often use informal assessments:

  • “Our data team spot-checks records.”
  • “We have FHIR Profiles or data validation rules.”
  • “Our vendor says the data is clean.”

These approaches don’t answer the fundamental question: How good is good enough?

Our Approach: Rigorous, Transparent, Reproducible

We conducted a systematic analysis of healthcare data quality at population scale using the PIQI (Patient Information Quality Improvement) framework – an industry-standard assessment methodology.

Our Methodology:

  • Millions of patient records across multiple health systems
  • Mix of chronic disease patients, healthy patients, and high utilizers
  • Data from all major sources: EMR, payer, pharmacy, labs, imaging
  • Multiple data networks: Patient Access API, TEFCA, CMS

Framework:

  • PIQI (Patient Information Quality Improvement) – industry-standard
  • Extended with “Completeness” dimension for longitudinal, multi-source data
  • 13 specific dimensions across 5 categories
  • Weighted scoring (0-100 scale) with letter grades (A-F)

The Baseline: What We Found

Average raw data quality score: 36/100 (F grade – Critical). This wasn’t surprising. Healthcare data is inherently messy because of multiple sources with different standards, inconsistent coding practices, historical data from legacy systems, and disconnected networks with incomplete views.

Understanding the PIQI Framework

PIQI evaluates data across 4 categories (we extended the standard 4-category framework)

1. Availability (20% of quality score) – Is usable information present?

Dimensions:

  • Missing: Expected elements are absent (e.g., no Condition resources)
  • Unpopulated: Attributes exist but are empty (e.g., blank patient name)
  • Incomplete: Inadequate information (e.g., code without code system)

Our baseline finding: 45/100 (D grade)

  • 40% of expected resources missing
  • 35% of fields unpopulated
  • 25% of elements incomplete

Real example: Patient with documented Crohn’s disease and Lupus in Patient Access API network – completely absent from TEFCA data. If your system only connects to TEFCA, these critical diagnoses are invisible.

Consumer impact: AI makes recommendations without critical context. Clinical decision support misses important contraindications.

2. Accuracy (20% of quality score) – Is the data inherently valid?

Dimensions:

  • Invalid Format: Improperly formatted data (e.g., date “2024-13-45”)
  • Invalid Value: Values outside expected ranges (e.g., heart rate 500 bpm)
  • Invalid Grouping: Incompatible attribute combinations

Our baseline finding: 52/100 (F grade)

Consumer impact: Invalid data causes AI to process nonsense as if it were real, leading to unreliable outputs.

3. Conformity (15% of quality score) – Does coded information conform to standards?

Dimensions:

  • Invalid Member: Code doesn’t exist in the specified system
  • Incompatible: Wrong code system used (e.g., ICD-10 where SNOMED expected)
  • Obsolete: Deprecated/inactive codes

Our baseline finding: 38/100 (F grade)

  • 40% of codes don’t exist in the specified standard terminologies
  • 35% wrong code systems used (custom Epic codes instead of LOINC)
  • 25% deprecated/obsolete codes still in use

Real example: Same lab test appears as:

  • Custom Epic code “LAB_GLUCOSE_2024”
  • LOINC code 2345-7 (correct)
  • No code at all (just text “glucose”)

Consumer impact: Impossible to deduplicate across sources. Analytics can’t aggregate properly. Interoperability breaks down.

4. Plausibility (15% of quality score) – Does the data make sense?

Dimensions:

  • Temporally Implausible: Timeline doesn’t make sense
  • Clinically Implausible: Values outside reasonable ranges
  • Situationally Implausible: Conflicting information

Our baseline finding: 48/100 (F grade)

Consumer impact: Implausible data indicates systemic issues that AI will amplify rather than correct.

The Cost: What Low Quality Data Actually Means

Poor data quality creates significant operational, clinical, and regulatory challenges:

Operational Costs

AI Token Costs:

  • Processing 4,462 lab results per patient vs. 24 key findings
  • ~$50 per AI query vs. ~$5 (10x difference)
  • Poor user experience from slow response times (30+ seconds vs. 3 seconds)

Data Operations:

  • Constant manual cleanup and data quality tickets
  • Integration failures from non-standard codes
  • Excessive storage costs from duplicate records

Clinical Risk

Missing Diagnosis Scenario:

  • AI recommends NSAID for abdominal pain
  • Patient has Crohn’s disease (in disconnected network)
  • NSAID causes serious complications
  • Significant liability exposure

Medication Error Scenario:

  • 21 medications marked “active” (only 3 actually are)
  • AI checks interactions across all 21
  • False positive alerts = alert fatigue
  • Missed real interaction = patient harm

Regulatory Exposure

HIPAA Violations:

  • Processing 4,462 lab results when only 24 are needed
  • Data minimization principle violated

Audit Failures:

  • Can’t explain AI decisions based on poor-quality data
  • Compliance failures in CMS quality reporting

This is the measurable cost of poor data quality.


In Part 2, we’ll reveal how we transformed F-grade data into B-grade data—achieving a 122% improvement—and what that means for AI performance, costs, and patient safety.

Join us on our mission to simplify healthcare, one person at a time.