Why Most Pricing Data Gives You Half the Picture—And How We Fixed That

Price transparency regulations brought access. They didn't bring answers. Payer files, hospital disclosures, and claims data each capture a piece of the picture—but not the whole thing.

Serif Health was built to change that. We integrate all three sources into a single, pre-validated platform so the answer is waiting when you ask the question. There’s no switching between datasets, no manual reconciliation and no guesswork about which source to trust.

The Problem With Separate Data Sources

Many price transparency companies describe themselves as multi-source. Having access to multiple datasets is not the same as unifying them. When data requires separate queries and manual reconciliation, the risk of error is high. The time to insight is slower.

That is a meaningful distinction, because each data source has a real benefit and a real limitation on its own.

Here’s how the datasets vary:

Payer transparency data shows what was negotiated at the contract level, but MRF files are required to list every code in a contract, including codes a provider has never actually billed. These phantom rates inflate reported medians and distort any analysis built on them.
Hospital MRF data provides billing context, including negotiated rates, chargemaster rates, and cash pay rates. However, payer and plan data reporting from hospitals is not uniform, and requires a classification system to compare across facilities and plans.
Claims data shows what was actually paid in the real world. It reflects genuine utilization and allowed amounts. But on its own, it is backward-looking. It tells you what happened rather than what is currently contracted or what the market is paying today.

These nuances mean that using only one source means operating on partial truth. The problem is that incomplete data rarely announces itself. It just quietly shapes the wrong decisions.

How Serif Brings All Three Together

Serif cross-validates all three data sources at the code, provider, and market level before any result reaches you for an integrated view of the data.

Claims data filters out phantom rates from payer files. If a provider has never billed a code, that rate is removed. This step alone can eliminate up to 90% of raw MRF records, leaving a far more reliable dataset.
Payer data anchors the fee schedule, providing negotiated rate context that claims data alone cannot give you.
Hospital data validates and fills gaps in what payers reported, particularly for complex contract structures like line-level carve-outs that payer files do not capture cleanly.

The result is a dataset that has been stress-tested against itself because all three sources are already reconciled. Your team does not have to do that work. You start with the question, not the data preparation.

One More Layer Most Vendors Miss

There is a persistent data quality problem that makes this integration even more critical: custom and local billing codes. Major payers, including UnitedHealthcare, Cigna, and Aetna, publish rates under proprietary code systems rather than standard CPT or DRG codes. If your query looks for CPT 99283, it may return nothing, even when the rate exists under a different code name.

Serif’s normalization layer maps these custom codes to standard equivalents. Data that would otherwise disappear into gaps is surfaced and made usable. This is the kind of work that happens before you see a result, not work you must do yourself. Read more about how Serif solves custom codes here.

The Scale Behind It

Serif covers more than 230 commercial payers, over 5,000 hospitals, and 250 million covered lives in claims data. That breadth matters. But breadth without integration is just more data to sort through.

For contracting teams, benefits analysts, finance leaders, and strategy teams, the value is not the volume of data. It is that the data is already connected, already validated, and already ready to answer the question in front of you.

The Real Difference Between Data and a Decision

Claims show what happened. Payer data shows what was negotiated. Hospital data provides billing context. You need all three, working together, to make accurate decisions.

The Serif difference is not just three datasets. It is three datasets that are already integrated, already validated, and already working together.

‍

Choosing a Network Isn’t About Discounts, It’s About Total Cost