LESSON 03

Diagnostics & Medical Devices

Clinical Validation vs. Clinical Evidence: What Payers and Hospitals Actually Require

Regulatory clearance proves a device is safe and effective enough to sell. Clinical evidence proves it is worth buying — and those are different questions with different answers.

13 min read

There is a critical distinction that most diagnostics founders discover too late: the evidence required to clear a product with FDA and the evidence required to convince a hospital to use it or a payer to reimburse it are not the same evidence. FDA clearance establishes that a device is safe and performs as labeled. It says nothing about whether the device changes clinical outcomes, reduces costs, or improves care delivery. Hospitals and payers care about those latter questions, and a cleared product without answers to them will sit in a regulatory filing cabinet while the sales team loses every deal.

Clinical validation, in the narrow regulatory sense, refers to the analytical and clinical performance studies that establish a test's accuracy relative to a reference standard. For a diagnostic test, this means measuring sensitivity — the probability that the test is positive when the condition is present — and specificity — the probability that the test is negative when the condition is absent — in a well-characterized patient population. These studies demonstrate that the test works. They do not demonstrate that using the test changes what clinicians do or improves what happens to patients.

Clinical utility is the concept that bridges test performance and clinical value. A test has clinical utility when its result changes clinical management and that change leads to better patient outcomes. Demonstrating clinical utility requires a different study design than analytical validation — typically a prospective interventional trial where one group of patients is managed using the test and another group is managed without it, and outcomes are compared. Clinical utility studies are expensive, take years, and are frequently not required for regulatory clearance, which is why so many cleared diagnostics lack them — and why so many cleared diagnostics struggle to achieve commercial adoption.

Hospital value analysis committees — often called VACs — are the internal review bodies that evaluate new products before a health system approves them for procurement. VACs typically include clinicians, pharmacists, supply chain officers, quality staff, and financial analysts. They evaluate clinical evidence, total cost of ownership, workflow impact, and health system strategic priorities. A product presented to a VAC with only regulatory clearance and analytical validation data will almost always be asked for more — specifically, outcome data showing that using the product leads to better or cheaper care compared to current practice.

Health technology assessment, or HTA, is the formal process used in many healthcare systems — and increasingly by large payers in the United States — to evaluate whether a new technology provides sufficient clinical and economic benefit to justify coverage or inclusion on a formulary. HTA bodies like the National Institute for Health and Care Excellence in the UK, or NICE, and commercial payer medical policy teams in the US evaluate systematic reviews of the evidence, cost-effectiveness models, and comparative effectiveness data. A diagnostics company targeting covered reimbursement in any major market must understand which HTA frameworks apply and design its evidence program to satisfy them.

Comparative effectiveness research compares a new diagnostic or device against the current standard of care rather than against a placebo or no-testing baseline. This is the type of evidence that HTA bodies and sophisticated hospital systems find most persuasive. The question is not whether your test is accurate in an idealized study population — it is whether your test is more accurate, faster, safer, or cheaper than what clinicians currently use and whether those differences translate into better outcomes at a cost that health systems can justify. Companies that design studies only to demonstrate absolute performance, rather than performance relative to alternatives, build evidence packages that win regulatory arguments but lose commercial ones.

Real-world evidence — data collected from electronic health records, claims databases, registries, and post-market surveillance programs rather than controlled trials — is playing an increasingly important role in coverage and adoption decisions. Real-world evidence can supplement clinical trial data to demonstrate that a test performs as expected in routine care, that its clinical utility persists across diverse patient populations, and that downstream cost offsets are measurable in actual health system data. For diagnostics, building a data strategy that generates real-world evidence from early commercial deployments is no longer optional — it is the mechanism by which the second and third commercial contracts are won.

Cleared is not the same as proven. Proven is not the same as adopted. Adoption requires evidence that speaks to the buyer's actual decision, not the regulator's.

This lesson is coming soon.

TERMS

Term of focus

Clinical Sensitivity and Specificity

Clinical sensitivity measures the proportion of patients with a condition who test positive — how reliably the test detects true disease. Clinical specificity measures the proportion of patients without a condition who test negative — how reliably the test avoids false positives. Both are reported as percentages and are typically presented together because a test optimized for sensitivity often sacrifices specificity, and the acceptable tradeoff depends entirely on the clinical consequence of each error type.

Clinical utility describes whether using a diagnostic test changes clinical management in a way that improves patient outcomes or reduces costs compared to not using it. Demonstrating clinical utility requires evidence beyond analytical accuracy — typically a prospective study showing that test-guided management leads to measurably better results. Clinical utility is the primary evidence payers and hospital procurement bodies require, and it is frequently absent from the evidence packages of cleared diagnostics.

A VAC is the internal multidisciplinary review body within a health system responsible for evaluating new products before approving them for procurement and clinical use. Members typically include clinicians, supply chain, quality, and finance stakeholders who collectively assess clinical evidence, cost, workflow impact, and strategic alignment. A product that cannot pass VAC review will not be adopted regardless of its regulatory status.

HTA is the systematic evaluation of the clinical, economic, social, and ethical implications of a health technology to inform coverage, reimbursement, and clinical adoption decisions. HTA bodies use evidence frameworks that weigh clinical effectiveness, comparative performance, and cost-effectiveness against existing standards of care. In the United States, large commercial payers conduct informal HTA through medical policy reviews; in most other markets, formal HTA is a mandatory step before national coverage decisions.

Comparative effectiveness research evaluates a new diagnostic or intervention against the current standard of care rather than against a no-treatment or no-testing baseline. The question is whether the new technology performs better, not merely whether it performs at all. Payers and hospital procurement bodies strongly prefer comparative effectiveness data because it directly answers the decision they face: should we switch from what we currently do?

RWE is clinical evidence derived from data collected outside traditional randomized controlled trials — including electronic health records, insurance claims, patient registries, and post-market surveillance programs. Payers and regulators increasingly accept RWE to supplement controlled trial data for coverage and label expansion decisions. For diagnostics, RWE generated from early commercial deployments is the mechanism for demonstrating that trial performance translates to routine clinical practice.

The reference standard — sometimes called the gold standard — is the best available method for determining whether a patient truly has the condition a test is designed to detect, used as the comparator against which a new test's accuracy is measured. Choosing the right reference standard is one of the most consequential methodological decisions in a diagnostic validation study, because imperfect reference standards introduce bias that can make a test appear more or less accurate than it truly is.

BEFORE YOUR NEXT MEETING

— Do we have clinical utility data — not just analytical accuracy data — and if not, have we designed a study that would generate it before we enter our first major health system sales conversation?

— Which payers are likely to be asked for coverage of our test, and have we reviewed their current medical policy on similar diagnostics to understand the evidence standard they apply?

— If a value analysis committee asks us to compare our test against the current standard of care rather than against no testing, do we have that data — and if not, what is our answer?

— What real-world evidence are we generating from our early commercial deployments, and does our contract structure with early customers allow us to access and publish that data?

— Have we engaged an HTA body or reviewed a coverage decision framework in any of our target markets, and do we know what type of economic model they require for a coverage recommendation?

REALITY CHECK

SOURCES

↗Teutsch et al. — 'The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative' (2009)

↗Institute of Medicine — 'Evolution of Translational Omics: Lessons Learned and the Path Forward' (2012)

↗NICE — 'Evidence Standards Framework for Digital Health Technologies' (2019)

↗FDA — 'Framework for Real-World Evidence Program' (2018)

↗Bossuyt et al. — 'STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies'

↗Academy of Managed Care Pharmacy — 'AMCP Format for Formulary Submissions' (2016)

↗Fryback and Thornbury — 'The Efficacy of Diagnostic Imaging' (1991)

LESSON 03 OF 04