TriNetX Data Sets

Home > TriNetX Solutions > Data Sets & Analytics > Downloadable Data Sets

Explore real-world data from TriNetX in your own environment

TriNetX makes it easy for clinical researchers to query, license, and download real-world clinical data that is aggregated and harmonized directly from healthcare organizations on a continuous basis.

About TriNetX Data Sets

TriNetX offers researchers data that is current, HIPAA and GDPR compliant, and clinically broader than traditional licensed research data, opening up a new range of research possibilities.

  • Quickly explore available data to assess research feasibility

  • Purchase and access desired research data sets quickly via web download

  • Elevate research ambitions with more extensive clinical data

  • Increase confidence in research findings through improved data diversity and traceability





Lab Results





Longitudinal Clinical & Genomic Data

TriNetX has partnered with healthcare organizations spanning 30 countries and thousands of sites, and with data providers covering 99% of US health plans to deploy a linked and continually updated global health research network. Unlike traditional health research data sources, TriNetX aggregates information directly from electronic medical records (EMR) systems on a continuous basis, resulting in data that is current and more representative of the true patient population. TriNetX partner healthcare organizations include a mix of hospital, primary care, and specialty treatment providers spanning a wide range of geographies, age groups, and income levels.

Define your cohort.

If you’re a TriNetX subscriber, you already know how to quickly explore billions of health facts and define new research cohorts through our flexible web-based cohort builder.

If you’re new to TriNetX, rest assured that no subscription is required. You and a researcher from our Clinical Sciences team will use this same tool together to build the exact cohort you need. We believe in transparency at every step.

Click to view

Review the data.

We’ll carefully review the summary statistics and fill rates with you to make sure your cohort comes with all the data you need to power your analysis. Refine the size and criteria of your cohort until it’s perfect.

Click to view

Request and license.

The order process is clear and quick. Our terms grant you rights to the data for one year, with the option to receive refreshed data on your cohort quarterly during that time.

Click to view


Within days of finalizing your cohort, we’ll notify you that your files are ready. You can download a compressed folder containing CSV files or import the data directly into your in-house application.

Click to view

Get right to work with analysis-ready tables on LUCID!

Skip the downloads and uploads, and take possession of your data in our trusted research environment. LUCID is a powerful notebook environment that supports your code without leaving TriNetX. Explore, analyze, and model with complete control.

Learn more here

Billions of Clinical Facts.
New Research Possibilities.

See how one of our power users defined his criteria, requested his dataset, and arranged his files to answer a critical question about the relationship between eGFR and heart failure.

TriNetX Licensed Data Sets

To understand the patient journey, you may need to dive deeper than aggregate counts and means. Spanning domains from diagnoses to genomic variants, our data sets attribute every observation to a pseudonymized patient ID, encounter, and date, making it straightforward to build robust longitudinal pictures of today’s patients. We deliver the data in a universal, ready-to-use format—linked CSV tables—to give you the freedom to analyze on any application you choose.

Use Cases

  • Conduct precise time-to-event analysis
  • Track in-clinic medications and procedures
  • Compare multiple cohorts at once, along any number of characteristics
  • Reconstruct individual patient histories
  • Follow changes in lab values
  • Train predictive models using thousands
    of well-represented patient co-variates

Key Data Elements

  • Demographics
  • Diagnoses
  • Procedures
  • Medications
  • Labs
  • Encounters

Get an Inside Look a Curated Data Sets

Download the Data Set Snapshot for high-level information on the clinical facts and data coverage within four top therapeutic areas.

Curated data sets are designed to be immediately usable. They’re built upon a normalized, standards-based terminology that is an OMOP inspired, Native TriNetX Format, or the Sentinel Common Data Model, ensuring compatibility and ease of use. These research-ready data sets include calculated, derived variables like BMI, and new therapeutic-specific tables and facts purpose-built your analysis.


Linked Data Sets

Our linked data sets combine data sourced from EHR and insurance claims into a single, longitudinal record - for each one of the 11.2 million patients represented in both sources. Secure and rapid tokenization allows us to match EHR and claim records on a per-patient basis without ever accessing or exposuring personally identifying information. The result is a robust record that follows a patient across time and between providers, bringing demographics, clinical observations, treatment details, and costs under one view. By further linking with federal death registries and private obituaries, we support analyses of long-term survival in addition to the full array of HEOR, efficacy, and safety analyses.

Use Cases

  • Incidence and prevalence
  • Long-term safety and efficacy
  • Treatment patterns
  • Drug adherence and persistence
  • Burden of Illness
  • Cost of care
  • Disease Progression
  • Overall survival

Key Data Elements

  • Demographics
  • Diagnoses
  • Procedures
  • Medications
  • Labs
  • Encounters
  • Enrollment
  • Claim headers & lines
  • Costs
  • Rx fills

Data Set Curation

Finding the data you need for your research can be a complex exercise. Large data set purchases often contain data you don’t need; identifying appropriate inclusion/exclusion criteria can be time consuming and costly; there may be data sources well-suited to your needs that you haven’t identified. Allow TriNetX data and platform experts to take the burden of data curation off your plate. Our team will understand your data needs, build and iterate on a query across our global data network, and ensure that you are delivered one data set suited to your needs.

Use Cases

  • Reduce need for large data set purchases and further curation of data
  • Precision curation eliminates the need for repeated and iterative data set downloads
  • Find symptomatic patients without an ICD-10 code
  • Identify rare disease patients