TriNetX Datasets
Explore real-world data from TriNetX in your own environment
TriNetX makes it easy for clinical researchers to query, license, and download real-world clinical data that is aggregated and harmonized directly from healthcare organizations on a continuous basis.
About TriNetX Datasets
TriNetX offers researchers data that is current, HIPAA and GDPR compliant, and clinically broader than traditional licensed research data, opening up a new range of research possibilities.
Quickly explore available data to assess research feasibility
Purchase and access desired research data sets quickly via web download
Elevate research ambitions with more extensive clinical data
Increase confidence in research findings through improved data diversity and traceability





Lab Results




Longitudinal Clinical & Genomic Data
TriNetX has partnered with healthcare organizations spanning 30 countries and thousands of sites, and with data providers covering 99% of US health plans to deploy a linked and continually updated global health research network. Unlike traditional health research data sources, TriNetX aggregates information directly from electronic medical records (EMR) systems on a continuous basis, resulting in data that is current and more representative of the true patient population. TriNetX partner healthcare organizations include a mix of hospital, primary care, and specialty treatment providers spanning a wide range of geographies, age groups, and income levels.
Define your cohort.
If you’re a TriNetX subscriber, you already know how to quickly explore billions of health facts and define new research cohorts through our flexible web-based cohort builder.
If you’re new to TriNetX, rest assured that no subscription is required. You and a researcher from our Clinical Sciences team will use this same tool together to build the exact cohort you need. We believe in transparency at every step.
Review the data.
We’ll carefully review the summary statistics and fill rates with you to make sure your cohort comes with all the data you need to power your analysis. Refine the size and criteria of your cohort until it’s perfect.
Request and license.
The order process is clear and quick. Our terms grant you rights to the data for one year, with the option to receive refreshed data on your cohort quarterly during that time.
Within days of finalizing your cohort, we’ll notify you that your files are ready. You can download a compressed folder containing CSV files or import the data directly into your in-house application.
Get right to work with analysis-ready tables on LUCID!
Skip the downloads and uploads, and take possession of your data in our trusted research environment. LUCID is a powerful notebook environment that supports your code without leaving TriNetX. Explore, analyze, and model with complete control.
Billions of Clinical Facts.
New Research Possibilities.
See how one of our power users defined his criteria, requested his dataset, and arranged his files to answer a critical question about the relationship between eGFR and heart failure.
TriNetX Licensed Datasets
To understand the patient journey, you may need to dive deeper than aggregate counts and means. Spanning domains from diagnoses to genomic variants, our data sets attribute every observation to a pseudonymized patient ID, encounter, and date, making it straightforward to build robust longitudinal pictures of today’s patients. We deliver the data in a universal, ready-to-use format—linked CSV tables—to give you the freedom to analyze on any application you choose.
Use Cases
- Conduct precise time-to-event analysis
- Track in-clinic medications and procedures
- Compare multiple cohorts at once, along any number of characteristics
- Reconstruct individual patient histories
- Follow changes in lab values
- Train predictive models using thousands
of well-represented patient co-variates

Key Data Elements
- Demographics
- Diagnoses
- Procedures
- Medications
- Labs
- Encounters
Get an Inside Look at Curated Datasets
Download the Dataset Snapshot for high-level information on the clinical facts and data coverage within four top therapeutic areas.
Curated datasets are designed to be immediately usable. They’re built upon a normalized, standards-based terminology that is an OMOP inspired, Native TriNetX Format, or the Sentinel Common Data Model, ensuring compatibility and ease of use. These research-ready data sets include calculated, derived variables like BMI, and new therapeutic-specific tables and facts purpose-built your analysis.
Linked Datasets
Our linked datasets combine data sourced from EHR and insurance claims into a single, longitudinal record - for each one of the 11.2 million patients represented in both sources. Secure and rapid tokenization allows us to match EHR and claim records on a per-patient basis without ever accessing or exposing personally identifying information. The result is a robust record that follows a patient across time and between providers, bringing demographics, clinical observations, treatment details, and costs under one view. By further linking with federal death registries and private obituaries, we support analyses of long-term survival in addition to the full array of HEOR, efficacy, and safety analyses.
Use Cases
- Incidence and prevalence
- Long-term safety and efficacy
- Treatment patterns
- Drug adherence and persistence
- Burden of Illness
- Cost of care
- Disease Progression
- Overall survival
Key Data Elements
- Demographics
- Diagnoses
- Procedures
- Medications
- Labs
- Encounters
- Enrollment
- Claim headers & lines
- Costs
- Rx fills
Dataset Curation
Finding the data you need for your research can be a complex exercise. Large data set purchases often contain data you don’t need; identifying appropriate inclusion/exclusion criteria can be time consuming and costly; there may be data sources well-suited to your needs that you haven’t identified. Allow real-world data and TriNetX LIVE™ experts to take the burden of data curation off your plate. Our team will understand your data needs, build and iterate on a query across our global data network, and ensure that you are delivered one dataset suited to your needs.
Use Cases
- Reduce need for large data set purchases and further curation of data
- Precision curation eliminates the need for repeated and iterative dataset downloads
- Find symptomatic patients without an ICD-10 code
- Identify rare disease patients

Complete the form below to download the Curated Dataset Snapshot!
"*" indicates required fields
Request a demo now:
"*" indicates required fields