MIDRC Data Commons

The MIDRC Data Commons is an AI‑ready, curated medical imaging dataset, currently encompassing over 135,000 public imaging studies (from a total collection of more than 300,000), sourced from chest X‑rays, chest CT scans, and later expanded to MRI, ultrasound, PET, and other anatomical regions across modalities. All images are stored in standard DICOM format, fully de‑identified, and paired with rich clinical metadata, including patient demographics, COVID‑19 status, imaging protocol tags, and harmonized descriptions based on LOINC standards. The dataset adheres to FAIR principles via the Gen3 Data Ecosystem, allowing registered users to build cohorts, query across metadata, and download images and annotations under a controlled data use agreement. t also features a sequestered (private) subset reserved specifically for AI validation/testing and regulatory benchmark purposes, separate from the open public dataset. The effort includes curation pipelines—covering de‑identification, abstraction, quality assessment, and ontology mapping—as well as semi‑automated annotation tools (e.g., DICOM SR/SEG, JSON) to support downstream AI development.

Data and Resources

Additional Info

Field Value
Source N/A
Author N/A
Maintainer N/A
Version N/A
Last Updated June 22, 2025, 20:55 (UTC)
Created June 22, 2025, 20:55 (UTC)
Publications 5343
access_type Open
aliases ["MIDRC", "MIDRC Data Commons", "MIDRC Data", "MIDRC Imaging Data Commons", "Medical Imaging Data Resource Center"]
creationMethod N/A
creatorEmail N/A
creatorName N/A
dataType N/A
endDateTime 2025-06-11T12:00:00
flag_terms ["NIBIB", "National Institute of Biomedical Imaging and Bioengineering", "NIH", "National Institutes of Health", "ARPA-H", "Advanced Research Projects Agency for Health", "University of Chicago", "American College of Radiology", "Radiological Society of North America", "American Association of Physicists in Medicine"]
issueDate 2020-08-01
lastUpdateDate 2025-06-11
pocEmail midrc-support@datacommons.io
pocName Medical Imaging and Data Resource Center (MIDRC)
publisherEmail midrc-support@datacommons.io
publisherName University of Chicago
purpose To foster machine learning innovation through data sharing for rapid and flexible collection, analysis, and dissemination of imaging and associated clinical data, providing researchers with unparalleled resources.
startDateTime 2020-08-01T00:00:00
status draft
theme []
uploadType dataset