NCI Imaging Data Commons (IDC) v24 is live!
This is the largest data release by file count since 2022, adding 15 new collections and approximately 5.7 terabytes (TB) of new data. With this release we cross the milestone of 1 million publicly available DICOM series, bringing IDC’s total to nearly 100 TB (99.3 TB) across 176 collections. Here is a summary of what’s new.
Digital Pathology
v24 brings a major expansion of slide microscopy (SM) content — the dominant addition by data volume (~4 TB of new pathology slides). All of the slide microscopy data available in IDC is harmonized into DICOM from vendor-specific formats.
New collections sourced from Genomics Data Commons (GDC), with the matching genomic data available from GDC:
-
CGCI-BLGSP — Burkitt Lymphoma Genome Sequencing Project: 388 patients, 1,933 series, 1.6 TB (DOI)
-
CGCI-HTMCP-DLBCL — Human immunodeficiency virus (HIV)+ Tumor Molecular Characterization, Diffuse Large B-Cell Lymphoma: 43 patients, 496 series, ~545 GB (DOI)
-
HCMI-CMDC — Human Cancer Models Initiative: 382 patients, 810 series, ~372 GB (DOI)
-
CGCI-HTMCP-CC — HIV+ Tumor, Cervical Squamous Cell Carcinoma: 211 patients, 525 series, ~148 GB (DOI)
-
CGCI-HTMCP-LC — HIV+ Tumor, Lung Cancer: 27 patients, 84 series (DOI)
-
CDDP-EAGLE-1 — Lung adenocarcinoma: 49 patients, ~23 GB (DOI)
Digital pathology collections from other contributors:
-
CATCH — Canine skin cancers (melanoma, squamous cell carcinoma (SCC), malignant peripheral nerve sheath tumor (MPNST)): 282 cases, 350 series, ~571 GB (converted from the data sourced from TCIA) (DOI)
-
PDXNet — Patient-Derived Xenograft Network (human + mouse models): 919 subjects, 919 series, ~126 GB (submitted by the PDXNet consortium) (DOI)
-
HTAN-TNP-SARDANA — Human Tumor Atlas Network, multiplexed fluorescence SM: colon mucinous adenocarcinoma (sourced from HTAN) (DOI)
Radiology
New collections curated by The Cancer Imaging Archive (TCIA) - those are ingested “as is”, and are now available via the resources and tools maintained by IDC:
-
EAY131 (NCI-MATCH trial) — The largest new radiology addition in v24. This collection brings imaging from the National Cancer Institute (NCI) Molecular Analysis for Therapy Choice (MATCH) precision oncology trial: 2,813 patients, 30,293 series (~797 gigabytes (GB)), spanning CT, MR, and PET across abdomen, chest, pelvis, and more (DOI). It is accompanied by a new analysis results collection, EAY131-Tumor-Annotations, with tumor annotations for 15,799 series (DOI).
-
LDCT-and-Projection-data — 200 patients, 698 series, ~863 GB of low-dose CT data including raw projection data. Valuable for algorithm development at the sinogram level (DOI).
-
PSMA-PET-CT-Lesions — 378 patients with paired prostate-specific membrane antigen (PSMA) PET/CT imaging and lesion segmentations (CT + PT + SEG, ~117 GB) (DOI).
-
Spinal-Multiple-Myeloma-SEG — 67 patients with CT + expert segmentations of spinal multiple myeloma (~304 GB). Paired CT and SEG series for all patients (DOI).
-
CPTAC-STAD — Stomach adenocarcinoma imaging from the Clinical Proteomic Tumor Analysis Consortium (CPTAC): 20 patients, CT + ultrasound (DOI).
A new mouse imaging collection was contributed by the University of Washington team led by Paul Kinahan and McGarry Houghton, as part of the NCI Co-Clinical Imaging Research Resources Program (CIRP) consortium activities:
- UW-CIRP-Mouse-PET-CT-NSCLC — A preclinical mouse model collection with PET/CT and radiotherapy (RT) structure sets for non-small cell lung cancer (NSCLC) research (14 subjects). IDC’s growing support for non-human imaging continues (DOI).
Updated Collections
- BoneMarrowWSI-PediatricLeukemia — 1,027 updated cell-level expert annotations (DICOM ANN series) to fix issues identified in the previous release of this collection (DOI).
Get started
-
Official LLM IDC skill: GitHub - ImagingDataCommons/idc-claude-skill: Natural language interface to NCI Imaging Data Commons · GitHub - the easiest interface for interacting with IDC
-
Explore in the portal: https://portal.imaging.datacommons.cancer.gov
-
Programmatic access:
pip install --upgrade idc-index -
Getting started notebooks: GitHub - ImagingDataCommons/IDC-Tutorials: Self-guided notebook tutorials to help get started with IDC · GitHub
Full data release notes: Data release notes | IDC User Guide
