IDC July 2023 release

NCI Imaging Data Commons team is happy to announce the July 2023 data release v15!

This release introduces 7 new collections and the pathology component for the ICDC-Glioma collection, and adds new capabilities to the IDC Portal to simplify access to the data.

Data

The first highlight of this release is the inclusion of the data collected in the National Library of Medicine Visible Human Project dataset (VHD). Since its initial release in 1994, VHD enabled a number of studies, including development of the Insight Toolkit, and its initial release in 2002 (check out this 2003 publication by Michael Ackerman and Terry Yoo for the historical perspective). For a long time, access to VHD required obtaining a license, but recently NLM lifted that requirement. While the access to the data was opened, downloading it from the NLM was somewhat suboptimal, and more importantly, radiology data was shared in the proprietary format that predated DICOM, complicating its use.

The IDC release of the VHD dataset includes both the radiology (CT and MR) and digitized cryosections images for both Visible Male and Visible Female. As part of data harmonization, images were converted into standard DICOM representation (from the original proprietary GE Signa format and raw color for the radiology and cryosection images, respectively), while maintaining the acquisition metadata in standard DICOM attributes. Cryosections are available as External Camera Photography (XC) modality DICOM series. Individual files (instances) of the XC series can be loaded using commonly used tools, such as 3D Slicer or ITK. To demonstrate this, we prepared a Colab demo notebook that demonstrates how you can visualize those images. Since the XC modality contains spatial information, you can also load cryosection stack as a volume in 3D Slicer (a couple of the issues identified in ITK need to be resolved before this will work directly - you can monitor this issue for status: BUG: support color DICOM data series by pieper · Pull Request #7089 · Slicer/Slicer · GitHub).

A preview of what will be possible “out of the box” with 3D Slicer and VHD (assuming you have sufficient RAM to fit it in!) can be seen from the screenshot and video prepared by @pieper!

We expanded the availability of searchable clinical data accompanying IDC collections by including publicly available subset of clinical data available for the NLST collection. As a result, over 47K of nearly 64K cases available in IDC now have at least some clinical data in addition to images and annotations. If you are interested to explore how to search IDC clinical data and combine it with the imaging attributes, check out our clinical data intro tutorial.

Portal

We are continuously looking for ways to simplify user experience downloading data from IDC. In this release we introduce a feature that allows to define and download manifest from the portal without having to log in. Once you have s5cmd tool installed, copy the command line and download the files included in the manifest!

You must be excited to download the Visible Human Dataset after reading the earlier section - see the quick demo below how you can do it using the updated portal! :wink:

2023-07-17_13-38-55

We are continuously exploring ideas to further simplify access to the data. You are welcome to contribute your thoughts in the discussion here: Simplify the ease of downloading individual items from the portal · Issue #1186 · ImagingDataCommons/IDC-WebApp · GitHub.

Tutorials

A few releases ago we introduce the nnU-Net-BPR-Annotations analysis results collection with the segmentations and slice-level annotations of anatomy. We now have a repository and a preprint describing that collection, to help both use it and perform similar analyses by the others.

2023-07-17_14-11-24_bpr

Reminders

  • If you have any questions about IDC, you can email them to support@canceridc.dev or start a new thread in IDC forum.
  • Please drop by IDC Office Hours to ask any questions about IDC: every Tuesday 16:30 – 17:30 (New York) and Wednesday 10:30-11:30 (New York) via Google Meet at https://meet.google.com/xyt-vody-tvb .
  • Free cloud credits are available for those who want to explore features of Google Cloud not included in the free tier (e.g., Cloud Compute Engine, Vertex AI, using Healthcare API for your data): apply here

Summary

(as always, the live dashboard for the screenshot above is available here)

2 Likes