IDC March 2023 release

It’s been a while since we had a release in IDC, but March 2023 release is now live with what we think are some exciting updates! :fireworks:

Data

This release marks an important milestone for IDC - for the first time we are including a new analysis results collection ingested directly from a DICOM dataset shared in a general purpose data repository (Zenodo, in this case)!

The newly added analysis results collection is nnU-Net-BPR-annotations, which contains the files shared in this Zenodo entry:

Krishnaswamy, D., Bontempi, D., Clunie, D., Aerts, H. & Fedorov, A. AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections. (2022). doi:10.5281/zenodo.7539035

This dataset contains AI-generated annotations produced by two popular algorithms (both volumetric segmentations and slice-level annotations of body regions and landmarks), which can now be searched and visualized in IDC. We are working on a report describing the process of generating this data and its content, and various resources to demonstrate how it can be used - stay tuned! For now, check out this Looker Studio dashboard that is focused on the exploration of the produced annotations.

2023-03-16_18-07-08

(and here is the IDC viewer link to the dataset in the screen capture above if you would rather explore it yourself: https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.32722.99.99.33434883007817985455420606067524076428)

Viewers

We are very happy to announce that we added the ability to visualize images stored in IDC Google Storage buckets using open source, zero footprint in-browser VolView viewer from Kitware!

For now (since we plan to streamline this process in the future), you will first need to create a manifest file that follows defined format and lists individual instance series in the IDC bucket, and then either drop this manifest into the https://volview.netlify.app webapp, as shown below, or host that file somewhere and pass it to the webapp in the URL, such as in this example: https://volview.netlify.app/?urls=https://gist.githubusercontent.com/fedorov/099347b752ca4eb5562ffca861debde9/raw/756f8008dd4731c5ba4da2d55169dd4aee4426a5/idc_nc_series_sample.json, which fetches the manifest from this gist. For comparison, here is the same series visualized in IDC OHIF viewer.

2023-03-16_17-50-35

To create a manifest JSON for a DICOM series in IDC defined by SeriesInstanceUID, you can use the following BigQuery SQL query! It will also give you the URL of the same series visualized in IDC OHIF Viewer.

Not sure what to do with the query below? Read on for the pointer to our “Getting started” tutorial series! :wink:

SELECT
 STRING_AGG(DISTINCT(CONCAT("https://viewer.imaging.datacommons.cancer.gov/viewer/",StudyInstanceUID,"?seriesInstanceUID=", SeriesInstanceUID))) as idc_viewer,
  CONCAT("{\"resources\":[", STRING_AGG(CONCAT("{\"url\":\"", gcs_url, "\"}"), ","), "]}") as volview_manifest
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  # set SeriesInstanceUID for the series you want to visualize
  SeriesInstanceUID = "1.3.6.1.4.1.14519.5.2.1.1600.1202.283877825196805338641083041279"

Why VolView integration is important? Two main reasons:

  1. you can now volume render IDC images - something that cannot be done in the IDC radiology viewer that uses OHIF
  2. you can combine visualization of IDC images and your own data within an existing viewer that you do not need to maintain or deploy (we do have a tutorial on how you can deploy your own instance of OHIF, and we know this works great for some folks, but it does require initial effort to set up)

This integration became possible thanks to the new partnership with Stephen Aylward and Forrest Li from Kitware, and continuing support from Adler Santos and Antonio Lobato from the Google Public Datasets Program!

Tutorials

The “Getting started” tutorial series we presented at RSNA’22 is a great resource to learn about IDC basics (including BigQuery). Please check it out and let us know your thoughts, or what other topics you would like to see covered!

Download

If you have not heard about s5cmd yet - you are up for a treat! It’s a blazingly fast and highly flexible open source tool that can be used to access files via S3 interface (which covers Google Storage buckets), and is a lot faster than Google’s gsutil. We updated our documentation to describe how you can use s5cmd to download IDC content, now without the need to create any keys!

Alert

:information_source: We have quite a few exciting updates coming up soon - stay tuned for our next release!

Reminders

  • If you have any questions about IDC, you can email them to support@canceridc.dev or start a new thread in IDC forum.
  • Please drop by IDC Office Hours to ask any questions about IDC: every Tuesday 16:30 – 17:30 (New York) and Wednesday 10:30-11:30 (New York) via Google Meet at https://meet.google.com/xyt-vody-tvb .
  • Free cloud credits are available for those who want to explore features of Google Cloud not included in the free tier (e.g., Cloud Compute Engine, Vertex AI, using Healthcare API for your data): apply here

Summary

(as always, the live dashboard for the screenshot above is available here)

3 Likes