New in IDC v18: TotalSegmentator segmentations and radiomics features for NLST CTs

IDC data release v18 went live quietly a couple of weeks ago. We have a lot in store to announce, but this time we will do it bit by bit - there are too many exciting updates!

The first highlight of IDC v18 is the availability of AI-generated segmentations for the National Lung Screening Trial (NLST) collection! NLST is one the largest collections we have, but until now had very few image annotations.

Harnessing the power of the cloud (we will discuss separately how this was done!), we applied TotalSegmentator to 126,088 CT series from NLST to segment the total of 9,565,554 anatomic structures. We then utilized pyradiomics to extract first order (e.g., mean and standard deviation of intensity) and shape (e.g., volume and sphericity) features for each of the segmented structures. Both the segmentations and radiomics features are available in IDC for download under CC-BY license.

Manifests for these analysis results (if you want to just download the entire 14 TB of files in bulk) and a bit more details about this content are shared on Zenodo here.

You can access the segmentations in IDC Portal here, or access them directly from 3D Slicer using SlicerIDCBrowser extension. The video below demonstrates how to use IDC Portal and Slicer to explore this collection.

More details, demonstrations and usage instructions will be shared soon. We are working on several ideas on how to simplify access to the radiomics features. In the meantime, if you have any questions or wishes about this collection or accompanying documentation, please PM me or reply to this thread!

Subscribe to the forum to stay tuned for updates about this collection and other upcoming highlights about IDC v18! :wink:

4 Likes

Kudos to the IDC team for this tremendous accomplishment. :muscle:

1 Like

If you are interested to learn how we used Google Cloud via Broad Terra platform to perform this analysis for just a bit over $1K, check out this preprint!

Thiriveedhi, V. K., Krishnaswamy, D., Clunie, D., Pieper, S., Kikinis, R. & Fedorov, A. Cloud-based large-scale curation of medical imaging data using AI segmentation. Research Square (2024). doi:10.21203/rs.3.rs-4351526/v1

[…] Utilizing >21,000 Virtual Machines (VMs) over the course of the computation we completed analysis in under 9 hours, as compared to the estimated 522 days that would be needed on a single workstation. The total cost of utilizing the cloud for this analysis was $1,011.05. Our contributions include: 1) an evaluation of the numerous tradeoffs towards optimizing the use of cloud resources for large-scale image analysis; 2) CloudSegmentator, an open source reproducible implementation of the developed workflows, which can be reused and extended; 3) practical recommendations for utilizing the cloud for large-scale medical image computing tasks. We also share the results of the analysis: the total of 9,565,554 segmentations of the anatomic structures and the accompanying radiomics features in IDC as of release v18.

Congratulations! That is :muscle:

2 posts were split to a new topic: Extracting radiomics features from existing segmentations

A post was split to a new topic: How to access measurements accompanying TotalSegmentator-CT-Segmentations analysis results collection?

To simplify access to this dataset, we have now added zip files with the per-structure radiomics features in CSV and Parquet format to the Zenodo record. You can access them here: https://doi.org/10.5281/zenodo.12004521.