Issues with NSCLC-Radiomics collection

Hi,

I am working with the NSCLC-Radiomics collection in IDC and wanted to point out a few issues with the dataset. I noticed that there (at least) a few series with missing slices and multiple pixel spacing values.

Here are some examples:

https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.32722.99.99.62174692101933124338373944934297606346
https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.32722.99.99.20513433714193890964781227188470278863
https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.32722.99.99.108388375584764933411258715618206379998

I can certainly write a query to filter out such cases before my analysis, but was curious if others had come across similar problems with this dataset, or had any insight as to how to resolve these issues.

Thanks!

Deepa

Thank you for reporting this @deepa! I was not aware of this issue.

I went back to TCIA, and confirmed that for all of the 3 CT series you identified TCIA has the same number of slices. At least we know that those slices were not lost in the process of ingesting data into IDC!

I submitted a ticket to TCIA asking about this issue (here, for completeness). But I am not optimistic those slices can be recovered. I think it would be good if we could figure out a query that would allow checking whether a series has missing slices, so this can be known before the user downloads the files.