Issues with NSCLC-Radiomics collection


I am working with the NSCLC-Radiomics collection in IDC and wanted to point out a few issues with the dataset. I noticed that there (at least) a few series with missing slices and multiple pixel spacing values.

Here are some examples:

I can certainly write a query to filter out such cases before my analysis, but was curious if others had come across similar problems with this dataset, or had any insight as to how to resolve these issues.



Thank you for reporting this @deepa! I was not aware of this issue.

I went back to TCIA, and confirmed that for all of the 3 CT series you identified TCIA has the same number of slices. At least we know that those slices were not lost in the process of ingesting data into IDC!

I submitted a ticket to TCIA asking about this issue (here, for completeness). But I am not optimistic those slices can be recovered. I think it would be good if we could figure out a query that would allow checking whether a series has missing slices, so this can be known before the user downloads the files.