We’re working on a tutorial notebook that will focus on a deep learning task using the TCGA-GBM (and probably LGG) cohorts, and we’re planning on having it ready soon.
In the meantime, I’ve created a temporary one that can be found here.
The rationale behind this notebook is to select a subset of studies from a cohort using the study date as the criteria. Specifically, I wanted to select the first study for each subject since I’m interested only in the pre-surgery ones.
Some questions and comments:
- is there a better way to do it? For example, using BigQuery? I’m a SQL novice; I’ve tried looking around but couldn’t find a way to select only the earliest study for each subject using BigQuery. If not, should we keep this notebook as a quick reference for selecting studies based on the study date? (I guess it could be extended using other criteria).
- Is there a way to order the studies in the IDC portal using the study date? If not, I think it could be a useful feature for the portal.
- After selecting the examples studies in the notebook, there were some extra DICOM files that are not visible neither in the IDC portal list nor in the OHIF viewer (see attachment). I’ve tried opening those files with ITK-SNAP but got an error message (see attachment). What are these files?