API to get patients of a collection

Hello, ICDC developer here!

I’m trying to discover ICDC related data in IDC and display a link from ICDC to IDC. We know there is one collection “icdc_glioma” that is related to ICDC study “GLIOMA01” currently. I can find this collection by calling IDC API at https://api.imaging.datacommons.cancer.gov/v1/collections, and search “icdc” in collection_id field. Now I just need to map icdc_glioma in IDC to GLIOMA01 in ICDC. I can see the “Case IDs” in IDC portal are same as ICDC case IDs. Maybe I can use case IDs to map a collection in IDC to a study in ICDC. But the problem is I couldn’t find an API to retrieve all case IDs for a collection. Please help!

1 Like

@mingying you can get all of the PatientIDs for a given collection by running the following query either in the BigQuery console, or in Google Colab using %%bigquery magic:

SELECT
  DISTINCT(PatientID)
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  collection_id = "icdc_glioma"

Once you have the list of cases, how would you like to link individual studies?

One option is you could link specific studies to open them in the viewer, such as this: https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.14519.5.2.1.138967438947378494828155132414366986280. If this is what you wanted, I can help you generate those links - it’s just another BQ query.

First we can check how many studies are available for each case - in DICOM data model, each study (imaging session) is uniquely identified by StudyInstanceUID. Apparently, only one case had two studies:

SELECT
  PatientID,
  COUNT(DISTINCT(StudyInstanceUID)) AS num_studies
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  collection_id = "icdc_glioma"
GROUP BY
  PatientID
ORDER BY
  num_studies DESC

And this is how you can get URLs for the individual studies:

SELECT
  PatientID,
  StudyInstanceUID,
  CONCAT("https://viewer.imaging.datacommons.cancer.gov/viewer/",StudyInstanceUID) AS study_url
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  collection_id = "icdc_glioma"

Thanks @fedorov! Is patient information also available in REST API? I assume I need an IDC account to use BigQuery, how do I apply one? Thanks!

No, you actually don’t! All you need is a google ID and activate Google Cloud for that account. Tthe BQ tables I am referring to are public. This notebook has prerequisites/samples/instructions.

Please give this a try and let me know if you run into any issues.

Thanks @fedorov! I’ll try it out and reach out again if I need more help.

1 Like

You can also pass the following query to the /cohorts/query/preview API endpoint:
{
“cohort_def”: {
“name”: “mycohort”,
“description”: “Example description”,
“filters”: {
“collection_id”: [“ICDC-Glioma”]
}
},
“queryFields”: {
“fields”: [
“PatientID”
]
}
}

2 Likes

Thanks @fedorov! Yes I can retrieve the data we need!
Thanks so much @bill.clifford! That works too!

1 Like