Creating cohort in portal from a list of PatientIDs

Graglia_Luca_BSD · September 17, 2025, 8:55pm

Hi,

I have a list of PatientID and I would like to load them in the portal (https://portal.imaging.datacommons.cancer.gov/explore/ ) to create a cohort. How can I do that?

Best,

Luca

···

–

Luca Graglia

Director of Software and Infrastructure Services

Data for the Common Good

Biological Sciences Division

University of Chicago

lgraglia@bsd.uchicago.edu

fedorov · September 18, 2025, 2:57pm

Luca, thank you for reaching out with this qestion.

At this time, it is not possible to use a list of PatientIDs to build a cohort in IDC Portal.

Can you tell us more about what you would like to do with the cohort? What is your ultimate goal?

If you want to be able to download the images for the specific list of patients, it is possible with a little bit of coding, and I am happy to help with that.

Graglia_Luca_BSD · September 18, 2025, 7:49pm

Hi Andrey,

Thank you for your response.
My ultimate goal is to be able to see how much data there is in the IDC given a set of PatientIDs, and download those images / request to have access to them if there is any governance layer before the download can happen.
In an ideal scenario I would also like some metadata about those images.

Thank you for your offer, I would love some guidance in to downloading the images for a specific list of patients.

Best,

Luca

fedorov · September 18, 2025, 8:18pm

Luca, thank you for the clarification! I will follow up.

Also, note that I moved this discussion into a public section of the IDC forum, since I believe your question is of general interest, and does not contain any sensitive information. I see you are communicating via email, and I wanted to make sure you recognize this.

You can join the forum and continue the conversation in the forum here Creating cohort in portal from a list of PatientIDs , or continue communicating by email!

fedorov · September 24, 2025, 2:18pm

Sorry for the delay in replying - travels/deadlines!

Here’s the basic recipe - happy to expand based on your feedback/questions:

Install the prerequisite idc-index python package - this will give you interface to navigate basic metadata accompanying IDC content:
```
$ pip install --upgrade idc-index
```

Instantiate IDCClient that provides API/metadata tables:

from idc_index import IDCClient
client = IDCClient()

IDCClient provides access to a pandas dataframe documented here, corresponding to the current release of IDC data, which you can use to select items for the given patient identifiers:
```
patient_ids = ["TCGA-3L-AA1B","PANLMU"]
selection = client.index[client.index["PatientID"].isin(patient_ids)]
```
If you just want to download everything for the specific patients, you can:
```
client.download_from_selection(patientId = patient_ids, downloadDir=".")
```

The downloaded content by default will be organized in a hierarchy collection/patient/study/series:

$ tree -d
.
├── ccdi_mci
│   └── PANLMU
│       └── 2.25.60737598245052570577932078803929433012
│           └── SM_1.3.6.1.4.1.5962.99.1.856942911.401081431.1727433828671.4.0
└── tcga_coad
    └── TCGA-3L-AA1B
        └── 2.25.173524747743997252212346304558885903341
            ├── SM_1.3.6.1.4.1.5962.99.1.3192501271.1499461926.1639575073815.2.0
            ├── SM_1.3.6.1.4.1.5962.99.1.3215887122.1825455320.1639598459666.2.0
            └── SM_1.3.6.1.4.1.5962.99.1.3233347454.2096386808.1639615919998.2.0

Please let me know if this addresses your use case!

Luca_Graglia · October 14, 2025, 4:29pm

Thank you @fedorov this is great!

fedorov · October 14, 2025, 4:57pm

Great to hear that @Luca_Graglia! I will then mark the earlier reply as solution, but please let me know if you have any further comments or questions related to this.

Topic		Replies	Views
How to modify an existing IDC cohort Feedback and features feature	9	571	April 28, 2021
Using API to get images for a particular TCGA patient barcode Data	1	604	December 7, 2021
Text2Cohort: a new LLM toolkit to query IDC database using Natural Language Queries Announcements	4	745	May 27, 2023
Question: IDC API Availability Support	10	1002	December 7, 2021
API to get patients of a collection Support	6	450	April 28, 2022

Creating cohort in portal from a list of PatientIDs

Related topics