Pathology Report Accessibility / matching data with KidsFirst portal

Hello,

I am working on a machine learning digital pathology project with medulloblastoma WSIs and would like to use IDC data as an external test set. I’ve already downloaded the images from the IDC SM portal however I was wondering if it is possible to gain access to any associated pathology reports that might be on file. I am primarily interested in the pathologist ground truth assessment of each slide/case, whether that is in the form of a report or image annotations, if available. Thank you!

Kindly,

Therry Malone | Research Assistant

Mark D. Krieger | Surgeon-in-Chief; Senior VP; Billy and Audrey L. Wilder Chair in Neurosurgery

Jennifer A. Cotter | Director, Neuropathology; Director, Center for Pathology Research Services

Children’s Hospital Los Angeles
4650 Sunset Blvd., Mailstop #43 | Los Angeles, CA 90027
Ph: 562.400.5141 | thmalone@chla.usc.edu
www.CHLA.org

Hello Terry,

Thank you for your question.

Unfortunately, we do not have access to the pathology reports for the images. They may be available from GDC for a subset of TCGA slides, but we have not investigated this, and do not have those in IDC at this time. I will make a note to try to investigate this, but unfortunately I cannot give you any estimate when this may be done.

Those are not the specific areas of your interest, but we do have expert annotations for the RMS collection, see https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations and bone marrow cell annotations in https://portal.imaging.datacommons.cancer.gov/explore/filters/?collection_id=bonemarrowwsi_pediatricleukemia

Andrey Fedorov

Hi Andrey,

Understood, thank you!

Best,

Therry

···

From: Andrey Fedorov notifications@canceridc.discoursemail.com
Sent: Thursday, May 14, 2026 8:29 AM
To: Malone, Therry thmalone@chla.usc.edu
Subject: Re: Pathology Report Accessibility (EXTERNAL EMAIL)



CAUTION: BE CAREFUL WITH THIS MESSAGE
This email came from outside CHLA. Do not open attachments, click on links, or respond unless you expected this message and recognize the email address: incoming+verp-7157bf6d91a5e0553f59558b95a4e5db@canceridc.discoursemail.com.







fedorov
May 14

Hello Terry,

Thank you for your question.

Unfortunately, we do not have access to the pathology reports for the images. They may be available from GDC for a subset of TCGA slides, but we have not investigated this, and do not have those in IDC at this time. I will make a note to try to investigate this, but unfortunately I cannot give you any estimate when this may be done.

Those are not the specific areas of your interest, but we do have expert annotations for the RMS collection, see https://secure-web.cisco.com/1cciWRHNZp0HZ8pSvqKVaX2obiGBEjm7lc8bAwhon1jsA2-Cqzc8RLbKnIy7jwU8tCmqrDc9tI3BOPUqRJ70MGBXrV_WCkVB0NgLoVKhdUE_H-zC1DsafL_DA1RYAJnctUqb3kwmW-J6Rcse09bIW80s_8E1QDHKoF9GAZHwv_6yHpsMF86N5GJtT-6h-TpTL5eLfDxKBuk6AhmB7q0nKiFHxKJP2iTt7uG0AsMJAFIGqURkej8DpGdwtBcO_435VaICA9sobtoTOTddeovD8hdxFt2vylL1r1_2–AySL63IPOYIzqsZb1UMSBljABkiSUgv9-_-cx_wo2d7zyihK5na-ZaqTrkD6dKwU_2a_m46YcKFbI1r7_9lx601L62GuyPQ62RalwKE637yg6sb4aRBke6VfbhtkfidFvLFjeOitxWWC5l_teaiPVQqC7xo-8j_FVjpJQ5hX8Yf1YJNTw/https%3A%2F%2Fportal.imaging.datacommons.cancer.gov%2Fexplore%2Ffilters%2F%3Fanalysis_results_id%3DRMS-Mutation-Prediction-Expert-Annotations and bone marrow cell annotations in https://secure-web.cisco.com/1b4uDRHOQi4tNtJwV7ggetouPL-He88GU8Rd60k3LelfllBTLNyRm7ngKPhFT2USd92tociNoBc1Yc_zMFYPIZB9p_YBpZSluKmW8OqidqmZ509HnhlR4U_80f4GmLIqI6oeotZnGJkFMp05b4hC5_aOcE7zaYtbUlr6AJCRMQjvQk99pqoTAQeSKdgrzxmpV5kV626yoCVxyZakAF8idS0dCN8NAkX_0T44gSLRkRMpT2l_ov26iY357C1mFABy-nk9mz8NGZiDKJkA2lEgYnJSnDn_2BYWk1E6Xf0WB-UiU5Fx9VC6H-6ML9QrUawvV-0bCgnefP5Ghc4kug-pIkSXUbXrkdrVKeEkzvW0Ld95Wf9tH3FTGIst7PRClSSpEr3oaXQCa4NthMkA7_HG9Mvgc9N0x0pVq9aaSpa85QHLzLESzEoAHStfttN9HxKd7_FEMuyGCD_5PXm2kKzqe4w/https%3A%2F%2Fportal.imaging.datacommons.cancer.gov%2Fexplore%2Ffilters%2F%3Fcollection_id%3Dbonemarrowwsi_pediatricleukemia

Andrey Fedorov

Participants: IDC support email triage group (8)

Hi Andrey,

I have a follow up question. My understanding is that there is overlapping data between the IDC and KidsFirst portals, specifically from the CCDI-MCI and CBTN, respectively. Is it possible to know what CBTN data was included in the CCDI-MCI dataset that is accessible through the IDC SM portal? I’ve looked through both repositories extensively as well as the DICOM metadata of slides downloaded from IDC but haven’t found any linking information.

Therry

···

From: Andrey Fedorov notifications@canceridc.discoursemail.com
Sent: Thursday, May 14, 2026 8:29 AM
To: Malone, Therry thmalone@chla.usc.edu
Subject: Re: Pathology Report Accessibility (EXTERNAL EMAIL)



CAUTION: BE CAREFUL WITH THIS MESSAGE
This email came from outside CHLA. Do not open attachments, click on links, or respond unless you expected this message and recognize the email address: incoming+verp-7157bf6d91a5e0553f59558b95a4e5db@canceridc.discoursemail.com.







fedorov
May 14

Hello Terry,

Thank you for your question.

Unfortunately, we do not have access to the pathology reports for the images. They may be available from GDC for a subset of TCGA slides, but we have not investigated this, and do not have those in IDC at this time. I will make a note to try to investigate this, but unfortunately I cannot give you any estimate when this may be done.

Those are not the specific areas of your interest, but we do have expert annotations for the RMS collection, see https://secure-web.cisco.com/1cciWRHNZp0HZ8pSvqKVaX2obiGBEjm7lc8bAwhon1jsA2-Cqzc8RLbKnIy7jwU8tCmqrDc9tI3BOPUqRJ70MGBXrV_WCkVB0NgLoVKhdUE_H-zC1DsafL_DA1RYAJnctUqb3kwmW-J6Rcse09bIW80s_8E1QDHKoF9GAZHwv_6yHpsMF86N5GJtT-6h-TpTL5eLfDxKBuk6AhmB7q0nKiFHxKJP2iTt7uG0AsMJAFIGqURkej8DpGdwtBcO_435VaICA9sobtoTOTddeovD8hdxFt2vylL1r1_2–AySL63IPOYIzqsZb1UMSBljABkiSUgv9-_-cx_wo2d7zyihK5na-ZaqTrkD6dKwU_2a_m46YcKFbI1r7_9lx601L62GuyPQ62RalwKE637yg6sb4aRBke6VfbhtkfidFvLFjeOitxWWC5l_teaiPVQqC7xo-8j_FVjpJQ5hX8Yf1YJNTw/https%3A%2F%2Fportal.imaging.datacommons.cancer.gov%2Fexplore%2Ffilters%2F%3Fanalysis_results_id%3DRMS-Mutation-Prediction-Expert-Annotations and bone marrow cell annotations in https://secure-web.cisco.com/1b4uDRHOQi4tNtJwV7ggetouPL-He88GU8Rd60k3LelfllBTLNyRm7ngKPhFT2USd92tociNoBc1Yc_zMFYPIZB9p_YBpZSluKmW8OqidqmZ509HnhlR4U_80f4GmLIqI6oeotZnGJkFMp05b4hC5_aOcE7zaYtbUlr6AJCRMQjvQk99pqoTAQeSKdgrzxmpV5kV626yoCVxyZakAF8idS0dCN8NAkX_0T44gSLRkRMpT2l_ov26iY357C1mFABy-nk9mz8NGZiDKJkA2lEgYnJSnDn_2BYWk1E6Xf0WB-UiU5Fx9VC6H-6ML9QrUawvV-0bCgnefP5Ghc4kug-pIkSXUbXrkdrVKeEkzvW0Ld95Wf9tH3FTGIst7PRClSSpEr3oaXQCa4NthMkA7_HG9Mvgc9N0x0pVq9aaSpa85QHLzLESzEoAHStfttN9HxKd7_FEMuyGCD_5PXm2kKzqe4w/https%3A%2F%2Fportal.imaging.datacommons.cancer.gov%2Fexplore%2Ffilters%2F%3Fcollection_id%3Dbonemarrowwsi_pediatricleukemia

Andrey Fedorov

Participants: IDC support email triage group (8)

Therry, I will reach out to our contacts at CCDI to help you with this.

Do you mind if I move this conversation to the public space of IDC user support forum https://discourse.canceridc.dev/?

Since you reached out via email, by default your message is accessible only to the forum staff. It would be easier to coordinate the response, and it would be helpful for the rest of the users of IDC if we could make this tread public. Please let me know.

Andrey,

I don’t mind at all. Thanks for your help!

···

From: Andrey Fedorov notifications@canceridc.discoursemail.com
Sent: Thursday, May 14, 2026 4:41 PM
To: Malone, Therry thmalone@chla.usc.edu
Subject: Re: Pathology Report Accessibility (EXTERNAL EMAIL)



CAUTION: BE CAREFUL WITH THIS MESSAGE
This email came from outside CHLA. Do not open attachments, click on links, or respond unless you expected this message and recognize the email address: incoming+verp-99e7eb916d107c99f0d236fe2a63496c@canceridc.discoursemail.com.







fedorov
May 14

Therry, I will reach out to our contacts at CCDI to help you with this.

Do you mind if I move this conversation to the public space of IDC user support forum https://secure-web.cisco.com/1Y-yOrC-YvSGtzy_dykIUXJvzVZRNh0BrA_8NqcVXE5oBzAeP342PXGg1QeTfsXXL6EX2S2q3BKGNeSqlYeolQQS-ZDA0KIhd5ki_7G8PItEEDOR_gSI_xEVfhhuq9ngqr34kkJCj5t6zR1nZsUjZYcgWxNEzMXdYdI3G7ooQLUEDVGKd4HZh-Q_5nhuVw-unABHg42MrsOzT9XXOLYq1U_kZmyXJhxd815c4hlCR-WtSusYs___lOnBavP8TxtYHGZ4pdfEVXET0LQxzsBJ5cXfpibuVuclVxLUDH8YIlqkMiIZ6sQXziSP4zu2f1NkgiGjTGbXXtnHuVHTYUZWgbBqdX_wpsrAuxOIyxzrlRHOHRpO10EU1zatxyKjEVijOAUX-Tx_GvG5zL1d-J_VqzLxDZ1A0OSH-V3ar6hm58C3Ljeyg3EuQzCIVi9VfYfeE2eNulMos8nWmzs_wjS1wAw/https%3A%2F%2Fdiscourse.canceridc.dev%2F?

Since you reached out via email, by default your message is accessible only to the forum staff. It would be easier to coordinate the response, and it would be helpful for the rest of the users of IDC if we could make this tread public. Please let me know.

Participants: IDC support email triage group (8)

Thank you for bringing this to our attention. We identified 68 participants that overlap between the MCI and CBTN studies. The participant_id mappings between the two datasets are included in the table below. Downloading the IDC data associated with these MCI participant IDs should provide the dataset you are looking for.

Column 1 Column 2
MCI participant_id CBTN participant_id
PBBHZD C4034646
PBBIMT C4318161
PBBIUN C4245345
PBBIVA C5623929
PBBIXX C4317792
PBBIZK C4317915
PBBIZT C4103649
PBBJJS C4344360
PBBKSJ C4344483
PBBMCX C2698374
PBBMHS C5491581
PBBMKI C4653705
PBBMRT C5491458
PBBMYD C4830825
PBBNFJ C5260218
PBBNWE C4745217
PBBPPA C4633533
PBBRDE C5623806
PBBTBU C4948413
PBBTFD C4948536
PBBUTU C7617267
PBBVUU C5254560
PBBWUK C5254437
PBBXMP C5254314
PBBYTY C5254191
PBBZAR C5253945
PBBZNE C5254068
PBBZNV C5253822
PBCAEF C5492319
PBCBCD C5492565
PBCBJR C5492196
PBCBKE C6970410
PBCBSA C5623437
PBCCLD C5623683
PBCDLB C5492442
PBCDVZ C5623560
PBCFES C6353688
PBCFPT C6083826
PBCGFJ C6083703
PBCHET C6083580
PBCKHN C6958725
PBCKLL C6353811
PBCLUB C6083457
PBCMJC C6349014
PBCSYM C6969918
PBCUIN C7333137
PBCUNY C6958971
PBCUWA C6970533
PBCWDZ C7302510
PBCWFT C7093779
PBCWIV C6970041
PBCXFJ C7093902
PBCXHU C7093656
PBCYMG C7302387
PBCYUR C7093533
PBDAEM C7457490
PBDAKJ C7302018
PBDALL C7302633
PBDAUZ C7302756
PBDBFH C7302264
PBDDFI C7457613
PBDEMM C7424526
PBDENP C7457367
PBDFBU C7617021
PBDFKM C7611978
PBDGFD C7617144
PBDHFB C7612101
PBDJPW C7617390

In additional, we are currently developing CPI bulk query and export feature for our CCDI Ecosystem. We can notify you when this feature becomes available. In the meantime, here is the current method of pulling study synonyms for the MCI (or other) cohort:

  1. Make selections in the CCDI Hub (https://ccdi.cancer.gov/explore) from left hand facet filters.

  2. Navigate to the Participant metadata table for selected cases

  3. Download the JSON output file for the selected cases

  4. Within the JSON file, for applicable cases with alternate study synonyms, there will be a data element for “Available CPI Mapping”, consisting of an array of synonyms for associated studies or domains. CBTN is listed as the domain description “Children’s Brain Tumor Network”, and the associated ID for that entry will be the associated CBTN ID.

Hi Bahar,

Thank you for providing this list and instructions for extracting CBTN IDs from participant metadata. I was able to replicate your mapping with your instructions.

Does this approach yield the ground truth mapping for all associated sources (i.e., are the 68 overlapping MCI/CBTN participants you identified the only participants taken from the CBTN cohort), or are there additional mappings/methods for determining which participants came from CBTN and have associated CBTN IDs?

Essentially, I have a list of CBTN Collection / Participant IDs sourced from my institution and KidsFirst, and I need to know their IDC participant ID, if it exists.

Thanks!

Hi Therry, If you have a list of participants and would like to identify alternate identifiers used across pediatric cancer studies or repositories, the CCDI Participant Index (CPI) can help map related participant IDs across multiple CCDI-supported datasets and systems. The CPI is designed to support data integration and cross-study research by maintaining mappings between de-identified participant identifiers rather than storing clinical or genomic data itself. The CPI API allows authorized users to retrieve associated participant identifiers, validate IDs, discover related domains or studies, and access metadata and statistics about the index while maintaining strong privacy protections. Additional technical documentation is available in the CPI API Documentation.