科学研究

Research

首页 >  论文  >  详情

Spatially aware graph neural networks and cross-level molecular profile prediction in colon cancer histopathology: a retrospective multi-cohort study

发表会议及期刊:Lancet Digit Health

Kexin Ding*, Mu Zhou*, He Wang, Shaoting Zhang†, Dimitri N Metaxas†


Summary

Background

Digital whole-slide images are a unique way to assess the spatial context of the cancer microenvironment. Exploring these spatial characteristics will enable us to better identify cross-level molecular markers that could deepen our understanding of cancer biology and related patient outcomes.

Methods

We proposed a graph neural network approach that emphasises spatialisation of tumour tiles towards a comprehensive evaluation of predicting cross-level molecular profiles of genetic mutations, copy number alterations, and functional protein expressions from whole-slide images. We introduced a transformation strategy that converts whole-slide image scans into graph-structured data to address the spatial heterogeneity of colon cancer. We developed and assessed the performance of the model on The Cancer Genome Atlas colon adenocarcinoma (TCGA-COAD) and validated it on two external datasets (ie, The Cancer Genome Atlas rectum adenocarcinoma [TCGA-READ] and Clinical Proteomic Tumor Analysis Consortium colon adenocarcinoma [CPTAC-COAD]). We also predicted microsatellite instability and result interpretability.

Findings

The model was developed on 459 colon tumour whole-slide images from TCGA-COAD, and externally validated on 165 rectum tumour whole-slide images from TCGA-READ and 161 colon tumour whole-slide images from CPTAC-COAD. For TCGA cohorts, our method accurately predicted the molecular classes of the gene mutations (area under the curve [AUCs] from 82·54 [95% CI 77·41–87·14] to 87·08 [83·28–90·82] on TCGA-COAD, and AUCs from 70·46 [61·37–79·61] to 81·80 [72·20–89·70] on TCGA-READ), along with genes with copy number alterations (AUCs from 81·98 [73·34–89·68] to 90·55 [86·02–94·89] on TCGA-COAD, and AUCs from 62·05 [48·94–73·46] to 76·48 [64·78–86·71] on TCGA-READ), microsatellite instability (MSI) status classification (AUC 83·92 [77·41–87·59] on TCGA-COAD, and AUC 61·28 [53·28–67·93] on TCGA-READ), and protein expressions (AUCs from 85·57 [81·16–89·44] to 89·64 [86·29–93·19] on TCGA-COAD, and AUCs from 70·46 [61·37–79·61] to 81·80 [72·20–89·70] on TCGA-READ), along with genes with copy number alterations (AUCs from 81·98 [73·34–89·68] to 90·55 [86·02–94·89] on TCGA-COAD, and AUCs from 62·05 [48·94–73·46] to 76·48 [64·78–86·71] on TCGA-READ), microsatellite instability (MSI) status classification (AUC 83·92 [77·41–87·59] on TCGA-COAD, and AUC 61·28 [53·28–67·93] on TCGA-READ), and protein expressions (AUCs from 85·57 [81·16–89·44] to 89·64 [86·29–93·19] on TCGA-COAD, and AUCs from 51·77 [42·53–61·83] to 59·79 [50·79–68·57] on TCGA-READ). For the CPTAC-COAD cohort, our model predicted a panel of gene mutations with AUC values from 63·74 (95% CI 52·92–75·37) to 82·90 (73·69–90·71), genes with copy number alterations with AUC values from 62·39 (51·37–73·76) to 86·08 (79·67–91·74), and MSI status prediction with AUC value of 73·15 (63·21–83·13).

Interpretation

We showed that spatially connected graph models enable molecular profile predictions in colon cancer and are generalised to rectum cancer. After further validation, our method could be used to infer the prognostic value of multiscale molecular biomarkers and identify targeted therapies for patients with colon cancer.

Funding

This research has been partially funded by ARO MURI 805491, NSF IIS-1793883, NSF CNS-1747778, NSF IIS 1763523, DOD-ARO ACC-W911NF, and NSF OIA-2040638 to Dimitri N Metaxas.



comm@pjlab.org.cn

上海市徐汇区云锦路701号西岸国际人工智能中心37-38层

沪ICP备2021009351号-1