Associate Professor University of Virginia, United States
Introduction: Phosphoproteomics is a powerful tool to understand kinase activities and characterize aberrant signaling pathways in cancers. Currently, phosphoproteomics pipelines require significant amount of materials and so requires measuring across tissue sections, due to the complexities of the tumor microenvironment, the heterogeneities in cancer tissues can often influence cellular conditions and obscure signal detection. Given that isolating cancer cells alone may dampen the active signals of cancer cells, we can only discern relevant signaling events specifically associated with cancer cells while maintaining the integrity of the tissues. Here, to understand in what extent non-epithelial cells affect kinase signal detection of cancer cells, we present an analysis that aims to identify common phosphopeptides in eight LUAD cell lines, all containing K-RAS mutations, and simulate phosphoproteomics from whole tumor tissues by computationally spiking in phosphopeptides from endothelial cells, macrophages, and T-cells in varying ratios. This hopes to provide future directions to determine how tissue complexities might contribute to overall signal detection and deconvolve kinase signals in the tumor microenvironment.
Materials and
Methods: Data preparation We collected five different phosphoproteomics datasets that consists of phosphopeptides from lung adenocarcinoma (LUAD) cell lines, endothelial cells, macrophages, and T-cells. We mapped the detected peptides using ProteomeScout and PhosphoSitePlus, and normalized each dataset individually. Spike-in ratio estimation To determine spike-in ratios, we applied CIBERSORT to estimate cell type proportions in 33 lung adenocarcinoma tissue samples with KRAS mutations, published in Clinical Proteomic Tumor Analysis Consortium (CPTAC). We then tested 8 different ratios in each spike-in simulation. For each simulation, we kept the same total number of peptides and generated a new ‘hybridized’ phosphoproteomics data with the specific ratios of peptides from different cell types. Kinase activity comparison We employed a published kinase activity prediction method named “KSTAR” to generate binarized evidence for each individual hybridized dataset. To determine the threshold, we computed the differential of phosphosites kept with respect to changes in thresholds.
Results, Conclusions, and Discussions: Results LUAD cell lines share similar peptides across datasets We compared two LUAD cell lines (H1792, H23) from two different sources. The first dataset was taken from Liu et al., 2021, which contained around 9000 phospho-serine (pS)/ phosphor-threonine (pT) mapped peptides, while the second dataset from Solanki et al., 2021 contains around 32,000 pS/T mapped peptides. We found that >75% of mapped peptides in the first dataset were also shared by the second dataset, which suggested that there may be reliable peptides to identify LUAD epithelial cell lines. Phospho-signals of cancer cells were masked by endothelial spike-ins After applying CIBERSORT in KRAS mutant LUAD samples, we determined to use the six following ratios of epithelial to endothelial cells that we used were: 1.88, 3.22, 4.90, 6.37, 9.35, 26.26. While keeping similar number of peptides in each hybridized phosphoproteomics dataset, our results indicated that, in the presence of non-epithelial cells, phospho-signals of cancer cells were masked by the increasing ratio of spike-ins. Since around 80% of peptides from were also shared by LUAD cells, the effects were not exclusively caused by the addition of HUVEC-specific peptides.
Discussions This analysis not only reveals the challenges in using heterogeneous tissue samples to identify signals of cancer cells, but also highlights the complexity of signals of the tumor microenvironment Our results lay the groundwork for further research to understand tissue heterogeneity with phosphoproteomics.
Acknowledgements (Optional): 1. Croucher Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Croucher Foundation. 2. Interdisciplinary Training in Systems & Biomolecular Data Science Statement: “Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number T32GM145443. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.” 3. Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number U01CA284193. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health