68. Integrative analysis of genomic and transcriptomic data using RegTools to identify splice-altering mutations within bulk

Aly Abdelkareem

Kelsy Cotto

Abstract

Kelsy Cottoa, Yang-Yang Fenga, Avinash Ramua, Megan Richtersa, Sharon Freshoura, Zachary Skidmorea, Joshua McMichaela, Jason Kunisakia, Yiing Lina, William Chapmana, Christopher Mahera, Vivek Aroraa, Gavin Dunnb, Ravindra Uppaluric, Ramaswamy Govindana, Obi L. Griffitha, Malachi Griffitha

aWashington University, St. Louis, MO, USA; bMassachusetts General Hospital, Boston, MA, USA; cBrigham and Women’s Hospital, Boston, MA, USA

The interpretation of variants in cancer is often focused on genomic alterations that have a known coding consequence. This analysis strategy excludes somatic mutations in non-coding regions of the genome and even exonic mutations that may have unidentified regulatory consequences. To address this issue, we created RegTools, a software suite that integrates analysis of variant calls from genomic data with evidence of expressed splice junctions from transcriptomic data to efficiently identify variants that may cause aberrant splicing in tumors. To date, we have applied RegTools to over 9,000 bulk RNA-Seq samples from TCGA and clinical cohorts from Washington University to identify somatic variants that are associated with alternative splicing patterns within these tumors. We discovered 235,778 splice-altering events across 158,200 unique variants and 131,212 unique junctions.Only 1.4 percent of these mutations were previously discovered by similar attempts, while 98.6 percent are novel findings. We also applied RegTools to single cell RNA-Seq (scRNA) data from MC6BC, a transplantable organoid model of urothelial carcinoma with features of human basal-squamous urothelial carcinoma, and identified tumor-specific somatic variants leading to novel splice junctions at single-cell resolution. To our knowledge, this is the first analysis that identifies splice-altering variants within scRNA data. Future work will include utilizing RegTools to predict the novel proteins resulting from splice-altering variants and their neoantigenicity. RegTools is freely available and open-source (www.regtools.org) and all analysis scripts are provided within the project’s GitHub repo (https://github.com/griffithlab/regtools).