
Yao Yao
Dr. Yao is a Staff Scientist at the Scripps Research Institute, specializing in Computational Biology and Data Science. Prior to joining Scripps, he completed his Ph.D. program from Oregon State University with major in Computer Science. His dissertation research focused on using machine learning techniques to classify regulatory non-coding human variome.
Abstract
Yao Yao, Everaldo Rodolpho, Sebastien Lelong, Xinghua Zhou, Cyrus Afrasiabi, Zhongchao Qian, Marco Alvarado Cano, Ginger Tsueng, Andrew I. Su, Chunlei Wu
Scripps Research, La Jolla, CA, United States
The depth and breadth of knowledges on genetic variation are growing exponentially. At the same time, the latest web API technology revamps the way we collect, manage and publish the fast-growing variant annotations. First, users can fetch annotation data from web APIs with much lower latency compared to downloading and analyzing data sources separately. Second, web APIs return data in standard formats (e.g., JSON or XML), which enables easy integration with other applications such as knowledge discoveries. These benefits reduce the burden of data wrangling on researchers and therefore accelerate their research.
Here we represent MyVariant.info, a high-performance, scalable web API for variant annotations. Currently, it integrates variant annotation from 21 data sources (including ClinGen, Clinvar, dbSNP, dbNSFP, CIViC, Geno2MP, GRASP, gnomAD and Wellderly). All data are parsed to JSON format and merged by shared HGVS IDs. To provide superior query performance and rich query syntax, MyVariant.info leverages ElasticSearch to index the JSON data. With the standard JSON format and highly usable query interface, MyVariant.info can easily contribute to downstream studies such as drug repurposing, gene regulatory networks, and pathway analyses.
With the latest release in Jan. 2023, MyVariant.info is holding 1.42 billion annotation documents for variants in hg19, and 1.45 billion for hg38. We proudly receive 2.6 million monthly requests and serve more than 2,000 unique IP addresses with 99.99% uptime. MyVariant.info is accessible at https://myvariant.info; its source code is hosted at GitHub, https://github.com/biothings/myvariant.info, under the Apache License, Version 2.0.