|
|
|
Exon expression QTL (eeQTL) analysis highlights distant genomic variations associated with splicing regulation |
Leying Guan1,4, Qian Yang1, Mengting Gu2, Liang Chen3, Xuegong Zhang1,2( ) |
1. MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST, Department of Automation, Tsinghua University, Beijing 100084, China 2. School of Life Sciences, Tsinghua University, Beijing 100084, China 3. Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA 4. Department of Physics, Tsinghua University, Beijing 100084, China |
|
|
|
|
Abstract Alternative splicing is a ubiquitous mechanism of post-transcriptional regulation of gene expression and produces multiple isoforms from the same genes. Expression quantitative trait loci (eQTL) has been a major method for finding associations between gene expression and genomic variations. Differences in alternative splicing isoforms are resulted from differences in the expression of exons. We propose to use exon expression QTL (eeQTL) to study the genomic variations that are associated with splicing regulation. A stringent criterion was adopted to study gene-level eQTLs and exon-level eeQTLs for both cis- and trans- factors. From experiments on an RNA-sequencing (RNA-Seq) data set of HapMap samples, we observed that compared with eQTLs, more eeQTL trans-factors can be found than cis-factors, and many of the eeQTLs cannot be found at the gene level. This work highlights that the regulation of exons adds another layer of regulation on gene expression, and that eeQTL analysis is a new approach for investigating genome-wide genomic variations that are involved in the regulation of alternative splicing.
|
| Keywords
eeQTL
eQTL
alternative splicing
trans-factor
association
regulation
|
|
Corresponding Author(s):
Xuegong Zhang
|
| About author: Tongcan Cui and Yizhe Hou contributed equally to this work. |
|
Online First Date: 02 November 2014
Issue Date: 04 December 2014
|
|
| 1 |
Y. Gilad,, S. A. Rifkin, and J. K. Pritchard, (2008) Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet., 24, 408–415
https://doi.org/10.1016/j.tig.2008.06.001
pmid: 18597885
|
| 2 |
M. Morley,, C. M. Molony,, T. M. Weber,, J. L. Devlin,, K. G. Ewens,, R. S. Spielman, and V. G. Cheung, (2004) Genetic analysis of genome-wide variation in human gene expression. Nature, 430, 743–747
https://doi.org/10.1038/nature02797
pmid: 15269782
|
| 3 |
J. K. Pickrell,, J. C. Marioni,, A. A. Pai,, J. F. Degner,, B. E. Engelhardt,, E. Nkadori,, J. B. Veyrieras,, M. Stephens,, Y. Gilad, and J. K. Pritchard, (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 464, 768–772
https://doi.org/10.1038/nature08872
pmid: 20220758
|
| 4 |
J. Majewski, and T. Pastinen, (2011) The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet., 27, 72–79
https://doi.org/10.1016/j.tig.2010.10.006
pmid: 21122937
|
| 5 |
E. E. Schadt,, S. A. Monks,, T. A. Drake,, A. J. Lusis,, N. Che,, V. Colinayo,, T. G. Ruff,, S. B. Milligan,, J. R. Lamb,, G. Cavet,, et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature, 422, 297–302
https://doi.org/10.1038/nature01434
pmid: 12646919
|
| 6 |
M. V. Rockman, and L. Kruglyak, (2006) Genetics of global gene expression. Nat. Rev. Genet., 7, 862–872
https://doi.org/10.1038/nrg1964
pmid: 17047685
|
| 7 |
K. Xia,, A. A. Shabalin,, S. Huang,, V. Madar,, Y. H. Zhou,, W. Wang,, F. Zou,, W. Sun,, P. F. Sullivan, and F. A. Wright, (2012) seeQTL: a searchable database for human eQTLs. Bioinformatics, 28, 451–452
https://doi.org/10.1093/bioinformatics/btr678
pmid: 22171328
|
| 8 |
T. P. Yang,, C. Beazley,, S. B. Montgomery,, A. S. Dimas,, M. Gutierrez-Arcelus,, B. E. Stranger,, P. Deloukas, and E. T. Dermitzakis, (2010) Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics, 26, 2474–2476
https://doi.org/10.1093/bioinformatics/btq452
pmid: 20702402
|
| 9 |
S. B. Montgomery,, M. Sammeth,, M. Gutierrez-Arcelus,, R. P. Lach,, C. Ingle,, J. Nisbett,, R. Guigo, and E. T. Dermitzakis, (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature, 464, 773–777
https://doi.org/10.1038/nature08903
pmid: 20220756
|
| 10 |
W. Cookson,, L. Liang,, G. Abecasis,, M. Moffatt, and M. Lathrop, (2009) Mapping complex disease traits with global gene expression. Nat. Rev. Genet., 10, 184–194
https://doi.org/10.1038/nrg2537
pmid: 19223927
|
| 11 |
E. E. Schadt,, C. Molony,, E. Chudin,, K. Hao,, X. Yang,, P. Y. Lum,, A. Kasarskis,, B. Zhang,, S. Wang,, C. Suver,, et al. (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biol., 6, e107
https://doi.org/10.1371/journal.pbio.0060107
pmid: 18462017
|
| 12 |
A. J. Myers,, J. R. Gibbs,, J. A. Webster,, K. Rohrer,, A. Zhao,, L. Marlowe,, M. Kaleem, , D. Leung,, L. Bryden,, P. Nath,, et al. (2007) A survey of genetic human cortical gene expression. Nat. Genet., 39, 1494–1499
https://doi.org/10.1038/ng.2007.16
pmid: 17982457
|
| 13 |
B. E. Stranger,, A. C. Nica,, M. S. Forrest,, A. Dimas,, C. P. Bird,, C. Beazley,, C. E. Ingle,, M. Dunning,, P. Flicek,, D. Koller,, et al. (2007) Population genomics of human gene expression. Nat. Genet., 39, 1217–1224
https://doi.org/10.1038/ng2142
pmid: 17873874
|
| 14 |
J. B. Veyrieras,, S. Kudaravalli,, S. Y. Kim,, E. T. Dermitzakis,, Y. Gilad,, M. Stephens, and J. K. Pritchard, (2008) High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet., 4, e1000214
https://doi.org/10.1371/journal.pgen.1000214
pmid: 18846210
|
| 15 |
T. Zeller,, P. Wild,, S. Szymczak,, M. Rotival,, A. Schillert,, R. Castagne,, S. Maouche,, M. Germain,, K. Lackner,, H. Rossmann, , et al. (2010) Genetics and beyond—the transcriptome of human monocytes and disease susceptibility. PLoS ONE, 5, e10693
https://doi.org/10.1371/journal.pone.0010693
pmid: 20502693
|
| 16 |
S. Stamm,, S. Ben-Ari,, I. Rafalska,, Y. Tang,, Z. Zhang,, D. Toiber,, T. A. Thanaraj, and H. Soreq, (2005) Function of alternative splicing. Gene, 344, 1–20
https://doi.org/10.1016/j.gene.2004.10.022
pmid: 15656968
|
| 17 |
B. R. Graveley, (2001) Alternative splicing: increasing diversity in the proteomic world. Trends Genet., 17, 100–107
https://doi.org/10.1016/S0168-9525(00)02176-4
pmid: 11173120
|
| 18 |
B. Modrek, and C. Lee, (2002) A genomic view of alternative splicing. Nat. Genet., 30, 13–19
https://doi.org/10.1038/ng0102-13
pmid: 11753382
|
| 19 |
D. Brett,, H. Pospisil,, J. Valcárcel,, J. Reich, and P. Bork, (2002) Alternative splicing and genome complexity. Nat. Genet., 30, 29–30
https://doi.org/10.1038/ng803
pmid: 11743582
|
| 20 |
P. J. Gardina,, T. A. Clark,, B. Shimada,, M. K. Staples,, Q. Yang,, J. Veitch,, A. Schweitzer,, T. Awad,, C. Sugnet,, S. Dee,, et al. (2006) Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics, 7, 325
https://doi.org/10.1186/1471-2164-7-325
pmid: 17192196
|
| 21 |
J. P. Venables, (2004) Aberrant and alternative splicing in cancer. Cancer Res., 64, 7647–7654
https://doi.org/10.1158/0008-5472.CAN-04-1910
pmid: 15520162
|
| 22 |
M. A. Garcia-Blanco,, A. P. Baraniak, and E. L. Lasda, (2004) Alternative splicing in disease and therapy. Nat. Biotechnol., 22, 535–546
https://doi.org/10.1038/nbt964
pmid: 15122293
|
| 23 |
L. Wang,, L. Duke,, P. S. Zhang,, R. B. Arlinghaus,, W. F. Symmans,, A. Sahin,, R. Mendez, and J. L. Dai, (2003) Alternative splicing disrupts a nuclear localization signal in spleen tyrosine kinase that is required for invasion suppression in breast cancer. Cancer Res., 63, 4724–4730
pmid: 12907655
|
| 24 |
P. A. Goodman,, C. M. Wood,, A. Vassilev,, C. Mao, and F. M. Uckun, (2001) Spleen tyrosine kinase (Syk) deficiency in childhood pro-B cell acute lymphoblastic leukemia. Oncogene, 20, 3969–3978
https://doi.org/10.1038/sj.onc.1204515
pmid: 11494125
|
| 25 |
H. Nakashima,, S. Natsugoe,, S. Ishigami,, H. Okumura,, M. Matsumoto,, S. Hokita, and T. Aikou, (2006) Clinical significance of nuclear expression of spleen tyrosine kinase (Syk) in gastric cancer. Cancer Lett., 236, 89–94
https://doi.org/10.1016/j.canlet.2005.05.022
pmid: 15993535
|
| 26 |
P. Prinos,, D. Garneau,, J. F. Lucier,, D. Gendron,, S. Couture,, M. Boivin,, J. P. Brosseau,, E. Lapointe,, P. Thibault,, M. Durand,, et al. (2011) Alternative splicing of SYK regulates mitosis and cell survival. Nat. Struct. Mol. Biol., 18, 673–679
https://doi.org/10.1038/nsmb.2040
pmid: 21552259
|
| 27 |
H. Feng,, Z. Qin, and X. Zhang, (2013) Opportunities and methods for studying alternative splicing in cancer with RNA-Seq. Cancer Lett., 340, 179–191
https://doi.org/10.1016/j.canlet.2012.11.010
pmid: 23196057
|
| 28 |
Q. Pan,, O. Shai,, L. J. Lee,, B. J. Frey, and B. J. Blencowe, (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet., 40, 1413–1415
https://doi.org/10.1038/ng.259
pmid: 18978789
|
| 29 |
E. T. Wang,, R. Sandberg,, S. Luo,, I. Khrebtukova,, L. Zhang,, C. Mayr,, S. F. Kingsmore, , G. P. Schroth, and C. B. Burge, (2008) Alternative isoform regulation in human tissue transcriptomes. Nature, 456, 470–476
https://doi.org/10.1038/nature07509
pmid: 18978772
|
| 30 |
S. Marco-Sola,, M. Sammeth,, R. Guigó, and P. Ribeca, (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat. Methods, 9, 1185–1188
https://doi.org/10.1038/nmeth.2221
pmid: 23103880
|
| 31 |
L. Y. Chen,, K. C. Wei,, A. C. Huang,, K. Wang,, C. Y. Huang,, D. Yi,, C. Y. Tang,, D. J. Galas, and L. E. Hood, (2012) RNASEQR—a streamlined and accurate RNA-seq sequence analysis program. Nucleic Acids Res., 40, e42
https://doi.org/10.1093/nar/gkr1248
pmid: 22199257
|
| 32 |
C. Trapnell,, L. Pachter, and S. L. Salzberg, (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 25, 1105–1111
https://doi.org/10.1093/bioinformatics/btp120
pmid: 19289445
|
| 33 |
J. Wu,, O. Anczuków,, A. R. Krainer,, M. Q. Zhang,and C. Zhang, (2013) OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Res., 41, 5149–5163
https://doi.org/10.1093/nar/gkt216
pmid: 23571760
|
| 34 |
A. Dobin,, C. A. Davis,, F. Schlesinger,, J. Drenkow,, C. Zaleski,, S. Jha,, P. Batut,, M. Chaisson, and T. R. Gingeras, (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15–21
https://doi.org/10.1093/bioinformatics/bts635
pmid: 23104886
|
| 35 |
L. Wang,, X. Wang,, X. Wang,, Y. Liang, and X. Zhang, (2011) Observations on novel splice junctions from RNA sequencing data. Biochem. Biophys. Res. Commun., 409, 299–303
https://doi.org/10.1016/j.bbrc.2011.05.005
pmid: 21575597
|
| 36 |
C. Trapnell,, B. A. Williams,, G. Pertea,, A. Mortazavi,, G. Kwan,, M. J. van Baren,, S. L. Salzberg,, B. J. Wold, and L. Pachter, (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol., 28, 511–515
https://doi.org/10.1038/nbt.1621
pmid: 20436464
|
| 37 |
A. Roberts,, H. Pimentel,, C. Trapnell, and L. Pachter, (2011) Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics, 27, 2325–2329
https://doi.org/10.1093/bioinformatics/btr355
pmid: 21697122
|
| 38 |
W. Li,, J. Feng, and T. Jiang, (2011) IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly. J. Comput. Biol., 18, 1693–1707
https://doi.org/10.1089/cmb.2011.0171
pmid: 21951053
|
| 39 |
C. Trapnell,, D. G. Hendrickson,, M. Sauvageau,, L. Goff,, J. L. Rinn, and L. Pachter, (2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol., 31, 46–53
https://doi.org/10.1038/nbt.2450
pmid: 23222703
|
| 40 |
X. Ma, and X. Zhang, (2013) NURD: an implementation of a new method to estimate isoform expression from non-uniform RNA-seq data. BMC Bioinformatics, 14, 220
https://doi.org/10.1186/1471-2105-14-220
pmid: 23837734
|
| 41 |
H. Jiang, and W. H. Wong, (2009) Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 25, 1026–1032
https://doi.org/10.1093/bioinformatics/btp113
pmid: 19244387
|
| 42 |
Z. Wu,, X. Wang,, X. Zhang, (2011) Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq. Bioinformatics, 27, 502–508
|
| 43 |
H. Richard,, M. H. Schulz,, M. Sultan,, A. Nürnberger,, S. Schrinner,, D. Balzereit,, E. Dagand,, A. Rasche,, H. Lehrach,, M. Vingron,, et al. (2010) Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucleic Acids Res., 38, e112
https://doi.org/10.1093/nar/gkq041
pmid: 20150413
|
| 44 |
J. M. Johnson,, J. Castle,, P. Garrett-Engele,, Z. Kan,, P. M. Loerch,, C. D. Armour,, R. Santos,, E. E. Schadt,, R. Stoughton, and D. D. Shoemaker, (2003) Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science, 302, 2141–2144
https://doi.org/10.1126/science.1090100
pmid: 14684825
|
| 45 |
J. Hull,, S. Campino,, K. Rowlands,, M. S. Chan,, R. R. Copley,, M. S. Taylor,, K. Rockett,, G. Elvidge,, B. Keating,, J. Knight,, et al. (2007) Identification of common genetic variation that modulates alternative splicing. PLoS Genet., 3, e99
https://doi.org/10.1371/journal.pgen.0030099
pmid: 17571926
|
| 46 |
T. Kwan,, D. Benovoy,, C. Dias,, S. Gurd,, C. Provencher,, P. Beaulieu,, T. J. Hudson,, R. Sladek, and J. Majewski, (2008) Genome-wide analysis of transcript isoform variation in humans. Nat. Genet., 40, 225–231
https://doi.org/10.1038/ng.2007.57
pmid: 18193047
|
| 47 |
E. L. Heinzen,, D. Ge,, K. D. Cronin,, J. M. Maia,, K. V. Shianna,, W. N. Gabriel,, K. A. Welsh-Bohmer,, C. M. Hulette,, T. N. Denny, and D. B. Goldstein, (2008) Tissue-specific genetic control of splicing: implications for the study of complex traits. PLoS Biol., 6, e1
https://doi.org/10.1371/journal.pbio.1000001
pmid: 19222302
|
| 48 |
J. Coulombe-Huntington,, K. C. Lam,, C. Dias, and J. Majewski, (2009) Fine-scale variation and genetic determinants of alternative splicing across individuals. PLoS Genet., 5, e1000766
https://doi.org/10.1371/journal.pgen.1000766
pmid: 20011102
|
| 49 |
Y. Lee,, E. R. Gamazon,, E. Rebman,, Y. Lee,, S. Lee,, M. E. Dolan,, N. J. Cox, and Y. A. Lussier, (2012) Variants affecting exon skipping contribute to complex traits. PLoS Genet., 8, e1002998
https://doi.org/10.1371/journal.pgen.1002998
pmid: 23133393
|
| 50 |
A. Ramasamy,, D. Trabzuni,, J. R. Gibbs,, A. Dillman,, D. G. Hernandez,, S. Arepalli,, R. Walker,, C. Smith,, G. P. Ilori,, A. A. Shabalin,, et al., (2013) Resolving the polymorphism-in-probe problem is critical for correct interpretation of expression QTL studies. Nucleic Acids Res., 41, e88
https://doi.org/10.1093/nar/gkt069
pmid: 23435227
|
| 51 |
K. Mozhui,, X. Wang,, J. Chen,, M. K. Mulligan,, Z. Li,, J. Ingles,, X. Chen,, L. Lu, and R. W. Williams, (2011) Genetic regulation of Nrnx1 expression: an integrative cross-species analysis of schizophrenia candidate genes. Transl. Psychiatr., 1, e25
https://doi.org/10.1038/tp.2011.39
|
| 52 |
E. Lalonde,, K. C. Ha,, Z. Wang,, A. Bemmo,, C. L. Kleinman,, T. Kwan,, T. Pastinen, and J. Majewski, (2011) RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res., 21, 545–554
https://doi.org/10.1101/gr.111211.110
pmid: 21173033
|
| 53 |
W. Sun, and Y. Hu, (2013) eQTL mapping using RNA-seq data. Stat. Biosci., 5, 198–219
https://doi.org/10.1007/s12561-012-9068-3
pmid: 23667399
|
| 54 |
T. Lappalainen,, M. Sammeth,, M. R. Friedländer,, P. A. ’t Hoen,, J. Monlong,, M. A. Rivas,, M. Gonzàlez-Porta,, N. Kurbatova,, T. Griebel,, P. G. Ferreira,, et al., (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature, 501, 506–511
https://doi.org/10.1038/nature12531
pmid: 24037378
|
| 55 |
W. Wang,, Z. Qin,, Z. Feng,, X. Wang, and X. Zhang, (2013) Identifying differentially spliced genes from two groups of RNA-seq samples. Gene, 518, 164–170
https://doi.org/10.1016/j.gene.2012.11.045
pmid: 23228854
|
| 56 |
The International HapMap Consortium. (2003) The international HapMap project. Nature, 426, 789–796
pmid: 14685227
|
| 57 |
Y. Guan, and M. Stephens, (2008) Practical issues in imputation-based association mapping. PLoS Genet., 4, e1000279
https://doi.org/10.1371/journal.pgen.1000279
pmid: 19057666
|
| 58 |
P. Scheet, and M. Stephens, (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet., 78, 629–644
https://doi.org/10.1086/502802
pmid: 16532393
|
| 59 |
S. Yoon,, Z. Xuan,, V. Makarov,, K. Ye, and J. Sebat, (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res., 19, 1586–1592
https://doi.org/10.1101/gr.092981.109
pmid: 19657104
|
| 60 |
V. Boeva,, A. Zinovyev,, K. Bleakley,, J.-P. Vert,, I. Janoueix-Lerosey,, O. Delattre, and E. Barillot, (2011) Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. Bioinformatics, 27, 268–269
https://doi.org/10.1093/bioinformatics/btq635
pmid: 21081509
|
| 61 |
K. Zhao,, Z. X. Lu,, J. W. Park,, Q. Zhou, and Y. Xing, (2013) GLiMMPS: robust statistical model for regulatory variation of alternative splicing using RNA-seq data. Genome Biol., 14, R74
https://doi.org/10.1186/gb-2013-14-7-r74
pmid: 23876401
|
| 62 |
J. Monlong,, M. Calvo,, P. G. Ferreira, and R. Guigó, (2014) Identification of genetic variants associated with alternative splicing using sQTLseekeR. Nat. Commun., 5, 4698
https://doi.org/10.1038/ncomms5698
pmid: 25140736
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|