|
|
A nonparametric regression method for multiple longitudinal phenotypes using multivariate adaptive splines |
Wensheng ZHU1,2, Heping ZHANG2() |
1. Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China; 2. Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034, USA |
|
|
Abstract In genetic studies of complex diseases, particularly mental illnesses, and behavior disorders, two distinct characteristics have emerged in some data sets. First, genetic data sets are collected with a large number of phenotypes that are potentially related to the complex disease under study. Second, each phenotype is collected from the same subject repeatedly over time. In this study, we present a nonparametric regression approach to study multivariate and time-repeated phenotypes together by using the technique of the multivariate adaptive regression splines for analysis of longitudinal data (MASAL), which makes it possible to identify genes, gene-gene and gene-environment, including time, interactions associated with the phenotypes of interest. Furthermore, we propose a permutation test to assess the associations between the phenotypes and selected markers. Through simulation, we demonstrate that our proposed approach has advantages over the existing methods that examine each longitudinal phenotype separately or analyze the summarized values of phenotypes by compressing them into one-time-point phenotypes. Application of the proposed method to the Framingham Heart Study illustrates that the use of multivariate longitudinal phenotypes enhanced the significance of the association test.
|
Keywords
Multivariate phenotypes
longitudinal data analysis
genetic association test
multivariate adaptive regression splines
|
Corresponding Author(s):
ZHANG Heping,Email:heping.zhang@yale.edu
|
Issue Date: 01 June 2013
|
|
1 |
Carlborg O, Haley C S. Epistasis: too often neglected in complex trait studies? Nat Rev Genet , 2004, 5: 618-625 doi: 10.1038/nrg1407
|
2 |
Friedman J H. Multivariate adaptive regression splines. Ann Stat , 1991, 191-141 doi: 10.1214/aos/1176347963
|
3 |
Kallberg H, Padyukov L, Plenge R M, . Gene-gene and gene-environment interactions involving HLA-DRB1, PTPN22, and smoking in two subsets of rheumatoid arthritis. Am J Hum Genet , 2007, 80: 867-875 doi: 10.1086/516736
|
4 |
Kannel W B, Dawber T R, Kagan A, . Factors of risk in the development of coronary heart disease-six year follow-up experience. The Framingham Study . Ann Intern Med, 1961, 55: 33-50 doi: 10.7326/0003-4819-55-1-33
|
5 |
Kathiresan S, Manning A K, Demissie S, . A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med Genet , 2007, 8(Suppl 1): S17 doi: 10.1186/1471-2350-8-S1-S17
|
6 |
Kathiresan S, Melander O, Guiducci C, . Six new loci associated with blood lowdensity lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat Genet , 2008, 40: 189-197 doi: 10.1038/ng.75
|
7 |
Kooner J S, Chambers J C, Aguilar-Salinas C A, . Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides. Nat Genet , 2008, 40: 149-151 doi: 10.1038/ng.2007.61
|
8 |
Lange C, Silverman E, Xu X, . A multivariate family-based association test using generalized estimating equations: FBAT-GEE. Biostatistics , 2003, 4: 195-206 doi: 10.1093/biostatistics/4.2.195
|
9 |
Liang K Y, Zeger S L. Longitudinal data analysis using generalized linear models. Biometrika , 1986, 3: 13-22 doi: 10.1093/biomet/73.1.13
|
10 |
Miller N E, Miller G J. Letter: high-density lipoprotein and atherosclerosis. Lancet , 1975, 1: 10-33
|
11 |
Namboodiri K K, Kaplan E B, Heuch I, . The Collaborative Lipid Research Clinics Family Study: biological and cultural determinants of familial resemblance for plasma lipids and lipoproteins. Genet Epidemiol , 1985, 2: 227-254 doi: 10.1002/gepi.1370020302
|
12 |
Pollin T I, Damcott C M, Shen H, . A null mutation in human APOC3 confers a favorable plasma lipid profile and apparent cardioprotection. Science , 2008, 322: 1702-1705 doi: 10.1126/science.1161524
|
13 |
Xu X, Tian L, Wei L J. Combining dependent tests for linkage or association across multiple phenotypic traits. Biostatistics , 2003, 4: 223-229 doi: 10.1093/biostatistics/4.2.223
|
14 |
Yeager M, Orr N, Hayes R B, . Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet , 2007, 39: 645-649 doi: 10.1038/ng2022
|
15 |
Zhang H P. Multivariate adaptive splines for analysis of longitudinal data. J Comput Graph Stat , 1997, 6: 74-91
|
16 |
Zhang H P. Analysis of infant growth curves using multivariate adaptive splines. Biometrics , 1999, 55: 452-459 doi: 10.1111/j.0006-341X.1999.00452.x
|
17 |
Zhang H P. Mixed effects multivariate adaptive splines model for the analysis of longitudinal and growth curve data. Stat Methods Med Res , 2004, 13: 63-82 doi: 10.1191/0962280204sm353ra
|
18 |
Zhang H P, Liu C-T, Wang X Q. An association test for multiple traits based on the generalized Kendall’s tau. J Amer Stat Assoc , 2010, 105: 473-481 doi: 10.1198/jasa.2009.ap08387
|
19 |
Zhang H P, Zhong X. Linkage analysis of longitudinal data and design consideration. BMC Genet , 2006, 7: 37 doi: 10.1186/1471-2156-7-37
|
20 |
Zhu W S, Jiang Y, Zhang H P. Nonparametric covariate-adjusted association tests based on the generalized Kendall’s tau. J Amer Stat Assoc , 2012, 107: 1-11 doi: 10.1080/01621459.2011.643707
|
21 |
Zhu W S, Zhang H P. Why do we test multiple traits in genetic association studies? (with discussion), J Korean Stat Soc , 2009, 38: 1-10 doi: 10.1016/j.jkss.2008.10.006
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|