Wonil Chung and Hyunji Hwang contributed equally to this work.

Various methodologies for the genetic analysis of longitudinal data have been proposed and applied to data from large-scale genome-wide association studies (GWAS) to identify single nucleotide polymorphisms (SNPs) associated with traits of interest and to detect SNP-time interactions. We recently proposed a grid-based Bayesian mixed model for longitudinal genetic data and showed that our Bayesian method increased the statistical power compared to the corresponding univariate method and well detected SNP-time interactions. In this paper, we further analyze longitudinal obesity-related traits such as body mass index, hip circumference, waist circumference, and waist-hip ratio from Korea Association Resource data to evaluate the proposed Bayesian method. We first conducted GWAS analyses of cross-sectional traits and combined the results of GWAS analyses through a meta-analysis based on a trajectory model and a random-effects model. We then applied our Bayesian method to a subset of SNPs selected by meta-analysis to further discover SNPs associated with traits of interest and SNP-time interactions. The proposed Bayesian method identified several novel SNPs associated with longitudinal obesity-related traits, and almost 25% of the identified SNPs had significant p-values for SNP-time interactions.

Genome-wide association studies (GWAS) have been extensively performed to identify single nucleotide polymorphisms (SNPs) associated with cross-sectional traits, but GWAS on longitudinal traits have been underexplored [

Recently, for the analysis of longitudinal genetic data, we developed a Bayesian mixed model with a built-in variable selection feature based on a grid-based covariance estimation approach [

In this paper, we analyze longitudinal obesity-related traits such as body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) from Korea Association Resource (KARE) data to further evaluate the proposed Bayesian method. These traits are well-known markers of obesity across all ages and continuously change over time. We first conduct cross-sectional GWAS analysis at each time point and combine the results of the GWAS analysis of all-time points using a meta-analysis based on a trajectory model [

The Korean Genome and Epidemiology Study (KoGES) was a large prospective cohort study that was initiated to solve public health issues and prepare for personalized and preventive health care in Korea [

The 10,038 participants in the KARE were genotyped at ~500,000 SNPs using the Affymetrix Genomewide Human SNP array 5.0. For SNP quality control (QC), we removed SNPs with minor allele frequency (MAF)<0.05, genotype calling rates <95%, and Hardy-Weinberg equilibrium p-values <10^{−6}, resulting in 4,518,929 SNPs. For sample QC, we only preserved participants with consistent sex and calling rates >90%, resulting in 9,331 individuals. The SNP QC and sample QC were performed using PLINK (v1.90) [^{2} >0.80, which consisted of 3,848,960 SNPs.

In the GWAS for cross-sectional obesity-related traits such as BMI, HIP, WST, and WHR, we performed linear mixed model (LMM) association analysis implemented in the BOLT-LMM (v2.3) software [^{−8}) that were independent from each other and formed clumps of other SNPs that were located within 1,000 kb from each index SNP and in LD with the index SNP based on r^{2} > 0.1. Clumps annotated to the same gene were further combined to the clump with the most significant index SNP.

To discover SNPs associated with the overall trend and trajectory of phenotypes, we prepared two different longitudinal outcomes (the average and trajectory), which were obtained from longitudinal BMI, HIP, WST, and WHR measurements (baseline to sixth follow-up). To assess the average outcome, we calculated the average BMI, HIP, WST, and WHR measurements over the six time points and then used the average values as the phenotype for GWAS analysis. To assess the trajectory outcome, we utilized the methodology in Gouveia et al. [_{ij}_{0}_{i}_{0};; _{1}_{i}_{1}; _{hij}_{h}_{ij}_{0}_{i}_{1}_{i}_{1}_{i}

We utilized two different meta-analysis methods based on a fixed-effects model (Meta-Fixed) and a random-effects model (Meta-Random) to combine GWAS results for cross-sectional traits across all-time points (baseline to sixth follow-up). The Meta-Fixed method, as exemplified by the inverse-variance weighted method [

To further discover SNPs associated with traits of interest and SNPs interacting with time/environmental factors, we considered the following Bayesian mixed model:

where _{i}_{i}_{1}, …, _{in}_{i})^{T}_{i}_{i}_{n}_{i} is an _{i}^{tT}^{gT}^{ggT}^{gtT}^{T}_{i}_{i}_{i}^{2}_{n}_{i}). For a given trait, we had _{i}_{1}, …, _{k}^{T}_{i}_{i}_{i}_{i}_{r}_{r}_{+1} (_{r}_{r}_{+1}), we set _{r}

For Bayesian estimation of the mixed-effects model (^{T}

where _{i}_{i}_{1}, … _{ik}^{T}_{ij}

As priors for _{l}^{T}_{ml}^{T}_{0}_{0}^{2} distribution. The prior for the overall mean ^{2} was chosen as a scaled inverse ^{2} distribution,

The posterior samples can be used to approximate the posterior distribution of the parameters. The posterior inclusion probability of each SNP was calculated using its inclusion proportion in the Markov chain Monte Carlo samples. The Bayes factor (BF) can be calculated to quantify the evidence for the inclusion of a specific SNP against the exclusion of that SNP. Jeffreys [_{2}(BF) > 6.8). A critical issue with the proposed Bayesian model was how to choose an optimal number of grid points,

We conducted a GWAS analysis of KARE data on longitudinal obesity-related phenotypes such as BMI, HIP, WST, and WHR. Seven covariates (age, age squared, sex, area, drinking, smoking status and 10 genotype PCs) were included in the analysis. Among 9,331 individuals, we selected 4,621 individuals who had no missing values for either phenotypes or covariates. The number of measurements for each individual was six, and thus the total number of observations was 27,726. Hapmap3 SNPs with MAF > 0.05 and imputation ^{2} > 0.8 were retained for the GWAS analyses, resulting in 3,848,960 SNPs.

GWAS analyses for cross-sectional obesity-related phenotypes were performed using LMM association analysis implemented in the BOLT-LMM. To assess the cross-sectional outcomes, we analyzed all measurements (baseline to sixth follow-up) separately for each phenotype. ^{−8}), and for HIP, five SNPs on chromosome 11 (baseline), one SNP on chromosome 12 (third follow-up), two SNPs on chromosome 12 (fourth follow-up), and one SNP on chromosome 6 (sixth follow-up) exceeded the threshold for GWAS significance in

In order to identify SNPs associated with the overall trend and trajectory of the phenotype, we utilized two types of longitudinal outcomes (average and trajectory), obtained for BMI, HIP, WST, and WHR measurements across consecutive examinations of each individual. For the average outcomes, we calculated the average of all measurements from baseline to sixth follow-up. For the trajectory outcomes, we estimated the per-individual trajectory for all obesity-related traits by fitting a mixed model in the R package lme4 [

To effectively combine all GWAS results for cross-sectional traits, we performed two different meta-analyses (Meta-Fixed and Meta-Random; see the Methods section) using RE2C software [_{10}(p-value) in GWAS for four longitudinal traits between the Meta-Random method and Meta-Fixed method. As expected, the p-values with the Meta-Random method were smaller than those obtained with the Meta-Fixed method, but there was reasonable concordance between them. In

For our Bayesian analysis, a subset of SNPs was selected using LD clumping analysis based on the results of the Meta-Random method to only include SNPs that are not highly correlated with each other (with correlation < 0.5 to avoid multicollinearity). For our Bayesian analysis, we picked a list of the 4,124 SNPs from the LD clumping analyses (option: --clump -kb 250 −p1 0.001 −p2 0.01 −r2 0.5) using PLINK (v1.90) based on summary statistics from the Meta-Random analysis for the four obesity-related traits. Again, our Bayesian analysis included age, age squared, sex, area, drinking, smoking status, and 10 genotype PCs as covariates. The number of grid time points was set to 3 for the analysis based on the DIC and simplified BPIC scores with a different number of grid points (results not shown). _{2}(BF) > 6.8) for BMI, 20 SNPs for HIP, 10 SNPs for WST, and 6 SNPs for WHR in

Also, our Bayesian method can discover SNPs interacting with environmental covariates including age, we needed to investigate whether the identified SNPs interacted with age covariate. For SNP-age interaction, we fitted the LMM with each of the four traits (BMI, HIP, WST, and WHR) as a response variable and eight covariates (SNP, age, age squared, sex, area, drinking and smoking status, SNP-age interaction) as predictors: _{ij}_{0}_{i}_{0}, _{h}_{hij}_{ij}

In this paper, we performed a meta-analysis based on GWAS results for cross-sectional obesity-related traits such as BMI, HIP, WST, and WHR from KARE data and then applied our Bayesian method to a subset of SNPs selected by meta-analysis to further detect causal SNPs and SNPs with time-varying effects. To obtain the meta-analysis results, we first conducted GWAS analyses on obesity-related traits separately for each time point and then combined all GWAS results based on the average, trajectory, Meta-Fixed, and Meta-Random methods. For our Bayesian analysis, we first selected a subset of SNPs based on Meta-Random results and the LD clumping analyses and then applied our Bayesian method to those selected SNPs. We found our Bayesian method newly detected various novel loci and there was a reasonable concordance between our Bayesian analysis and the average, trajectory, Meta-Fixed, and Meta-Random methods. We also confirmed that almost 25% of the identified SNPs using our Bayesian method had significant p-values for SNP-age interaction, meaning that our Bayesian method can well detect SNPs interacting with age as well as SNPs associated with traits of interest.

With the limited sample size, we restricted our Bayesian analysis to a subset of SNPs based on the BOLT-LMM analysis and LD clumping analysis. With a sufficient sample size, our method can be applied to all available SNPs. We are currently developing a parallel computing algorithm based on a message passing interface to execute multiple groups of SNPs simultaneously. This will make it feasible to apply our method to large-sample GWAS data.

One limitation of the proposed Bayesian model is its interpretability. It cannot readily conclude whether the identified SNPs had main effects and/or interactions with other covariates including time. Novel findings need to be further evaluated to elucidate their mode of action, as we performed the analysis for SNP-age interaction. If the number of identified SNPs is small, we can estimate the effect of each genotypic combination, which we can use to interpret the genetic mode. Our Bayesian model for GWAS data relied on a set of preselected SNPs. How to select a good set of SNPs, especially those with low marginal effects but high interactions with other SNPs or environmental factors, is challenging and deserves further investigation.

Conceptualization: WC. Data curation: WC, HH, TP. Formal analysis: WC, HH. Funding acquisition: WC, TP. Methodology: WC. Writing - original draft: WC, HH. Writing - review & editing: WC, HH, TP.

Taesung Park serves as an editor of the Genomics and Informatics, but has no role in the decision to publish this article. All remaining authors have declared no conflicts of interest.

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (2020R1C1C1A01012657) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2021R1A6A1A10044154). This work was supported by the Soongsil University Research Fund. GWAS dataset and epidemiological data for KoGES are third-party data and are available under the approval of the data access committee of the National Biobank of Korea (

Supplementary data can be found with this article online at

Genomewide Manhattan plots of −log_{10}(p-value) for association with crosssectional second body mass index (BMI), hip circumference (HIP), waist circumference (WST), waisthip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(P-value) for association with crosssectional third body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waisthip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(p-value) for association with cross-sectional fourth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(p-value) for association with crosssectional fifth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(p-value) for association with crosssectional sixth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with baseline body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with cross-sectional second body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with crosssectional third body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with crosssectional fourth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with crosssectional fifth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with crosssectional sixth body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(p-value) for association with average body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(P-value) for association with trajectory body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with average body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Quantile-Quantile plots of −log_{10}(p-value) for association with trajectory body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM.

Genomewide Manhattan plots of −log_{10}(p-value) for association with longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using Meta-Fixed method.

Quantile-Quantile plots of −log_{10}(p-value) for association with longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using Meta-Fixed method.

Quantile-Quantile plots of −log_{10}(p-value) for association with longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using Meta-Random method.

Scatter plots of −log_{10}(P-value) for association for longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements between Meta-Random method vs. Meta-Fixed Method.

Genomewide association results for BMI, HIP, WST, and WHRassociated SNPs with p-value < 5 ⅹ10^{-8} using BOLT-LMM

Genomewide association results for BMI, HIP, WST, and WHR-associated SNPs with p-value < 5 ⅹ10^{-8} using Meta-Fixed

Genomewide Association results for BMI, HIP, WST, and WHR-associated SNPs with p-value < 5 ⅹ10^{-8} using using Meta-Random

Genomewide association results for BMI, HIP, WST, and WHR-associated SNPs with 2log(BF) > 6.8 using gridbayes

Genome-wide Manhattan plots of −log_{10}(p-value) for association with baseline body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using BOLT-LMM. For BMI, one single nucleotide polymorphism (SNP) on chromosome 6 reached genome-wide association studies (GWAS) significance, for HIP, five SNPs on chromosome 11 exceeded the threshold for GWAS significance and for WST, eleven SNPs on chromosome 11 reached GWAS significance.

Genome-wide Manhattan plots of −log_{10}(p-value) for associations with longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements using the Meta-Random method. Two single nucleotide polymorphisms (SNPs) on chromosome 5 and seven SNPs on chromosome 16 reached genome-wide association studies (GWAS) significance and for WHR, six SNPs on chromosome 12 reached GWAS significance.

Scatter plots of −log_{10}(p-value) for association for longitudinal body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements between the Meta-Random method and baseline analysis. The p-values of the Meta-Random method were more significant than those of the baseline analysis in general.

Genome-wide profiles of 2log(BF) for all combined effects with body mass index (BMI), hip circumference (HIP), waist circumference (WST), and waist-hip ratio (WHR) measurements from Gridbayes. The two dashed horizontal lines represent the genome-wide thresholds for moderate (Bayes factor [BF]=10) strong (BF=30) genome-wide associations. We found 112 single nucleotide polymorphisms (SNPs) with strong signals for BMI, 20 SNPs for HIP, 10 SNPs for WST, and six SNPs for WHR.

Time-specific changes in body mass index (BMI) for three different genotypes (0, 1, and 2) at four single nucleotide polymorphisms (SNPs) (Chr5:149489242, Chr6:96377858, Chr10:18571215, and Chr20:40534573), age-specific changes in hip circumference (HIP) at one SNP (Chr8:98187086), and age-specific changes in waist circumference (WST) at one SNP (Chr2:133806432). The figures clearly show different slopes in phenotypic values over six time points for three genotypes, meaning that there are SNP-age interactions at identified the genetic loci.

Genome-wide association results for BMI, HIP, WST, and WHR-associated SNPs with a p-value < 0.05 using the LMM method

Trait | Chr | Position | Minor | Major | MAF | Beta | SE | p(LMM) |
---|---|---|---|---|---|---|---|---|

BMI | 13 | 55546160 | C | T | 0.3864 | −0.1773 | 0.0467 | 3.53 × 10^{−4} |

13 | 55326244 | A | G | 0.3059 | −0.1164 | 0.0503 | 3.53 × 10^{−4} | |

8 | 108287752 | C | T | 0.1474 | 0.1386 | 0.0638 | 5.71 × 10^{−4} | |

9 | 109243327 | C | A | 0.2209 | 0.0088 | 0.0580 | 1.24 × 10^{−3} | |

12 | 97721824 | G | A | 0.3989 | −0.0519 | 0.0484 | 1.71 × 10^{−3} | |

4 | 17340809 | A | G | 0.1842 | 0.1939 | 0.0596 | 1.72 × 10^{−3} | |

5 | 136606085 | C | T | 0.4770 | −0.1044 | 0.0442 | 2.06 × 10^{−3} | |

5 | 136608677 | G | A | 0.4774 | −0.1075 | 0.0442 | 2.32 × 10^{−3} | |

11 | 126243505 | T | C | 0.4071 | 0.1202 | 0.0480 | 2.89 × 10^{−3} | |

3 | 88585081 | A | C | 0.0772 | −0.1467 | 0.0860 | 3.94 × 10^{−3} | |

20 | 37949055 | G | A | 0.2599 | 0.0571 | 0.0503 | 5.30 × 10^{−3} | |

3 | 37832172 | A | G | 0.4460 | −0.1209 | 0.0475 | 6.85 × 10^{−3} | |

4 | 17378993 | T | C | 0.2553 | 0.1735 | 0.0520 | 8.00 × 10^{−3} | |

9 | 119401114 | T | C | 0.4033 | 0.0356 | 0.0452 | 1.02 × 10^{−2} | |

20 | 15637524 | T | C | 0.4354 | 0.1340 | 0.0519 | 1.18 × 10^{−2} | |

13 | 105882996 | C | T | 0.3231 | −0.0168 | 0.0550 | 1.45 × 10^{−2} | |

20 | 43260694 | C | T | 0.1624 | −0.1104 | 0.0613 | 1.82 × 10^{−2} | |

6 | 96377858 | T | C | 0.2070 | −0.2001 | 0.0601 | 1.96 × 10^{−2} | |

3 | 186711098 | T | G | 0.4910 | −0.1182 | 0.0454 | 2.13 × 10^{−2} | |

5 | 149489242 | T | C | 0.1006 | 0.1105 | 0.0785 | 2.25 × 10^{−2} | |

10 | 18571215 | G | A | 0.3777 | −0.1505 | 0.0495 | 2.37 × 10^{−2} | |

20 | 40534573 | T | C | 0.3207 | −0.1388 | 0.0488 | 2.56 × 10^{−2} | |

3 | 5500434 | T | C | 0.2601 | 0.0894 | 0.0521 | 2.70 × 10^{−2} | |

5 | 136484839 | G | T | 0.2642 | 0.1446 | 0.0519 | 2.76 × 10^{−2} | |

20 | 16840158 | A | G | 0.3718 | −0.1600 | 0.0521 | 3.02 × 10^{−2} | |

13 | 104745772 | A | G | 0.0685 | 0.2781 | 0.0892 | 3.19 × 10^{−2} | |

9 | 119383320 | A | C | 0.3169 | −0.0684 | 0.0479 | 3.42 × 10^{−2} | |

1 | 237458209 | G | A | 0.4012 | −0.1363 | 0.0493 | 3.75 × 10^{−2} | |

12 | 78510165 | G | T | 0.4001 | −0.1467 | 0.0455 | 4.02 × 10^{−2} | |

11 | 20359771 | C | A | 0.3368 | −0.1069 | 0.0523 | 4.65 × 10^{−2} | |

HIP | 5 | 28989968 | G | T | 0.1186 | −0.1733 | 0.0700 | 6.01 × 10^{−3} |

8 | 98187086 | A | G | 0.1649 | 0.1739 | 0.0612 | 8.60 × 10^{−3} | |

1 | 64914487 | G | A | 0.0839 | 0.2533 | 0.0816 | 4.83 × 10^{−2} | |

WST | 2 | 133232051 | A | G | 0.0532 | 0.1073 | 0.1009 | 3.05 × 10^{−4} |

5 | 41264019 | C | T | 0.1530 | −0.1240 | 0.0645 | 5.06 × 10^{−3} | |

5 | 42569642 | C | A | 0.4095 | −0.1104 | 0.0457 | 4.61 × 10^{−2} | |

2 | 133806432 | G | T | 0.3763 | 0.0108 | 0.0506 | 4.96 × 10^{−2} | |

WHR | 12 | 114085922 | C | T | 0.2865 | 0.1432 | 0.0508 | 3.23 × 10^{−2} |

BMI, body mass index; HIP, hip circumference; WST, waist circumference; WHR, waist-hip ratio; SNP, single nucleotide polymorphism; LMM, linear mixed model.