Impact of type 2 diabetes variants identified through genome-wide association studies in early-onset type 2 diabetes from South Indian population
Article information
Abstract
The prevalence of early-onset type 2 diabetes (EOT2D) is increasing in Asian countries. Genome-wide association studies performed in European and various other populations have identified associations of numerous variants with type 2 diabetes in adults. However, the genetic component of EOT2D which is still unexplored could have similarities with late-onset type 2 diabetes. Here in the present study we aim to identify the association of variants with EOT2D in South Indian population. Twenty-five variants from 18 gene loci were genotyped in 1,188 EOT2D and 1,183 normal glucose tolerant subjects using the MassARRAY technology. We confirm the association of the HHEX variant rs1111875 with EOT2D in this South Indian population and also the association of CDKN2A/2B (rs7020996) and TCF7L2 (rs4506565) with EOT2D. Logistic regression analyses of the TCF7L2 variant rs4506565(A/T), showed that the heterozygous and homozygous carriers for allele ‘T’ have odds ratios of 1.47 (95% confidence interval [CI], 1.17 to 1.83; p = 0.001) and 1.65 (95% CI, 1.18 to 2.28; p = 0.006) respectively, relative to AA homozygote. For the HHEX variant rs1111875 (T/C), heterozygous and homozygous carriers for allele ‘C’ have odds ratios of 1.13 (95% CI, 0.91 to 1.42; p = 0.27) and 1.58 (95% CI, 1.17 to 2.12; p = 0.003) respectively, relative to the TT homozygote. For CDKN2A/2B variant rs7020996, the heterozygous and homozygous carriers of allele ‘C’ were protective with odds ratios of 0.65 (95% CI, 0.51 to 0.83; p = 0.0004) and 0.62 (95% CI, 0.27 to 1.39; p = 0.24) respectively, relative to TT homozygote. This is the first study to report on the association of HHEX variant rs1111875 with EOT2D in this population.
Introduction
Early-onset type 2 diabetes (EOT2D) a relatively new phenomenon recognized in the past few decades and is caused by the complex interplay between genetic and environmental factors [1,2]. Currently 425 million people are living with diabetes worldwide and the number is expected to reach 629 million by 2045 with nearly 60% of the affected people living in Asian countries [3,4]. Asians have an earlier age of diagnosis and a higher prevalence of diabetes for the same body mass index (BMI) than Europeans. India alone is presently home to 72 million people with diabetes [4,5]. Epidemiological studies performed in Indians showed a 25.3% increase in individuals developing type 2 diabetes (T2D) at <40 years [6]. A recent nationwide population-based study estimating the national prevalence of diabetes and prediabetes in India also indicates 25–34 years as the take-off point for diabetes both in urban and rural areas [7].
Genome-wide association studies (GWAS) and subsequent meta-analyses of these studies have increased the list of T2D associated genetic variants to more than a hundred [8]. However, these variants identified by the large scale GWAS were mostly with the late-onset type 2 diabetes (LOT2D) subtype that develops after 40 years of age. Though the T2D subtype that develops at earlier ages (EOT2D) has a considerably larger heritable component, very few studies have looked at the genetic component of EOT2D. T2D develops at an earlier age a decade or two earlier in Asian Indians and often coincides with the monogenic form of diabetes namely maturity-onset diabetes of the young (MODY) [9,10]. Indeed previous studies have demonstrated association of some MODY variants also with EOT2D [11-13]. Genetic variants in TCF7L2 [14-16], HNF1A [17], ABCA1 [18], DIO2 [19], PCLO [20], TRIB3 [21], ADIPOQ, and LEPR [22] identified with LOT2D in various populations also showed association with EOT2D. Our group also replicated the association of the variants in TCF7L2, CDKN2A/2B and an intergenic single nucleotide polymorphism (SNP) on chromosome 1p31 identified with LOT2D in various population also with EOT2D in Asian Indians [23]. With this background, the present study was designed to study the association of 25 variants within 18 distinct gene loci, previously identified with the LOT2D subtype in various GWAS, on South Indians with EOT2D.
Methods
Study subjects
The study group comprised of 1,188 unrelated EOT2D subjects and 1,183 normal glucose tolerant (NGT) subjects recruited from Chennai Urban Rural Epidemiology Study (CURES) and from Dr. Mohan’s Diabetes Specialties Centre (DMDSC) tertiary diabetes center in Chennai in South India. Subjects for the study were selected based on the World Health Organization (WHO) criteria. NGT was defined as fasting plasma glucose < 100 mg/dL and 2-h post glucose value ≤ 140 mg/dL. Diabetes was diagnosed if the fasting plasma glucose was ≥ 126 mg/dL or 2-h post glucose value ≥ 200 mg/dL or if the participant was on drug therapy for diabetes after diagnosis by a physician. The following criteria were used for selection of EOT2D subjects: patients having early-onset diabetes if they were diagnosed before the age of 35 years, responding to oral hypoglycemic agents, fasting C-peptide > 1.0, stimulated C-peptide > 2.0 pmol/mL, and glutamic acid decarboxylase antibodies negative. Only unrelated individuals were included in this study. Subjects with ketoacidosis at diagnosis, exocrine pancreatic disease (fibrocalculous pancreatic diabetes), pregnant women and subjects known to have confirmed maturity-onset diabetes of the young, were excluded from the study. Written consent was obtained from all the individuals participating in the study and the study was approved by the Institutional Ethics Committee of the Madras Diabetes Research Foundation (RHN/Adhoc/19/2011-2012).
Anthropometric and biochemical measurements
Anthropometric measurements including weight, height, and waist measurements were obtained using standardized techniques. The BMI was calculated using the formula, weight (kg)/(height × height)(m2). Blood pressure (BP) was measured with a mercury sphygmomanometer (Diamond Deluxe BP apparatus, Pune, India) from the left arm in a sitting position. Fasting plasma glucose (glucose oxidase-peroxidase method), serum cholesterol (cholesterol oxidase-peroxidase-amidopyrine method), serum triglycerides (glycerol phosphate oxidase-peroxidase-amidopyrine method), and high-density lipoprotein cholesterol (direct method polyethylene glycol–pretreated enzymes) was measured using Hitachi-912 Auto analyzer (Hitachi, Mannheim, Germany). Low-density lipoprotein cholesterol was calculated using the Friedewald formula. Glycated hemoglobin was estimated by high-pressure liquid chromatography using the variant machine (Bio-Rad, Hercules, CA, USA) and the intra- and inter-assay coefficient of variation of glycated hemoglobin was less than 10%.
SNP selection
Twenty-five variants representing eighteen different gene loci identified in various GWAS studies including ADAMTS9 [24], CDC123 [25], CDKAL1 [26-28], CDKN2A/2B [24,29,30], COBLL1 [31], GRB14 [32], HNF1A [33], HNF4A [34], IGF2BP2 [26,28], JAZF1 [24,33], HHEX [35], PPARG [35], RBMS1 [36], SLC30A8 [27], TCF7L2 [37], THADA [24], TP53INP1 [34], and TSPAN8 [34] were selected for the study.
Genotyping
Genomic DNA was extracted from peripheral blood leucocytes by proteinase K digestion followed by phenol-chloroform method. Genotyping was done using MassARRAY system (Sequenom, San Diego, CA, USA) following the manufacturer's instructions as published elsewhere [38]. SpectroTYPER software (Sequenom) automatically called the genotypes and only conservative and moderate calls were accepted for the study. Ten percent of the samples genotyped were replicated and discordance rate observed was less than 0.4% for the replicated samples. All the variants genotyped had call rate ranging between 90%–99%.
Statistical analysis
Hardy-Weinberg equilibrium (HWE) was performed by using Pearson χ2 statistics in controls for each variant separately. Logistic regression analysis was performed assuming additive model to determine the association between variants and the risk for EOT2D, with and without adjusting for parametric confounders such as age, sex, and BMI using SPSS version 20.0 (IBM Corp., Armonk, NY, USA). The power of the study was estimated using PS Power and Sample Size program (Vanderbilt University, Nashville, TN, USA) calculations (with type I error probability α = 0.05). Linkage disequilibrium (LD) and haplotype frequencies were estimated using Haploview software (http://www.broad.mit.edu/mpg/haploview/) [39].
Ethical approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional ethics committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Results
Clinical and biochemical parameters of the study subjects
Table 1 summarizes the clinical and biochemical parameters of the subjects studied. Mean age of the EOT2D and NGT subjects were 32 ± 6 and 31 ± 8 (mean ± SD), respectively. The fasting plasma glucose, 2-h post plasma glucose, and glycated hemoglobin were significantly (p < 0.001) higher among the EOT2D subjects when compared with the NGT subjects.
Comparison of minor allele frequencies of the studied polymorphisms in South Indian population with frequencies from the 1000 Genomes Project populations
Minor allele frequencies (MAF) of the SNPs studied in the present study were compared with the reported frequencies of 1000 Genomes Project (Global, European, and South Asian population), representative of the genetic diversity that exists within various population in the world and is shown in Supplementary Table 1. According to the 1000 Genome Project database, MAF of most of the studied SNPs in the present study was similar to the South Asian population, thus supporting the fact that this study population presented a high South Asian component. Similarly, all the SNPs included for the present study were common in South Asian population with allele frequency (MAF) ≥ 0.05.
LD estimation
LD analysis was performed for SNPs in IGF2BP2 (rs4402960, rs1470579, and rs6769511), CDKAL1 (rs4712523, rs4712524, and rs7754840), JAZF1 (rs868745 and rs849134), and CDKN2A/2B (rs564398, rs7020996, and rs2383208). Fig. 1 shows the r2 values for the studied SNPs. The r2 values were found to be at least 0.83 between the SNPs in IGF2BP2, CDKAL1, and JAZF1.

Linkage disequilibrium plot for the single nucleotide polymorphisms (SNPs) of IGF2BP2, CDKAL1, JAF1, and CDKN2A/2B, genes. R2 values mentioned in the linkage disequilibrium (LD) plot. LD is seen only between SNPs of the same gene, not across the genes.
Since the r2 values between SNPs in CDKN2A/2B was less than 0.35, haplotypes were constructed and the difference in the haplotype frequencies between cases and controls were analyzed. For the SNPs within CDKN2A/2B, although the frequency of the TCA haplotype was higher in EOT2D subjects when compared with the NGT subjects (p = 0.027), the significance was lost after Bonferroni correction (p < 0.05/7=0.007). Table 2 shows the haplotype frequencies of the SNPs within CDKN2A/2B gene.
Association of studied SNPs with EOT2D
Genotypic distributions of all the variants studied were in HWE and none of the variants studied showed monoallelic condition. As shown in Table 3, six SNPs within five distinct loci rs7020996 (CDKN2A/2B), rs7607980 (COBLL1), rs6769511, rs1470579, rs4402960 (IGF2BP2), rs4812829 (HNFA4), rs1111875 (HHEX) and rs4506565 (TCF7L2) were found to be significantly associated (p < 0.05) with EOT2D in our South Indian population. However, after Bonferroni correction (p = 0.002) the association with EOT2D remained significant only for three SNPs within three distinct gene loci rs1111875 (HHEX: p = 2.0 × 10-4), rs4506565 (TCF7L2: p = 1.0 × 10-5) and rs7020996 (CDKN2A/2B: p = 6.0 × 10-4). A tendency to association (p < 0.05) with EOT2D was also observed with variants in COBLL1 (rs7607980), IGF2BP2 (rs6769511, rs1470579, and rs4402960), and HNF4A (rs4812829) in the present study. The risk allele frequencies for all the variants in the EOT2D and NGT subjects are shown in Table 3.
Logistic regression analyses were performed under the additive model for the variants with significant association to EOT2D identified in the present study: rs1111875 (HHEX), rs4506565 (TCF7L2), and rs7020996 (CDKN2A/2B) after adjusting for potential confounders like age, sex, and BMI (Table 4). The heterozygous and homozygous carriers of allele ‘T’ of the TCF7L2 variant rs4506565 (A/T) had an odds ratio of 1.47 (95% confidence interval [CI], 1.17 to 1.83; p = 0.001) and 1.65 (95% CI, 1.18 to 2.28; p = 0.006) respectively relative to AA homozygote. In the case of the HHEX variant rs1111875 (T/C), heterozygous and homozygous carriers for allele ‘C’ had an odds ratio of 1.13 (95% CI, 0.91 to 1.42; p = 0.27) and 1.58 (95% CI, 1.17 to 2.12; p = 0.003) respectively relative to TT homozygote. However, for the CDKN2A/2B variant rs7020996 heterozygous carrier for the ‘T’ allele showed an association that was protective in nature with odds ratios of 0.65 (95% CI, 0.51 to 0.83; p = 0.0004) while the homozygous carrier showed no significant association (OR, 0.62; 95% CI, 0.27 to 1.39; p = 0.24) relative to the CC homozygote with EOT2D.

Association of variants with early onset type 2 diabetes with OR and CI (adjusted for age, sex, and BMI)
Table 5 shows the comparison of the clinical and biochemical characteristics of NGT subjects and the SNPs associated with EOT2D based on their genotype. For the rs4506565 of the TCF7L2 gene, NGT subjects homozygous for the ‘TT’ genotype had increased glycated hemoglobin levels (mean ± SD, 5.5 ± 0.4) when compared with the carriers of the ‘AA’ genotype (5.4 ± 0.4, p = 0.01). In case of the CDKN2A/2B variant rs7020996, carriers of the ‘CC’ genotype had significantly higher fasting plasma glucose levels (mean ± SD, 87 ± 9 mg/dL), compared to the carriers of the ‘TC’ genotype (85 ± 9 mg/dL, p = 0.03). None of the other biochemical parameters showed any significant differences among the genotypes in either the NGT or the diabetic subjects.
Discussion
There is a rapid increase in the number of subjects diagnosed with T2D below the age of 40 years. However, only few studies have investigated the association of genetic determinants of LOT2D with EOT2D. The present study aimed at investigating the association of 25 variants from 18 distinct gene loci with EOT2D in this South Indian population, has shown association of variants in TCF7L2, CDKN2A/2B, and HHEX with EOT2D with p-values of 1.00 × 10-5, 6.00 × 10-4, and 2.00 × 10-4 respectively with power ranging from 67%–83%.
Transcription factor-7-like 2 (TCF7L2) spans 217kb region on chromosome 10q25.3. TCF7L2 is a transcription factor involved in the Wnt signaling pathway and is expressed not only in the β-cells but also in other cell lineages and glucose-metabolizing tissues, including the liver [40]. TCF7L2 identified by Grant et al. [41] is the gene with greater susceptibility to LOT2D in various populations [42-48]. Association of the TCF7L2 variant rs4506565 (A/T) with LOT2D was initially reported by the Wellcome Trust Case Control Consortium with odds ratio of 1.88 (1.56–2.27, p = 5.1 × 10-12) [37]. The association of rs4506565 (TCF7L2) with T2D was later was replicated in Middle east [49,50], Tunsanian Arabs [51], Lebanese [52], and Indian population [48,53,54]. While in Europeans, the TCF7L2 variant rs4506565 showed evidence for association with EOT2D exceeding genome-wide significance, thus clearly establishing TCF7L2 as a T2D susceptibility gene of substantial importance [42]. Table 6 shows the comparison of the p-value and odds ratio of the SNPs with association to EOT2D identified in the present study with p-value and odds ratio in other population with T2D [24,35,37,49,52,56,56]. The risk allele frequency of the TCF7L2 (rs4506565) in South Indian EOT2D subjects was observed to be 36.4%, compared with 37% in North Indian subjects [48], 49% in Saudi Arabian subjects [49], 44% in Tunsanian Arab subjects [51], 46% in Lebanese subjects [52] and 39% in European subjects with LOT2D [37]. In the present study, we have shown a strong association of the TCF7L2 variant with EOT2D in the South Indian population. A previous study by Chidambaram et al. [23] has shown only marginal association of rs4506565 (TCF7L2) with EOT2D in Asian Indians. While, the limitation of the previous study was the small sample size, in the present study, we used a much larger sample size thus increasing the power of the study. Rs4506565 (TCF7L2) also showed a significant association with fasting glucose in non-diabetic subjects in European population [57]. A comprehensive pathway analysis with 529 of the 548 genes within 5 kb of a TCF7L2 binding site by Zhao et al. [58] has shown enriched metabolism-related pathway categories in genes bound by TCF7L2. Lyssenko et al. [59] using an adenovirus system showed 2-fold increased expression of TCF7L2 in human islets, associated with increased insulin gene expression and reduced glucose-stimulated insulin secretion compared with control islets. These studies thus provide evidence for increased expression of TCF7L2 in human islets with altered insulin but not glucagon secretion. TCF7L2 also plays a crucial role in coordinating the expression of proinsulin and its subsequent processing to form mature insulin [60]. In mouse models, removal of TCF4 from B cells in newborn Tcf7l2–/– mice and in adult B cell–specific Tcf7l2 mutants, challenged by fasting or by high-fat diet did not show any affect in their function [61].
The present study has also confirmed the association of the CDKN2A/2B variant rs7020996 with EOT2D in the South Indian population, which was also earlier suggested by Chidambaram et al. [23] in Asian Indian population. The CDKN2A/2B locus at chromosome 9p21 was tagged as hot spot for association with LOT2D in a series of GWAS [24,26,30,35]. Zeggini et al. [24] in a meta-analysis study initially reported on the association of rs7020996 of the CDKN2A/2B gene with T2D in European population with OR 1.26 (1.15–1.38) (p = 1.8 × 10-7). Though CDKN2A/2B was reported to influence diabetes risk across varied ethnicities, not many studies have replicated the association of the CDKN2A/2B variant rs7020996 with T2D. Replication of the CDKN2A/2B variant rs7020996 both with EOT2D and LOT2D by our own group has shown significant association [23,54]. The risk allele frequency of CDKN2A/2B (rs7020996) in South Indian EOT2D subjects was observed to be 88.9% in EOT2D subjects, compared with 88% in Asian Indian EOT2D subjects [23] and 91% in South Indian LOT2D subjects [53]. The CDKN2A/2B genes are expressed in adipocytes and pancreatic islets. CDKN2A and CDKN2B encodes p16INK4a and p15INK4b and inhibit the activity of CDK4 and CDK6, respectively. The p16INK4a encoded by CDKN2A is a tumor suppressor and inhibits CDK4 (cyclin-dependent kinase) influencing pancreatic β cell proliferation, through decreased cell mass and subsequent decreased insulin release. The increased insulin demand possibly increases the susceptibility to T2D [43]. In murine models, overexpression of Cdkn2a leads to decreased islet proliferation in aging mice and that of Cdkn2b leads to islet hypoplasia and diabetes [62]. Study by Kong et al. [63] investigating the mechanism through which the GWAS identified CDKN2A/2B variants increase the T2D risk showed the impact of CDKN2A/2B SNPs mediated through β-cell mass but not β-cell function.
HHEX, located on chromosome 10q23.33 encodes a 270 amino-acid protein and was identified to be strongly associated with LOT2D in European populations by Scott et al. [35] with odds ratio (OR) 1.13 (1.09–1.17), p = 5.7 × 10-10. The association was later replicated in Danish [43], Japanese [55,64,65], Korean [56], Han Chinese [66], and in Tunisian population [67]. However, studies performed in Indians [53,54,68-70] and African American population [71] failed to replicate the association observed in various populations. The lack of association of the HHEX variant among various Indian populations Khatri Sikhs [68], Hyderabadi population [69], an endogamous North Indian population [53] and South Indian population [54] could possibly be due to the insufficient sample size, population stratification/admixture or due to confounders. Meta-analysis of 26 studies with 45,792 cases and 65,083 controls, also revealed a stronger association between rs1111875 and risk for T2D in East Asian (OR, 1.19) than in white populations (OR, 1.15) and Indian population (OR, 1.13) [72]. Intriguingly, a meta-analysis by Chauhan et al. [73] performed in Indian population successfully replicated the association of the HHEX variant with LOT2D in North Indian population. The risk allele frequency of HHEX (rs1111875) in South Indian was observed to be 42.3% in EOT2D subjects, compared with 32% in Japanese [55], 36% in Korean [56], 32% in Han Chinese [66], and 52% in European [35] subjects with T2D. Giannini et al. [74] showed association of the HHEX variant rs1111875 with prediabetes among obese youth. In European and Finnish population, the rs1111875 (HHEX) showed association with lower birth weight providing evidence for the ‘fetal programming hypothesis’ suggestive of decreased insulin secretion or action with reduced intrauterine growth and thereby lower birth weight as well as susceptibility to LOT2D [75,76]. HHEX genes encodes a transcription factor involved in the Wnt signaling pathway and also plays important role in many biological processes including cell cycle regulation, organ development, and cell differentiation via both transcriptional activation and repression [77]. Recent functional studies have identified HHEX as the first transcription factor required for δ-cell maintenance mediated through paracrine regulation of β-cell activity. The same study also showed misregulated HHEX expression with paracrine control of insulin secretion, leading to accelerated β-cell exhaustion and failure [78]. In HHEX-null mice pancreatic β cells was defined for its involvement in β-cell differentiation and function and failure of ventral pancreas development [79]. However, the exact mechanism through which the TCF7L2, CDKN2A/2B, and HHEX exerts its effect on T2D is still unclear. Additionally, a recent study by Mohan et al. [80] has also suggested the predominant role of beta-cell dysfunction than insulin resistance in the pathogenesis of T2D among Asian Indian youth.
With regard to the other variants genotyped in the present study rs6769511, rs1470579, rs4402960 of IGF2BP2, rs7607980 of COBLL1, and rs4812829 of HNF4A showed only a nominal association (p < 0.05) in terms of the association with EOT2D in this South Indian population with power ranging from 34% to 46%. The nominal association of these variants rs6769511, rs1470579, rs4402960 (IGF2BP2), rs7607980 (COBLL1), and rs4812829 (HNF4A) observed with EOT2D in the present study may however be due to the relatively small sample size, which is one of the major limitations of this study. Moreover, we have replicated only 25 gene variants with EOT2D in the present study out of the several hundred gene variants identified with LOT2D.
The significance of EOT2D is that, due to the earlier onset of diabetes these individuals are at increased susceptibility to complications of diabetes includes neuropathy, retinopathy, and cardiovascular disease compared to LOT2D. Our results highlight the need for larger prospective studies to identify the effect of genetic variants implicated in the development of EOT2D. To conclude, the present study is the first study to confirm the association of gene variants associated with EOT2D in South Indian population, and shows the importance of the HHEX variants with EOT2D.
Notes
Authors’ Contribution
Conceptualization: VR. Data curation: VR. Formal analysis: SL. Funding acquisition: VM, VR. Methodology: MC, SL. Writing – original draft: VR, SL. Writing – review & editing: VR, VM.
Conflicts of Interest
No potential conflict of interest relevant to this article was reported.
Acknowledgements
This study was supported by funding from Indian Council of Medical Research and University of Minnesota through the project ‘Diabetes: Genetic Susceptibility in the Asian Indian Population’ (RHN/Adhoc/19/2011-2012 dated 13.7.2011). We thank Dr.Myron Gross (University of Minnesota) for all the support. LS was supported by CSIR senior research fellowship.
Supplementary Materials
Supplementary data can be found with this article online at http://www.genominfo.org.
Comparison of the MAF of the genotyped SNPs with 1000 Genome Project database