### Introduction

### Methods

### Korea Association Resource Project

^{-6}and genotype call rates less than 95%, and with the exclusion of SNPs with a minor allele frequency < 0.05, a total of 305,799 autosomal SNPs were utilized in this analysis. After eliminating participants with samples having low call rates (less than 96%), contaminated samples, gender inconsistency, serious concomitant illness, and cryptic relatedness, 8,842 samples (4,183 males and 4,659 females) were included in the analysis. Since our study focused on T2D, we selected only T2D patients and controls by excluding 3,863 samples using the T2D diagnostic criteria summarized in Table 1 [24]. Table 2 presents the demographic information of participants and differences in demographic variables between cases and controls.

### Statistical analysis

### Propensity score matching

*MatchIt*: largest, smallest, and random [26]. The ‘largest’ method establishes matches from the largest to the smallest value of a distance measure, while the ‘smallest’ method generates matches from the smallest to the largest value of a distance measure, while the ‘random’ method yields matches in random order. PSM was applied to the KARE data to ensure homogeneity of demographic variables (covariates) between the control and T2D groups, using the R package

*MatchIt*.

### SNP sets

### Variable selection

*π*is the probability of T2D (1 ≤

_{i}*i*≤

*n*),

*n*denotes the number of samples.

*x*represents the SNPs (1 ≤

_{ij}*i*≤

*n*, 1 ≤

*j*≤

*p*) with 0, 1, and 2 values for the number of minor alleles.

*p*denotes the number of SNPs used in the model. Stepwise selection was used to maximize the AUC by updating variables step by step. Since age, BMI, and sex are known demographic and prognostic variables of T2D, we fixed these three variables during the stepwise process. This procedure was performed using the R package

*MASS*[27].

### Prediction models

*lambda.min*, which is the value at which the training mean square error is smallest [28]. For EN, we selected the

*λ*value to be

*lambda.1se*in the

*glmnet*package. Each prediction model was evaluated in terms of the test-set AUC.

### Results

### Propensity score matching

### Model prediction

### Discussion

*JAZF1, KCNJ11*, and

*KCNQ1*were previously shown to be related to insulin secretion [29]. In addition,

*IGF2BP2*and

*CDKAL1*were reported to be associated with reduced beta-cell function [20]. Both insulin secretion and beta-cell function play important roles in T2D.