### Introduction

### Methods

*f (n, k, L)*, AIC

*f (k, L)*,

*AICc f (AIC, k, n)*is as mentioned in Eqs. (1), (2), and (3).

*L*= the maximized value of likelihood function of model M,

*n*= number of data point,

*K*= number of parameters estimated by model. Frequencies and transition/transversion bias 24 different nucleotide substitution models also evaluated. Simulate the biological data to estimate the probability rate of substitution (

*r*) using ML method for different nucleotide substitution models. Similarly, database used to assess the nucleotide base frequencies for each sequence as well as an overall average to assess the extent of relation.

### Results and Discussion

### ML of different nucleotide substitution models

*K*= 11 shown in Fig. 1. In addition, rate of variation across sites (+G), the GTR + G model show BIC and AIC score slightly increase with respect to GTR. On further addition, a proportion of invariable sites (+I) and/or rate of variation across sites (+G), GTR + G + I model indicates 0.0072% elevation in BIC score and 0.00144% go up in AIC (

*K*= 13). HKY model (

*K*= 7) having lowest value for BIC 347473, AICc 347405, but higher than most appropriate GTR model. Similarly HKY + I + G model (

*K*= 9) simulated result shows the score get higher with respect to base model. Both the model JC + G + I and K2 + I (

*K*= 5) boast BIC and AICc criterion score highest. The deviation between GTR and K2 + I models is for BIC, AICc scores 1.49% and 1.50%, respectively.

*f*) and rates of base substitutions rate (

*r*) are also key factor to justify best nucleotide substitution model using ML technique. The nucleotide frequencies predicted for GTR model are A = 0.28, U = 0.317, C = 0.195 and G = 0.207 of biological data of SARS-CoV-2, SARS-CoV, and MERS-CoV. The frequencies of nitrogenous base remain constant for first 12 models from GTR, GTR + G,GTR + G + I, HKY, TN93, HKY + G, TN93 + G, TN93 + G + I, HKY + G + I, GTR + I, HKY + I to TN93 + I. The nucleotide frequencies for T92, T92 + G, T92 + G + I, T92 + I models are (A = 0.299, U = 0.299, C = 0.201, G = 0.201) remain steady, but varied from prior methods. JC, JC + G, K2, K2 + G, K2 + G + I, JC + I, JC + G + I, K2 + I models replicated the same frequency at the rate 0.25 for all nitrogenous base as revealed in Fig 2. Base substitution rates are also dependent on nucleotide substitutions models, in GTR model r(AU), r(UA), r(CA), r(GA) substitutions are dominated. Fig. 3 replicate the min and max rate of rates of base substitutions irrespective of models are as follow, r(AU 0.077, 0.122), r(AC 0.05, 0.084), r(AG 0.77, 0.107), r(UA 0.074, 0.115), r(UC 0.06, 0.101), r(UG 0, 0.086), r(CA 0.073, 0.126), r(CU 0.079, 0.119), r(CG 0.05, 0.124), r(GA 0.079, 0.132), r(GU 0, 0.099), and r(GC 0.05, 0.086).

### ML to estimate of substitution matrix and transition/transversion bias

*R*) using ML depends upon the base frequency parameters and nucleotide substitution models. Base frequency parameters Π

_{A}= Π

_{C}= Π

_{T}= Π

_{U}= 1/4 for JC and K2 models and for GTR, HKY, TN93, T3 models have all Πi free to exchange. Six different nucleotide substitution models were simulated for biological sequence data of SARS-CoV, MERS-CoV, and SARS-CoV-2.

### Conclusions

*+I*). 0.03% difference found in BIC and AIC score for GTR model at penalty parameter of 11 signified that SARS-CoV-2 is closely to SARS-CoV and MERS-CoV both virus strains. The base frequency all 24-substitution model except JC and K2 are same, due to free exchangeability, resultant of that JC and K2 parameter observations trends are different from other substitution models. The results also indicate the close proximity of SARS-CoV-2 to SARS-CoV and MERS-CoV probability rate of substitution confirmed transitional substitutions are more dominate in all genomic sequences (NC_045512.2, NC_019843.3, and FJ588686.1) because two out of three single nucleotide polymorphisms are transitions retain in SARS-CoV, MERS-CoV, and SARS-CoV-2. Low frequency of nucleotide (0-0.35) and substitution rate (0-0.18) in all nucleotide substitution models support the result of closeness among the virus strain. 1st + 2nd + 3rd + noncoding simulated result for transition/transversion bias reflected the positive evolution that indicates towards of nonsynonymous substitutions. The outcome of A-T (62.14%) and G-C (37.86%) nucleobase frequencies for SARS-CoV-2 evidence that variation in genome with respect SARS-CoV & MERS-CoV. The G-C frequencies are 5.86% elevated in SARS-CoV & MERS-CoV and A-T frequencies are 5.86% upward for SARS-CoV-2. Closer the nucleobase frequency also supports and affirms SARS-CoV-2 is closer resemblance of SARS-CoV and MERS-CoV.