High-quality draft genome and characterization of commercially potent probiotic Lactobacillus strains

Lactobacillus acidophilus UBLA-34, L. paracasei UBLPC-35, L. plantarum UBLP-40, and L. reuteri UBLRU-87 were isolated from different varieties of fermented foods. To determine the probiotic safety at the strain level, the whole genome of the respective strains was sequenced, assembled, and characterized. Both the core-genome and pan-genome phylogeny showed that L. reuteri was closest to L. plantarum than to L. acidophilus, which was closest to L. paracasei. The genomic analysis of all the strains confirmed the absence of genes encoding putative virulence factors, antibiotic resistance, and the plasmids.


Introduction
Lactobacillus are a group of Gram-positive, rod-shaped, microaerophilic, non-spore-forming, lactic acid-producing bacteria [1], they are the natural and significant inhabitants of gastrointestinal tract of humans, as well as they are known to constitute a major part of the oral and vaginal microbiome [2][3][4][5]. Lactobacillus are the most common probiotics found in fermented food products, and the awareness of probiotic benefits is evolving more quickly. Commercially available Lactobacillus probiotic strains help to restore the microbiota of imbalanced gut caused due to antibiotic treatments; however, the pathogenicity and efficacy of potential probiotics have to be assessed for safety. Here, we report the whole genome sequence of commercially potent probiotic Lactobacillus strains: Lactobacillus acidophilus UBLA-34, Lactobacillus paracasei UBLPC-35, Lactobacillus plantarum UBLP-40, and Lactobacillus reuteri UBLRU-87.
Lactobacillus strains were isolated from serially diluted fermented foods under anaerobic conditions at 37°C using MRS (deMan, Rogosa, and Sharpe) agar, the pure isolated colonies were cultured using MRS broth, the cells were harvested for DNA isolation with the phenol-chloroform extraction method, followed by 16S rRNA gene amplification (using the primers 27F and 1429R) [6], the strains were confirmed by PCR amplicons sequencing and phylogenetic analysis. High molecular weight genomic DNA of the identified strains was isolated by the above-described method, DNA fragments of 300-to 400-bp size were generated by ultrasonication, fragmented DNA was used to prepare a paired-end sequencing library with a Nextera DNA Flex Library preparation kit (Illumina, San Diego, CA, USA) and sequencing was performed on an Illumina NextSeq 500 System (Illumina).
Pan-genomic analysis of Lactobacillus strains was performed to determine the conserved core and variable genes (Table 3) [14], the estimated pan-genome size was 6,487, and the parameter 'b' was  calculated to be 0.794494 (Fig. 1), which confirms that the pan-genome is open. The highest number of new genes which contributed to the pan-genome was observed for L. plantarum UBLP-40 (Table 3). The highest part of the core genome of Lactobacillus genus was composed of genes related to metabolism, the second-highest contributing genes were related to information storage and processing, whereas the unique and accessory genes contained more amount of poorly characterized genes in comparison to core genome (Fig. 2). The phylogeny of core and pan-genome showed that L. reuteri shares the relatedness with L. plantarum, whereas L. paracasei is closest to L. acidophilus (Fig. 3).
All the four genomes of Lactobacillus strains were screened to determine the presence of genes encoding for putative virulence factors such as hemolysin BL, non-hemolytic enterotoxin NHE, enterotoxin T, cytotoxin T, and cereulide [15], antibiotic resistance [16], and plasmids [17]. None of the genomes (UBLA-34, UBLPC-35, UBLP-40, and UBLRU-87) showed the presence of putative virulence factor or antibiotic resistance encoding genes or plasmids or any antibiotic-resistant genes containing plasmids. Secondary metabolite producing gene cluster detection was performed for all the Lactobacillus strains, based on the hidden Markov model profiling of metabolite producing genes [18].

Lactobacillus paracasei UBLPC-35
Two bacteriocin biosynthetic gene clusters were found in scaffold number 1 (location: 21,360-44,300 nt and 85,659-97,824 nt), there was no significant similarity found with the known gene clusters.

Lactobacillus plantarum UBLP-40
First bacteriocin biosynthetic gene cluster was found in scaffold number 7 (location: 101,210-113,360 nt), whereas terpene biosynthetic gene cluster was found in scaffold number 12 (location: 77,136-92,747 nt), there was no significant similarity found with the known gene clusters.

Lactobacillus reuteri UBLRU-87
No secondary metabolite producing gene cluster was found.

Data Availability
The raw sequence reads have been submitted to the NCBI SRA and the whole-genome shotgun project has been deposited in DDBJ/ EMBL/GenBank under the following accession numbers: Lactobacillus acidophilus UBLA-34: SRR7958229, RBHY00000000: the version described in this paper is version RBHY01000000, Lactoba-cillus paracasei UBLPC-35: SRR8382560, RCFI00000000: the version described in this paper is version RCFI01000000, Lactobacillus plantarum UBLP-40: SRR8382543, RDEY00000000, the version described in this paper is version RDEY01000000, Lactobacillus reuteri UBLRU-87: SRR8382542, RIAU00000000, the version described in this paper is version RIAU01000000.

Conflicts of Interest
No potential conflict of interest relevant to this article was reported.