Molecular genetic diversity and bioinformatic analysis of Leucocytozoon sabrazesi based on the mitochondrial genes cytb, coxI and coxIII and co-infection of Plasmodium spp.

Leucocytozoon sabrazesi is an intracellular haemoprotozoan parasite responsible for leucocytozoonosis, which is transmitted by insect vectors and affects chickens in tropical and subtropical areas in many countries. It causes huge economic losses due to decreased meat and egg production. In the present study, we used nested PCR to determine the genetic diversity of L. sabrazesi based on the cytb, coxI, coxIII and concatenated genes in chickens in Thailand. In addition, we found co-infections between L. sabrazesi and Plasmodium spp. (P. gallinaceum or P. juxtanucleare) in chickens that were not identified by microscopic examination of blood smears. The phylogenetic analysis indicated that L. sabrazesi cytb and coxIII genes were conserved with similarity ranging from 99.9 to 100% and 98 to 100%, respectively whereas the coxI gene was diverse, with similarities ranging from 97 to 100%. These findings ascertained the nucleotide analysis of the cytb, coxI, coxIII and concatenated sequences in which 4, 8, 10 and 9 haplotypes were found, respectively. In addition, it was found that the large number of synonymous substitutions and conservative amino acid replacements in these mitochondrial genes occurred by non-synonymous substitution. The evolutionary analysis of the Ka/Ks ratio supported purifying selection and the negative values of both Fu’s Fs and Tajima’s D indicate selective sweep especially for the coxI gene. The entropy and Simplot analysis showed that the genetic variation in populations of Plasmodium spp. was higher than in Leucocytozoon. Hence, the nucleotide sequences of three mitochondrial genes could reflect the evolutionary analysis and geographic distribution of this protozoan population that switches hosts during its life cycle.

A conventional diagnosis of Leucocytozoon infection is based on microscopic examination of the gametocytes in Giemsa-stained blood smears of the infected chickens. Currently, polymerase chain reaction (PCR) may be more reliable and widely used to diagnose the infection and be supplemented by the standard parasitological method, especially in the laboratory for high sensitivity and specificity even when blood smears are negative with low parasitemia. Although the diversity of hemosporidian parasites has been demonstrated based on mitochondrial genes, such as cytochrome b (cytb), in ecological and evolutionary studies [4,17], there is little information about the genetic diversity of L. sabrazesi isolates in Thailand with Plasmodium spp. co-infection when using mitochondrial genes (cytb, coxI and coxIII). Therefore, this study aimed to investigate the mitochondrial genetic diversity of L. sabrazesi and Plasmodium spp. coinfections in chickens in Thailand at these three loci, including phylogenetic and biogeographic relationships. In addition, the phylogenetic relationship, haplotype diversity, entropy, and geographic and evolutionary distribution among the isolates identified in this work and those from other countries are presented.

Ethics statement
Experimentation on animals was carried out under the following approval and permit from the Animal Care and Use Committee (IMBMU-ACUC), Institute of Molecular Biosciences, Mahidol University, Thailand. All suitable international, national and/or institutional guidelines for animal care and use were followed. Also, we received consent to collect chicken blood samples at the animal farm.

Ficoll density gradient centrifugation
For Giemsa-stained blood smears, elongated gametocytes of Leucocytozoon sabrazesi were detected in 30 chicken blood samples. The blood samples were diluted with 0.1 M phosphatebuffered saline (PBS), pH 7.4 and overlayed with Ficoll-Paque (Sigma-Aldrich, Burlington, MA, USA). They were centrifuged at 400 Âg for 30 min at 25°C. The gametocytes were gently harvested by inserting the pipette directly through the upper layer and later washed twice in the PBS solution.

Leucocytozoon sabrazesi DNA extraction
The genomic DNA of L. sabrazesi in blood samples was extracted by using an E.Z.N.A. Ò Tissue DNA Kit (OMEGA Bio-Tek, Norcross, GA, USA) following the protocol of Watthanadirek et al. [42,43] and Junsiri [22] with some modifications. Briefly, 250 lL of blood samples were mixed thoroughly with 25 lL of proteinase K solution, and incubated at 70°C for 10 min. Then, 250 lL of absolute ethanol were added and all lysates transferred to the HiBind Ò DNA Mini Column. The lysates were centrifuged at maximum speed for 1 min before adding 500 lL of HBC buffer. After adding 700 lL of DNA washing buffer, the genomic material was eluted with 50 lL of elution buffer. Finally, the extracted DNA solutions were stored at À20°C until further use.
Cloning and sequencing of the L. sabrazesi cytb, coxI and coxIII genes The PCR products were purified using a PureDireX PCR Clean-Up & Gel Extraction Kit. The 5 0 blunt end of purified PCR products was ligated into a pET100/D-TOPO Ò vector (Invitrogen Life Technologies, Carlsbad, CA, USA). The ligation products were heat-shocked and transformed into chemically competent Escherichia coli host strain TOP10 cells (Invitrogen Life Technologies). The transformed E. coli cultures were spread on Luria-Bertani (LB) agar plates containing 100 lg of ampicillin and incubated at 37°C for 16 h. The positive bacterial colonies were picked and cultured in LB media containing ampicillin with shaking at 37°C for 16 h. The plasmids were extracted from bacterial cultures using an AxyPrep Plasmid Miniprep Kit (Axygen Bioscience, Union City, CA, USA) before sequencing.

Sequence and in silico analysis
The presence of cytb, coxI and coxIII inserts was confirmed by Sanger sequencing. All sequences were submitted and deposited in the National Center for Biotechnology Information (NCBI) GenBank database. The sequences were also analyzed by BLAST (https://blast.ncbi.nlm.nih.gov). All nucleotide and amino acid sequences were analyzed by the computer programs MEGA 7.0.26 [24] and Jalview [10]. For nucleic acid substitution analysis, nucleotide diversity was determined using DnaSP software, V.6.0 [26]. All base substitutions were determined as synonymous and nonsynonymous substitutions in nucleotides and amino acid sequences were assessed using PRO-VEAN analysis [9] as compensation of physicochemical properties of amino acid replacement. In addition, the haplotype analysis was determined through DnaSP software, V.6.0 [26] before visualization of the mutational occurrence of haplotypes from different geographic distribution, and the relationships among haplotypes were visualized with a TCS network in the popART program [25].

Multiple sequence alignment and phylogenetic analysis
The cytb, coxI and coxIII sequences were employed for sequence alignment and phylogenetic analysis. Multiple sequence alignments were conducted with the MUSCLE algorithm [12]. All aligned DNA sequences were used to construct the molecular phylogenetic trees using neighbour-joining (NJ), maximum likelihood (ML), maximum parsimony (MP) and Bayesian analysis (BA) [19]. The reliability of the internal branching pattern of the phylogenetic tree was determined in each clade by statistical calculation of 1000 replicates using the bootstrapping method [13] and MrBayes program for posterior probability. The evolutionary distances were evaluated by the Kimura 2-parameter method [23]. Similarity (as a percentage) was also analyzed by using a sequence identity matrix in BioEdit software V.7.0.5.3 [16].

Determination of L. sabrazesi mitochondrial gene sequences
The DNA sequences of L. sabrazesi cytb, coxI and coxIII were partially amplified by nested PCR. The quality of PCR products was evaluated by the ratio of optical density (OD 260/280 ) of 1.8-2.0, which showed no contamination of the products. The lengths of cytb, coxI and coxIII sequences Thailand strain were 248, 588 and 294 bp, respectively. All DNA sequences of L. sabrazesi investigated in this study were submitted and deposited in the NCBI GenBank database (https://www.ncbi.nlm.nih.gov/genbank/) under accession numbers MZ634375 to MZ634390 for the cytb gene, MZ634391 to MZ634403 for the coxI gene, and MZ634404 to MZ634417 for the coxIII gene (Table 1).

Phylogenetic analysis
The L. sabrazesi cytb sequences obtained in this work were aligned with other sequences retrieved from GenBank including sequences from Thailand, Malaysia, Myanmar, China, USA, Uganda, Congo, Sri Lanka, Brazil, Philippines, UK and Japan. Our sequences detected in this study were positioned in the same clade as L. sabrazesi (Fig. 1). The Thailand coxI sequences were determined in the different clades in the phylogenetic tree together with other sequences of P. gallinaceum and P. juxtanucleare (Fig. 2), while the phylogenetic tree constructed from coxIII sequences was positioned in the same clade as L. sabrazesi (Fig. 3). Not only the phylogenetic tree constructed from each mitochondrial sequence, but also the concatenated genes from all mitochondrial sequences were used to construct the phylogenetic tree, which showed that our sequences were grouped and positioned in the same clade as L. sabrazesi (Fig. 4). Moreover, the reliability of bootstrap frequencies and Bayesian posterior probabilities of all phylogenies are displayed with the highest values on each branch.

Entropy analysis
The similarity analysis from Simplot showed higher nucleotide variation in Plasmodium spp. than in L. sabrazesi. The entropy analysis of the cytb, coxI and coxIII genes showed more variation of nucleic acid sequences than amino acid sequences. To analyze nucleic acid entropy, cytb, coxI and coxIII sequences showed 81 peaks with entropy values ranging from 0.11691 to 0.93764, 174 peaks with entropy values ranging from 0.13579 to 1.06709, and 125 peaks with entropy values ranging from 0.14614 to 0.94469, respectively. Entropy analysis of amino acid sequences exhibited that the charts showed 24 peaks with entropy values ranging from 0.11691 to 1.05331 for cytb, 62 peaks with entropy values ranging from 0.13579 to 1.61397 for coxI, and 46 peaks with entropy values ranging from 0.14614 to 1.18722 for coxIII (Fig. 5). The coxI gene was found to be more diverse than cytb and coxIII and this is consistent with multiple sequence alignment which showed more similarity among amino acid sequences than nucleic acid sequences ( Supplementary Figs. 1-3). The nucleic acid variation from multiple sequence alignment correlated to high nucleic acid diversity in the coxI gene caused by nucleic acid sequences of L. sabrazesi. Besides the coxI gene, both the cytb and coxIII genes exhibited higher genetic diversity in Plasmodium spp. than in L. sabrazesi (Tables 2  and 3).

Nucleic acid substitution analysis
Each nucleic acid substitution of cytb, coxI and coxIII was validated as transition from purine to purine and from pyrimidine to pyrimidine. In addition, the percentage of base composition of these genes indicated the number of A and T bases greater than G and C contents. However, most base substitutions were indicated as the synonymous substitutions (Fig. 6). Moreover, the synonymous frequency (K s ) of these genes was higher than non-synonymous frequency values (K a ). The K a /K s ratios of cytb, coxI, coxIII and concatenated genes were 0.13, 0.168, 0.227 and 0.181, respectively (Table 2). While all results of the evolutionary estimation of Tajima's D values exhibited minus values, only coxI showed statistical significance, which determined an excess of low frequency polymorphisms relative to expectations under the neutral model of evolution (p < 0.10) ( Table 2). In addition to Tajima D values, the Fu's Fs statistic based on the distribution of haplotypes displayed minus values, indicating an excess of rare haplotypes over what would be expected under neutrality; especially coxI exhibited significant negative values of both Tajima D and Fu's Fs statistic (p < 0.10) ( Table 2). Each base nonsynonymous substitution was analyzed in regards to the compensation of physicochemical properties of amino acid replacement. The cytb gene was found to have two positions of hydrophobic amino acid replacement from I15V and L66V. In the case of coxI, there were five amino acid replacements in the L. sabrazesi population, including R6I, Y32C, K56N, S99T and A113V, while Plasmodium spp. were found to have 23 amino acid substitutions, including R6K, Y32K, R50 K, N58T, N58K, N61K, K66I, L67H, I71M, S73F, L74F, F81L, C93W, P95S, K97E, P99A, K102R, I103L, Q111H, G116E, L117F, F119I, P123A, S123A, F126C and F126Y. The coxIII gene was found to have six amino acid replacements, including T4H, T4P, L5I, L32I, S55F, I69T and I79S. However, all amino acid replacements exhibiting the most conservative replacements occurred by non-synonymous substitution.

Haplotype diversity
The TCS Network tool was used to construct the haplotype network of the cytb, coxI and coxIII gene sequences of Leucocytozoon spp. and Plasmodium spp. The haplotype of each gene was estimated together with geographic distribution, consistently displaying high variation from multiple sequence alignment. The coxI gene showed a greater number of nucleotide variations and higher diversity than coxIII and cytb. However, L. sabrazesi harbored 4, 8 and 10 haplotypes of  cytb, coxI and coxIII, respectively. For L. sabrazesi cytb gene Thailand strain, our findings showed that most sequences are found in haplotype #1 and some sequences are found in haplotypes #3 and #4 obtained from Myanmar and Malaysia (Fig. 7, Tables 2 and 3). In the case of coxI, L. sabrazesi Thailand strain contained seven haplotypes, including haplotypes #1 to #5 and #10 to #11 formed the nearest clade with haplotype #14 of L. sabrazesi Malaysia strain. In addition to the coxI gene of L. sabrazesi, haplotype #9 of P. gallinaceum from Thailand formed the nearest branch to haplotype #30 of P. gallinaceum from the Philippines. Five haplotypes of P. juxtanucleare from Thailand, including haplotypes #6 to #8, #12 and #13 also formed the nearest branch to haplotype #28 of P. juxtanucleare from Japan (Fig. 8, Tables 2 and 3). Additionally, nine haplotypes of L. sabrazesi coxIII gene Thailand strain exhibited the nearest branch to haplotype #10 of L. sabrazesi Malaysia strain (Fig. 9, Tables 2 and 3). The concatenated gene comprising eight haplotypes in L. sabrazesi Thailand strain also grouped together with haplotype #9 in L. sabrazesi Malaysia strain (Fig. 10, Tables 2 and 3).

Discussion
Leucocytozoonosis caused by the hemoprotozoan L. sabrazesi is an important insect-borne disease of chickens  and causes high economic losses to chicken industries worldwide, including in Thailand. In general, genetic diversity is a survival strategy which is employed by parasites to evade the immune responses of avian hosts (chickens, ducks and birds) [5,37]. There have been studies of genetic diversity of Leucocytozoon sp. based on the mitochondrial gene sequences in several countries, and almost all of these studies focused on the cytb gene [8,18,29,44]. However, there has been no information available regarding the genetic diversity and phylogeny of L. sabrazesi mitochondrial genes in Thailand until now. In the present study, we used the cytb, coxI, coxIII and concatenated genes in the chicken population sampled in Thailand to ascertain the genetic diversity of L. sabrazesi and their co-infections in these regions.  The molecular detection and DNA sequencing displayed the highest similarity of both cytb and coxIII genes of L. sabrazesi. Interestingly, this is the first report of co-infection between L. sabrazesi and P. gallinaceum and that of L. sabrazesi and P. juxtanucleare in the leucocytes of chickens in Thailand. Notably, the coxI gene has the ability to cross-react and could be used to detect infection of L. sabrazesi and Plasmodium spp. Our findings are consistent with the report obtained by Pacheco et al. [32]. A phylogenetic analysis was carried out to display the relationship between individual and multilocus genes of mitochondria determining the detection of L. sabrazesi. Moreover, the coxI gene has been employed to detect te infection of P. gallinaceum and P. juxtanucleare in chickens from Bongti and Tha Sao districts in Kanchanaburi province located near the Chacheongsao province of Thailand which are reported about P. gallinaceum [34] and near at the border of Myanmar which are reported regarding P. juxtanucleare in chickens [44]. Regarding three mitochondrial nucleotide sequences, our results indicated the highest sequence similarity to L. sabrazesi and some co-infected with P. gallinaceum and P. juxtanucleare.
Genetic variation of three mitochondrial genes commonly occurred in Plasmodium spp., while coxI showed high genetic variation in Leucocytozoon spp. However, these genes were found to have higher transition than transversion rates, and caused mutational bias to high A-T content and were proned to express the evolutionary saturation for divergence of parasites, which are consistent with the analysis of hemosporidian mitochondrial genomes [33]. Moreover, the lack of mitochondrial sequences from Leucocytozoon spp. and Plasmodium spp. directly affected the evolutionary analysis. These genes displayed K a /K s ratios less than one and minus values, indicating purifying selection [30]. Tajima's D results indicated minus values, but only coxI indicated selective sweep, which was consistent with the negative value of Fu's Fs statistic which determined the population expansion under statistical significance [15]. In addition, the cytb and coxIII genes indicated minus values of K a /K s ratios that determined purifying selection, but both Tajima's D and Fu's Fs were negative and not significant, indicating neutrality or perhaps these values can result in indirect selection from balancing selection on a nearby locus (linked genes) [38]. All evolutionary analyses reflected that hemosporidian organisms passed through the important obstacle of evolution like genetic drift before performing population expansion later [7,14]. In addition, some variations affected haplotype distribution, which occurred from the polyphyletic relationship of genus Leucocytozoon spp. and likely displayed as an ancestor of avian parasites [27]. In addition, only partial nucleotide sequences exhibited the number of synonymous greater than non-synonymous substitution, and amino acid replacement caused by non-synonymous substitution did not show lethal effects to L. sabrazesi and mitochondrial genome variation caused by the host switching during their life cycle [33]. However, the number of non-synonymous substitutions affecting amino acid replacements exhibited a higher number of conservative than radical amino acid replacements, reflecting the purifying selection of mitochondrial genes [11]. In addition to nucleotide substitution, the non-synonymous substitutions which caused the amino acid substitutions were estimated concerning the compensation of amino acids by physicochemical properties through PROVEAN program. We found that all amino acid substitutions did not affect their function based on comparisons from the NCBI database. Moreover, the multiple amino acid sequence alignment also consistently displayed compensation of physicochemical properties of amino acid replacement through BLOSUM 62 score and exhibited a higher number of conservative than radical amino acid replacements with the dark blue color ( Supplementary  Figs. 1B, 2B, 3B).
In this study, the entropy and multiple sequence analysis showed the genetic variation in Plasmodium spp. to be greater than in Leucocytozoon spp. This indicated greater diversification of malaria parasites and the paraphyletic relationship among avian hemosporidians [31]. However, Leucocytozoon spp. displayed a higher number of haplotypes than Plasmodium spp., these values were affected in populations of Leucocytozoon sp. but not L. sabrazesi [37]. Similarly, haplotype diversity indicated the close genetic relationship among L. sabrazesi. detected in Thailand, Malaysia and Myanmar [44].

Conclusions
This study is the first report on the genetic diversity of L. sabrazesi based on the mitochondrial genes including cytb, coxI, coxIII and concatenated sequences in Thailand. The co-infection between L. sabrazesi either P. gallinaceum or P. juxtanucleare in chickens in Thailand was investigated. The advantage of cross-PCR amplification of the coxI gene is that it can discriminate co-infection, which is not verified by microscopic examination. Even though the phylogenetic relationship and evolutionary distribution showed high genetic variation and haplotype diversity in the coxI, coxIII and cytb genes, they still indicated purifying selection, which occurred together with population expansion after genetic drift events in switching-host hemosporidian populations. These findings could help to improve the understanding of molecular phylogenetics and diversity among these mitochondrial sequences of L. sabrazesi Thailand strain. Our findings could therefore be beneficial for the development of immunodiagnostic tools and vaccine strategies for chicken leucocytozoonosis.

Supplementary materials
The Supplementary materials of this article are available at https://www.parasite-journal.org/10.1051/parasite/2022022/olm Supplementary Figure 1. Alignment of nucleic acid sequences of the cytb gene among L. sabrazesi and Plasmodium spp. The highest similarity of nucleotide positions is represented with dark blue color, while white color represents the least similarity of each nucleic acid position (A). Multiple amino acid sequence alignment of CYTb protein among L. sabrazesi and Plasmodium spp. The highest similarity of physicochemical properties (BLOSUM score 62) of each amino acid position is represented with blue color, while white color represents the least similarity of each amino acid position (B).
Supplementary Figure 2: Alignment of nucleic acid sequences of coxI gene among L. sabrazesi and Plasmodium sp.p The highest similarity of nucleotide positions is represented with dark blue color, while white color represents the least similarity of each nucleic acid position (A). Multiple amino acid sequence alignment of COXI protein among L. sabrazesi and Plasmodium spp. The highest similarity of physicochemical properties (BLOSUM score 62) of each amino acid position is represented with blue color, while white color represents the least similarity of each amino acid position (B).
Supplementary Figure 3: Alignment of nucleic acid sequences of coxIII gene among L. sabrazesi and Plasmodium sp. The highest similarity of nucleotide positions is representd with dark blue color, while white color represents the least similarity of each nucleic acid position(A). Multiple amino acid sequence alignment of COXIII protein among L. sabrazesi and Plasmodium spp. The highest similarity of physicochemical properties (BLOSUM score 62) of each amino acid position is represented with blue color, while white color represents the least similarity of each amino acid position (B). Table S1: Similarity of the cytb gene sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences taken from GenBank. Table S2: Similarity of the cox I gene sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences obtained from GenBank. Table S3: Similarity of the cox III gene sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences taken from GenBank. Table S4: Similarity of the cytb amino acid sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences taken from GenBank. Table S5: Similarity of the cox I amino acid sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences taken from GenBank. Table S6: Similarity of the cox III amino acid sequences of Leucocytozoon spp. and Plasmodium spp. as detected in chicken samples in Thailand compared with other sequences taken from GenBank.