Identification and expression profile of odorant-binding proteins in the parasitic wasp Microplitis pallidipes using PacBio long-read sequencing

Microplitis pallidipes Szépligeti (Hymenoptera: Braconidae) is an important parasitic wasp of second and third-instar noctuid larvae such as the insect pests Spodoptera exigua, Spodoptera litura, and Spodoptera frugiperda. As in other insects, M. pallidipes has a chemosensory recognition system that is critical to foraging, mating, oviposition, and other behaviors. Odorant-binding proteins (OBPs) are important to the system, but those of M. pallidipes have not been determined. This study used PacBio long-read sequencing to identify 170,980 M. pallidipes unigenes and predicted 129,381 proteins. Following retrieval of possible OBP sequences, we removed those that were redundant or non-full-length and eventually cloned five OBP sequences: MpOBP2, MpOBP3, MpOBP8, MpOBP10, and MpPBP 429, 429, 459, 420, and 429 bp in size, respectively. Each M. pallidipes OBP had six conserved cysteine residues. Phylogenetic analysis revealed that the five OBPs were located at different branches of the phylogenetic tree. Additionally, tissue expression profiles indicated that MpOBP2 and MpPBP were mainly expressed in the antennae of male wasps, while MpOBP3, MpOBP8, and MpOBP10 were mainly expressed in the antennae of female wasps. MpOBP3 was also highly expressed in the legs of female wasps. Temporal profiles revealed that the expression of each M. pallidipes OBP peaked at different days after emergence to adulthood. In conclusion, we identified five novel odorant-binding proteins of M. pallidipes and demonstrated biologically relevant differences in expression patterns.


Introduction
The chemosensory recognition systems of insects are critical to food-seeking, mating, parasitism, oviposition, and other behaviors [10,11]. These systems are assisted by odor carriers such as odorant-binding proteins (OBPs), chemosensory proteins (CSPs), odorant receptors (ORs), ionotropic receptors (IRs), and gustatory receptors (GRs), etc. [7,9]. Mechanistically, OBPs and CSPs transfer odorants to ORs, which recognize and convert chemical signals into electrical signals [21,33]. The water-soluble OBPs are highly concentrated in the sensillum lymph, functioning to recognize pheromones and chemical odorants in the environment [24,53]. First identified in antennae of male Antheraea polyphemus [46], OBPs have now been widely identified in Lepidoptera, Diptera, Hemiptera, and Coleoptera. The larger protein class is classified into two groups based on functional differences: general odorant-binding proteins (GOBPs, detection of "general" odors such as host plants or food) and pheromone binding proteins (PBPs, involved in the perception of sex pheromones) [23,34]. Insect OBPs typically have a molecular weight of 15-20 kDa and are characterized by three disulfide bonds formed from six conserved cysteine residues [26,35].
The wasp Microplitis pallidipes Szépligeti (Hymenoptera: Braconidae), widespread in China, is an important parasite of second and third-instar noctuid larvae, including agricultural pests Spodoptera exigua, Spodoptera litura, and Mythimna separata. Field observations show that M. pallidipes parasitizes over 30% of S. exigua larvae [52]. Female wasps lay eggs in the body cavity of host caterpillars, and once hatched, M. pallidipes larvae obtain nutrients from their hosts, arresting hostcaterpillar development at the fourth instar stage. Through the action of OBPs, parasitoids recognize and solubilize volatile hydrophobic odor molecules in the environment to locate hosts for ovipositing [37].
Recent developments in high-throughput transcriptome analysis have contributed greatly to entomology [6,58]. For example, single-molecule long-read sequencing technology from Pacific Biosciences (PacBio), which can sequence fulllength cDNA molecules, has been applied to obtain whole transcriptomes of various species [3,8,18]. Methodologically, PacBio transcriptome sequencing can identify alternative isoforms and yields longer reads than Illumina and other secondgeneration sequencing techniques (SGS) [1]. Third-generation single-molecule sequencing such as the PacBio technology provides novel insights into transcriptome complexity, including complex alternative splicing, full-length splice variants, and alternative polyadenylation.
In general, this is the first transcriptome of the whole body of adults of both sexes of M. pallidipes, with the specific goal of identifying OBPs and exploring their spatiotemporal expression profiles by qRT-PCR. We described five OBPs and demonstrated differences in expression patterns across different days after emergence to adulthood, and differences in tissues. Understanding M. pallidipes OBPs is potentially useful for developing effective attractants that can elevate the wasp's effect as a form of biological control on insect pests.

Materials and methods
Insects at different days after emergence to adulthood Microplitis pallidipes insects were obtained from the Institute of Eco-environmental Protection, Shanghai Academy of Agricultural Sciences, China. Parasitized Spodoptera exigua larvae were fed artificial food in addition to their natural food until they reached the second instar; adult parasitoids were then extracted. Wasps were fed with honey water (10%) and kept under the following environmental conditions: 27 ± 0.5°C, 85 ± 10% relative humidity, and a 12 h light/dark photoperiod [52].

RNA extraction and PacBio sequencing
Total RNA used for the PacBio RNA-seq was extracted from one-day-old adult female and male M. pallidipes (1:1 sex ratio,~100 females and 100 males) using Trizol reagent. RNA quality and quantity were determined using gel electrophoresis and an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA), respectively. First-strand cDNA was synthesized from total RNA using PowerScript reverse transcriptase and Oligo-dT primer, following the manufacturer's protocol (TransGen Biotech, Beijing, China).

Analysis of PacBio sequencing reads
Raw PacBio polymerase reads with subreads !50, and a predicted consensus accuracy !0.8 were selected to produce reads of insert (ROIs). These included full-length (FL) and non-full-length (nFL) transcript sequences based on whether 5 0 /3 0 cDNA primers and a poly(A) tail were simultaneously observed. To generate consensus sequences, isoform-level clustering was applied to FL transcripts via the Iso-Seq iterative clustering for error correction (ICE) algorithm. Finally, redundancy in consensus sequences was removed using the CDHIT-EST suite (http://weizhong-lab.ucsd.edu/cdhit_suite/ cgi-bin/index.cgi), yielding a full-length transcriptome of M. pallidipes.

Analysis and prediction of TFs, ORFs, and SSRs
Transcription factors (TFs) were identified in Other Eukaryotes TFDB (http://bioinfo.life.hust.edu.cn/AnimalTFDB). Species not included in that database were identified using HMMSEARCH from Protein Family (Pfam). Open reading frames (ORFs) of unigenes were extracted after comparing annotations from NR and SwissProt; ORFs were then translated into protein sequences according to priority. For sequences without NR and SwissProt annotations, ORFs were predicted in Transdecoder. Next, MISA was used to search for simple sequence repeats (SSRs). The minimum number of one-base, two-base, and three-to-six-base repetitions was 10, 6, and 5, respectively. Two SSRs were considered a single compound SSR if the distance between them was less than 100 bp.

Screening, identification, and sequence analysis of OBPs
Unigene functions were annotated according to the NR database. After searching all annotation results, possible OBPs were classified based on annotation statistics. Sequences were spliced and aligned to obtain possible OBPs, and primers (Table 1) were designed to amplify the full-length sequences using PCR Mix (TransGen, Beijing, China). The thermocycling schedule was: 95°C for 5 min; followed by 30 cycles at 95°C for 30 s, 50°C for 30 s, 72°C for 45 s; and an additional extension step at 72°C for 10 min. Amplicons of the expected size were sub-cloned, and three clones were sequenced for each gene. After obtaining full-length OBP sequences, signal peptides were analyzed with SignalP 6.0 (http://www.cbs.dtu.dk/ services/SignalP/), and protein domains were analyzed using the Hmmer website (http://www.ebi.ac.uk/Tools/hmmer/) [12]. To conduct a phylogenetic analysis, a total of 61 OBP protein sequences were used from four different insects (21 from Apis mellifera, 20 from Microplitis mediator, 15 from Dichasma alloeum, and five from Microplitis pallidipes that we identified). A maximum-likelihood tree based on the Jones-Taylor-Thornton model was constructed in MEGA 7.0, and branch supports were assessed using 1000 bootstrap replicates [22]. The OBP protein sequences were retrieved from GenBank (Supplementary Note 1).

Real-time quantitative PCR to test stage and tissue OBPs expression profile
To determine the expression profiles of M. pallidipes OBPs in different tissues, total RNA used for qRT-PCR analysis was extracted from 20-50 mg of different tissues (antennae, heads from which antennae were removed, thoraxes from which legs and wings were removed, abdomens, wings, and legs) of freshly emerged female and male adult wasps. To determine the expression profile of M. pallidipes OBPs at different days after emergence to adulthood, total RNA used for qRT-PCR Table 1. Primers used for PCR amplification.

Usage of primers MpOBP2exf
ATGAAGTCAATTATTATCTTGGGAGTTTTGCT Amplify the full-length cDNA gene analysis was extracted from 20-50 mg of whole body of female and male adults at different days (1-day-old adults, 2-day-old adults, 3-day-old adults, 4-day-old adults, and 5-day-old adults). First-strand cDNA was synthesized using the First-Strand cDNA Synthesis Enzyme (TransGen Biotech, Beijing, China). Housekeeping genes were 18S ribosomal RNA gene (MW466574) and b-actin (MZ570587). TransStart Green qPCR Mix (TransGen, Beijing, China) and appropriate primers ( Table 1) were used for amplification. The optimized thermocycling program was 94°C for 30 s, followed by 45 cycles at 94°C for 5 s, and 60°C for 30 s. Data were analyzed on the ABI StepOne instrument (Applied Biosystems, Foster City, CA, USA), and relative gene expression was quantified using the 2 ÀDDCt (cycle thresholds) method [30].

Statistical analysis
The differences in relative expression of M. pallidipes OBPs among different tissues or different days after emergence to adulthood were determined by one-way ANOVA, using statistical package SPSS (Version 22.0, SPSS Inc., Chicago, IL, USA). The difference in relative expression of M. pallidipes OBPs between males and females was determined using the t-test. Significance was set at p < 0.05 in the analysis.

Overview of the PacBio sequencing datasets
PacBio sequencing results showed that the FL transcriptome of M. pallidipes contained 520,217 ROIs, including 499,794 5 0 primer reads, 506,784 3 0 primer reads, 502,379 poly-A reads, 482,775 FL reads, and 37,051 nFL reads. Fulllength reads also included 466,817 FL non-concatemer reads with an average length of 2347 bp. After ICE and CD-HIT clustering, we identified 170,980 unigenes with a mean length of 2847.43 bp. The longest sequence length was 19,936 bp, the N50 sequence length was 3184 bp, the N90 sequence length was 2101 bp, and GC content was 34.76%. Length distribution ranged from 500-6000 bp (Fig. 1).
In terms of functional analysis, GO results indicated that most unigenes were enriched in cellular process (biological process), membrane (cellular component), and binding (molecular function) (Fig. 2b). Additionally, KEGG results revealed that most unigenes were involved in pathways related to signal transduction, translation, endocrine system, transport and catabolism, and carbohydrate metabolism (Fig. 2c). Finally, egg-NOG results found that most unigenes were predicted to participate in general function (Fig. 2d).
Phylogenetic analysis revealed that OBPs generally clustered into three large independent groups. MpOBP2 and MpOBP3 were located in a large branch along with their orthologous sequences, and MpOBP10 and MpPBP were located in another large branch. OBP8 was segregated into unique clades with the orthologous sequences. Furthermore, MpOBP2, MpOBP3, MpOBP10 and MpPBP diverged to different small groups and clustered with their orthologs in other species. MmOBP2 has the closest evolutionary relationship with MpOBP2 on the phylogenetic tree. This evolutionary relationship also appeared in MpOBP3, MpOBP8, MpOBP10, and MpPBP in the two species M. pallidipes and M. mediator (Fig. 5).

Tissue and temporal expression profile of OBPs
The qPCR results showed that all five M. pallidipes OBPs were expressed in the antennae at significantly higher levels surpassing heads, thoraxes, abdomens, wings, and leg tissues for 2.19 to 6197.14 fold-change (p < 0.05). There was no significant difference in MpOBP2 expression level among heads, thoraxes, abdomens, wings, and legs. MpOBP8, MpOBP10, and MpPBP were similar to MpOBP2. However, MpOBP3 expression was significantly higher in legs surpassing heads, thoraxes, abdomens, and wing tissues for 3.32-18.25 foldchange (p < 0.05). We also observed several sex differences. MpOBP2 and MpPBP were expressed significantly higher in male antennae than female antennae (ratios of 4.39:1 and 14.46:1, respectively) (p < 0.05), whereas MpOBP8 and MpOBP10 exhibited the reverse pattern. MpOBP3 was also highly expressed in the legs of females, in addition to female antennae. Both female antennae and legs typically expressed OBPs at higher levels than male antennae and legs (Fig. 6).
Temporal expression profiles also differed across the five OBPs. In female wasps, MpOBP2 expression of 5-day-old adults was significantly lower than that in 1-day-old, 2-dayold, 3-day-old, and 4-day-old adults (0.29-0.44 fold-change) (p < 0.05). MpOBP3 and MpPBP showed the highest expression level in the 3-day-old adults, MpOBP8 showed the highest expression level in the 2-day-old adults, and MpOBP10 expression displayed a stepwise decrease from first to fifth day adults. In male wasps, MpOBP2, MpOBP3, and MpOBP8 expression had no significant difference from first to fifth day, respectively. Among the five time points, MpOBP10 showed the highest expression level in the 4-day-old adults, while MpPBP showed the highest expression level in the 2-day-old adults. MpOBP3 expression was significantly higher in female adults surpassing male adults for 8.57 to 17.38 fold-change from first to fifth day, whereas MpPBP exhibited the reverse pattern. MpOBP2 expression was higher in males surpassing females for 3.92 to 9.08 fold-change from first to fifth day. MpOBP8 expression was higher in females than in males with the ratio of 1:0.59 in  the 1-day-old adults, and there was no significant difference in other adults. MpOBP10 expression was higher in females than in males in the 1-day-old and 2-day-old adults (Fig. 7).

Discussion
This study successfully cloned five full-length OBP cDNA from M. pallidipes using PacBio long-read sequencing. Our results indicated that M. pallidipes had considerably fewer OBPs than many other insects, e.g., Drosophila melanogaster with 51 OBPs [17], Bombyx mori with 44 [14], Agrotis ipsilon with 33 [15], Encarsia formosa with 39 [16], Aulacocentrum confusum with 11 [28], and M. mediator with 20 [36]. The main reason for the relatively low number of M. pallidipes OBPs found in this study might be that there was an artifact of some M. pallidipes OBPs exhibiting undetectably low or tissue/stage-specific expression. In this study, RNA-Seq analysis was performed on whole body (mixed sexes) to obtain potential OBPs in all tissues. Consequently, the RNA-Seq analysis samples contained fewer specific tissues such as antennae, which meant that the low abundance OBPs specifically expressed in antennae could not be detected. We intend to continue to conduct RNA-Seq analysis on the M. pallidipes antennae in future studies to find other OBPs of M. pallidipes. In addition, interspecific variation and alternative splicing might also play significant roles in the multiplicity of OBP sequences [19].
The sequence characteristics of M. pallidipes OBPs indicated that they belong to the classic OBP family. All five M. pallidipes OBPs were predicted to have a conservative PBP-GOBP superfamily domain. Each M. pallidipes OBP also contained six conserved cysteine residues, with spacing characteristic of classic OBPs: C1-X 22À27 -C2-X 3 -C3-X 36À45 -C4-X 8À12 -C5-X 8 -C6 (where X is any aa). This pattern differed from those of non-classical OBPs (Dimer OBPs, Plus-C OBPs, Minus-C OBPs, and Atypical OBPs) [14,19]. Each of the M. pallidipes OBP sequences contained a predicted signal peptide typical of secreted proteins, supporting their function in binding and transporting odor molecules [5,38]. Additionally, the analysis of phylogenetic tree suggested that MpOBP2 and MpOBP3 might have a close evolutionary relationship, which was similar with the relationship between MpOBP10 and MpPBP. The close evolutionary relationship between M. pallidipes OBPs and M. mediator OBPs implied that OBPs are evolutionarily relatively conservative in Microplitis sp.
Analyzing the tissue-specific pattern of OBPs can provide insight into their biological function. Most insect OBPs are specifically expressed in the antennae, such as the 32 OBPs of Encarsia formosa [16], nine OBPs of Aulacocentrum confusum [28], and two OBPs of Macrocentrus cingulum [2]. Consistent with these prior findings, the five M. pallidipes OBPs were primarily expressed in the antennae. However, some OBPs are also expressed in other tissues, such as legs and wings [31,36]. Examples of high expression in the legs include   MpOBP2, MpOBP3, MpOBP8, MpOBP10, and MpPBP, respectively. The two charts on the left represent comparisons of OBP expression at different days after emergence to adulthood, and the five small charts on the right represent comparisons of female and male OBP expression. Day or sex is on the x-axis and relative quantification of M. pallidipes OBPs is on the y-axis. Bars represent standard deviations. 1d-5d: 1-dayold adults, 2-day-old adults, 3-day-old adults, 4-day-old adults, and 5-day-old adults; F: female, M: male. Different lower-case letters indicate a significant difference (one-way ANOVA, p < 0.05). ns means no significant difference and * means a significant difference (t-test, p < 0.05).
OBP2 and OBP8 of Aphis glycines [47], OBP7 and OBP16 of Tropidothorax elegans [43], and OBP19 of M. mediator [36]. Examples of high expression in the wings include OBP4 of Ectropis obliqua [31], OBP13 of Oedaleus infernalis [54], and OBP8 of T. elegans [43]. Our study found that MpOBP3 was relatively highly expressed in the legs of female wasps, suggesting involvement in chemo-sensing at the leg level, where many odor receptors related to the olfactory system are distributed [31]. In addition to olfactory perception, OBPs also appear to be involved in other physiological functions, as they are widely expressed in other non-olfactory organs such as the midgut and glands [27,40,42]. For example, two Aedes albopictus OBPs contribute to transporting hydrophobic ligands in the hemolymph [4]. One Culex nigripalpus OBP is associated with nutrient and other small-molecule transport in the intestines [40]. Three OBPs of Streltzoviella insularis control semiochemical release in male genitalia [51]. Finally, OBP22 of Aedes aegypti is transferred as a pheromone carrier from the male reproductive apparatus to the spermatheca [39].
The expression of OBPs also exhibited sex-specific patterns. MpOBP2 and MpPBP were mainly expressed in female antennae, whereas MpOBP3, MpOBP8, and MpOBP10 were mainly expressed in male antennae. These sex-specific patterns are likely related to functional differences. The OBPs expressed specifically on male antennae may be critical for detecting sex pheromones. Alternatively, female-specific OBPs may be important in the detection of general odorants such as host volatiles [20]. Similarly, OBP14 of M. mediator is also mainly expressed in the female antennae, while OBP18 of the same species is mostly expressed in male antennae [36].
We observed variable M. pallidipes OBP expression across different days after emergence to adulthood. MpOBP2 and MpOBP10 expression was highest in 1-day-old wasps, while MpOBP3 expression was highest in 3-day-old wasps. Furthermore, MpOBP8 and MpPBP expression was highest in 2-day-old wasps. Therefore, we speculated that different OBPs play vital roles at distinct developmental stages. For example, OBP14 expression in Adelphocoris lineolatus fluctuated from its peak in third-instar larvae to its nadir in fourth-instar larvae [44]. Additionally, OBP11 expression was increased at the late larval and adult stages of Tribolium castaneum [55]. OBP1 of Plutella xylostella had highest expression level in 1st instar larvae among the different larval stages [56]. These developmental differences might be associated with OBP binding to different odorant compounds, as well as their involvement in a wide range of insect behaviors, including host-searching and mating.
In conclusion, we provide the first mixed transcriptome study of whole body of adults of both sexes of M. pallidipes using long-read sequencing and identified five OBP genes. Furthermore, we characterized OBP-gene expression patterns across different tissues and different days after emergence to adulthood using qPCR. These findings provide important insights into OBP function in parasitic wasps and are also applicable to other insects.