Development of MALDI-TOF mass spectrometry for the identification of lice isolated from farm animals.

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is now routinely used for the rapid identification of microorganisms isolated from clinical samples and has been recently successfully applied to the identification of arthropods. In the present study, this proteomics tool was used to identify lice collected from livestock and poultry in Algeria. The MALDI-TOF MS spectra of 408 adult specimens were measured for 14 species, including Bovicola bovis, B. ovis, B. caprae, Haematopinus eurysternus, Linognathus africanus, L. vituli, Solenopotes capillatus, Menacanthus stramineus, Menopon gallinae, Chelopistes meleagridis, Goniocotes gallinae, Goniodes gigas, Lipeurus caponis and laboratory reared Pediculus humanus corporis. Good quality spectra were obtained for 305 samples. Spectral analysis revealed intra-species reproducibility and inter-species specificity that were consistent with the morphological classification. A blind test of 248 specimens was performed against the in-lab database upgraded with new spectra and validated using molecular tools. With identification percentages ranging from 76% to 100% alongside high identification scores (mean = 2.115), this study proposes MALDI-TOF MS as an effective tool for discriminating lice species.


Introduction
Lice are highly host-specific insects [20], belonging to the order Phthiraptera. They are obligate parasites of birds and many species of mammals, including humans [40,41]. Nearly Lice parasitism may be responsible for pediculosis causing mild to severe anemia, and many types of skin damage such as focal necrosis and scars on the skin of heavily infested animals [8,12]. These have economic consequences especially for livestock farmers [8,45]. Some sucking lice such as P. humanus corporis (Pediculus humanus corporis) have the ability to transmit pathogens to humans [17].
The identification of arthropods including lice is an important step for surveillance and control of parasitism as well as transmitted diseases [25]. Currently, lice are mainly identified morphologically based on dichotomous keys that take high consideration of the host animal from which the louse has been collected [32,47].
Morphological identification requires entomological expertise and specific documentation [50]. For lice and other arthropods, it may be limited by the integrity of the specimen which can be damaged during collection or transport by its fragility or by the absence of distinctive morphological criteria at an immature stage of the life cycle such as ticks [33].
Alternative methods such as molecular approaches have been developed to identify arthropods including lice [19,27]. These are based on comparative analyses of gene sequences such as the 18S rRNA or the cytochrome c oxidase subunit I (COI) genes widely used for the identification of lice [19,27]. However, the NCBI GenBank database is still far from comprehensive regarding animal lice gene sequences [24].
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is an ionization technique that generates specific spectra from protein extracts from organisms [48]. The acquisition of the spectra allows the creation of a database based on reference spectra of the formally identified organism [25]. In recent years, this proteomic approach has revolutionized clinical microbiology for the identification of bacteria and fungi [36,39].
The objective of the present study was to test the ability of MALDI-TOF MS to identify lice specimens collected from livestock and poultry in Algeria.

Ethical considerations
Informal verbal consent was obtained from the owners of the mammals and poultry that were selected for sampling lice directly. Lice were not sampled from protected animals nor from animals in private residences or national parks.
Human lice were reared at IHU Méditerranée Infection on adult female New Zealand white rabbits obtained from Charles River Laboratories. They were handled according to Decree No. 2013-118, 7 February 2013 and as described in the approved experimental protocols (references APAFIS #01077.02 & 2015050417122619). Protocols were approved by the Ethics Committee "C2EA-14" of Aix-Marseille University, France and the French Ministry of National Education, Higher Education and Research.
For mammals, the animals were examined by parting their wool from sheep and goats and hair from cattle, visually inspecting the skin for lice. In poultry, the head and feathers on the neck, feet, skin, wing feathers, feathers of the belly, and feathers of the croup and of the tail were meticulously examined. In some cases, chickens were sprayed with insecticide and placed on a small spot on a sampling surface for 20 min [4]. Lice collected from the same animal were recovered and stored in the same tube either dry at À20°C or in 70% ethanol to be transported to Marseille, France for further analyses. For the present study, we used frozen lice only and kept the other lice for future studies. Each louse was rinsed with ethanol (70%) for 15 min, and later in distilled water for one minute. All body parts of the collected body lice were examined using a Zeiss Axio Zoom V16 (Zeiss, Marly-le-Roi, France) microscope. The morphological keys provided by Wall [47] and Pajot [32] were used for morphological identification (Fig. 1). The names of the species of lice and their abbreviations used in this study were chosen according to previously published identification keys [1,7,14,34,37].

Molecular identification of lice
Following morphological identification, between 8 and 10 specimens of each louse species were selected from at least two animal hosts at each study site. The abdomen of each louse was used for the extraction of DNA using an EZ1 DNA tissue extraction kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions. Lice DNA was then eluted in 100 lL of Tris EDTA buffer using a DNA extracting EZ1 Advanced XL Robot (Qiagen), as previously described [5]. The DNA was either immediately used or stored at À20°C until molecular analysis. The DNA extracting EZI (Qiagen) was disinfected after each batch of extraction as per the manufacturer's recommendations in order to avoid crosscontamination.
SAIDG (5 0 -TCTGGTTGATCCTGCCAGTA -3 0 ) and SBIDG (5 0 -ATTCCGATTGCAGAGCCTCG -3 0 ) primers were used to amplify partial 539 base pair 18S rRNA gene sequences for species-level molecular identification of the lice, as previously described [21]. The DNA samples tested were successfully amplified using an automated DNA thermal cycler (Applied Biosystems, Foster City, CA, USA). The cycling program consisted of 15 min at 95°C followed by 39 cycles of denaturing at 95°C for 30 s, annealing at 58°C for 30 s, extension of 1 min at 72°C, followed by a final cycle of 5 min at 72°C and sampling while held at 4°C. A mix without DNA was used as a negative control. The amplification products were then subjected to electrophoresis through a 1.5% agarose gel stained with SYBR Safe™ and visualized with the ChemiDoc™ MP ultraviolet imager (Bio-Rad, Marnes-la-Coquette, France).
The positive samples were purified, sequenced using a Big Dye Terminator kit and an ABI PRISM 3130 Genetic Analyzer (Applied BioSystems, Courtaboeuf, France). The obtained sequences were analyzed and assembled using ChromasPro, version 1.34 (Technelysium Pty, Ltd., Tewantin, QLD, Australia).

Sample preparation for MALDI-TOF MS analysis
Two protocols were tested to assess which body part was relevant for MALDI-TOF MS analyses. In protocol 1, the louse was longitudinally cut into two equal parts, one used for MALDI-TOF MS analysis and the other for molecular biology. In protocol 2, a transverse section was performed to separate the cephalothorax and the legs for the MALDI-TOF MS and spectra obtained were tested. We used the abdomen for molecular biology.
In both protocols, after homogenization of the sample, a quick spin centrifugation at 10,000 rpm for 1 min was performed to pellet debris and 1 lL of supernatant from each sample was deposited on the MALDI-TOF MS target plate in quadruplicate (Bruker Daltonics, Wissembourg, France) and covered with 1 lL of CHCA matrix solution composed of saturated a-cyano-4-hydroxycinnamic acid (Sigma), 50% acetonitrile (v/v), 2.5% trifluoroacetic acid (v/v) (Aldrich, Dorset, UK) and high-performance liquid chromatography (HPLC)-grade water. After drying for several minutes at room temperature, the target was placed in the MALDI-TOF MS [30] (Fig. 2).
Following comparison of the spectra quality obtained when using protocol 1 and protocol 2, protocol 2 (using cephalothorax-legs) was chosen for further analyses. The validity of the spectra obtained with protocol 2 was confirmed by testing 24 fresh lice P. humanus corporis from laboratory rearing by MALDI-TOF MS. Reproducibility and spectra quality was confirmed using FlexAnalysis v.3.3 software and the gel view tool of ClinProTools 2.2 software (Bruker Daltonics, Leipzig, Germany) (Fig. 3). Non-engorged fresh P. humanus corporis lice were later used as controls for each MALDI-TOF MS assay.

MALDI-TOF MS parameters
Protein mass profiles were obtained using a Microflex LT MALDI-TOF Mass Spectrometer (Bruker Daltonics), using Flex Control software (Bruker Daltonics), with the parameters described previously [49]. The profiles of the spectra obtained were viewed using FlexAnalysis v.3.3 software and exported to ClinProTools v.2.2 and MALDI-Biotyper v.3.0 software  (Bruker Daltonics) for data processing (smoothing, basic subtraction and peak selection) and cluster analysis.

Creation of a reference spectra database
In order to obtain reference spectra and upgrade our arthropod database, a subgroup of lice specimens identified both morphologically and using molecular tools were subjected to MALDI-TOF MS (Tables 2 and 3). Lice species from the same genus were run on the same MALDI-TOF MS target plate to rule out any plate bias. Intra-species reproducibility and interspecies specificity of MALDI-TOF MS spectra were visually evaluated using the gel view, dendrogram and principal component analysis tools of ClinProTools 2.2 and MALDI-Biotyper v3.0. (Bruker Daltonics). Dendrograms are based on the results of Composite Correlation Index (CCI) matrices. CCIs are calculated by dividing spectra into intervals and comparing these intervals across a dataset. The composition of correlations of all intervals provides the CCI which is used as a parameter that defines the distance between spectra. A CCI match value of 1 represents complete correlation, whereas a CCI match value of 0 represents an absence of correlation [25]. Spectral dendrograms were created to assess the profile diversity within each species and high-quality spectra from separate clusters were selected using FlexAnalysis software v.3.3. (Bruker Daltonics) to update the reference spectra database.
Reference spectra were selected based on intensity, overall spectrum quality and intra-species reproducibility. For each reference sample, a main spectrum profile (MSP) was created using the automated function of MALDI-Biotyper software v.3.3. (Bruker Daltonics). Spectra from a spot of lower quality were sometimes removed to obtain a high-quality MSP. MSPs were created on the basis of an algorithm using peak position, intensity and frequency data. Between two and nine new reference spectra per species were added to the lice database in our laboratory [26].

Blind tests and cluster analysis
New specimens of lice collected at different study sites were tested. Each spectrum obtained by MALDI-TOF MS analysis as described above was subjected to a blind test analysis against the upgraded database. The significance of the identification was determined using the log score values (LSV) given by MALDI-Biotyper software v.3.3. corresponding to a signal intensity level of the mass spectra of the query and reference spectra. The LSV range was from 0 to 3. LSVs allow for good evaluation of reproducibility between a queried spectrum and a reference spectrum, as they result from a thorough comparison of the position of peaks and the intensity between those two spectra (MALDI BioTyper Help, Bruker). In order to visualize MALDI-TOF MS profile similarities and distances, hierarchical clustering of the mass spectra of all tested species was performed using the dendrogram function of MALDI-Biotyper software v.3.3. Although no threshold has been definitively validated for arthropod identification using MALDI-TOF MS, LSVs ! 1.8 were considered adequate for relevant identification, as reported in pioneer papers [30,44]. Percentages of included spectra are reported in Table 4.

Lice collection and morphologic identification
A total of 4112 lice were collected from several livestock farm animals and stored at À20°C: a total of 23 sheep, 20 cattle, 14 goats, and 13 poultry (Table 1).    On the basis of morphological criteria, 13 species of lice were morphologically identified including four species of sucking lice and 9 chewing lice ( Table 2). Seven species were collected on mammals including four from cattle with B. bovis  Table 2). Genera abbreviations were modified in this study to properly differentiate genera with the same initials. A list of abbreviations is provided.

Molecular identification of lice
Of the 4112 lice morphologically identified, 159 lice specimens preserved at À20°C and belonging to 13 species were randomly selected to be included in the study.
Randomly selected specimens of each species were subjected to molecular identification targeting the 18S rRNA gene. A GenBank request revealed that 18S rRNA gene reference sequences were available for 8 of the 13 lice species. Amongst these eight available sequences on GenBank, four had an average quality and the remaining four were of very poor quality. No sequences were available for five species including Li. caponis, C. meleagridis, G. gallinae, Go. gigas, and Me. stramineus (Table 3).
For 13 species of lice, the BLAST analysis of 18S rRNA reference sequences of the lice specimens of the same species demonstrated high identity ranging from 99% to 100%, supporting correct morphological identification ( Table 3). The sequences obtained for each species of lice were corrected and blasted to reveal the intra-species similarity of the sequence of the 18S RNA gene. Sequence alignment using BioEdit software revealed that all sequences from the same species were identical and thirteen 18S rRNA gene good quality consensus sequences were deposited in the NCBI GenBank database (Table 3).

MALDI-TOF MS analyses
A total of 427 lice preserved at À20°C were tested by MALDI-TOF MS using two protocols.
An analysis of the spectral profiles using FlexAnalysis software showed that the spectra obtained using the second protocol provided MALDI-TOF MS profiles of higher intensity and superior quality to those obtained with the first protocol (Fig. 4). Based on spectra quality MALDI-TOF MS, the second protocol provided good intra-species reproducibility and interspecies specificity between specimens of the same species and variability between different species. This protocol was therefore selected for further MALDI-TOF MS analyses (Fig. 5) to create a reference spectra database.
MALDI-TOF MS identification was considered correct when there was concordance between the morphological identification and molecular identification, when the latter was possible, that is when sequences were available in GenBank and were considered reliable.
In this study, we obtained 305 specimens with good quality spectra, of which 57 spectra were added as reference spectra and 248 specimens used for the blind test with an average LSV of 2.115 and correct identification percentages between 76% and 100% (Table 4). In all, 103 of the 408 samples (25.25%) tested had poor quality spectra and these were removed for this proof-of-concept (Supplementary Data 1). Nevertheless, specimens with low quality spectra were correctly identified with an average percentage of 61.11% and with low LSVs, highlighting the quality of the database created (Supplementary Data 1).
The controls of fresh and non-engorged fresh P. humanus corporis lice were well identified at each test. The intra-species reproducibility and inter-species specificity of the MALDI-TOF MS profiles were further objectified using MALDI-Biotyper software cluster analysis. Dendrogram analysis revealed specific clustering on distinct branches of lice according to species. Lice belonging to the same genus were grouped in the same part of the MSP dendrogram (Fig. 6).

Discussion and conclusion
The morphological identification of lice is very complex because the species are morphologically close to one another. For the first time, MALDI-TOF MS was used as an additional tool for lice identification.
In this study, we successfully identified 14 species of lice using MALDI-TOF MS. Morphological identification was molecularly confirmed by targeting a fragment of louse 18S rRNA gene sequences. The choice of the 18S rRNA gene is based on previous results that proved the relevance of this gene for louse identification and the presence of reference sequences in GenBank [21]. However, 5/13 of the lice species studied in this work had no sequence available in GenBank, highlighting the drawbacks of using molecular biology alone for louse identification. Only 4/13 species of lice presented correct identification using the 18S rRNA gene. The remaining 4/13 species of lice resulted in incorrect identification despite the fact that their reference sequences were present in GenBank. Further analysis of the GenBank reference sequences of each of these species revealed that they were all of poor quality. This study allowed us to add five new sequences that did not exist on GenBank, and eight additional complementary sequences for which a reference was already available ( Table 3). Five of the 13 sequences of lice namely Go. gigas, B. bovis, B. ovis, B. caprae, and Chelopistes meleagridis were already published under new genera (Table 5) [14,34,38,42,47].
A preliminary MALDI-TOF MS database containing the spectra of 14 species was hereby created and the database will be regularly updated with the spectra of new specimens. The spectra files are available on request and transferable to any Bruker MALDI-TOF MS device. The MALDI-TOF MS arthropod database can be shared through scientific collaboration projects; it will be possible to freely query this database online in the future.
For use in entomology, the choice of arthropod body parts to be used for the MALDI-TOF MS test is a very important criterion. For ticks and mosquitoes, MALDI-TOF MS identification of the arthropod species is based on leg spectra. Other body parts had to be carefully selected for other arthropods when the legs did not provide satisfactory spectra [10,13,22]. Here, the spectral profiles generated from the cephalothorax-legs of the lice subjected to MALDI-TOF MS were reproducible. Spectral analysis highlighted intra-species reproducibility and inter-species specificity, which was consistent with the morphological classification. In addition, hierarchical clustering based on the MALDI-TOF MS spectra revealed that all of the specimens from the same species were grouped in the same branch. Our results demonstrated that the use of the body of a louse without the abdomen was the best sample for distinguishing lice species using the MALDI-TOF MS approach. There are many advantages to selecting this part of the body, for example avoiding the influence of the intestinal contents on the MALDI-TOF MS spectra [50]. Moreover, using a small body part for MALDI-TOF MS allows further analyses of the remaining parts of the arthropod, such as the detection of microorganisms [10] or the identification of blood meals of the arthropods [31]. Higher quality spectra resulted from the cephalothorax-legs part of the louse compared to when it was dissected longitudinally (Fig. 4). This can be explained by the fact that some parts of arthropods yield better spectral qualities than others, as has been demonstrated by several studies [10,13,22,43].
In this study, we included only good quality spectra. Indeed, at this stage, only high-quality spectra can be included to validate the results and create a reliable database.
The number of specimens with low quality spectra can be explained by the fact that all these samples had to be frozen and thawed several times for various analyses including long morphological identification, molecular biology, and MALDI-TOF MS assays. These repeated thawing steps could have caused protein alterations responsible for the poor quality of the spectra. This hypothesis is supported by the fact that the groups of samples that were manipulated first have a greater number of high-quality spectra. This should not be an issue when applied to entomological studies since molecular biology is not always required and when a comprehensive MALDI-TOF MS database is available, the quality of the spectra will be improved. Nevertheless, many specimens with low quality spectra were correctly identified, reaching 100% correct identification for some species such as Menopon gallinae (average: 61.11%) (Supplementary Data 1). The performance of the identification despite the many thawing steps is a validation of the quality of the database created, which will be continuously strengthened with new field specimens.
MALDI-TOF MS enables the identification of lice without any entomological knowledge [25], as long as the database is comprehensive. Furthermore, the MALDI-TOF MS sample preparation method is simple and the speed of data analysis makes it possible to obtain quick and reliable identification results [50].
This study points to new possibilities for improving the knowledge of animal lice in Algeria by using several identification tools. We have also illustrated the limitations of molecular biology with the lack of comprehensiveness of the NCBI GenBank database, which is a major setback to using this method. To circumvent these limitations, we have deposited our new sequences in the NCBI GenBank database (Table 3).
This fast and accurate low-cost tool identifies not only the different immature stages of the arthropod's life cycle [6,13], but also the origin of blood meal sources from arthropods [31].
The MALDI-TOF MS detection of louse-borne bacteria could provide new opportunities for vector surveillance, particularly in Algeria where all these louse species are present [16].
Previous studies reported the detection of Rickettsia slovaca in Haematopinus suis from Algeria [51]. It was later demonstrated that lice could acquire the bacterium R. slovaca after feeding on a bacteremic boar which does not yet prove that they are vectors, but would require epidemiological studies to be carried out [51]. It would be interesting to attempt to detect louse-associated bacteria such as Bartonella quintana or Borrelia recurrentis. Using the proposed protocol, the abdomen of the lice can be used for molecular screening of microorganisms.
This study confirmed that MALDI-TOF MS is a faster and cheaper method for identifying lice stored at À20°C. In the field, alcohol is a more widely-used method of conserving the samples, especially in countries with limited resources [5]. It has been shown that MALDI-TOF MS is reliable for identifying arthropods preserved in alcohol, such as ticks [5], mosquitoes [29], and fleas [10]. Therefore, it would be interesting in the future to set up a MALDI-TOF MS protocol for identifying lice kept in alcohol [52]. It would also be interesting to assess whether MALDI-TOF MS can be used to differentiate lice which are infected or not infected by louse-borne microorganisms.