MALDI-TOF mass spectrometry: a new tool for rapid identification of cercariae (Trematoda, Digenea)

Identification of cercariae was long based on morphological and morphometric features, but these approaches remain difficult to implement and require skills that have now become rare. Molecular tools have become the reference even though they remain relatively time-consuming and expensive. We propose a new approach for the identification of cercariae using MALDI-TOF mass spectrometry. Snails of different genera (Radix, Lymnaea, Stagnicola, Planorbis, and Anisus) were collected in the field to perform emitting tests in the laboratory. The cercariae they emitted (Trichobilharzia anseri, Diplostomum pseudospathaceum, Alaria alata, Echinostoma revolutum, Petasiger phalacrocoracis, Tylodelphys sp., Australapatemon sp., Cotylurus sp., Posthodiplostomum sp., Parastrigea sp., Echinoparyphium sp. and Plagiorchis sp.) were characterized by sequencing the D2, ITS2 and ITS1 domains of rDNA, and by amplification using specific Alaria alata primers. A sample of each specimen, either fresh or stored in ethanol, was subjected to a simple preparation protocol for MALDI-TOF analysis. The main spectral profiles were analyzed by Hierarchical Clustering Analysis. Likewise, the haplotypes were analyzed using the maximum likelihood method. Analytical performance and the log-score value (LSV) cut-off for species identification were then assessed by blind testing. The clusters obtained by both techniques were congruent, allowing identification at a species level. MALDI-TOF enables identification at an LSV cut-off of 1.7 without false-positives; however, it requires more data on closely related species. The development of a “high throughput” identification system for all types of cercariae would be of considerable interest in epidemiological surveys of trematode infections.


Introduction
In the life cycle of Trematoda, the first intermediate host is a mollusc, usually an aquatic snail. This host releases cercariae into the environment. Cercariae are free living mobile larval stages that must locate a suitable second definitive or intermediate host [17,33]. The study of cercariae is therefore essential to understand the epidemiology of Trematoda and their ecological relationships with their hosts. Trematodes are one of the most important parasites in medical and veterinary parasitology. For example, the furcocercariae of Schistosoma are the causal agent of schistosomiasis which affects more than 230 million people worldwide [6], and the Fasciola hepatica fluke is a parasite of high importance in veterinary medicine [26].
Traditional identification of cercariae is based on their natural environment (fresh or saline water), the species of emitting molluscs, and their morphological features (presence of eye spots, type of tail, position of suckers, osmotic regulation system, and distribution of sensory papilla). However, this approach presents several limitations. The morphology of different species within the same genus is very similar at the cercarial stage, which renders species identification particularly challenging. The required expertise for identification is long to acquire and it is the prerogative of a small number of specialists whose number is in constant decline. Particular technical skills are also required for several coloration techniques such as silver-impregnation and borax-carmine staining [7,11].
Molecular biology is becoming the gold standard for the identification of Trematodes at larval stages. Use of the D2 domain of the 28S subunit and the internal transcribed spacers (ITS2 and ITS1) of ribosomal DNA (rDNA), or the cytochrome C oxidase I (COI) gene of mitochondrial DNA has made it possible to refine the taxonomy of Trematodes [2]. Molecular techniques enable researchers to differentiate cryptic species that are morphologically similar at the larval or adult stages [14,20]. These molecular tools have great discriminatory power, but (i) they are still sometimes technically challenging and remain time-and resource-consuming, and (ii) GenBank does not include sufficient sequences to allow for strong species identification, especially sequences obtained from adults (except for the most common parasites of human and veterinary importance).
Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry (MALDI-TOF MS) is now a widely used technique for easy, rapid, and reliable routine identification of bacteria and yeasts [5,31,34]. This technique is based on laser ionization of sample proteins after co-crystallization with MALDI-matrix and comparison of the obtained mass spectra with a database of reference spectra [5]. MALDI-TOF MS is currently under development for the study of protozoa with potential use for the identification of Leishmania [22], Plasmodium [23] and trypanosomatids [1]. Few applications, however, have been proposed in the field of helminthology. MALDI-TOF has recently proved its effectiveness for the rapid identification of Trichinella at the genus and species levels, with a high degree of confidence [27]. In the case of Trematodes, MALDI-TOF MS has been used to find biomarkers for schistosomiasis in mice sera, allowing for very early detection of the infection in this animal model [18].
We propose the use of MALDI-TOF MS as a rapid and inexpensive method for high-throughput identification of cercariae.
The goal of the present study was to design a simple protocol for acquiring MALDI-TOF spectra of cercariae freshly emitted from snails. The discriminatory power of this technique was then investigated and formed a preliminary spectral database especially targeting the furcocercariae of diplostomoids. The analytical performance of this technique was also evaluated by performing blind validation. Finally, we studied the effect of storage in ethanol on cercariae identification.

Materials and methods
Cercaria and snail collection Snails from four different areas were collected: the first one, regularly prospected during an epidemiological survey of the transmission of Alaria alata, is located in the center of France [(National Domain of Chambord (DNC): 48°35 0 N 1°55 0 E)]; the second is Der-Chantecoq lake (DR) in North-Eastern France (48°35 0 N 4°45 0 E), the third was investigated in the context of human cercarial dermatitis from a recreational pond used for swimming [Zebulle Park/Chevenon (ZE) (46°91 0 N 3°22 0 E)], and the fourth is in a landscape of meadows in the locality of Snails were collected by hand from April 2017 to June 2018. They were collected once in all areas, except for the DNC area where the collection was performed monthly from spring to summer.
Collections were pooled in the laboratory and cercarial emergence was stimulated by lighting for 30 min to 2 h. Snails from positive batches were individualized for a second assay and preliminary screening of cercariae was performed using morphological features as proposed by Combes et al. [7] and Faltýnková et al. [10,11]. Identification of snails was performed at the genus level according to Glöer and Meier-Brook [15]. Taking into account the fact that snails usually emitted one kind of cercariae, and after checking under a stereomicroscope, some of the cercariae were processed for MALDI-TOF, whereas others were preserved in 95% ethanol for molecular analysis. Some samples from the foot of most positive snails were also collected. DNA extraction was performed using a QIAamp DNA mini kit (Qiagen, Germany), following the manufacturer's instructions.
Sequence homology was evaluated by nucleotide BLAST requests (https://blast.ncbi.nlm.nih.gov/Blast.cgi). A lack of homology was considered for values lower than 97%.
The evolutionary history was inferred by using the maximum likelihood method. The best evolution model (General Time Reversible model; GTR) with invariant sites was selected based on Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC) using MEGA7 built-in function [21].
Initial tree(s) for the heuristic search were obtained automatically by applying the Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. Internal node support was assessed by a bootstrap test over 500 replicates. All positions containing gaps and missing data were eliminated. All evolutionary analyses were conducted in MEGA7.

MALDI-TOF MS Spectral acquisition
To achieve MALDI-TOF spectral acquisition, 2-5 lL of water containing freshly emerged cercariae was directly spotted to the MALDI target or centrifuged at 4000 rpm for 3 min. After centrifugation, the pellet was washed with distilled water and 5 lL was spotted onto the MALDI target plate (Bruker Daltonik GmbH, Bremen, Germany). Each sample was deposited in at least four replicates. After drying at room temperature, the samples were covered with 1 lL of 70% formic acid. After complete drying, 1 lL of matrix (a-cyano-hydroxy-cinnamic acid in solution with 2.5% trifluoroacetic acid and 50% acetonitrile in water, Bruker Daltonik) was added to each spot. The target was then air-dried at room temperature. MALDI-TOF spectrum acquisition was performed using a Microflex LT mass spectrometer controlled by FlexControl software (Bruker Daltonik) with detection of positive ions on a range of 2000-20,000 m/z (mass to charge ratio). Each spectrum was acquired from 240 laser shots on random regions of the spot using autoexecute mode. Instrument calibration was verified using the Bacterial Test Standard (Bruker Daltonik). Spectra were processed using the FlexAnalysis and MALDI-Biotyper v3.4 software suite (Bruker Daltonik). High quality spectra for each sample were selected to create reference spectra (Main Spectrum Profile: MSP) using the default Bruker Method, which were added to the in-house database. Hierarchical cluster analysis (MSP dendrogram) was performed on the newly created MSP using MALDI-Biotyper Compass Explorer v4.1 software, and a distance matrix was calculated using the correlation method and clustered with the Ward algorithm.

Database validation and LSV cut-off determination
The newly created MSP database was evaluated by means of a blind test performed with new specimens from the DR lake. These new specimens were also deposited in four replicates and each spot was acquired 12 times. The log-score value (LSV) calculated by the Bruker MALDI-Biotyper was then used to evaluate the reliability of species identification based on the similarity between the reference MSP and newly acquired spectra. The cut-off for LSV was determined on the basis of molecular identification using a receptor-operated-channel curve (ROC curve) calculated by logistic regression (SAS 9.4, Grégy-sur-Yerres, France).

Evaluation of the effect of storage in ethanol
In a first step, the specimens of the validation set stored in 80% ethanol were re-analyzed by MALDI-TOF 3 months (91 days) later, using the same parameters.
In a second step, specimens preserved in ethanol over a period ranging from 1 to 14 months were analyzed by MALDI-TOF. In order to evaluate the effect of ethanol fixation, some specimens were fixed in ethanol immediately after emission and analyzed by MALDI-TOF MS the same day. Differences between true positive rates and LSVs were analyzed using Chi-Square and ANOVA tests (SAS 9.4, Grégy-sur-Yerres, France).

Results
A total of 2786 snails were tested for cercarial emission and only a few of them, belonging to the Lymnaeidae (Radix, Lymnaea and Stagnicola), and Planorbidae (Planorbis and Anisus) were positive. The number of the snails tested for each site and the labels of samples used for analysis are reported in the Table 1.
According to the morphological type of emitted cercariae and their origin, some snails were used to evaluate identification by the MALDI-TOF approach versus characterization by molecular biology: one Radix was positive with ocellated pigmented furcocercariae (FO), three with furcocercariae with or without eye spots (LF), and two with cercariae of Echinostomatidae (EC); five Lymnaea stagnalis, three positive with LF, two with xiphidiocercariae (XI); 10 Stagnicola sp., eight positive with LF and two with EC; three Anisus sp. with LF; 38 Planorbis sp., 36 with LF, and two with Echinostomatidae (Table 2).
Consistent and reproducible MALDI-TOF MS spectra were acquired from all the specimens, except for LFDC96 (Parastrigea sp.) with peaks of high intensity between 2 and 20 kDa. All the 12 taxa from which spectra were acquired displayed different peak patterns (Fig. 1). Spectra were tested against the Bruker Taxonomy MSP library, providing no bacterial or fungal identification with log-score values > 1.5.
The results of MSP cluster analysis are shown in Figure 2A. Specimens of the same species were grouped together in clusters clearly separated from other species, with low intra-species heterogeneity. Furcocercariae with forked tails without pigmented eye spots were particularly distant from the other cercariae. The classification is consistent when compared to that based on molecular data inferred using the maximum likelihood method (Fig. 2B).
The MSP database constructed with 10 species (20 MSP) was blind-tested against spectra acquired from 22 samples of freshly emitted cercariae representing five species. Among them, three species were present in the database (Alaria alata n = 15/ 22, Australapatemon sp. n = 1/22, and Echinoparyphium sp. n = 1/22). Among the 1056 acquisitions, 264 spectra (25%) were flat-line spectra and were therefore not included in the analysis. In the 792 remaining spectra, 648 were acquired from species present in the database. Among them, only 147/648 (22.68%) reached the Bruker recommended cut-off LSV of 2.0 for species level identification. However, an LSV of 1.7 sufficient for genus identification was obtained in 443/648 spectra (68.36%). In order to evaluate the best LSV cut-off for cercariae identification, a logistic regression model, based on concordance of MALDI-TOF and molecular data was then built. The ROC curve is shown in Figure 3. The area under the curve of the model was 0.9501 (95% Wald confidence limits: 0.9357-0.9644). Choosing an LSV threshold of 1.7 enabled us to obtain specificity of 100%, with sensitivity of 81.7% (Fig. 3). No false identification was reported using 1.7 and 2.0 cut-off LSV, even for the taxa which were not present in the database, 108 spectra of Cotylurus sp. (LFDC43, LFDC88, LFDC89, LFDC90, LFDC91) and 36 spectra of the Posthodiplostomum sp. (LFDC83). The database was then updated to include Cotylurus sp. with MSPs generated from LFDC89 and LFDC90. This new version yielded good performance for species identification of Cotylurus sp. (78 correct identification among the 78/108 spectra attaining the LSV cut-off of 1.7). This updated database had similar performances with 521/792 spectra reaching the 1.7 cut-off (65.78%), with 100% correct identifications.
Retrospectively, we did not observe any differences in terms of spectral profile between cercariae of the same species from one year to the next. For example, spectra of Alaria alata isolated at Chambord (LFDC41/44/45/50/51/52/53/54) in 2017 did not differ from those isolated in 2018 (LFDC57-LFDC72). We also did not observe any difference between cercariae of the same species isolated from different sites: for example, Australapatemon sp. spectra from specimen LFJO1-LFJO2 isolated in the JO site were not different from those isolated from Chambord (LFDC42, LFDC48, LFDC84, LFDC86).
We observed in one case, re-emission of the same cercariae (LFDC88: Cotylurus sp.) by the same mollusc (DCLF88) 90 days later. There was no difference in the spectral profile between the first and the second emission.
In order to assess the potential use of this new tool, we evaluated the effect of ethanol conservation on analytical performance. We compared the LSV and the rate of true positives between freshly emitted cercariae in the blind validation specimens and cercariae of the same emission preserved over 3 months in 80% ethanol. Among the 1114 spectra acquired, 553 (49.64%) were flatline spectra. In the 561 spectra analyzed using the updated database, 183 reached the cut-off LSV (    there was no false identification for the 183 spectra preserved in ethanol attaining the 1.7 cut-off. In a second round, we analyzed the effect of exposure time to ethanol on collected samples and freshly emitted samples. Among the 1074 spectra (including the 561 previously described), 287 (26.72%) reached the 1.7 LSV cut-off (all of them were concordant with molecular identification). There was no clear tendency of LSV to decrease as a function of preservation time. When comparing spectra obtained from fresh cercariae versus those obtained with cercariae fixed or conserved in ethanol, we mainly observed a degradation of peak intensity resulting in a lower signal-to-noise ratio. Representative spectra are shown in Figure 4.

Discussion
We propose MALDI-TOF MS as a rapid and reliable identification system for cercariae.
This approach is easier to implement than morphological identification. Indeed, according to Gaillot et al. [13], identification of morphological differences between species of Trematoda at the cercaria stage relies on structures that can only be found in fresh cercariae (after contact with urine for the excretory system or carmine-borax staining). For cercariae with forked tails and with colorless eye spots, the morphological feature used is the position of the penetration glands: preacetabular (Tylodelphys sp., Cotylurus sp.) or postacetabular (Diplostomum sp. and Australapatemon sp.). For the last two taxa, the size of the glands and the body spinose are the morphological features used for diagnosis when emitted by Lymnaea stagnalis [10]. With regard to the cercaria emitted by planorbid snails [11], as for example those of Alaria alata, Parastrigea sp., and Australapatemon sp., other morphological features (number of rows of spines, size of spines around suckers, flame-cell formula and body spinose or not) are used to distinguish these genera. To avoid the use of several identification keys, it would be beneficial to have only one approach to the cercariae, regardless of the snail and its living environment.
Identification by molecular biology remains an expensive technique that requires trained staff as well as expertise in processing and interpreting the results. In our study, we demonstrated the ability of MALDI-TOF MS to reliably identify cercariae using a simple protocol. This direct deposit protocol is particularly time-saving compared to morphological and molecular methods. It allows high-throughput identification with more than one hundred specimens processed per day.
MALDI-TOF MS technology is nowadays increasingly accessible to clinical and research laboratories. This approach is also cost-effective as only a small number of reagents are needed. The cost of identifying bacteria using a direct deposit protocol on reusable targets was evaluated at €0.12 per well [13].
We found good discriminatory power when differentiating between the studied groups. This encouraging analytical performance needs to be confirmed on a larger number of taxa, including closely related species such as Diplostomum pseudospathaceum, D. spathaceum and D. phoxini.
According to Bruker's recommendations, LSVs under 1.7 were considered invalid identification. LSVs between 1.7 and 2.0 were considered valid at the genus level, and LSVs higher than 2.0 were considered reliable identification at the species level. In our study, using an LSV cut-off of 2.0 for identification at the species level was highly specific, but resulted in a high proportion of unidentified spectra. Lowering the cut-off to 1.7 allowed for the identification of a higher number of specimens with similar specificity. This cut-off value has already been proposed for species-level identification of filamentous fungi [4,29]. Further studies are needed for the validation of this cut-off on upgraded spectral databases with a higher number of taxa.
In our study, we observed a high proportion of "flat-line" and low-quality spectra. These can be explained by the heterogeneity of the cercariae deposited in the MALDI-TOF target. In our experience, four deposits per sample is a good compromise between deposit and acquisition time, and generally enables identification of the sample with at least an LSV > 1.7 on one well.
We did not note any influence of the species of emitting mollusc. This allowed us to confirm the circulation of Alaria alata in Planorbis as well as in Anisus. We also have found  the same species, Australapatemon sp., in two different locations on two different species of snails. There was also no spectral difference in the same species at different times of study, or between two emissions of the same cercarial species by a same mollusc. These results appear to show that the signal measured by MALDI-TOF mass spectrometry is specific to the studied cercariae and not artefacts of the mollusc or the living environment. MALDI-TOF MS therefore seems to be a reproducible method for cercariae identification. In this study, we observed emission of only a single type of cercariae by each positive snail. Co-infection with two trematodes in the same snail is rarely observed in natural conditions and usually concerns two morphotypes of associated cercariae (e.g., forked tail/ Echinostomatidae; Echinostomatidae/xiphidiocercariae; furcocercariae with eye spots/xiphidiocercariae), as shown in experimental conditions on competitive antagonism [12,25]. Even though two cercariae can be emitted at the same time by a single snail, no cases of associations with the same morphotype of cercariae have been reported.
Fixation of cercariae and their storage in ethanol leads to degradation of spectral intensity, resulting in a high proportion of unidentified spectra. This raises a problem for the retrospective study of collections stored in ethanol. The study of other storage methods for the biological material, such as freezing at different temperatures and other fixatives, seems important for the development of this technique.
We constituted for this study an MSP database with a limited number of Trematoda species. It must be improved by inclusion of new species to cover the broad range of Trematoda involved in veterinary or human medicine. The database would also be improved by increasing the number of strains for a given taxa [30].
Our study highlights the huge potential of MALDI-TOF for large epidemiological surveys of Trematoda.
This technique could thus be applied to the study of human schistosomiasis, including the detection of hybrids [3,8,24], allowing for rapid and precise identification of the cercariae obtained during large snail collection campaigns. It would be of particular interest in areas of mixed circulation. Another field of application is the environmental survey of flukes of interest in human and veterinary medicine.

Conclusion
MALDI-TOF MS is a promising technique for cercariae identification at the species level. It has great discriminatory power using a rapid and easy preparation protocol. The implementation of a spectral database, gathering a large number of species, is one of our objectives for use in routine identification.