Molecular characterization of Glaesserella parasuis strains circulating in North American swine production systems

Background Glaesserella parasuis is the causative agent of Glässer’s disease in pigs. Serotyping is the most common method used to type G. parasuis isolates. However, the high number of non-typables (NT) and low discriminatory power make serotyping problematic. In this study, 218 field clinical isolates and 15 G. parasuis reference strains were whole-genome sequenced (WGS). Multilocus sequence types (MLST), serotypes, core-genome phylogeny, antimicrobial resistance (AMR) genes, and putative virulence gene information was extracted. Results In silico WGS serotyping identified 11 of 15 serotypes. The most frequently detected serotypes were 7, 13, 4, and 2. MLST identified 72 sequence types (STs), of which 66 were novel. The most predominant ST was ST454. Core-genome phylogeny depicted 3 primary lineages (LI, LII, and LIII), with LIIIA sublineage isolates lacking all vtaA genes, based on the structure of the phylogenetic tree and the number of virulence genes. At least one group 1 vtaA virulence genes were observed in most isolates (97.2%), except for serotype 8 (ST299 and ST406), 15 (ST408 and ST552) and NT (ST448). A few group 1 vtaA genes were significantly associated with certain serotypes or STs. The putative virulence gene lsgB, was detected in 8.3% of the isolates which were predominantly of serotype 5/12. While most isolates carried the bcr, ksgA, and bacA genes, the following antimicrobial resistant genes were detected in lower frequency; blaZ (6.9%), tetM (3.7%), spc (3.7%), tetB (2.8%), bla-ROB-1 (1.8%), ermA (1.8%), strA (1.4%), qnrB (0.5%), and aph3''Ia (0.5%). Conclusion This study showed the use of WGS to type G. parasuis isolates and can be considered an alternative to the more labor-intensive and traditional serotyping and standard MLST. Core-genome phylogeny provided the best strain discrimination. These findings will lead to a better understanding of the molecular epidemiology and virulence in G. parasuis that can be applied to the future development of diagnostic tools, autogenous vaccines, evaluation of antibiotic use, prevention, and disease control. Supplementary Information The online version contains supplementary material available at 10.1186/s12917-023-03698-x.


Background
Glaesserella parasuis (G.parasuis) is a Gram-negative bacterium that causes Glässer's disease in pigs.This disease is characterized by arthritis, meningitis, and polyserositis, commonly observed in 4 to 8-week-old pigs, but it can also sporadically occur in older pigs [1].Clinical signs are abdominal breathing or coughing, lameness, paddling, and septicemia with acute death [2,3].While the agent was first described in 1910 [1], it continues to challenge the health and productivity of swine production systems today.In fact, over the last ten years, disease due to G. parasuis increased substantially in swine cases received at the Iowa State University Veterinary Diagnostic Laboratory (ISU-VDL) [4].
The bacterium is considered endemic in all swine populations, colonizing the upper respiratory tract of pigs.Since the pathogenicity of G. parasuis strains varies quite significantly, from highly pathogenic, such as the well-described Nagasaki strain (serotype 5 ST24), to non-pathogenic strains, such as strains of serotype 3, 6, 9, and 11 [5,6], effective control hinges upon the correct identification and typing of the disease-causing strains.Serotyping has been the most commonly used typing method, with 15 serotypes based on capsular polysaccharides described to date [5,7].The major pitfall of serotyping is its low discriminatory power and typeabilty.Multilocus sequence typing (MLST) [8,9] is an alternative typing technique.Although MLST provides better strain discrimination and portability compared to serotyping, it is labor-intensive and time-consuming, hindering adoption.A plethora of other GPS genotyping methods have been described previously for instance pulsed field gel electrophoresis (PFGE), amplified fragment length polymorphisms (AFLP), multilocus variable number of tandem repeat analysis (MLVA), random amplified polymorphic (RAPD), enterobacterial repetitive insertion consensus PCR (ERIC-PCR) [10][11][12][13][14][15].Some of these methods have been compared previously [16].Although most of these methods have relatively good discriminatory power, the results are hard to compare between laboratories, interpretation of the banding pattern can be ambiguous, and do not in general provide information on the virulence potential of a given strain.
Pathotyping is also available for G. parasuis.The most important virulence factors described for G. parasuis are the virulence-associated trimeric autotransporter (vtaA) genes [6].There is increasing evidence that group 1 vtaA genes are associated with virulent isolates [17,18].While other putative virulence-associated genes, [19][20][21][22] have been described, these genes are not targeted frequently in diagnostic settings to assess the virulence potential of strains.
The reduction in sequencing costs and advances in bioinformatic pipelines have made whole genome sequencing (WGS) more feasible for diagnostic use.WGS offers higher discrimination between isolates by allowing the extraction of relevant information related to phylogeny, sequence type, putative virulence factors, and antimicrobial resistance (AMR) genes.Current knowledge about the molecular epidemiology of disease-associated G. parasuis strains circulating in North American pig populations is lacking.The objective of this study was to perform genetic characterization of G. parasuis strains from disease-associated cases using WGS.In-depth characterization using WGS could vastly improve on farm control and prevention programs.

Phylogenetic analysis
At least 47,061,190 of the paired-end reads processed had a Phred score of 30, suggestive of high quality.The overall GC content was between 39.06 and 40.26.A total of 71,368 core genome SNPs were identified from all strains by kSNP3.The phylogenetic tree based on the coregenome SNPs of these isolates indicated high genetic diversity, and were clustered into three primary lineages (LI, LII, LIII) (threshold = 0.1) based on the structure of the phylogenetic tree and the number of virulence genes (Fig. 1).The largest primary lineage was LIII, containing 49.5% (n = 108) of the study isolates.LI and LII contained 13.8% (n = 30) and 36.2%(n = 79) of the isolates, respectively.Each lineage contained sub-lineages with only LIIIA sublineage being highlighted due to its unique characteristics (Fig. 2).

Discussion
G. parasuis continues to challenge the productivity, health, and well-being of post-weaning pig populations, and represents a driver of antimicrobial use on farms [1].Effective control of G. parasuis requires a multifaceted approach based on the judicious use of antimicrobials, vaccination, minimizing viral co-infections and environmental triggers, improving husbandry and characterization of the disease-causing strains.Whole-genome sequencing analysis provides multiple layers of information that aid in determining the clinical relevance of the isolates and providing epidemiological insight.
In this study, WGS was performed on 218 G. parasuis isolates predominantly from North America.The size and GC content were within range with other published G. Parasuis genomes [23].Phylogenetic analysis revealed three primary lineages, LI, LII and LIII.This is in contrast with a recently published study from China [23] that showed G. parasuis isolates clustering into two main primary lineages comprised of STs not found in the present study, highlighting the high genetic variation found within the species.
Whole-genome in silico serotyping revealed 11 of the 15 known serotypes, with serotypes 7, 13, 4, 2, and 5/12, as the most common serotypes detected.The WGS in silico serotyping scheme was based on the DNA sequence fragment for each serotype published in Howell et al. [24].The scheme is unable to discriminate serotype 5 and 12.However, future comparative genomic analysis studies using complete genomes could be used to determine whether these two serotypes can be discriminated.In this study, only 6.4% of the isolates were untypable via in silico serotyping.This percentage was lower compared to 39% and 9.7% that utilized traditional and PCR based serotyping methods [10,25], respectively, emphasizing the potential advantage of sequence-based methods for bacterial typing.Furthermore, a real-time PCR serotyping scheme recently developed was able to assign a serotype to all 40 isolates that were previously untypable by conventional serotyping [26].
The most frequently detected serotypes observed in this study have been previously associated with disease and are commonly detected from clinical cases [10,25,27,28].However, in most studies serotypes 4 and 5/12 were the most frequently detected [10,[29][30][31].This study further supports the increasing evidence for serotype 7 in disease [25,31].In a recent study, serotype 7 strains from clinical cases were frequently detected in USA, Canada, Europe, China, and Vietnam, and some were associated with virulence [25].Its widespread nature could be attributed to trade between these countries or companies moving pigs between them.
Multilocus sequence typing identified 72 sequence types within 218 isolates, of which 91.7% (66/72) were novel.Specifically, this study highlighted the emergence of ST454, a strain detected in outbreaks of high mortality, characterized by per acute disease and polyserositis.The eBurst analysis showed that ST454 belonged to the CC157 with its predicted founder ST157, a sequence type first identified in Brazil from a pig with bronchopneumonia (https:// pubml st.org/ organ isms/ glaes serel la-paras uis).Surprisingly, ST157 was among the least prevalent STs in this study.In contrast, a clonal complex's ancestral ST (founder) is typically the most prevalent ST in a population due to the fitness advantage or random genetic drift [32].It is likely that as the number of isolates in the database increases, the founder ST might change to ST454.
The eBurst analysis revealed that almost half of the isolates had singleton STs, suggesting high heterogeneity and instability in the population structure, as previously described [11,17,33].The ST diversity observed in this study has also been reported in other studies.Olvera et al. identified 122 STs within 150 strains (17), Turni et al., identified 54 STs (41 novel) within 75 isolates from Australia [11].In this study, only 5 of the 72 STs identified had been previously reported, further highlighting the significant heterogeneity of the species on a global scale.While a high diversity of STs was observed, only a few STs, including ST454, ST478, and ST6 were the most frequently detected in the current study.The high detection frequency of these STs could reflect unique production systems submitting samples to the diagnostic laboratory and not a true reflection of the overall prevalence.Still, the data shows the clear role of these STs in disease cases in the North America.U.S, but also their distribution in different countries and flows.For example, ST454 was detected in 24 distinct flows and 5 countries.Targeted control and elimination procedures, such as inclusion of these STs in autogenous vaccine products or depopulation of herds carrying such STs could potentially reduce the G. parasuis disease burden for a significant number of production systems.While the intensity of sampling varied across production systems, multiple flows had at least 2 STs (Additional file 2), supporting previous knowledge that within a production system multiple G. parasuis strains can be detected [34] and could be contributing to disease.Therefore, disease control should rely on the characterization of the clinically relevant strains within a flow to determine the appropriate vaccine candidates for autogenous vaccine production or to evaluate the potential efficacy of commercially available vaccines, as well as to improve the sourcing of replacement animals and pig flow management.In this study, while multiple serotypes and STs were detected within most flows, several flows were represented by a predominant ST or serotype (Additional file 1), particularly genotypes with apparent higher pathogenicity, such as with ST454.
Unlike serotypes, isolates of the same STs were mostly lineage specific, as revealed in the WGS-based phylogenetic analysis (Fig. 2).Furthermore, isolates of the same serotypes had multiple different STs but the opposite was observed less frequently, highlighting the higher discriminatory power of MLST compared to serotyping.Furthermore, core-genome SNP phylogenetic analysis showed evidence of genetic variation in isolates within the same STs e.g.ST6 and ST454 (Fig. 2), demonstrating the higher resolution offered by whole-genome sequencing compared to MLST and serotyping.Still, whether these genomic differences between strains of the same STs impact the clinical outcome is yet to be elucidated.
Among the virulence-associated genes, group 1 vtaA genes are good predictors of virulence, and some have been shown experimentally to play a role in disease [17,18,35,36].Approximately 97% of isolates from this study carried at least 1 group 1 vtaA gene.Based on previous research [25,36], this suggests that these strains could have pathogenic potential.Phylogenetic SNP analysis revealed three primary lineages with varied number of average group 1 vtaA genes present.LI and LII isolates carried a numerically higher number of group 1 vtaA genes on average than LIII, whereas isolates from the LIIIA sublineage (Fig. 2) lacked all the vtaA genes.It is not entirely known if there is a correlation between the number of group 1 vtaA genes and increased virulence or if certain patterns of vtaA gene carriage can be a predictor of virulence.However, we observed some group 1 vtaA genes were significantly associated with known serotypes associated with disease for example 5/12 or 7, suggesting that these could serve as predictors of virulence.Isolates of serotype 8 and 15 in LIIIA lacked all vtaA genes and are serotypes that have been predominantly detected from the nasal cavity of pigs, confirming previous findings [25].Variability in carriage of group 1 vtaA genes was noticed within isolates of the same ST or genetically similar based on SNP phylogenetic analysis (Fig. 2).For instance, group 1 vtaA gene carriage for 39 serotype 7 ST454 isolates ranged from 4-9.Similarly, all CC157 isolates (n = 54) carried between 4 and 9 group 1 vtaA genes.This shows how variable vtaA carriage can be even between strains showing high similarity at the core -genome level.The lack of detection of group 3 vtaA genes 12 and 13 in 44% and 33% of isolates, respectively, was an unexpected finding since previous studies have identified these genes among all G. parasuis strains previously examined but this was based on the translocator domain PCR (17).Some VtaA12 and vtaA13 might be divergent and therefore not identified as vtaA12 or vtaA13 homologs.In this study, genes were only reported if their sequence identity was above 70% and the gene coverage was above 50%.Previous studies have shown divergence within these genes which could have contributed to a lack of detection given our stringent definition for gene reporting (V.Aragon, unpublished data).
Monomeric autotransporters, porin proteins, and fimbria are involved in G. parasuis pathogenesis [21,37,38].At least 51% of the isolates were positive for bmaA4, bmaA5, bmaA6, ompP2, ompP5, pilA, pilB, pilC and siaB in the current study.In fact, even isolates that were negative for all vtaA genes were positive for some of these, such as ompP5 and siaB.This may suggest that some of these putative virulence genes are conserved in both virulent and nonvirulent strains and may not be good predictors of virulence compared to group 1 vtaA genes.However, this notion should be interpreted with caution since the study isolates were predominantly from clinical cases.lsgB, which has been associated with virulent strains [20], was detected in only 8.3% of the isolates which were predominantly of serotype 5/12, suggesting a potential marker for this serotype.
Control and prevention of Glässer's disease often rely on the strategic use of antimicrobials.Thus, detection of antimicrobial resistance genes could provide value in antimicrobial use decision-making.Among the genes, bcr (bicyclomycin resistance), ksgA (kasugamycin resistance), bacA (bacitracin resistance), sul2 (sulfonamide resistance, and aph (3'')-Ib (streptomycin resistance) were detected in almost 100% of the isolates.Consistent with this study, bcr, ksgA, and bacA genes were highly prevalent in G. parasuis isolates in a recent study [23].Some of these highly prevalent genes encode resistance to antibiotics not commonly used in swine production.A recent study showed that tetracyclines, lincosamides, and beta-lactams are among the top three antibiotic classes used in the USA [39].Furthermore, susceptibility data from Iowa State University shows most 2021 G. parasuis strains (n = 908) as mostly susceptible to ampicillin, ceftiofur, enrofloxacin, florfenicol, tiamulin and tilmicosin (https:// vetmed.iasta te.edu/ sites/ defau lt/ files/ VDL/ pdf/ Susce ptibi lity-Summa ry-Porci ne-2021.pdf ).The remaining genes; aph3''Ia, strA, bla ROB-1, blaZ, spc, ermA, tetB, tetM and qnrB were detected in lower frequencies.Some studies showed varying frequencies of these genes in G. parasuis [40,41].Due to the low prevalence of some of these AMR genes a definitive conclusion on their distribution in phylogenetic lineages could not be determined.However, they were detected more frequently in LIII than LI and LII isolates (Table 1).For instance, 73.3% (11/15) and 75% (6/8) of the isolates positive for blaZ and tetM, respectively, were in LIII, and were of serotypes 1 ST417 and ST420, 2 ST476, 4 ST418, 7 ST471 and ST473, 13 ST415 and ST421, and NT ST415.Further studies are needed to correlate if the genotypic antimicrobial resistance profile can predict phenotypic resistance in this species.
Although the findings of this study were consistent with the published literature to date, caution is needed when interpreting the results.This is because the G. parasuis isolates included in the current study were from disease diagnosis cases submitted to the ISU VDL, and thus only captured production systems that submit samples to this laboratory.However, the contributing production systems represented the top pork producers in North America.

Conclusion
While WGS is computational demanding, requires high technical skill and is currently more expensive than other typing tools, the value obtained with the information far exceeds these limitations.Still, it is expected that serotyping and virulence marker PCRs will continue to be used to monitor G. parasuis strain variation.Results herein will lead to a better understanding of the molecular epidemiology and virulence potential in G. parasuis that can be applied to the development of improved diagnostic tools, evaluation of antibiotic use, tracking of outbreak strains and identification of vaccine candidates for improved management of Glässer's disease in swine production systems.

Source of isolates
A total of 218 G. parasuis isolates were obtained from porcine cases submitted to the Iowa State University Veterinary Diagnostic Laboratory (ISU VDL) from 2015 through 2022 (Additional file 3).While strains mostly originated from the USA (n = 154), they were also obtained from Canada (n = 20), Chile (n = 9), Denmark (n = 4), Peru (n = 6), and Mexico (n = 25).(Additional file 3).For each isolate, metadata was obtained, including; the year of isolation, tissue type, flow and farm name, and pig age.Strains from the USA originated from at least 16 different states.Histopathological data was obtained for 80/218 isolates (Additional file 3).
In addition, a total of 15 reference strains were provided by Dr. Marcelo Gottschalk, each representing one of the 15 G. parasuis serotypes [5].All presumptive G. parasuis strains isolated at the ISU VDL or submitted as pure cultures from VDL clients were subjected to matrixassisted laser desorption ionization-time-of flight mass spectrometry (MALDI-TOF MS) for definitive species identification prior to sequencing.

Whole genome sequencing
Initial isolation of G. parasuis from samples was achieved by plating clinical samples onto 5% sheep blood (Hardy Diagnostics, Santa Maria, CA) with Staphylococcus hyicus nurse streak and incubating overnight at 37 0 C. Presumptive G. parasuis isolates were subcultured onto chocolate agar (Thermo scientific ™ ), confirmed using MALDI-TOF MS, and frozen at -80 0 C in brain heart infusion broth with 30% glycerol.Two days before submission for sequencing, bacterial strains were retrieved from the freezer and streaked onto chocolate agar.An overnight pure growth culture lawn was inoculated into 2 mL phosphate-buffered saline (PBS).Genomic DNA was extracted using ChargeSwitch gDNA mini bacterial kit (Life Technologies Carlsbad, CA) according to the manufacturer's instructions [42].
The DNA quality (A280/A260) was assessed using a Nanodrop (Thermo scientific ™ ) and quantified using a Qubit fluorometer dsDNA HS kit (Life Technologies).Multiplex genome libraries were prepared using the Nextera XT DNA library preparation kit (Illumina, San Diego, CA).The genomic library was quantified using a Qubit fluorometer dsDNA HS kit (Life Technologies Carlsbad, CA) and normalized to the recommended amplification concentrations.The pooled libraries were sequenced on an Illumina Miseq sequencer using Miseq Reagent V3 for 600 cycles (Illumina, San Diego, CA).Raw reads were demultiplexed automatically on the Miseq.
Pre-processed reads were assembled using SPAdes Genome Assembler version 3.11.1-Linus[44] using assembly options for paired-end reads and Burrows-Wheeler Aligner mismatch correction.Small (< 500nt) and low (< 2) average kmer coverage contigs were removed from the SPAdes assembly results from further analysis using custom scripts to determine N50, longest contigs, and total length of the contigs.

Bioinformatic analysis
Virulence-associated genes in each genome were detected using SRST2 [45] against a custom virulence gene database using raw reads.Blastn was used to detect highly divergent virulence-associated genes.Antimicrobial resistance genes in each isolate were detected using SRST2 [45] with default parameters (-min_coverage, 90%, -min_depth 5) by mapping reads from each isolate against the acquired antimicrobial resistance gene database (/data/ARGannot_r3.fasta)provided by SRST2.In addition, the assembled contigs were also blasted against the same antimicrobial resistance gene database, virulence gene database, and candidate hits with identity above 70% and gene coverage above 50% were also reported and merged with the SRST2 results.Serotype was determined by detecting serotype-specific capsule loci [24].Single nucleotide polymorphisms (SNPs) of G. parasuis isolates were identified by running kSNP3 with standard mode.The optimal k-mers size was calculated by the Kchooser program, and the whole-genome phylogeny was analyzed based on identified core genome SNPs [46].
Multilocus sequence typing was performed using SRST2 with an integrated MLST database and definitions.Sequence types (STs) were further confirmed by querying the assembled fasta files using the G. parasuis pubMLST typing platform [9,47].Assembled fasta files of isolates with novel alleles or allelic profiles were submitted to the PubMLST database.Novel alleles and allelic profiles were extracted from the assembled files and assigned a number using the G. parasuis typing database.Novel ST were added to the pubMLST G. parasuis database with their corresponding metadata.Determination of clonal complexes was done using the goeburst algorithm [48], a refinement of eBURST [32], implemented in PHYLOViZ (http:// www.phylo viz.net), using the stringent six of 7 alleles identical with the founder ST [49].The global pubMLST G. parasuis dataset and the study isolates were subjected to the goeBURST analysis to determine the study isolates' evolutionary descent and clonal complexes.A clonal complex (CC) was defined if it contained at least three STs, including the founder ST.Sequence types that did not meet this criterion were considered singleton STs.

Data analysis
Data management, descriptive, and inferential analyses of G. parasuis serotypes, STs, AMR, and virulence genes were performed in Microsoft Excel ® and R (R program version 4.0.0,R core team 2020).
Simpson's index of diversity [52] was used to measure the diversity of genetic type distributions, which included serotype and ST.The calculations followed the formula: where N was the total number of isolates, and n was the total number of isolates per serotype or ST.
Tissues of isolation, when reported, were classified as lung, upper respiratory tract (URT), systemic tissues or unknown.Likewise, reported lesions were categorized by affected body system, including nervous, respiratory, systemic, and unknown.Associations between serotype/ ST, lineages, vtaA genes (1 -9) versus tissue of isolation, and ST/serotypes versus lineages and vtaA genes (1 -9) were evaluated using logistic regression.To account for the statistical model variation of logistic regression, only serotypes and STs containing at least ten isolates, representing at least 10% of the isolates, were selected for association analyses, which included serotypes 7, 13, 4, 5/12, 2, and NT, and STs 6, 454 and 478.Using Poisson Regression, the total number of Group 1 VtaA genes (1-9) was compared across serotypes (or ST).This study was funded in part by PIC North America.

Availability of data and materials
The genome assembly data and raw read sequences were deposited in the National Center for Biotechnology Information (NCBI) and Sequence Read Archive (SRA), respectively, under the Bio project number PRJNA749326.Other data is available in the additional files --.• support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year

•
At BMC, research is always in progress.

Learn more biomedcentral.com/submissions
Ready to submit your research Ready to submit your research ?Choose BMC and benefit from: ? Choose BMC and benefit from:

Fig. 3 1
Fig. 3 Distribution of sequence types (STs) and serotypes of Glaesserella parasuis (G.parasuis) isolates.The colour codes represent the sequence type diversity within in a given serotype.Frequently detected STs ST454, ST478, and ST6 are highlighted

Fig. 4
Fig. 4 Number of Group 1 vtaA genes by ST.Black line represents the range of Group 1 vtaA genes with blue dot identifying the median

•
fast, convenient online submission

•
thorough peer review by experienced researchers in your field • rapid publication on acceptance