Identifying associations between pig pathologies using a multi-dimensional machine learning methodology

Sanchez-Vazquez, Manuel J; Nielen, Mirjam; Edwards, Sandra A; Gunn, George J; Lewis, Fraser I

doi:10.1186/1746-6148-8-151

Research article
Open access
Published: 31 August 2012

Identifying associations between pig pathologies using a multi-dimensional machine learning methodology

Manuel J Sanchez-Vazquez¹,
Mirjam Nielen²,
Sandra A Edwards³,
George J Gunn¹ &
…
Fraser I Lewis⁴

BMC Veterinary Research volume 8, Article number: 151 (2012) Cite this article

4501 Accesses
24 Citations
Metrics details

Abstract

Background

Abattoir detected pathologies are of crucial importance to both pig production and food safety. Usually, more than one pathology coexist in a pig herd although it often remains unknown how these different pathologies interrelate to each other. Identification of the associations between different pathologies may facilitate an improved understanding of their underlying biological linkage, and support the veterinarians in encouraging control strategies aimed at reducing the prevalence of not just one, but two or more conditions simultaneously.

Results

Multi-dimensional machine learning methodology was used to identify associations between ten typical pathologies in 6485 batches of slaughtered finishing pigs, assisting the comprehension of their biological association. Pathologies potentially associated with septicaemia (e.g. pericarditis, peritonitis) appear interrelated, suggesting on-going bacterial challenges by pathogens such as Haemophilus parasuis and Streptococcus suis. Furthermore, hepatic scarring appears interrelated with both milk spot livers (Ascaris suum) and bacteria-related pathologies, suggesting a potential multi-pathogen nature for this pathology.

Conclusions

The application of novel multi-dimensional machine learning methodology provided new insights into how typical pig pathologies are potentially interrelated at batch level. The methodology presented is a powerful exploratory tool to generate hypotheses, applicable to a wide range of studies in veterinary research.

Background

Abattoir post-mortem inspection offers good opportunities for pig health monitoring [1] and it has been widely used as a data source for epidemiology-based analyses. Most of these studies focus on the identification of risk factors influencing the presence of the major abattoir pathologies: pneumonia, pleurisy and milk spot liver [2–9]. Few reports investigate how the different pathologies are interrelated [1, 10, 11]. Identification of the associations between pathologies may assist in elucidating theories on their biological connection and could greatly contribute to facilitating their control – for example by encouraging veterinarians to establish intervention strategies aimed at reducing the prevalence of not just one, but two or more conditions simultaneously. Knowledge of associations between lesions could also be employed to inform official abattoir inspection systems, in which the presence of one pathology could trigger an inspection for others.

Official routine meat inspections are implemented world-wide with the main objective of ensuring food safety. This system, however, is imperfect and is particularly lacking in sensitivity [12, 13]. Pig health schemes were proposed to provide an integrated system to capture abattoir information based on more detailed post-mortem inspection [14] which is considered to improve classification characteristics, particularly sensitivity [12]. Good examples of these initiatives in Europe are the British pig health schemes. On a regular basis, swine specialists carry out detailed post-mortem examinations in parallel to the official food-safety routine meat inspections. These schemes monitor the presence of various pathologies detected by means of a detailed inspection of the pluck and the skin of the slaughtered pig. These pathologies are normally associated with a reduction in performance traits or are potential indicators of the presence of welfare problems in the herds [10, 15–17].

Graphical modelling has been increasingly used in veterinary epidemiology to investigate and express the relationships between factors influencing diseased/unproductive status in livestock [18–23]. Frequently, studies utilising graphical models are based on structure discovery approaches, which are data-driven multivariate methodologies resulting in graphical outputs such as networks or path/chain models. Structure discovery has been employed to explore how mastitis and fertility management influence production in dairy herds [18]; to identify changes in pig behaviour related to early piglets mortality [19]; to investigate the most likely pathogens involved in clinical mastitis in dairy cows [20]; and to identify those farm risk factors associated with bovine viral diarrhoea [23]. Besides these examples, other studies employed graphical models informed using existing/expert knowledge to describe risk factors influencing the prevalence of Mycoplasma hyopneumoniae [21]; and to estimate the risk of leg disorders in finishing pigs [22]. A crucial distinction among the abovementioned papers, is that these two latter studies [21, 22] did not use structure discovery to inform structure of the network, but were rather based on published knowledge and expert opinion. The latter is highly subjective and if, as in this study, extensive data are available, then extracting the co-dependence network structure from observed data provides objective and robust empirical analyses.

Multi-dimensional machine learning methodology (also known as Bayesian graphical modelling) is a variety of graphical modelling structure discovery techniques used to identify the dependency structure that encodes the joint probability distribution between variables [24, 25], allowing for both visualization and estimation of associations. In short, this process consists of a series of model searches to identify the multi-dimensional model that best explains the data, using Bayes factors to compare between models [23]. This approach allows estimation of the associations between variables and distinguishes between direct and indirect dependence [25] (dependence being equivalent to biological association), contributing to generate hypotheses about the nature of the interrelationships. Multi-dimensional machine learning methodology offers an intuitively appealing and technically elegant way to investigate multiple associations between variables compared to more conventional multivariate statistical approaches (e.g. principal component and factor analyses). This methodology is used extensively in fields such as bioinformatics and genetics [26–28] and only recently has been applied in the veterinary field [23].

This paper uses a multi-dimensional machine learning methodology to identify whether associations exist between the different pathologies reported by the British pig health schemes. The results of this study could assist veterinarians in the control of these conditions by implementing strategies to control several conditions at once. These results could be also utilised to review current pig abattoir inspection strategies, and inform more targeted risk based inspections. Farmed pigs are normally considered as a grouped unit, where complex interactions take place between the environment, mainly determined by the housing system and the husbandry practices, and the pigs, characterised by their genetics, idiosyncratic behaviour and baseline health status [29]. For these reasons this study focuses on the interrelationship occurring between pathologies at batch level.

Methods

Data source

Abattoir data were accessed through the databases of the two pig abattoir lesion scoring health schemes which exist in Great Britain: Wholesome Pigs Scotland (WPS) (covering Scotland) and British Pig Health Scheme (BPHS) (covering England and Wales) [6]. The health schemes provide services in 17 pig abattoirs. Both schemes obtain a sample from each batch of pigs by assessing every second pig on the slaughter line. The scoring was carried out by swine veterinarians trained in this method of testing on the abattoir inspection line. The data were from a three year period (July 2005 to June 2008).

Dataset

For the purpose of this investigation, a batch is defined as a group of pigs from a single farm submitted to the abattoir on a particular date. A total of 6485 batches were included, submitted from 1138 farms, with a median of 4 batches assessed per farm (first quartile 2, third quartile 8). All the batches consisted of exactly 50 pigs assessed.

Scoring for the different pathologies

Ten pathologies reported by the health schemes are included in this study: Enzootic-pneumonia-like lesions, pleurisy (pleuritic lesions), milk spots, hepatic scarring, pericarditis, peritonitis, (lung) abscess, pyaemia (pyaemic lung lesions), tail damage and papular dermatitis. A further explanation on the gross pathology description, the most typical cause associated and the scoring system for each condition are presented in Table 1. In this study, a positive case for each pathology was defined as a pig affected with any degree of lesion and a negative when lesions were absent.

Table 1 Summary of the gross pathology description of conditions studied with their most typical cause and the scoring system

Full size table

Consistency in the scoring of the pathologies

Both health schemes carried out exercises to standardise the definition of each lesion across the inspectors. One WPS assessor was involved in the training of all the other inspectors that carried out WPS and BPHS assessments during the three year period included in this study. Once a year, all the inspectors underwent a refresher/training day where the same pigs and pathologies were assessed by all the assessors and feed-back was provided by the trainer. These assessment exercises aimed to maintain the consistency in the scoring criteria across assessors by identifying and correcting potential misclassifications. Furthermore, the schemes aimed to include at least two assessors per abattoir and to place each assessor in at least two different abattoirs, thereby minimising the potential of operator bias.

Definition of pathology batch-status variables

The machine learning approach utilised requires working with categorical variables. Batches were categorised into lesion present/absent using the frequency distribution of the batch prevalence for the different pathologies to determine data-derived cut-off points (further details are provided in Additional file 1: Figure S1 and Figure S2). In the context of this study, where all the batches have the same number of pigs inspected (i.e. 50), frequency and proportion are equivalent and the cut–offs are defined in terms of frequencies per batch. For enzootic pneumonia-like lesions, three categories were identified based on within batch prevalence: EP high) when more than 25 pigs were affected with any degree of severity; EP moderate-low) when between one and 25 pigs were affected; and EP zero) when no pigs were affected. For pleurisy, thee categories were also identified based on within batch prevalence: PL high) when more than seven pigs were affected; PL moderate-low) when between one and seven pigs were affected and PL zero) when no pigs were affected. The three prevalence level categories identified for enzootic pneumonia-like lesions and pleurisy were each separated into three binary variables (e.g. EP high [yes, no], and so on) to reflect the pathology batch-status. Splitting the prevalence level categories into three binary variables was chosen over creating a single multinomial variable to add flexibility in the modelling and facilitate the interpretation of the model outputs. For the other pathologies which have a much lower prevalence (i.e. milk spots, hepatic scarring, pericarditis, peritonitis, papular dermatitis, tail damage, abscess and pyaemia) batches were considered positive if at least one pig was found affected, and negative otherwise. In summary, the ten different pathologies were studied through 14 binary variables reflecting the pathology batch-status. A data break-down of the frequencies for the pathology batch-status variables is presented by pairs in Table 2.

Table 2 The break-down of the frequencies of the variables expressing batch-status for the different pathologies studies by pairs, N = 6485 batches of slaughtered pigs

Full size table

Multi-dimensional machine learning methodology

The process explained below aims to identify an optimal multi-dimensional model, i.e. a graphical model displayed as a network of connections, where each connection (arc) describes a statistically significant association between the different lesions in the data. Figure 1 schematically represents the machine leaning structure discovery process utilised, which is initiated with numerous series of searches followed by steps to summarise the results of each search. This methodology consists of fitting models which are network structures technically referred to as directed acyclic graphs (a graph with no loops), in which nodes correspond to the pathology batch-status variables and arcs between nodes (represented by arrows) indicate that a direct probabilistic dependency (e.g. an association) exist between nodes.

Direction of the arrows

The direction of the arcs connecting nodes is informed by the data, reflecting the dependency structure which generated the data [26]. The direction only implies association and says nothing of causality. Arc direction is as a result of the underlying mathematics used to construct the models (technically the graph denotes a factorisation of the joint probability distribution of the data). Models with particular arc directions may be better fit to the data than with the reversed directions, and therefore preferred, however, it would be incorrect with this information alone to infer that the biological dependence between two nodes is supported more in one direction over another, all that can be inferred is that association exists between nodes.

Searching for locally optimal structures

The machine learning structure discovery process was performed through series of local heuristic searches using a standard approach proposed by Heckerman et al. [24]. Locally optimal models are identified by random-restart local hill climbing searches, also known as a “greedy search” [26], which seek to maximise the goodness of fit metric (network score) for each model. This network score is given by the (log) marginal likelihood of the data given the model; equivalent to the Bayes factor when using equal prior on each model structure. This search process can be thought of as roughly analogous to stepwise regression in linear modelling but conducted in multiple dimensions where the initial model from which the search commences is randomly chosen. The interrelationships within different batch-status categories for the same pathology are inversely related – i.e. when one batch-status is present the others are not. Therefore arcs connecting the different batch-status categories for the same pathology (e.g. EP zero with EP high, or EP high with EP mild/moderate, and so on) were banned from the search.

Summarising the results from the local searches

Alternative and competing explanations of the data are produced during the local search process; different local searches may lead to different structural features (e.g. arcs) that appear in some networks but not in others [25]. A great deal of commonality across the search results is expected and strong features should be extracted reliably [25]. The aim is to produce an optimal structure that robustly represents the main associations. Three main ways are proposed to summarise the results from the local searches:

(1)
The “overall best network” is the single structure with the best score (according to the Bayes factor) across all the searches. This structure identifies the potential pathways (composed of sets of arcs) of associations between variables. Some of these pathways may be weak, however, i.e. only identified for this particular network and may incur over-fitting; a common problem within structure discovery approaches [26].
(2)
The “majority consensus network” is the structure that represents those common features present in the majority of the best-scored networks identified across all the heuristic searches. By using this, those associations (arcs) that were present in the majority (over 50%) of all the locally best networks were kept. This approach is typically employed in phylogenetic studies [30] and it has been suggested for structure discovery [23].
(3)
The “pruned network” is the structure that combines the two approaches mentioned above to produce a more robust output. Only those arcs that were part of the overall best network and also recruited by the majority consensus network were kept. Lewis et al. [23] proposed this approach mimicking pruning performed in decision tree inferences, which is essential to reduce over-fitting [31].

Identifying the final network

Out of the three structures described above, the pruned network is the model that provides the most robust and conservative approach and is therefore considered in this paper as the principal result. The strength of the association between two nodes (pathology batch-status variables) present in the pruned network was estimated by calculating the relative risk (RR) (also known as risk ratio) [32]. RR is calculated as the proportion of batches affected with condition A given condition B is present in the batch, divided by proportion of batches with the condition A but with condition B not present. The 95% confidence intervals (CI) for the RR were estimated using Monte-Carlo simulation.

Parameters in the search algorithm

Three major characteristics define the algorithm of the heuristic search:

set.seed a single value that sets the starting point for the search.
i.permutations a number that defines the times an initial empty network is perturbed to construct a random network from which a stepwise search is performed.
max.parents a number between 2 and total number of variables minus 1. This number defines the maximum number of arcs reaching a particular node. In this study no restrictions were placed upon the number of parents and the maximum, 13, was allowed in all searches.

The optimal number of local searches required to identify a robust machine learning structure is problem specific. In this study, the number was determined empirically by running two parallel sets of searches, differing in the set.seed value. The number of local searches was increased until both sets reached the same majority consensus network, thereby suggesting that a sufficient numbers of searches had been run to provide robust outputs. The results from both sets of searches were pooled to identify the best overall single network which, combined with the majority consensus network, led to the pruned network.

The analyses were performed in R [33] using a library written by FIL (freely available upon request) to perform the structure search. Other broadly similar libraries are available for use within R from CRAN (Comprehensive R Archive Network) website, and similar toolboxes are available for use with MATLAB.

Results

Empirical investigation determined that 10000 local searches were sufficient to ensure robust modelling results.

Graphical outputs

The “majority consensus network” is presented in Figure 2 and provides complementary information to the main output from this investigation, the “pruned network”, which is presented in Figure 3 completed with estimated RRs. The arcs presented in the “pruned network” could be identified in the “majority consensus network” to determine the percentage of local searches in which the particular arc appears, informing about the robustness. Thus, the “majority consensus network” (Figure 2) shows that the connections leading to milk spots from hepatic scarring and papular dermatitis, are the most robust – present in more than 90% of searches. In the pruned network (Figure 3) these arcs are retained, and it is observed that those batches with hepatic scarring had a moderate risk of milk spots compare to those batches not presenting hepatic scaring; likewise batches with papular dermatitis had a milder risk of milk spots compare to those with papular dermatitis absent. Figure 3 also shows that batches with mild or moderate levels of pleurisy were more likely to be enzootic pneumonia-like free than those with other category levels of pleurisy. The pneumonia free batches were more likely to be also free of papular dermatitis than those with pneumonia. Batches with a high level of enzootic pneumonia-like lesions had a moderate risk (i.e. RRs between 1.5 and 2.5) of having a high level of pleurisy compare to those batches with pneumonia absent or with moderate/low level. Having abscesses is associated with batches with a higher level of pleurisy compare to those with no abscesses. Batches with a moderate/low level of pleurisy had a negative risk of abscess and pericarditis compare to other batches with other levels of pleurisy. There is stronger risk (i.e. RRs over 2.5) of having peritonitis if pericarditis is present in the batch than if it is absent; conversely those batches with zero level of pleurisy are more likely to be peritonitis free than those with pleurisy present. Batches with peritonitis also had a milder risk (i.e. RRs between 1 and 1.5) of hepatic scarring compare to batches without peritonitis. Batches with pyaemia had a mild risk of hepatic scarring and a strong risk of having tail damage compare to those batches with pyaemia absent.

Discussion

This paper describes the application of a multi-dimensional machine learning methodology to multivariate epidemiological analyses. Applying this methodology to the data comprising of the typical pathologies present in slaughtered finishing pigs has led to an easy to interpret, highly visual, and statistically robust output: a network in which the main associations between the pathologies are easily identifiable.

The interrelationship between the pathologies

This study has provided information on the nature of hepatic scarring which is thought to be a post healing stage of milk spots; but for which other aetiologies can not be discarded. The results suggest that both Ascaris suum and systemic bacterial infections are independently interrelated with the presence of the liver capsule scarring. The former is reflected in the moderate association with milk spots, which seems to be a highly robust interrelationship as it was recruited in 92% of the searches. Different stages of A. suum parasitism within the same batch may take place; leading to coexistence of active milk spot lesions with those already healed, i.e. hepatic scarring. The potential bacterial aetiology of hepatic scarring is suggested as its risk increases with the presence of pyaemia – suggesting that both pathologies may be associated with Arcanobacterium pyogenes. In addition, hepatic scarring appears interrelated with peritonitis, which is typically present in systemic infections by Haemophilus parasuis (responsible for Glasser’s disease) or Streptococcus suis[34, 35]. These latter infectious agents would also explain the positive association between peritonitis and pericarditis. Likewise it was observed that when pleurisy was absent the chance of being peritonitis free increased.

Severe pneumonic pasteurellosis is typically manifested by abscessation and thoracic wall adherences [36] which explains the associations detected between pleurisy and abscess. Conversely, absence of pneumonia is associated with lower levels of abscesses. High batch prevalence of enzootic pneumonia-like lesions is interrelated to high levels of pleurisy, which is an expected finding as both respiratory conditions share common husbandry risk factors [6] and Mycoplasma hyopneumoniae (main pathogen for enzootic pneumonia) contributes to the occurrence of pleurisy [4]. This latter association may reflect the presence of poor health levels, particularly in the control of respiratory diseases. Alternatively, batches with mild or moderate levels of pleurisy appear more likely to be free of enzootic pneumonia, pericarditis and abscess, reflecting perhaps high health batches. Papular dermatitis is associated with the presence of milk spots, both being parasitic conditions. This association could reflect poor parasitic control strategies for some producers and highlights the fact that, even with current systems of production, parasitism is still neglected by some sectors of the industry. These results could be used to optimise abattoir inspection strategies. For example, when papular dermatitis is detected in the pigs (e.g. during the ante mortem inspections) the meat inspectors should place more emphasis in the liver inspections of those batches. This would be a proxy for the implementation of risk based surveillance abattoir polices that could optimise the use of industry and government resources [37].

Presence of pyaemia in the batch is associated with presence of tail damage. This latter pathology is known to be involved in early stages of the pathogenesis of pyaemia, by facilitating an entry access for bacteria [10]. At pig level these two lesions might not coexist simultaneously due to the time gap between the tail damage and the development of the pyaemia [10], but a batch level investigation may have assisted to find such association. The approach used in this study, investigating batch level prevalence, not only maintains coherence with the nature of pig production, but would have also assisted in the identification of any association when two pathologies may be part of the same causal pathway (e.g. milk spots and hepatic scarring). In this scenario it is likely that the pathologies do not coexist in the same pig, therefore pig level investigation would be an inefficient way of exploring their association. Furthermore, pathologies presented in a mild form or during the healing process can be missed in the abattoir inspections; whereas if they are present in more than one pig, the chance of being detected by the abattoir assessors increases leading to a more adequate batch level classification.

Clustering in the structure of data

In this study the impact on the analyses of the potential clustering structure in the data has been mitigated by modelling the data at batch level. Batch is typically the lowest and likely the strongest level of clustering present in abattoir data [38], particularly in health scheme pig abattoir data [6, 8]. It is also arguable that for the type of analyses presented – particularly the multi-dimensional aspect– clustering is of far less concern than in other types of traditional statistical analyses. For other potential levels of clustering to be an issue, e.g. on-farm (or abattoir or season), this would require that on different farms the proportion of batches which have, for example, {lesion A present given that lesion B is present and lesion C is not present and lesion D is present and so on…} are substantially different, and similarly for all the other conditional probabilities in the model. This form of "group-effect" is unlikely to be sizable after having already jointly adjusted for all the other conditions present in a batch. Hence, intuitively it could be argued that the machine learning methodology is robust to clustering, whether this is at farm/abattoir/season level. In practical terms, this assertion cannot be rigorously tested with this methodology and it should therefore be acknowledged as a potential limitation in this study.

Constraints of abattoir gross pathology data

The different pathologies were presented in this paper with their most typical cause (Table 1) and although some of them, i.e. EP-like lesion, milk spots and papular dermatitis, can be considered good proxies for specific pathogens [9, 15, 39, 40], none of them are strictly pathognomonic. The data obtained from abattoir monitoring carried out by the health schemes offered here a unique opportunity to explore the associations between these relevant pig pathologies. The presence of operator bias across the assessors, affecting the gross pathology classification cannot be absolutely ruled out, but the definition of the lesions did not change during the period included in our study. Additionally, the health schemes organise training and refresher days for the veterinarians and conduct internal comparisons on the same pigs assessed by different veterinarians, aiming to maintain assessor consistency over time

The results from this study are applicable to the whole study population, i.e. those farms participating in the pig health schemes, and particularly to those units that submitted several batches of pigs over the time period included in the study. Additionally, the results could be extrapolated to the population of British pig commercial units, as the assessments carried by the health schemes are considered representative of the British commercial sector [41].

Further discussion on the structure discovery approach

The multi-dimensional machine learning methodology presented is well suited for investigating multiple associations between pathologies, generating hypotheses about potential interrelationships. Linear models and their generalizations, for example, would have required designating one variable as a response and modelling the rest as a set of independent predictors. Multivariate techniques like principal component and factor analyses utilise dimension reduction to facilitate the identification of uncorrelated subgroups of variables (i.e. principal components and factors). In contrast, machine learning structure discovery does not reduce the dimensions of the data and its graphical nature allows for ready interpretation of all associations present. In this study, a small variable domain – ten pathologies studied in 14 variables – is modelled with a substantial amount of data, providing the ideal scenario for structure discovery multivariate analyses [25].

Conclusions

The application of novel multi-dimensional machine learning methodology provided new insights into how typical pig pathologies are interrelated at batch level, assisting in elucidating theories on their biological associations. The results from this study could be also used to optimise abattoir inspection utilising risk based surveillance strategies. The methodology presented is a powerful hypothesis-generating exploratory tool, applicable to wide range of studies in veterinary research.

References

Elbers ARW, Tielen MJM, Snijders JMA, Cromwijk WAJ, Hunneman WA: Epidemiological studies on lesions in finishing pigs in the Netherlands. I. Prevalence, seasonality and interrelationship. Prev Vet Med. 1992, 14: 217-231. 10.1016/0167-5877(92)90018-B.
Article Google Scholar
Hurnik D, Dohoo IR, Donald A, Robinson NP: Factor analysis of swine farm management practices on Prince Edward Island. Prev Vet Med. 1994, 20: 135-146. 10.1016/0167-5877(94)90112-0.
Article Google Scholar
Cleveland-Nielsen A, Nielsen EO, Ersboll AK: Chronic pleuritis in Danish slaughter pig herds. Prev Vet Med. 2002, 55: 121-135. 10.1016/S0167-5877(02)00089-2.
Article CAS PubMed Google Scholar
Enoe C, Mousing J, Schirmer AL, Willeberg P: Infectious and rearing-system related risk factors for chronic pleuritis in slaughter pigs. Prev Vet Med. 2002, 54: 337-349. 10.1016/S0167-5877(02)00029-6.
Article PubMed Google Scholar
Ostanello F, Dottori M, Gusmara C, Leotti G, Sala V: Pneumonia Disease Assessment using a Slaughterhouse Lung-Scoring Method. Journal of Veterinary Medicine, Series A. 2007, 54: 70-75. 10.1111/j.1439-0442.2007.00920.x.
Article CAS Google Scholar
Sanchez-Vazquez MJ, Smith R, Gunn GJ, Lewis F, Strachan DW, Edwards SA: The Identification of Risk Factors for the Presence of Enzootic Pneumonia-Like Lesions and Pleurisy in Slaughtered Finishing Pigs Utilizing Existing British Pig Industry data. The Pig Journal. 2010, 63: 25-33.
Google Scholar
Fraile L, Alegre A, Lopez-Jimenez R, Nofrarias M, Segales J: Risk factors associated with pleuritis and cranio-ventral pulmonary consolidation in slaughter-aged pigs. Vet J. 2010, 184: 326-333. 10.1016/j.tvjl.2009.03.029.
Article CAS PubMed Google Scholar
Sanchez-Vazquez MJ, Smith RP, Kang S, Lewis F, Nielen M, Gunn GJ, Edwards SA: Identification of factors influencing the occurrence of milk spot livers in slaughtered pigs: A novel approach to understanding Ascaris suum epidemiology in British farmed pigs. Vet Parasitol. 2010, 173: 271-279. 10.1016/j.vetpar.2010.06.029.
Article PubMed Google Scholar
Meyns T, Van Steelant J, Rolly E, Dewulf J, Haesebrouck F, Maes D: A cross-sectional study of risk factors associated with pulmonary lesions in pigs at slaughter. Vet J. 2011, 187: 388-392. 10.1016/j.tvjl.2009.12.027.
Article PubMed Google Scholar
Huey RJ: Incidence, location and interrelationships between the sites of abscesses recorded in pigs at a bacon factory in Northern Ireland. Vet Rec. 1996, 138: 511-514. 10.1136/vr.138.21.511.
Article CAS PubMed Google Scholar
Kritas SK, Morrison RB: Relationships between tail biting in pigs and disease lesions and condemnations at slaughter. Vet Rec. 2007, 160: 149-152. 10.1136/vr.160.5.149.
Article CAS PubMed Google Scholar
Enoe C, Christensen G, Andersen S, Willeberg P: The need for built-in validation of surveillance data so that changes in diagnostic performance of post-mortem meat inspection can be detected. Prev Vet Med. 2003, 57: 117-125. 10.1016/S0167-5877(02)00229-5.
Article PubMed Google Scholar
Bonde M, Toft N, Thomsen PT, Sorensen JT: Evaluation of sensitivity and specificity of routine meat inspection of Danish slaughter pigs using Latent Class Analysis. Prev Vet Med. 2010, 94: 165-169. 10.1016/j.prevetmed.2010.01.009.
Article PubMed Google Scholar
Willeberg P, Gerbola M-A, Petersen BK, Andersen JB: The Danish pig health scheme: Nation-wide computer-based abattoir surveillance and follow-up at the herd level. Prev Vet Med. 1984, 3: 79-91. 10.1016/0167-5877(84)90026-6.
Article Google Scholar
Sorensen V, Jorsal SE, Mousin J: Diseases of Respiratory System. In Diseases of Swine. 9th edition. Edited by: Straw BE, Zimmerman JJ, D'Allaire S, Taylor DJ. Blackwell Publishing; 2006;149-177.
Google Scholar
Stewart TB, Hale OM: Losses to Internal Parasites in Swine Production. J Anim Sci. 1988, 66: 1548-1554.
CAS PubMed Google Scholar
Taylor NR, Main DCJ, Mendl M, Edwards SA: Tail-biting: A new perspective. Vet J. 2010, 186: 137-147. 10.1016/j.tvjl.2009.08.028.
Article PubMed Google Scholar
Rougoor CW, Hanekamp WJA, Dijkhuizen AA, Nielen M, Wilmink JBM: Relationships between dairy cow mastitis and fertility management and farm performance. Prev Vet Med. 1999, 39: 247-264. 10.1016/S0167-5877(99)00007-0.
Article CAS PubMed Google Scholar
Pedersen LJ, Jorgensen E, Heiskanen T, Damm BI: Early piglet mortality in loose-housed sows related to sow and piglet behaviour and to the progress of parturition. Appl Anim Behav Sci. 2006, 96: 215-232. 10.1016/j.applanim.2005.06.016.
Article Google Scholar
Steeneveld W, van der Gaag LC, Barkema HW, Hogeveen H: Providing probability distributions for the causal pathogen of clinical mastitis using naive Bayesian networks. J Dairy Sci. 2009, 92: 2598-2609. 10.3168/jds.2008-1694.
Article CAS PubMed Google Scholar
Otto L, Kristensen CS: A biological network describing infection with Mycoplasma hyopneumoniae in swine herds. Prev Vet Med. 2004, 66: 141-161. 10.1016/j.prevetmed.2004.09.005.
Article PubMed Google Scholar
Jensen TB, Kristensen AR, Toft N, Baadsgaard NP, Ostergaard S, Houe H: An object-oriented Bayesian network modeling the causes of leg disorders in finisher herds. Prev Vet Med. 2009, 89: 237-248. 10.1016/j.prevetmed.2009.02.009.
Article PubMed Google Scholar
Lewis FI, Brulisauer F, Gunn GJ: Structure discovery in Bayesian networks: An analytical tool for analysing complex animal health data. Prev Vet Med. 2011, 100: 109-115. 10.1016/j.prevetmed.2011.02.003.
Article CAS PubMed Google Scholar
Heckerman D, Geiger D, Chickering DM: Learning Bayesian networks: The combination of knowledge and statistical data. Mach Learn. 1995, 20: 197-243.
Google Scholar
Friedman N, Koller D: Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks. Machine Learning. 2003, 50: 95-125.
Google Scholar
Needham CJ, Bradford JR, Bulpitt AJ, Westhead DR: A Primer on Learning in Bayesian Networks for Computational Biology. PLoS Comput Biol. 2007, 3: e129-10.1371/journal.pcbi.0030129.
Article PubMed Central PubMed Google Scholar
Poon AFY, Lewis FI, Pond SLK, Frost SDW: An Evolutionary-Network Model Reveals Stratified Interactions in the V3 Loop of the HIV-1 Envelope. PLoS Comput Biol. 2007, 3: e231-10.1371/journal.pcbi.0030231.
Article PubMed Central PubMed Google Scholar
Poon AFY, Lewis FI, Pond SLK, Frost SDW: Evolutionary Interactions between N-Linked Glycosylation Sites in the HIV-1 Envelope. PLoS Comput Biol. 2007, 3: e11-10.1371/journal.pcbi.0030011.
Article PubMed Central PubMed Google Scholar
Gonyou HW, Lemay SL, Zhang Y: Effects of the environment on productivity and disease. In Diseases of Swine. Edited by: Straw BE, Zimmerman JJ, D'Allaire S, Taylor DJ. Blackwell Publishing; 2006:1027-1036.
Google Scholar
Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.
Article CAS PubMed Google Scholar
Helmbold DP, Schapire RE: Predicting Nearly As Well As the Best Pruning of a Decision Tree. Mach Learn. 1997, 27: 51-68. 10.1023/A:1007396710653.
Article Google Scholar
Dohoo I, Martin W, Stryhn H: Veterinary Epidemiologic Research. Charlottetown, PE, Canada: Atlantic Veterinary College; 2003.
Google Scholar
R Development Core Team R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2009. ISBN 3-900051-07-0, URL http://www.R-project.org
Google Scholar
Brown CC, Baker DC, Barker IK: Alimentary system. In Jubb, Kennedy and Palmer's Pathology of Domestic Animals. Volume 2. 5th edition. Edited by: Grant Maxie M. Elsevier Saunders; 2007:1-295.
Google Scholar
Reams RY, Glickman LT, Harrington DD, Thacker HL, Bowersock TL: Streptococcus suis infection in swine: a retrospective study of 256 cases. Part II. Clinical signs, gross and microscopic lesions, and coexisting microorganisms. J Vet Diagn Invest. 1994, 6: 326-334. 10.1177/104063879400600308.
Article CAS PubMed Google Scholar
Pijoan C: Pneumonic pasteurellosis. In Diseases of Swine. 9th edition. Edited by: Straw BE, Zimmerman JJ, D'Allaire S, Taylor DJ. Blackwell Publishing; 2006:719-724.
Google Scholar
Stark K, Regula G, Hernandez J, Knopf L, Fuchs K, Morris R, Davies P: Concepts for risk-based surveillance in the field of veterinary medicine and veterinary public health: Review of current approaches. BMC Health Serv Res. 2006, 6: 20-10.1186/1472-6963-6-20.
Article PubMed Central PubMed Google Scholar
Goodwin-Ray KA, Stevenson M, Heuer C, Pinchbeck G: Hierarchical and spatial analyses of pneumonia-lesion prevalence at slaughter in New Zealand lambs. Prev Vet Med. 2008, 83: 144-155. 10.1016/j.prevetmed.2007.07.001.
Article CAS PubMed Google Scholar
Stalker MJ, Hayes MAT: Liver and biliary system. In Jubb, Kennedy and Palmer's Pathology of Domestic Animals. 5th edition. Edited by: Grant Maxie M. Elsevier Saunders; 2007;298-358.
Google Scholar
Cargill CF, Pointon AM, Davies PR, Garcia R: Using slaughter inspections to evaluate sarcoptic mange infestation of finishing swine. Vet Parasitol. 1997, 70: 191-200. 10.1016/S0304-4017(96)01137-5.
Article CAS PubMed Google Scholar
Sanchez-Vazquez MJ, Strachan WD, Armstrong D, Nielen M, Gunn GJ: The British pig health schemes: integrated systems for large-scale pig abattoir lesion monitoring. Vet Rec. 2011, 169: 413-413. 10.1136/vr.d4814.
Article CAS PubMed Google Scholar
Caswell JL, Williams KJ: Respiratory system. In Jubb, Kennedy and Palmer's Pathology of Domestic Animals. Volume 2. 5th edition. Edited by: Grant Maxie M. Elsevier Saunders; 2007;591-593.
Google Scholar
Grant Maxie M, Robinson W: Cardiovascular system. In Jubb, Kennedy and Palmer's Pathology of Domestic Animals. 5th edition. Edited by: Grant Maxie M. Elesevier Saunders; 2007:22-24.
Google Scholar

Download references

Acknowledgements

These investigations were partly funded by Defra as project OD0215. We are grateful to WPS and BPHS for providing the data for the analyses in this project. We would like to acknowledge and thank Jill Thompson, who reviewed the manuscript, for her contribution with technical expertise on pig pathology.

Author information

Authors and Affiliations

Scottish Agricultural College, Kings Buildings, West Mains Road, Edinburgh, EH9 3JG, UK
Manuel J Sanchez-Vazquez & George J Gunn
Department of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
Mirjam Nielen
Newcastle University, Agriculture Building, Newcastle upon Tyne, Newcastle, NE1 7RU, UK
Sandra A Edwards
Section of Epidemiology, Vetsuisse Faculty, University of Zurich, Zurich, Switzerland
Fraser I Lewis

Authors

Manuel J Sanchez-Vazquez
View author publications
You can also search for this author in PubMed Google Scholar
Mirjam Nielen
View author publications
You can also search for this author in PubMed Google Scholar
Sandra A Edwards
View author publications
You can also search for this author in PubMed Google Scholar
George J Gunn
View author publications
You can also search for this author in PubMed Google Scholar
Fraser I Lewis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fraser I Lewis.

Additional information

Authors' contributions

MJSV conceived the study, carried out the statistical analyses and drafted the manuscript. FIL contributed to the conception of the study, wrote the statistical software required and assisted in drafting the manuscript. MN participated in its design and helped to draft the manuscript. SAE participated in its design and coordination and helped to draft the manuscript. GG helped to review the manuscript. All authors read and approved the final manuscript.

and Fraser I Lewis contributed equally to this work.

Electronic supplementary material

Additional file 1: Data derived batch categorization for enzootic pneumonia and pleurisy. (DOC 224 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sanchez-Vazquez, M.J., Nielen, M., Edwards, S.A. et al. Identifying associations between pig pathologies using a multi-dimensional machine learning methodology. BMC Vet Res 8, 151 (2012). https://doi.org/10.1186/1746-6148-8-151

Download citation

Received: 11 February 2011
Accepted: 22 August 2012
Published: 31 August 2012
DOI: https://doi.org/10.1186/1746-6148-8-151

Identifying associations between pig pathologies using a multi-dimensional machine learning methodology

Abstract

Background

Results

Conclusions

Background

Methods

Data source

Dataset

Scoring for the different pathologies

Consistency in the scoring of the pathologies

Definition of pathology batch-status variables

Multi-dimensional machine learning methodology

Direction of the arrows

Searching for locally optimal structures

Summarising the results from the local searches

Identifying the final network

Parameters in the search algorithm

Results

Graphical outputs

Discussion

The interrelationship between the pathologies

Clustering in the structure of data

Constraints of abattoir gross pathology data

Further discussion on the structure discovery approach

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Authors' contributions

Electronic supplementary material

Additional file 1: Data derived batch categorization for enzootic pneumonia and pleurisy. (DOC 224 KB)

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Veterinary Research

Contact us