Topographic determinants of foot and mouth disease transmission in the UK 2001 epidemic
- Nicholas J Savill^{1, 2}Email author,
- Darren J Shaw^{3},
- Rob Deardon^{1, 7},
- Michael J Tildesley^{4},
- Matthew J Keeling^{4},
- Mark EJ Woolhouse^{2},
- Stephen P Brooks^{1} and
- Bryan T Grenfell^{5, 6}
DOI: 10.1186/1746-6148-2-3
© Savill et al; licensee BioMed Central Ltd. 2006
Received: 01 July 2005
Accepted: 16 January 2006
Published: 16 January 2006
Abstract
Background
A key challenge for modelling infectious disease dynamics is to understand the spatial spread of infection in real landscapes. This ideally requires a parallel record of spatial epidemic spread and a detailed map of susceptible host density along with relevant transport links and geographical features.
Results
Here we analyse the most detailed such data to date arising from the UK 2001 foot and mouth epidemic. We show that Euclidean distance between infectious and susceptible premises is a better predictor of transmission risk than shortest and quickest routes via road, except where major geographical features intervene.
Conclusion
Thus, a simple spatial transmission kernel based on Euclidean distance suffices in most regions, probably reflecting the multiplicity of transmission routes during the epidemic.
Background
The UK 2001 epidemic of foot and mouth disease highlighted the need for national governments to have well thought out and workable contingency plans to control the spread of highly infectious animal diseases. These plans must be based on quantitative predictions of epidemic size and extent under various conditions which, in turn, must be based on an understanding of how disease spreads between livestock premises. For a disease like foot and mouth that has a multitude of transmission routes between premises, predicting the course of an epidemic is complicated by demographic and topographic heterogeneity. The UK 2001 foot and mouth epidemic has furnished us with unique data on the temporal and spatial spread of this disease. This, coupled with demographic data of livestock holdings within the UK, gives us the opportunity to study in detail the various risk factors associated with disease spread. In this paper we focus on the effect caused by geographical features within the UK landscape.
Using Euclidean distance between farms in the transmission kernel was a first approximation to quantify the effect of spatial separation on FMDV transmission. However, contact tracing of infection by DEFRA highlighted several important transmission routes that occurred via road after the movement ban. These included movement of vehicles, personnel, milk tankers, farm equipment and livestock [7]. Thus, road-based measures may be better predictors of transmission risk than just simple Euclidean distance, especially in areas where roads go around large geographical features such as hills, rivers and estuaries. Showing this, however, is complicated by the non-unique, non-linear relationships between Euclidean distance and road-based measures, and by the fact that Euclidean distance is already a risk factor. Nevertheless, with detailed descriptions of the UK road network and the UK 2001 foot and mouth disease dynamics, we have developed a statistical test that can detect risk associated with road-based measures. In this paper we consider shortest and quickest routes.
Briefly, the test looks for a significant difference in the mean shortest or quickest route between farms where a possible transmission occurred and the mean shortest or quickest route between farms where no transmission occurred; the rational being that if shortest or quickest routes are risk factors, farms closer to an IP by road are more likely to have been infected than farms farther away. A detailed description of the test is given in Methods
Results and Discussion
List of regions and the counties used in the analysis.
Region | Counties |
---|---|
Devon | Devon |
Cumbria | Cumbria |
Dumfries and Galloway | Dumfries and Galloway |
Welsh Borders | Powys, Hereford and Worcester, Gloucestershire and Avon |
Settle | Lancashire and North Yorkshire |
p-values for the test to see if shortest and quickest routes are better predictors of transmission risk than Euclidean distance and vice versa for the various regions and different parameter values. p-values below 0.05 are taken to mean a significantly better predictor. The default parameter values are latent period (l) equal to 4 days and sheep and cattle transmissibility parameters equal (T_{ c } = T_{ s }, only relative values are required), n is the sum of p_{i,j}over all possible transmissions in Equation 1; effectively the number of IPs on their day of infection that are within 10 km of an infectious IP.
p-value | ||||||
---|---|---|---|---|---|---|
Region | Parameter | Shortest route better than Eucl. distance | Eucl. distance better than Shortest route | Quickest route better than Eucl. distance | Eucl. distance better than Quickest route | n |
Devon | l = 3 | 0.55 | 0.026 | 0.91 | < 10^{-3} | 107 |
l = 4 | 0.68 | 0.029 | 0.93 | < 10^{-3} | 104 | |
l = 5 | 0.61 | 0.043 | 0.93 | < 10^{-3} | 103 | |
T_{ s }/T_{ c }= 10 | 0.71 | 0.027 | 0.92 | < 10^{-3} | 104 | |
T_{ s }/T_{ c }= 0.1 | 0.65 | 0.027 | 0.92 | < 10^{-3} | 104 | |
Cumbria | l = 3 | > 0.999 | < 10^{-3} | > 0.999 | < 10^{-3} | 572 |
l = 4 | 0.999 | < 10^{-3} | 0.999 | < 10^{-3} | 562 | |
l = 5 | 0.998 | < 10^{-3} | > 0.999 | < 10^{-3} | 553 | |
T_{ s }/T_{ c }= 10 | 0.998 | < 10^{-3} | 0.999 | < 10^{-3} | 562 | |
T_{ s }/T_{ c }= 0.1 | 0.998 | < 10^{-3} | 0.999 | < 10^{-3} | 565 | |
Dumfries and Galloway | l = 3 | 0.87 | 0.005 | 0.93 | < 10^{-3} | 82 |
l = 4 | 0.82 | 0.006 | 0.88 | < 10^{-3} | 78 | |
l = 5 | 0.82 | 0.006 | 0.91 | < 10^{-3} | 76 | |
T_{ s }/T_{ c }= 10 | 0.81 | 0.005 | 0.90 | < 10^{-3} | 78 | |
T_{ s }/T_{ c }= 0.1 | 0.85 | 0.011 | 0.87 | < 10^{-3} | 78 | |
Welsh Borders | l = 3 | 0.87 | 0.003 | 0.87 | < 10^{-3} | 97 |
l = 4 | 0.88 | 0.011 | 0.87 | < 10^{-3} | 93 | |
l = 5 | 0.92 | 0.011 | 0.92 | < 10^{-3} | 88 | |
T_{ s }/T_{ c }= 10 | 0.88 | 0.012 | 0.85 | 0.001 | 92 | |
T_{ s }/T_{ c }= 0.1 | 0.92 | 0.008 | 0.93 | < 10^{-3} | 95 | |
Settle | l = 3 | 0.95 | 0.001 | 0.96 | < 10^{-3} | 87 |
l = 4 | 0.92 | < 10^{-3} | 0.96 | < 10^{-3} | 84 | |
l = 5 | 0.90 | 0.001 | 0.89 | < 10^{-3} | 81 | |
T_{ s }/T_{ c }= 10 | 0.92 | < 10^{-3} | 0.97 | < 10^{-3} | 84 | |
T_{ s }/T_{ c }= 0.1 | 0.91 | < 10^{-3} | 0.94 | < 10^{-3} | 84 |
As for Table 2 except that only IPs positively confirmed as infected are considered as IPs. IPs with a negative confirmation are treated as pre-emptive culls. Parameters are l = 4 and T_{ s }= T_{ c }.
p-value | |||||
---|---|---|---|---|---|
Region | Shortest route better than Eucl. distance | Eucl. distance better than Shortest route | Quickest route better than Eucl. distance | Eucl. distance better than Quickest route | n |
Devon | 0.55 | 0.19 | 0.79 | 0.008 | 41 |
Cumbria | 0.93 | < 10^{-3} | 0.998 | < 10^{-3} | 426 |
D. and G. | 0.25 | 0.055 | 0.43 | 0.005 | 37 |
Welsh B. | 0.13 | 0.53 | 0.18 | 0.33 | 14 |
Settle | 0.87 | < 10^{-3} | 0.90 | < 10^{-3} | 80 |
These tests show that shortest and quickest routes are no better predictors of transmission risk than Euclidean distance. However, it does not prove that they are any worse. In order to test this we turn the analysis around and ask if Euclidean distance is a better predictor of transmission risk than shortest or quickest route. This requires the calculation of shortest and quickest route transmission kernels (see Methods and Fig. 1). The p-values for the null-hypothesis that the difference in the mean Euclidean distance between possible transmissions and the mean Euclidean distance between non-transmissions could have arisen by chance are given in Table 2. For all regions and all parameter values the p-values are significant, strongly suggesting that Euclidean distance is a better predictor of risk than shortest and quickest routes. When the analysis is done with only those IPs that were positively confirmed as infected the results are not significant for shortest route in Devon, Dumfries and Galloway and the Welsh Borders, and for quickest route in the Welsh Borders. This is most likely due to small sample sizes reducing the power of the test (see n in Table 3).
The river Severn and its estuary are crossed by the M4/M48 Motorway in the southwest and by the A40 trunk road in the northeast, which themselves are about 40 km apart (Fig. 3, bottom inset). The p-value for the test is less than 0.001 (n = 672); a highly significant result suggesting that transmission between farms on opposite sides of the Severn is best modelled using a shortest route based transmission kernel.
We have also applied this test to other barriers. For example, during the epidemic it was suggested that the M6 Motorway, running north-south through Cumbria – and therefore through the centre of the Cumbrian epidemic – may have acted as a barrier to FMDV transmission between farms adjacent to it (Fig. 3, middle inset). Indeed, it is illegal for people, livestock and vehicles to directly cross the M6. We may therefore speculate that infection of farms across the M6 was exclusively via road. The network of minor roads that existed before the M6 was built still exist today – they cross the M6 via numerous tunnels and bridges. Thus, roads between farms on either side of the M6 do not show large excursions as observed around the Severn or the Solway Firth. The p-value of the test for IPs within 3 km of the M6 is 0.84 (n = 188); a non-significant result suggesting that a Euclidean-distance based transmission kernel is a sufficient model of transmission between farms on opposite sides of the M6 Motorway. We have also tested medium to large inland rivers (Devon: p = 0.64, n = 296, Cumbria: p = 0.18, n = 11700, Dumfries and Galloway: p = 0.43, n = 1180) and railway lines (Devon: p > 0.999, n = 10, Cumbria: p = 0.056, n = 4310, Dumfries and Galloway: p = 0.56, n = 351). Thus transmission between farms on opposite sides of these barriers is also best modelled by a Euclidean-distance based kernel rather than a shortest route based kernel. The p-value for railway lines in Cumbria is close to significant. However, given that the other regions are not significant it is reasonable to assume the same for Cumbria.
Conclusion
Why does Euclidean distance work so well, given that some transmission was certainly caused by movement of livestock, people and vehicles between farms via the road network? We do not have a definitive answer, although possible explanations include: 1) farms with a common boundary have more potential routes of infection than just a main road, for example tracks and private roads that cross both farms that are not recorded in the Digimap Meridian™ 2 Database; 2) infection via social networks may be a significant confounding factor.
In conclusion, Euclidean distance between infectious and susceptible farms is a better predictor of transmission risk than shortest or quickest routes, except that is where major geographical features intervene; then shortest route is the preferable measure of distance. Thus, mathematical models of the UK 2001 epidemic were justified in using Euclidean distance as a risk factor. However, future models should take into account the many large estuaries around the UK coastline.
In this paper we have developed a statistical test that can detect risk associated with various measures of the spatial relationship between infectious agents over and above that of simple Euclidean distance. Its use on other economically important livestock diseases may help in understanding their spread in potential future outbreaks. This work stresses the importance of analysing parallel geographical and disease outbreak data in order to construct parsimonious models which capture the essence of disease dynamics and control.
Methods
Premises data
The data used in this paper were taken from the DEFRA FMD Data archive [9]. Relevant information for the 2,026 mainland IPs were farmhouse coordinates and infection and slaughter dates. Thirteen IPs in this database that were confirmed on serology tests for antibodies to the virus do not have estimated infection dates; we assume that these IPs were infected 10 days before reporting, which is the period suggested by DEFRA in the database. Data for all other livestock holdings in the UK are an amalgam of 2001 census data and DEFRA's list of premises including all IPs and culled premises from the epidemic; in total 185,791 premises. Relevant information for each premises was farmhouse coordinates.
Road network
The UK road network was taken from the Digimap Meridian™ 2 Database [10]. In this database, road centre-lines are represented as links, and road intersections as nodes. A road link, which connects two nodes, comprises one or more line segments fixed positionally by a series of connected coordinate points. The coordinate system is the National Grid with a resolution of 1 m. The database distinguishes between Motorways, A roads, B roads and minor roads; it does not include private roads, tracks and some minor roads and cul-de-sacs of less than 200 m. We extract from this database the coordinates of all line segments of all road links. We create our own network of nodes and links, where each line segment is a link connected to two nodes. A node contains a list of all other nodes linked to it, and the Euclidean distance to each of these nodes calculated from the line segment coordinates.
Calculating shortest and quickest routes
We calculate the shortest route between all pairs of livestock premises in the UK within 10 km of each other. This is done by analysing 40 × 40 km^{2} overlapping regions incremented by 10 km horizontally or vertically. This ensures that all farms within 10 km of an IP are linked to an IP by road. Larger regions are computationally infeasible.
The road network in a 40 × 40 km^{2} region is converted into an N × N matrix where N is the number of nodes in the region. The matrix is initialised with the road distances between all linked nodes; elements of nodes not linked are given infinite values. The Floyd-Warshall algorithm [11, 12] is then applied to this matrix resulting in an N × N matrix where the value of each element gives the shortest route between its corresponding pair of nodes. The computational running time of the Floyd algorithm scales as N^{3}, where N varies from approximately 100 to 10,000 depending on the density of roads. When N exceeds 10,000 the algorithm's running time exceeds 1 day. The shortest route between any pair of farms is taken as the shortest route between the two assigned nearest nodes to these farms plus the assumed road distance of the farms from the main road. In a very few cases, especially neighbouring farms, the spatial configuration of a pair of farms and their connecting nodes causes the road distance to be less than the Euclidean distance. For these rare cases we assume road distance equal to the Euclidean distance.
To find the quickest route between two farms, distances between two nodes in the network are replaced with journey times. We assume that Motorway and trunk road speeds are 112 kph, A, B and minor road speeds are 72 kph, and farmhouse to road junction speed is 16 kph [13].
Statistical analysis of distance – based risk
Owing to incomplete or equivocal tracing data, it is not possible to prove conclusively which farm infected which. Therefore we must consider all infectious IPs as possible sources of transmission on the particular day a farm gets infected. However, we can calculate the probabilities of possible transmission events based on known risk factors. We know that risk depends on proximity from an infectious IP (K(d)) and on the transmissibility ( ) of the infecting farm [5]. Thus, we assume that the probability of an infectious IP i infecting a susceptible farm j (on the day t when j was infected) is given by
where T_{ s }is the transmissibility of sheep, T_{ c }the transmissibility of cattle, N_{s,i}the number of sheep and N_{c,i}the number of cattle. Only the relationship between T_{ s }and T_{ c }is required because of the form of Equation 1. We assume that the infectious periods of all IPs begin 3, 4 or 5 days after they become infected and end on the day they are slaughtered [14–16]; the infection and slaughter dates of IPs are taken from the DEFRA FMD Data archive [9].
For a given region, defined in Table 1, only farms in those counties are used in the analysis. For example, for the Cumbria region we assume that only farms in Cumbria can infect Cumbrian farms. Farms in the neighbouring county of Dumfries and Galloway are assumed not to infect Cumbrian farms. Some pre-emptively culled farms may have been infected but never reported. Because it is not possible to say which farms these were or how many of them there were, we cannot include them as IPs in our analysis.
For each IP we find the Euclidean distances and the shortest and the quickest routes between it and all farms it could have infected after 23rd February 2001 and within 10 km (termed possible transmissions), and all farms it could not have infected after 23rd February 2001 and within 10 km (termed non-transmissions). A possible transmission can occur when an IP is infectious on the day another farm was infected (and hence became an IP). A non-transmission between an IP and a farm is defined for three cases: the IP was infectious before the other farm became infected, the IP was infectious before the other farm was pre-emptively culled, and the other farm was never infected or culled.
The mean shortest or quickest route between infectious and susceptible premises in a region is found for possible transmissions (weighted by their probability of occurrence p, Equation 1, in which d_{i,j}represents Euclidean distance) and for non-transmissions. The difference between these means is recorded. The next step is to compare this difference to a null-distribution. The null hypothesis states that the difference in the means could have arisen by chance. The null-distribution is found as follows. One thousand weighted random samples of possible transmissions are taken from the population of all IP-farm pairs. The sampling is done without replacement. The weighting takes into account the fact that the ratio of possible transmissions to IP-farm pairs varies with Euclidean distance. Therefore, the probability of sampling a possible transmission at a given Euclidean distance is conditioned on this ratio at that distance. If we did not do this, we would preferentially sample IP-farm pairs with longer Euclidean distances within the population because these are more numerous. The unsampled IP-farm pairs make up a random sample of non-transmission pairs. The mean shortest or quickest route of the randomly sampled possible transmissions and non-transmissions are found and their difference calculated. The observed difference in the means is then compared to the null-distribution to obtain a p-value.
To test if Euclidean distance is a better predictor of risk than shortest or quickest route, the two variables under consideration are swapped with d_{i,j}in Equation 1 representing shortest or quickest route.
Simulated epidemics
Epidemics were simulated in order to test the power and specificity of the statistical test. The simulations are based on the stochastic simulations done by [5]. Briefly, the infection of susceptible farms are Poisson processes with rates determined by the susceptibility of the susceptible farms, the transmissibility of all infectious farms and a Euclidean-distance or road based transmission kernel. The rates and the Euclidean-distance based kernel are parameterised using the UK 2001 epidemic [5]. If the Euclidean-distance based transmission kernel is K_{ e }(e) (where e is Euclidean distance), and the Euclidean distance-shortest or quickest route density function of IP-farm pairs (e.g., Fig. 2) is f(r, e) (where r is shortest or quickest route), then the shortest or quickest route based transmission kernel K_{ r }(r), is given by
The Euclidean-distance kernel is the black line in Fig. 1. Using farms in Devon for f(r,e), the shortest route kernel is the magenta line and the quickest route kernel is the green line. For the first 30 days of the simulated epidemics, IPs are slaughtered after 3 days of reporting and farms within 1.5 km of an IP are pre-emptively culled after 5 days of reporting. These reduce to 1 and 2 days respectively after the first 30 days. There is no dangerous contact culling. One thousand simulations using the shortest route based transmission kernel were analysed. For an a value of 0.05, shortest route was a significantly better predictor of transmission than Euclidean distance for 98% of cases. However, the test for Euclidean distance as a better predictor of transmission was significant in 15% of cases. Conservatively, therefore, our test has a power of about 85%. An additional 1000 simulations using the Euclidean-distance based kernel were analysed. For an a value of 0.05, Euclidean distance was a significantly better predictor of transmission than shortest route for > 99.9% of cases. However, the test for shortest route as a better predictor of transmission was significant in just 1% of cases. Conservatively, therefore, our test has a specificity of about 99%.
Test for best distance – based transmission kernel
The following statistical test was developed to see if transmission between farms on opposite sides of specific transmission barriers is better modelled using a shortest route based transmission kernel or a Euclidean-distance based one. The distribution of infection probabilities (Equation 1) is found for IPs on opposite sides of a barrier first with d_{i,j}representing Euclidean distance. The same is then done with d_{i,j}representing shortest route. If these two infection-probability distributions are significantly different from each other, this suggests that transmission across the barrier will be modelled differently under the two kernels. Given that transmission did not occur directly over the barrier, this implies that the shortest route based transmission kernel would be the better model. If, however, the distributions are not significantly different from each other, then transmission across the barrier will not be modelled significantly differently under the two kernels; therefore we can assume that a simple Euclidean-distance based transmission kernel will suffice. The Kolmogorov-Smirnov test was used to compare the distributions.
Declarations
Acknowledgements
This research is supported by The Wellcome Trust. We thank Miles Thomas from the Central Science Laboratory, DEFRA, Sand Hutton, Yorkshire for his invaluable help with the data, Paul Bessell for production of the Ordnance Survey maps, and Cerian Webb for suggesting we see if Euclidean distance was a more significant risk factor than road-based measures.
Authors’ Affiliations
References
- Alexandersen S, Zhang Z, Donaldson AI, Garland AJM: The pathogenesis and diagnosis of foot-and-mouth disease. J Comp Path. 2003, 129: 1-36. 10.1016/S0021-9975(03)00041-0.View ArticlePubMedGoogle Scholar
- Anderson I: Foot and mouth disease 2001: Lessons to be learned inquiry. The Stationary Office, London 2002.Google Scholar
- Ferguson NM, Donnelly CA, Anderson RM: The foot-and-mouth epidemic in Great Britain: Pattern of spread and impact of interventions. Science. 2001, 292: 1155-1160. 10.1126/science.1061020.View ArticlePubMedGoogle Scholar
- Ferguson NM, Donnelly CA, Anderson RM: Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain. Nature. 2001, 414: 542-548. 10.1038/35097116.View ArticleGoogle Scholar
- Keeling MJ, Woolhouse MEJ, Shaw DJ, Matthews L, Chase-Topping M, Haydon DT, Cornell SJ, Kappey J, Wilesmith J, Grenfell BT: Dynamics of the 2001 UK foot and mouth epidemic: Stochastic dispersal in a heterogeneous landscape. Science. 2001, 294: 813-817. 10.1126/science.1065973.View ArticlePubMedGoogle Scholar
- Murray JD: Mathematical Biology. Springer-Verlag, London 1993.View ArticleGoogle Scholar
- Gibbens JC, Sharpe CE, Wilesmith JW, Mansley LM, Michalopoulou E, Ryan JBM, Hudson M: Descriptive epidemiology of the 2001 foot-and-mouth disease epidemic in great britain: the first five months. Vet Rec. 2001, 149: 729-743.PubMedGoogle Scholar
- Kitching RP, Hutber AM, Thrusfield MV: A review of foot-and-mouth disease with special consideration for the clinical and epidemiological factors relevant to predictive modelling of the disease. Vet J. 2005, 169: 197-209. 10.1016/j.tvjl.2004.06.001.View ArticlePubMedGoogle Scholar
- DEFRA: Department of Environment, Food and Rural Affairs FMD Data archive. [http://footandmouth.csl.gov.uk]
- EDINA: Digimap Service. [http://edina.ac.uk/digimap]
- Floyd RW: Algorithm 97 (SHORTEST PATH). Comms ACM. 1962, 5: 345-10.1145/367766.368168.View ArticleGoogle Scholar
- Warshall S: A theorem on boolean matrices. J ACM. 1962, 9: 11-12. 10.1145/321105.321107.View ArticleGoogle Scholar
- Department for Transport, The Stationary Office: Transport Statistics for Great Britain. 2002Google Scholar
- Burrows R: Excretion of foot-and-mouth disease virus prior to the development of lesions. Vet Rec. 1968, 82: 387-388.Google Scholar
- Gibson CF, Donaldson AI: Exposure of sheep to natural aerosols of foot-and-mouth disease virus. Res Vet Sci. 1986, 41: 45-49.PubMedGoogle Scholar
- Donaldson AI, Gibson CF, Oliver R: Infection of cattle by airborne foot-and-mouth disease virus: minimal doses with O_{1} and SAT 2 strains. Res Vet Sci. 1987, 43: 339-346.PubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.