The key aspects of paratuberculosis infection that we intended to capture in our model were the impact of herd management practices, the spread of infection through the environment, vertical and pseudo-vertical transmission from dam to calf, and the influence of imperfect tests on the effectiveness of control strategies.

A continuous time stochastic model was developed by assigning mean rates to the various possible events, from which inter-event times can be generated using the Gillespie algorithm [2]. Although the infection model is in continuous time, management decisions are taken on a discrete time basis, with a time step of one week, in order to reflect typical farm routines. Individuals are assigned one of four management states (calf, heifer, pregnant and non-pregnant productive animals) as well as one of five infection states (susceptible, infected but not shedding, low shedding, high shedding and clinical). As the approach is computationally intensive, the model and reweighting procedure were constructed in the C programming language.

#### Herd Management

The herd was assumed to be closed (new animals are not introduced to the farm). Control methods which are found to be appropriate in the closed case will still be useful in the open herd situation if coupled with sensible purchasing strategies to minimize the risk of importing infection into the herd. The converse is not true, since the long-term epidemiological state of an open herd will be critically driven by the level of infection imported into the herd. Therefore we have considered the long-term behaviour of the epidemic following the introduction of a single infected animal onto the farm. For the purposes of the current study a herd size of 90 dairy cattle was used, a figure which reflects average herd sizes in Scotland circa 2000 [14], as well as the mean of 81 animals in the data set [15] used in the reweighting procedure, described in section “Latin hypercube sampling and reweighting parameter sets”. A spread rather than seasonal calving strategy was assumed, where individual births can occur at any point in the year so that no seasonal variation was included in the model. At birth, it was assumed that 50% of individuals are male, with these individuals being immediately culled. Furthermore 8% of female calves are born dead or abnormal and removed from the model. After a fixed time span of 6 months individuals are moved into the next management state, “heifer”, in order to form a supply of new animals for the milking herd. A calf mortality rate *μ*
_{
c
} of 0.0085 per animal per month was assumed, corresponding to an overall calf mortality of 5% over 6 months, as given by the Milk Development Council [16].

The herd was assumed to be operating under a set stocking management practice, in which a constant number of productive animals is maintained. Animals in the heifer class are matured until they reach an age of 16 months. During this period they have an instantaneous mortality rate *μ*
_{
h
} of 0.0011 per animal per month [17]. In order to keep the spread calving pattern intact, the death of an individual in the milking herd will result in the selection of a replacement from those heifers in the age range of 14 to 16 months, with a view to calving occurring at around 24 months. Heifers which have not been selected by the time they reach 16 months are culled [17, 18], while if a replacement is needed when the pool of 14 to 16 month old animals is empty, a clean animal is bought in.

Productive animals are pregnant for a period of 9 months, and then left empty for 4 months [

18]. In order to calculate mortality rates for productive animals an estimate for the annual replacement fraction in a typical Scottish dairy herd of 0.242 was used, based on the Scottish Government Economic Report on Scottish Agriculture [

14]. The proportion of culls which are due to age related factors was taken from literature estimates of 0.11 [

18] and 0.05 [

19]. Combining these figures gave estimates of the fraction of the herd culled annually for reasons not related to age,

*q*, of 0.216 and 0.230 respectively. As this range is fairly narrow we take a fixed intermediate value of

*q*=0.22. This proportion is further partitioned into a fraction due to reproductive failure, taken as 0.292 [

19], with the remaining 70.8

*%*being assigned to other non-age related causes. An exponential inter-event time distribution was assumed, giving an instantaneous mortality rate for non-pregnant animals

*μ*
_{
a
}=−ln(1−0.708

*q*) per year. As animals are pregnant for 9 months followed by 4 months empty, the expected length of pregnancy in a randomly selected 12 month block is 108/13. Using this to generate the excess mortality

*μ*
_{
p
}associated with pregnancy gives a total mean monthly rate for pregnant animals of

An additional source of mortality in productive animals is culling due to decline in milk yield. In this model it was assumed that individuals are kept for a fixed number of lactations, *n*, and culled upon reaching this limit. The value of *n* is fixed for a particular farm or, equivalently, sample parameter set, but due to the wide range of values quoted in the literature was treated as one of the sampled parameters, with values between 5 and 12 lactations.

#### Infection model

Due to the long incubation period of paratuberculosis, it was assumed that no calves are in the shedding states of the model [20]. The principal route of transmission applicable to all animals is by ingestion of bacteria from the environment. In addition to the average level of environmental bacteria, adult (heifer and productive) animals are exposed to a further bacterial source term representing an increased level around a clinical individual, whereas it was assumed that calves have little direct contact with animals other than their dam. Let *c*(*t*) be the level of bacteria in the environment at time *t* and *Z*
_{
a
}(*t*), *Y*
_{
ah
}(*t*) and *Y*
_{
al
}(*t*) be the number of clinical, high and low shedding adults respectively. The number of sub-clinical individuals is denoted by *X*
_{
a
}(*t*) for adults and *X*
_{
c
}(*t*) for calves.

Adult animals become infected at a rate

where

*β*
_{
d
}and

*β*
_{
ai
}are parameters controlling the overall rate of direct and indirect infection, while

*N*
_{
a
} is the total number of adult individuals. The function

*f*
_{
a
}(

*c*(

*t*)) defines the relationship between the level of bacteria in the environment and the force of infection on a susceptible adult. For calves, infection through the environment is modelled in a similar manner to that in adults, with a rate

where *β*
_{
ci
}is a further parameter controlling the rate of indirect infection of calves, and *f*
_{
c
}(*c*(*t*)) defines the relationship between the level of bacteria in the environment and the force of infection on a susceptible calf.

Where a calf is born to a shedding animal both

*in utero* and

*post partum* routes of infection were included. For

*in utero* infection, the probability

*p*
_{
iu
}of being infected at birth is given by

in which *α*
_{
low
} and *α*
_{
high
} are shedding rates, defined below, and *p*
_{max}is a sampled parameter.

In order to represent

*post partum* infection an additional term is added to the

*in utero* probability, so that the total probability of vertical infection is given by

Here *u* is a sampled parameter taken from a *U*(0,1) distribution, representing the extent to which *post partum* infection through routes such as infected milk/colostrum is present in the model, while *ϕ*is a parameter which is used to represent infection management on the farm by limiting dam-calf contact, as will be discussed in section “Control measures”. In the absence of management interventions (*ϕ*=1) the probability of *post partum* infection will range from *p*
_{
iu
} to 1. It was assumed that calves have little direct contact with animals other than their dam. Calf to calf transmission has recently been demonstrated [21], however we do not include this route as it is assumed to be a low probability event given the low levels of shedding.

The response functions

*f*
_{
a
}(

*c*) and

*f*
_{
c
}(

*c*) describe the infective impact of environmental bacteria. The true infective impact of any environmental infection is difficult to model as it will depend on the distribution of the bacteria and the grazing habits of the animals [

22,

23]. For this reason it is more useful to assume a simple functional form for this response, which will be nonlinear with a sigmoidal form obeying the natural constraints

*f*(0)=0 and lim

_{
c→∞
}
*f*(

*c*)=

*K*. The constant

*K* can be set to 1 without loss of generality as its value can be absorbed into the values of

*β*
_{
d
},

*β*
_{
ai
} and

*β*
_{
ci
}. Given these constraints a piecewise linear function was used, as this has relatively few parameters involved and will allow more effective sampling, hence

with different values
,
,
and
for calves and adults. Using a different function *f*
_{
c
}(*c*(*t*)) for calf infection from the environment allowed for calves to be modelled as being at a higher risk of infection for a given level of contamination. The models of [8–11] all assume that initial infection can only occur up to one year of age, whereas here it has merely been assumed that the rate of infection is higher for calves.

Once infected each animal is assigned a sub-clinical phase of duration *T*
_{
i
}, which is drawn from a Gamma distribution *Γ*(*ν*
_{
c
}
*λ*
_{
c
}) for calves and *Γ*(*ν*
_{
a
}
*λ*
_{
a
}) for adults. The parameters in these distributions were obtained by fitting to published data [24–26]. The sub-clinical period for each animal *i* is divided into the non-shedding phase of duration *f*
_{1}
*T*
_{
i
}, the low shedding phase of duration *f*
_{2}
*T*
_{
i
} and the high shedding phase of duration (1−*f*
_{1}−*f*
_{2})*T*
_{
i
}. These classes were designed to be analogous to those of [27], with disease states also being similar to those used in [10].

Animals in the low shedding phase excrete bacteria at a rate

*α*
_{
low
}, while those in the high shedding state do so at a rate

*α*
_{
high
}. In order to calculate these shedding parameters, an underlying exponential shedding process for an animal infected at time

*t*
_{0} of

was assumed. At the start of the low shedding phase, a shedding rate

*α*(0)=

*α*
_{
min
} was assumed, which is derived using the faecal test model described in section “Control measures” assuming a mean detection probability for an animal in the low shedding state equal to 0.2, as given in [

27]. The final shedding rate

*α*((1−

*f*
_{1})

*T*
_{
i
})=

*α*
_{
max
}, obtained when an animal reaches the clinical phase of infection, is likely to be the most critical to the epidemiology and was thus treated as a sampled parameter. Given these values it is straightforward to calculate

*A* and

*λ*. The values of

*α*
_{
low
}and

*α*
_{
high
}were then calculated as averages of the exponential process over the corresponding time periods,

Explicitly modelling shedding in this manner, allowing for some individuals to shed at extremely high rates, incorporates some of the issues addressed by Mitchell et al. [11], although the current model only allows for variability in shedding rates between realisations, rather than explicit inclusion of a small number of animals shedding at an exceptionally high rate. It is likely that such animals would have a relatively limited impact on the long term dynamics of infection as they would be easily picked up by faecal tests [11], however they would contribute to additional variability in the observed dynamics, a feature which would not be observed in a deterministic model such as that of Mitchell et al.

Individuals in the clinical state were assigned a disease induced mortality *μ*
_{
i
} which is added to their basic mortality rates, *μ*
_{
h
}, *μ*
_{
a
} or *μ*
_{
a
} + *μ*
_{
p
}, in order to reflect culling of such animals. It was assumed that an animal is culled one week after entering the clinical state, equivalent to a value for *μ*
_{
i
}of 2 per animal per month.

#### Environmental infection

Mitchell et al. do not model environmental infection, but suggest that doing so would be worthwhile. The level of environmental contamination

*c*(

*t*) was modelled as a deterministic process with a linear decay rate

*δ*, which is a sampled parameter having a range of values that gives an infective dose a half life of between 28 and 133 days, consistent with data in [

28]. The value of

*c*(

*t*) is governed by the differential equation

A deterministic process was chosen as this quantity is likely to be very high, and hence the size of fluctuations will be relatively small. A stochastic process for a population of such a size would be extremely numerically intensive, whereas the deterministic process can be solved relatively straightforwardly. As

*Y*
_{
al
},

*Y*
_{
ah
} and

*Z*
_{
a
} are stochastic variables, the function

*c*(

*t*) will exhibit random changes in the growth rate at discrete time points. There is no source term in this equation as the bacterium is an obligate parasite. As the stochastic variables are discrete valued, it is possible to solve this equation piecewise in a sequence of time windows over which these variables are constant. Over such an interval we can rewrite equation (

9) as

in which

*A* is constant. This equation has the solution

which allows forward calculation in time of *c*(*t*) by inserting its value just prior to any change in the random variables as *c*(0), with the new value for *A* arising from the change.