We work out a procedure for calculating a risk map for the neighborhood transmission of CSFV. In essence, two pieces of information are required as input for the calculation. The first is the set of spatial locations of all (current) farms with pigs in the country. The second is an estimate of the distance-dependent probability of neighborhood transmission of CSFV.

### Database of herd locations

The National Animal Health Service maintains a database of spatial location information for all locations with pigs in The Netherlands, in which the positions of locations are given in meters {X,Y}, according the official cadastral system in The Netherlands (RD-coordinates). From the 2004 database we used geographical coordinates (point locations) and type of location (market places, slaughterhouses, rendering plants, recreational farms (containing less than 5 pigs), and four types of commercial pig farms). We excluded locations from the analysis with location type such as slaughterhouses, market places, and rendering plants. At these locations no pigs were assumed to be present during the outbreak of CSFV due to the zoo-sanitary measures. All other farms were included in the analysis.

If more than one owner of pigs or more than one herd type was registered at the same geographical location, we considered that location as one single herd. If one farm had pigs housed on different geographical locations, we considered each location as a separate herd. All together, we used a total net number of about 15,000 pig herds in the analysis.

### Method to assign risk levels to herds

In this section we describe how we combined the herd location data with an estimate of the probability of transmission – the second piece of information in the method – to obtain risk maps for the neighborhood spread of CSFV in the presence of base-line control measures prescribed by EU legislation. In the analysis of 1997–1998 CSF epidemic in The Netherlands by Stegeman et al. [4] a division was made between transmission via traceable contacts and short-range distance-dependent "neighborhood transmission". In this work we focus on the latter type of transmission. The probability of neighborhood transmission *p*(*r*), can be written as

*p*(*r*) = 1 - exp(- *λ*(*r*)*T*)

(1)

in which *r* is the Euclidean (straight-line) distance from infected herd to susceptible herd, *T* is the mean infectious period of a herd and *λ*(*r*) is the distance-dependent rate of neighborhood transmission. Stegeman et al. [4] estimate *λ*(*r*) from observed herd infections that arose from untraced neighborhood contacts in the 1997–1998 CSF epidemic in The Netherlands. Using a step function to approximate the rate *λ*(*r*), their result is:

\lambda (r)=\{\begin{array}{l}0.0270\hfill \\ 0.0078\hfill \\ 0.00006\hfill \\ 0\hfill \end{array}\begin{array}{rr}\hfill (\text{perweek})& \hfill 0\le r0.5\text{km}\\ \hfill (\text{perweek})& \hfill 0.5\le r1\text{km}\\ \hfill (\text{perweek})& \hfill 1\le r2\text{km}\\ \hfill (\text{perweek})& \hfill r\ge 2\text{km}\end{array}.

(2)

For the estimation of this rate both the number of infected and uninfected herds in the 1997–1998 CSF epidemic in The Netherlands is used. A major part of infected herds was reported due to clinical symptoms. A small part of pre-emptively slaughtered herds was diagnosed positive after having been slaughtered based on serum and blood samples taken shortly before culling [3]. As the sensitivity of the detection procedure used is considered to be very high, the data set should enable an accurate estimation of the rate of neighborhood transmission *λ*(*r*). The mean infectious period *T* of a herd (in the presence of base-line control measures) has been also estimated by Stegeman et al. [14].

It was found that *T* varied in time, being 6 weeks early on in the epidemic, and being reduced to about 3 weeks later on. To investigate the effect of *T* on the size of high-risk areas we have carried out calculations for *T* = 3, *T* = 6, and *T* = 9 weeks.

Using the calculational framework developed by Boender et al. [12], the availability of an estimated between-herd transmission probability *p*(*r*) allows one to calculate a measure for the expected amount of neighborhood transmission for each individual herd *i*, given by the local reproduction ratio *R*
_{
hi
}. If a herd *i* infects a herd *j* a distance *r*
_{
ij
}away with probability *p*(*r*
_{
ij
}), on average herd *i* will infect the following number of herds [12]:

{R}_{\text{h}i}=\frac{1}{{f}_{c}}{\displaystyle \sum _{j}p({r}_{ij}),}

(3)

in which the summation is over all the herds *j* excluding herd *i*. The factor *f*
_{c} compensates the effects of "local depletion" of the pool of neighboring susceptible herds (i.e. effects relating to neighboring herds already being infected) and is defined and calculated in Ref. [12]. As local depletion reduces the number of herds potentially infected by the source herd, the factor *f*
_{
c
}is bigger than one (i.e. *f*
_{c} > 1). In our case this factor *f*
_{c} equals 1.5, 1.7, and 1.9 for *T* = 3, *T* = 6, and *T* = 9 weeks, respectively.

As the reproduction ratio *R*
_{
hi
}is a weighted sum of all neighboring herds according to Equation (3), it is a measure of the local density of herds at each location *i*. A standard result in infectious disease epidemiology states that "major outbreaks" can only occur if the reproduction ratio (*R*) is larger than one, i.e. if *R* > 1 [17, 18]. In our context, the concept of a major outbreak translates into epidemic spread of the virus from herds with local reproduction ratio larger than one. For local reproduction ratios below one at most only a few transmission events will take place, i.e. no progressing epidemic spread will locally occur (see below for a detailed explanation). As a consequence, we may classify herds with *R*
_{
hi
}< 1 as low-risk herds and herds with *R*
_{
hi
}> 1 as high-risk herds. By calculating *R*
_{
hi
}for each pig herd *i* in the country we are able to identify high-risk areas as areas spanned by groups of neighboring high-risk herds.

### Method to discriminate between high-risk and low-risk areas

The most straightforward way to construct a risk map is by simply color-coding the two classes of herds on a map, thus visualizing areas with high-risk herds and areas with low-risk herds. We note however that an outbreak of CSFV starting at a low-risk herd situated in the neighborhood of an area with high-risk herds may often reach that area and still lead to a major outbreak. A way to take this effect into account is by adding border zones to high-risk areas.

We used a simulation approach to determine the border zone. This approach is based on the observation that the distance-dependent probability of virus transmission given in Equation (1) defines a spatial transmission model for which random epidemics can be generated on the set of all pig herds in the country. In detail, starting from a single infected herd, the second generation of infections can be generated by assigning randomly (according the distance-dependent probability) infectious contacts of the neighboring herds with the infected herd. From the second generation the third generation of infected herds is randomly generated and so on. At some point the epidemic terminates, either because no area with high risk herds is reached (minor outbreak), or because after such an area was hit, the number of high risk herds in that area is exhausted (major outbreak). For each individual 'low-risk' herd we generated 100 different epidemics starting from that herd, and recorded the total number of herds infected in each epidemic (i.e. its final size). Subsequently we inspected these 100 final sizes to see if they contained any major outbreaks. For each of the 'low-risk' herds that nevertheless gave rise to major outbreaks, we calculated the distance to the nearest high-risk herd. We fixed the value of the border zone radius around the areas with high-risk herds such that it contained 95% of the 'low-risk' herds that gave rise to major outbreaks. We varied the critical size beyond which an outbreak was called 'major' in this calculation between 10 and 100 infected herds. As the results were found to be insensitive to the precise critical size as long as it was chosen larger than or equal to 40 infected herds, we have used the critical size of 40 in the results below.