### Study area and animal sampling

The study was carried out in the region of Castilla y León, in the NW of Spain, and included 17 commercial dairy flocks distributed in seven out of the nine provinces of the region (Burgos, León, Palencia, Segovia, Valladolid, Salamanca and Zamora) (Fig. 1). In the study area, the flocks are reared under a semi-extensive system in which sheep graze on natural pasture for six hours per day and are kept indoors for the rest of the day. The average size of the sampled flocks was 912, ranging from 302 to 2121 animals per flock.

The survey was conducted from December 2011 to June 2012. This period was extremely dry (Additional file 1). Two conditions had to be met to include a flock in the study: first, the last anthelmintic treatment must have been administered at least 2 months before collecting the samples, and second, the sheep had to be grazing at the time of sampling. The animals included in this study were ewes obtained by artificial insemination from farms belonging to the Selection Nucleus of the National Association of Churra Breeders (ANCHE). Moreover, these animals were a subset of those previously genotyped with the *Illumina* OvineSNP50 BeadChip by [19] which were still alive during the sampling period and for which both phenotypes related to parasite resistance were available. Faecal samples were collected for each ewe directly from the rectum and blood samples were obtained by venipuncture of the jugular vein. Serum samples were stored at -20 °C until processing. This study is based on 529 adult Churra sheep with faecal and blood serum samples. The mean number of sheep sampled per flock was 31 (range: 11–60 individuals). The age of the sheep included in the study varied between 4 and 11 years. All of the sheep were undergoing milking at the time of sampling and were experiencing at least their third lactation.

### Parasitological measures

A modified McMaster technique [20] using zinc sulphate as a flotation solution was used to determine the number of eggs in faeces. The minimum detection limit of this technique was 15 eggs per gram (epg). Faecal egg counts were determined by multiplying the number of eggs observed microscopically (Neggs) by 15.

In each flock, pooled faeces were cultured to recover and identify third-stage larvae (L3) following standard parasitological techniques [20]. A total of 100 L3 were identified per flock to estimate the percentage of each species.

### Titre of IgA

An indirect ELISA was carried out to determine the activity of IgA in the serum, results were scored as optical density (OD). The preparation of somatic antigen from fourth-stage larvae (L4) of *T. circumcincta* was conducted as previously described by [21]. Microtitre plates (Sigma) were coated with 100 μl of PBS containing 2.5 μg/ml of *T. circumcincta* L4 somatic antigen, after which the plates were stored overnight at 4 °C. After discarding their contents, the plates were blocked with 250 μl of PT-Milk (4 g powdered milk + 100 ml PBS-Tween; PBS-Tween: 1 L PBS pH 7.4 + 1 ml Tween) for 30 min at 37 °C. Then, the blocking buffer was discarded, and 100 μl of serum was added, followed by incubation for 30 min at 37 °C. After washing the plates four times with PBS-Tween, 100 μl of a rabbit anti-sheep IgA antibody, conjugated to horseradish peroxidase (Serotec), at a dilution of 1/500 in PT-Milk, was added, followed by incubation for 30 min at 37 °C. The plates were then washed again four times with PBS-Tween and subsequently incubated in a peroxidase substrate and tetramethylbenzidine solution to produce a colour reaction, which was stopped by the addition of 50 μl of 2 M H_{2}SO_{4}. Finally, the absorbance was measured at 450 nm in a microplate reader (Titertek Multiskan). Positive and negative controls were included in every plate. Positive controls were obtained from a pool of serum from experimentally infected sheep with *T. circumcincta* and negative controls from non-infected sheep that were kept indoors. The results were expressed as the optical density ratio (ODR):

$$ \mathrm{O}\mathrm{D}\mathrm{R}=\left(\mathrm{sample}\ \mathrm{O}\mathrm{D}\hbox{-} \mathrm{negative}\ \mathrm{O}\mathrm{D}\right)/\left(\mathrm{positive}\ \mathrm{O}\mathrm{D}\hbox{-} \mathrm{negative}\ \mathrm{O}\mathrm{D}\right) $$

(1)

### Descriptive statistics

Descriptive statistical analysis for the two traits was conducted for the 529 sampled animals with the ‘pastecs’ package [22] in R [23]. The Shapiro-Wilk test was carried out to determine if the data for each trait was normally distributed. Due to the large number of zero counts in the FEC data and the fact that the animals graze during short periods of time (semi-extensive rearing system), we decided to use a ZINB model to estimate the zero-inflation parameter and then extended it to discriminate between exposed and unexposed animals. The zero inflated model with IgA data was compared to a simpler negative binomial model using a likelihood ratio test. Moreover, in this particular study, a zero inflated model is a biologically meaningful description of the system; the adverse climatic conditions for larval development of the year studied will reduce pasture contamination, and the short grazing periods due to the semi-extensive rearing system will reduce exposure, which means that some animals would not have been infected at the time of sampling, and may not have been infected since the last anthelmintic treatment. The zero inflated model also allows for a more natural extension into discriminating between infected and uninfected animals.

### Estimation of zero-inflation

In the zero inflated model, positive FEC are derived from a NB distribution, while a zero count can arise from either the NB distribution or the zero distribution (a binary distribution that generates structural zeros). The probability of belonging to the zero distribution is called the zero-inflation parameter. The animals that have zero counts arising from the zero distribution are assumed to have not been infected since the last anthelmintic treatment, so these animals can be excluded from further analysis. A Markov Chain Monte Carlo model similar to the one described in Denwood et al. [15] using the ‘*runjags*’ package [24] was employed to estimate the zero-inflation parameter.

In this model, the negative binomial distribution arises from a gamma-Poisson mixture distribution. Uninformative priors were used for the parameters of the gamma distribution. The posterior distribution of the zero-inflation parameter is shown in Fig. 2.

### Extending the ZINB model

A zero-inflation model does not determine which animals are exposed and resistant (as opposed to unexposed). The classical ZINB model was therefore extended to accommodate IgA data as additional information for the animal status, i.e. infected or not recently infected. The animal status is calculated as,

$$ \mathrm{Status}=\Big\{\begin{array}{l}0;\kern1em \mathrm{not}\kern0.28em \mathrm{recently}\kern0.28em \mathrm{infected}\kern1em \mathrm{with}\kern0.28em \mathrm{probability}\kern0.28em 1-{P}^{\exp },\kern1em \\ {}1;\kern1em \mathrm{infected}\kern6em \mathrm{with}\kern0.28em \mathrm{probability}\kern0.28em {P}^{\exp}\end{array}\kern1em \operatorname{} $$

(2)

where status = 0 means that the animal has not been recently infected and status = 1 means that the animal is infected. *P* is the probability of being recently exposed and is equivalent to one minus the zero-inflation parameter. The raw egg counts (FEC/15) were used and it is assumed that for each animal *i*, the number of eggs counted arises from the following,

$$ \mathsf{Negg}{\mathsf{s}}_i\sim \Big\{\begin{array}{l}0\kern6em \mathrm{if}\kern0.28em \mathrm{Status}=0,\kern1em \\ {}\mathrm{Poisson}\left({\uplambda}_i\right)\kern1.48em \mathrm{if}\kern0.28em \mathrm{Status}=1\end{array}\kern1em \operatorname{} $$

(3)

where *λ*
_{
i
} is the number of eggs arising from the gamma distribution (equation 4).

$$ {\uplambda}_i\sim \mathrm{gamma}\left(\mathrm{shape},\;\mathrm{rate}\right) $$

(4)

with the shape and the rate parameters of the gamma being calculated by the model. Similarly the IgA data can be partitioned in 2 gamma distributions (equation 5) based on the animal status.

$$ {\mathrm{IgA}}_i\sim \left\{\begin{array}{l}\mathrm{gamma}\left({\mathrm{sh}}_1,{\mathrm{rt}}_1\right)\kern2em \mathrm{if}\;\mathrm{Status}=0,\hfill \\ {}\mathrm{gamma}\left({\mathrm{sh}}_2,{\mathrm{rt}}_2\right)\kern2em \mathrm{if}\;\mathrm{Status}=1\hfill \end{array}\right. $$

(5)

with sh_{1}, sh_{2}, rt_{1} and rt_{2} being the two shapes and two rates respectively that parametrize the two gamma distributions. In the model, samples are drawn for sh_{1} and sh_{2} as well as for mn_{1} and mn_{2}, which are the two means of the two gamma distributions. The rates are calculated by rate = shape/mean and the mean for the animals not recently infected (mn_{1}) is always smaller than the mean of the infected (mn_{2}). The fully annotated R code of the model is given in the Additional file 2.

The number of iterations sampled was 50,000, with the first 5,000 being discarded (burn in), and assessed convergence with the Gelman-Rubin statistic from the ‘*coda*’ package [25] being under 1.05.

Using the realisations of the animal status across the iterations (unexposed animals have status = 0, exposed and infected have status = 1), it is possible to calculate the probability for each animal to be in one status or the other, \( {P}_i^{\exp } \); animals without zero FEC will always be in the infected status. The animals that were estimated to be unexposed, i.e. the animals with status = 0, in each sample of the Markov Chain were excluded from further analyses, allowing the use of simple statistical tools to analyse the remaining dataset for each sample.

### Correlations between phenotypes

Considering FEC, IgA and the realisations of animal status, \( {P}_i^{\exp } \), the Kendall’s rank correlation coefficient was used to estimate the relationships among these three parameters. We used Kendall’s rank because it is an appropriate non-parametric hypothesis test. Correlations were calculated in R, using the ‘*ltm*’ package [26], for each sample of the Markov Chain and the average across the samples is reported below.