The study included five Atlantic salmon (Salmo salar) sites (A, B, C, D and E) located in the Sogn and Fjordane county in the western region of Norway. Average seaway distances between sites were below 50 km, except for site E which was around 70 km from the nearest site. All bath treatments were conducted with full tarpaulin enclosures  using Betamax Vet® (Novartis Aqua Norge, Oslo, Norway) in accordance with manufacturer’s recommendations. Bath treatments were performed in all pens at each production site between November 2011 and February of 2012. The number of pens (p) per site varied from five to nine; for site A (p = 6), B (p = 7), C (p = 9), D (p = 6) and E (p = 5).
All treatments were completed within five days at a given site. Pens were treated consecutively at a rate of one or two pen treatments per day. Water temperatures (based on monthly average values at the time of treatment) were similar in three sites (C, D and E) ranging between 6.8 - 8.2°C. The coldest and warmest temperatures were recorded at site B (5.4°C) and site A (10.5°C), respectively. Delousing treatments in Norway are mandatory when lice levels exceed the thresholds provided in Norwegian regulations. As all pens on a site must be treated there is no opportunity to leave some pens untreated to act as controls, as would be possible in a clinical trial.
Sampling was performed before and after treatment at weekly or biweekly intervals. Pen samplings were conducted from ten days prior to, and up to approximately 50 days following treatment. All pens (n = 33) except one at site D (where fish were slaughtered) were sampled before and between 3 and 16 days after treatment; only two sites and around half of the pens were sampled after day 23. The total numbers of fish sampled were 455 before and 412 following treatment, respectively. Sample size per pen at each sampling point ranged from ten to 24 fish, with most groups comprising ten fish (62%). Each sample included a count of mobile Caligus elongatus and counts of L. salmonis for three lice cycle stages: chalimus, PAAM (pre-adult and adult males) and adult females. In this communication, we did not analyse data associated with C. elongatus as infestation with this species was only detected at one site before treatment and at very low levels. Counts of sea lice are routinely conducted by farmers and the results reported to authorities as mandated by Norwegian regulations . Fish are sampled from each pen with a dip pen net, anesthetized for examination and returned to the pen after recovering from anaesthesia .
Statistical analyses. Summary statistics before and after treatment
Arithmetic mean and median abundance (number of lice per fish) and prevalence (number of fish with lice) values were calculated for each stage group of L. salmonis to characterize the level of infestation at each site. For the analysis, we aggregated the count values for all pens within a site. Estimates and 95% confidence intervals were calculated using the adjusted bootstrap percentile (bias-corrected and accelerated, BCa) method  in the boot package in R [28, 29]. We generated 1,000 bootstrap samples from the original data set. ANOVA was used to identify differences in the mean counts of L. salmonis between sites after treatment. When the effect of site was significant, we used a Tukey’s Honestly Significant Difference (HSD) to determine which sites differed from each other. All statistical analyses were carried out in Version 2.15.1 of R .
Calculation of treatment effectiveness
Traditionally, treatment effectiveness is calculated as percentage reduction in PAAM or in all mobile stages (pre-adult and adult sea lice) . We calculated treatment effectiveness based on counts taken approximately two weeks after treatment (between days 10 and 20 after treatment). Pre- post treatment comparisons are widely used to detect treatment effects and reflect efficacy. In particular in sea lice trials control pens are often unavailable and any observed lice reduction following treatment can be safely attributed to the treatment which has previously been widely demonstrated to be effective to obtain marketing authorisation. We calculated 95% confidence intervals using the quasi-Poisson method as this has been previously shown to be effective for this purpose [31, 32].
The composition of L. salmonis stage groups was studied using Bray-Curtis distances in combination with non-metric multidimensional scaling (NMDS). These procedures are well suited for arthropod community analysis [33, 34] since they avoid assumptions of linear relationships and are less susceptible to bias introduced by large numbers of zero counts in the data . Bray-Curtis distances were calculated on untransformed data with the R package Vegan . Guidelines for the interpretation of NMDS plots have been provided by Dufrêne . Briefly, objects that are closer together within the NMDS plot are more similar (i.e. in terms of stage group composition) than those further apart. The stress value is used as a measure of goodness of fit between the original data (matrix of distances) and the ordered position of objects in the two dimensional space (NMDS configuration). Small stress values indicate a solution with good fit. In particular stress values below 0.1 indicate a good configuration, while values greater than 0.2 indicate a poor fit . We did not calculate correlations between community dissimilarities and ordination distances, as this can be misleading when using a non-linear method (NMDS). Ordination using principal coordinate analysis produced similar results to those obtained with NMDS (data not shown).
In addition to ANOVA, we used three non-parametric procedures to statistically examine differences in the composition of L. salmonis between sites in response to treatment. Non-parametric procedures are preferred for data with skew distributions such as parasitic infestations. These procedures included the multiple response permutation procedure (MRPP), the analysis of similarities (ANOSIM) and permutational multivariate analysis of variance (Adonis). All these permutation procedures compared the ranks of distances between groups (farm sites) with the ranks of distances within groups. The site factor was tested in 1,000 permutations of residuals under the null hypothesis. To avoid finding falsely significant results, we performed an inferential statistical procedure similar to Levene’s test. This procedure is based on a permutation-based test of multivariate homogeneity of group dispersions (variance in the sites) . Results from inferential testing indicated that the within-group dispersion was not significantly different between sites before and after treatment (data not shown) .
The MRPP tests the relationship of entities in the multidimensional space by comparing the weighted mean of within-site distances to the within-site means from randomly assigned sites. A significant p-value (<0.05) indicates that differences detected between sites are greater than would be expected from random assignment to sites. It also provides a measure of the magnitude of differences between group means (A); computed as A = 1- (δ/mδ), where the observed delta (δ) describes the weighted mean within-site distance, and the expected delta (mδ) is computed as the mean delta for all possible partitions of the data. For example, when the composition of sea lice at pens within-sites are identical, then δ = 0 and A = 1. The value of A becomes smaller as the level of agreement is increasingly reduced from than that expected by chance. The advantage of the MRPP statistic is that it is robust to unequal variance, non-normally distributed data and unbalance designs .
Analysis of similarities (ANOSIM)  is similar in concept to MRPP but uses a different test statistic. The result is summarized in the R statistic which indicates the magnitude of difference between group means. The R statistic ranges from 0 (no separation) to 1 (high separation). R values >0.75 are indicative of high separation, R >0.5 as separated but overlapping and R <0.25 as barely separable.
The permutation multivariate analysis of variance PERMANOVA (Adonis) is a permutation-based version of the multivariate analysis of variance . Similar to the other permutation tests, it uses distances between sites to partition variance. Significance testing is carried out using F-tests derived from permutations of the raw data.
Indicator species analysis
The concept of indicator species has been previously used in the fields of marine ecology [42–44]. For this Dufrêne and Legrende  proposed a flexible and asymmetrical approach to identify indicator species. This method combines the relative abundance (specificity) with the frequency (fidelity) of species (or stage groups) at a site and finds the stage groups that are significantly concentrated at a site or group of sites. Stage groups with significant indicator species values provide some measure of the characteristic of a site and can be used to monitor changes.
For each stage group i in each pen of site j, the relative abundance RA
and the relative frequency are computed as RF
is the mean abundance of stage group i across pens of the site j, A
is the sum of mean abundances of stage group i over all sites.
is the number of pens in site j where stage group i is present, S
is the total number of pens in that site. Combining the relative abundance and frequency gives the indicator species value of stage group i at site j
A stage group may be considered characteristic of a site if it has an IndVal value greater than 25% and a p-value < 0.1, as discussed in . The significance level was increased to decrease the Type II error that is commonly found as a result of low power resulting from the permutation test with a low number of replicates (pens). The p-value for a Monte Carlo test (1,000 permutations) evaluates the statistical significance of the IndVal.