Skip to main content

Table 3 Differences in results for analytical criteria when excluding or not recombinants for the different algorithmsa

From: Impact of alignment algorithm on the estimation of pairwise genetic similarity of porcine reproductive and respiratory syndrome virus (PRRSV)

Criterion

Algorithm

 

Clustal W

MAFFT

T-Coffee

Muscle

Clustal 0mega 

1. Difference in similarity: average pairwise genetic similarity (%) of aligned sequences within the dataset (mean)

 Replicate 1

0.01

−0.05

0.01

0.01

0.01

 Replicate 2

0.02

0.01

0.15

0.01

0.01

2. Difference in proportion of pairwise comparisons of sequences having ≥ 97.5% genetic similarity (%)

 Replicate 1

0.07

0.07

0.07

0.07

0.07

 Replicate 2

0.05

0.05

0.27

0.05

0.04

3. Difference in length of aligned dataset: number of sites per sequence in the aligned dataset

 Replicate 1

0

−3

0

0

0

 Replicate 2

0

0

−1

0

0

4. Difference in average sum of pairs (SP) score: proportion of shared homologies with reference alignment (%)b

 Clustal W as reference

0.01

0.04

0.01

0.01

 MAFFT as reference

0.01

0.04

0.01

0.00

 T-Coffee as reference

0.01

0.01

0.01

0.00

 Muscle as reference

0.01

0.01

0.04

0.00

 Clustal Omega as reference

0.01

0.00

0.04

0.00

Average

0.01

0.01

0.04

0.01

0.01

5. Difference in congruent cells ≥ 97.5% similarity: proportion of cells between two pairwise similarity matrices having the same binary value (0: < 97.5%; 1: ≥97.5%) for genetic similarityb

 Clustal W as reference

−0.01

0.11

0.00

0.00

 MAFFT as reference

−0.01

0.11

0.01

0.00

 T-Coffee as reference

0.11

0.11

0.11

0.11

 Muscle as reference

0.00

0.01

0.11

0.00

 Clustal Omega as reference

0.00

0.00

0.11

0.00

Average

0.02

0.02

0.11

0.03

0.03

  1. aThe open gap penalties used was 30 for Clustal W, 7 for MAFFT, −200 for T-Coffee, −1000 for Muscle and default for Clustal Omega. The five criteria presented in Table 1 for the two replicates including recombinants (Replicates 1 and 2, n = 1191) were re-evaluated for each replicate without recombinants (Replicates 1 and 2, n = 1183). Then, differences in results were computed (i.e. the result obtained with recombinant was subtracted from the result obtained without recombinant
  2. bAverage of 2 replicates of 1183 sequences