Impact of alignment algorithm on the estimation of pairwise genetic similarity of porcine reproductive and respiratory syndrome virus (PRRSV)

Table 3 Differences in results for analytical criteria when excluding or not recombinants for the different algorithms^a

Criterion	Algorithm
	Clustal W	MAFFT	T-Coffee	Muscle	Clustal 0mega
1. Difference in similarity: average pairwise genetic similarity (%) of aligned sequences within the dataset (mean)
Replicate 1	0.01	−0.05	0.01	0.01	0.01
Replicate 2	0.02	0.01	0.15	0.01	0.01
2. Difference in proportion of pairwise comparisons of sequences having ≥ 97.5% genetic similarity (%)
Replicate 1	0.07	0.07	0.07	0.07	0.07
Replicate 2	0.05	0.05	0.27	0.05	0.04
3. Difference in length of aligned dataset: number of sites per sequence in the aligned dataset
Replicate 1	0	−3	0	0	0
Replicate 2	0	0	−1	0	0
4. Difference in average sum of pairs (SP) score: proportion of shared homologies with reference alignment (%)^b
Clustal W as reference	–	0.01	0.04	0.01	0.01
MAFFT as reference	0.01	–	0.04	0.01	0.00
T-Coffee as reference	0.01	0.01	–	0.01	0.00
Muscle as reference	0.01	0.01	0.04	–	0.00
Clustal Omega as reference	0.01	0.00	0.04	0.00	–
Average	0.01	0.01	0.04	0.01	0.01
5. Difference in congruent cells ≥ 97.5% similarity: proportion of cells between two pairwise similarity matrices having the same binary value (0: < 97.5%; 1: ≥97.5%) for genetic similarity^b
Clustal W as reference	–	−0.01	0.11	0.00	0.00
MAFFT as reference	−0.01	–	0.11	0.01	0.00
T-Coffee as reference	0.11	0.11	–	0.11	0.11
Muscle as reference	0.00	0.01	0.11	–	0.00
Clustal Omega as reference	0.00	0.00	0.11	0.00	–
Average	0.02	0.02	0.11	0.03	0.03

^aThe open gap penalties used was 30 for Clustal W, 7 for MAFFT, −200 for T-Coffee, −1000 for Muscle and default for Clustal Omega. The five criteria presented in Table 1 for the two replicates including recombinants (Replicates 1 and 2, n = 1191) were re-evaluated for each replicate without recombinants (Replicates 1 and 2, n = 1183). Then, differences in results were computed (i.e. the result obtained with recombinant was subtracted from the result obtained without recombinant
^bAverage of 2 replicates of 1183 sequences

ISSN: 1746-6148