The wild boar (Sus scrofa) Lymphocyte function-associated antigen-1 (CD11a/CD18) receptor: cDNA sequencing, structure analysis and comparison with homologues

Background The most predominant beta2-integrin lymphocyte function-associated antigen-1 (LFA-1, CD11a/CD18, alphaLbeta2), expressed on all leukocytes, is essential for many adhesive functions of the immune system. Interestingly, RTX toxin-producing bacteria specifically target this leukocyte beta2-integrin which exacerbates lesions and disease development. Results This study reports the sequencing of the wild boar beta2-integrin CD11a and CD18 cDNAs. Predicted CD11a and CD18 subunits share all the main structural characteristics of their mammalian homologues, with a larger interspecies conservation for the CD18 than the CD11a. Besides these strong overall similarities, wild boar and domestic pig LFA-1 differ by 2 (CD18) and 1 or 3 (CD11a) substitutions, of which one is located in the crucial I-domain (CD11a, E168D). Conclusion As most wild boars are seropositive to the RTX toxin-producing bacterium Actinobacillus pleuropneumoniae and because they have sustained continuous natural selection, future studies addressing the functional impact of these polymorphisms could bring interesting new information on the physiopathology of Actinobacillus pleuropneumoniae-associated pneumonia in domestic pigs.


Background
Cell adhesion receptors play crucial roles in multicellular organisms by mediating the direct cell/cell or cells/extracellular matrix proteins interactions. These molecular interactions condition the structural integrity of cells and tissues and contribute to the signalling transduction intervening in the cellular dynamic [1]. Cell adhesion receptors are subdivided in several membrane-associated protein families, including integrins, cadherins, immunoglobulin superfamily cell adhesion molecules, selectins, and syndecans. Integrins are a family of cell surface adhesion and signalling glycoproteins made up of non-covalently associated 120-180 kDa α and 90-110kDa β subunits [2]. There are 19 distinct α subunits and 8 β subunits that are combined to form 25 different heterodimeric receptors [1]. Each subunit possesses (i) a large extracellular N-terminal domain associating with that of the companion subunit to form the integrin headpiece which contains the ligand binding site, (ii) a single transmembrane stretch, and (iii) a short cytoplasmic C-terminal tail which mediates interactions with cytoskeleton and signalling proteins [1,3,4].
The β 2 -integrin LFA-1 is essential for the following functions of the immune system [15][16][17][18][19][20] : (i) interaction between lymphocytes, (ii) interaction between T-cells and antigen presenting cells, (iii) adhesion of naïve lymphocytes to post-capillary high endothelial venules of secondary lymphoid tissues, (iv) adhesion of leukocytes to activated endothelium at sites of inflammation for extravasation, (v) control of cell differentiation and proliferation, and (vi) antibody-dependent killing by natural killer cells and granulocytes. Leukocyte LFA-1-mediated adhesion is engaged via the binding of the LFA-1 in an activated conformational state to membrane proteins, the so-called intercellular adhesion molecules (ICAM)-1 to -5 and the junctional adhesion molecule (JAM)-A [21].
Interestingly, several pathogens target the leukocytes β 2integrins which leads to lesions and disease development [22]. Several studies have highlighted the central role of LFA-1 in the pathogenesis of diseases caused by repeatsin-toxin (RTX)-producing bacteria. The virulence of Aggregatibacter (Actinobacillus) actinomycetemcomitans (localised aggressive periodontitis in humans), Mannheimia haemolytica (pneumonia in cattle), and pathogenic strains of Escherichia coli (extraintestinal infections) has been associated with a ligand/receptor interaction between their respective RTX toxin (LtxA, LktA, and HlyA) and the CD11a/CD18 receptor resulting in leukocytes alterations [23][24][25][26]. This interaction triggers synthesis and release of a wide array of cytokines and chemoattractants by the leukocytes that exacerbate inflammation and ultimately results in a much greater leukolysis worsening the lesions [25,27]. Actinobacillus pleuropneumoniae, a causative agent of pleuropneumonia in domestic pigs (Sus scrofa domestica), responsible for economic losses and antibiotic usage in the pork industry, also produces RTX toxins (ApxIA, -IIA, -IIIA, and -IVA) [28,29]. We therefore hypothesize that the pathogenesis of this disease similarly relies on an interaction with the Sus scrofa domestica LFA-1, whose CD11a (α L ) and CD18 (β 2 ) subunits have been well characterised [30,31]. On the basis of the report that approximately 50% of wild boars in their natural environment are serologically positive for Actinobacillus pleuropneumoniae [32] and because these wild pigs sustain losses due to natural selection pressure, we hypothesize that some LFA-1 molecular peculiarities conferring resistance to wild boars could have been selected. In this context, the purpose of this study was to report the sequence and analysis of the cDNAs encoding wild boar LFA-1 (WbCD11a/WbCD18) and to point out the wild boar LFA-1 specificities that might confer resistance to Actinobacillus pleuropneumoniaeassociated pneumonia.

Characterization of WbCD11a-encoding cDNA and deduced amino-acid sequence
The WbCD11a cDNA sequence contains an ORF of 3519 bp [GenBank:EF585976] that codes for 1172 aa (Fig. 1). Starting from the N-terminal end, the 1172 aa mature WbCD11a contains a 23-residue putative leader peptide (M 1 -S 23 ), an extracellular domain of 1064 residues (Y 24 -D 1086 ), a single hydrophobic transmembrane region of 24 residues (M 1087 -Y 1110 ) and a short cytoplasmic tail of 62 residues (K 1111 -A 1172 ) (Fig. 1). Six N-linked putative glycosylation sites (Asn-Xaa-Ser/Thr) are found in the extracellular domain (Fig. 1). The WbCD11a possesses 22 cysteine residues, among which one is located in the cytoplasmic tail (Fig. 1). A subset of integrin α chains (α 1 , α 2 , α 10 , α 11 , α D , α E , α L , α M and α X ), including CD11a, contains a I-domain (for Inserted domain, also called α L Idomain or α L A-domain) that is homologous to the family of von Willebrand Factor (vWF) A-type domains and to cartilage matrix protein [33,34]. The I-domain has been associated with ligand binding. Its three-dimensional structure consists of a five-stranded parallel β-sheet core surrounded on both faces by seven α-helices (Fig. 1). A short antiparallel strand occurs on one edge of this sheet [35]. The I-domain (I 149 -D 331 ) contains a metal iondependent adhesion site (MIDAS) (residues D 160 -S 164 , T 229 , D 262 ) [35,36] (Fig. 1). The I-domain crystallisation has demonstrated that a "closed" (low affinity) and an "open" (high affinity) forms exist and that the major conformational changes during transition from the closed to open states include a rearrangement of the cation-coordinating residues in the MIDAS site, accompanied by a small inward movement of the α1 helix and a large downward shift of the mobile C-terminal α7 helix [37]. The extracellular domain of WbCD11a contains seven internal repeats (FG-GAP) (G 40 -T 89 , S 90 -E 147 , S 348 -R 398 , A 399 -Q 453 , G 455 -D 511 , G 513 -F 573 , I 576 -P 628 ) that surround the I-domain ( Fig. 1) [38,39]. The degree of identity is highest among the three COOH-terminal repeats (18-31%) and their central region (D 466 -E 474 , D 528 -D 536 and D 588 -D 596 ) is similar to the EF hand divalent cation-binding motifs (DCBM) of troponin C, parvalbumin and galactose binding protein [38] (Fig. 1). All the cysteine residues and all but one N-glycosylation sites are found outside the Iregion and divalent cation binding motifs (Fig. 1), consistent with the hypothesis that these regions may undergo conformational changes important in ligand binding [38,40]. Between the FG-GAP 7 and the transmembrane domain stands the thigh domain (M 629 -K 766 ), the genu (N 767 -C 774 ), and the CALF domains (E 775 -L 920 , N 921 -D 1086 ) [41]. The cytoplasmic portion of WbCD11a contains four potential phosphorylation sites and also a conserved "G 1113 FFKR" basic sequence near the transmembrane region (Fig. 1). The integrins become constitutively active when this sequence is deleted. The "G 1113 FFKR" motif thus normally fixes the integrins in an inactive state [11,42].
Beside the complex mechanisms of affinity/avidity regulation of the integrins, the existence of several isoforms issued from alternative splicing complicates the biological understanding of these glycoproteins [43]. Previously, we have characterised two different forms of PoCD11a due to the presence of a supplementary "cag" codon that codes for a glutamine (Q) in position 744 [Gen-Bank:DQ013284, GenBank:DQ013285] [31]. The addition of a Gln at the same position was also observed in the human (Q 746 ) [GenBank:NM_002209, Gen-Bank:AY892236], the simian (Q 746 ) [44], ovine (Q 743 ) [45] and caprine (Q 743 ) [GenBank:AY773018, Gen-Bank:AY773019] CD11a cDNAs (Fig. 2). This addition located in the thigh domain of the extracellular part of CD11a, just above the genu, increases the length of an αhelix in the PoCD11a according to the GORIV bioinformatic program. Until now, it was not clear whether this addition represented two alleles or was generated by an alternative splicing. We have recently cloned and sequenced a third PoCD11a form characterised by an The nucleotide and deduced amino acid sequences of wild boar CD11a cDNA Figure 1 The nucleotide and deduced amino acid sequences of wild boar CD11a cDNA.

L W E E E A T S R D Q R A D K D I Q P I L R P S A H S E T R E I P F E K N C G E
:770 2146: GTTCAAGACCTCATCTCGCCGATCAACGTCTCCCTAAATTACTCTCTCTGGGAGGAAGAAGCGACATCGAGGGACCAAAGGGCGGACAAGGACATCCAGCCCATCCTGAGACCCTCAGCACACTCAGAAACCAGGGAGATCCCTTTCGAGAAGAACTGTGGAGAG :2310 ¤

771: D K K C E A D L R V A F S P E S S K V L R L T P S T S L A V R L T L Q N V K E D A Y W V H L S L S F P W G L S
:825 2311: GACAAGAAATGTGAGGCAGACCTGAGGGTGGCCTTCTCCCCTGAAAGCTCCAAAGTCCTGCGTCTGACCCCGTCCACCAGCCTCGCAGTGCGGCTGACACTGCAAAACGTGAAGGAAGACGCGTACTGGGTCCACCTCAGCCTGAGCTTCCCCTGGGGCCTCTCC :2475 ¤ ¤ #

826: F R K V E V L K P H S Q M P V S C E E L L E E T S L Q S R A V S C N V S S P I F K A N S S V D I Q V M F D T L
:880 2476: TTCCGCAAAGTGGAGGTGCTCAAGCCTCACAGCCAGATGCCTGTGAGCTGCGAGGAGCTTCTGGAGGAGACCAGTCTTCAGAGCAGAGCCGTCTCCTGCAACGTGAGCTCTCCCATCTTCAAAGCAAACAGCTCGGTTGATATCCAGGTGATGTTTGATACGCTA :2640 ¤

881: S N S S W E D L V E L K A D V R C N N E D T G H L I D N W A A T S I P V L Y P L N I L T K D Q E N S T L Y I S
:935 2641: TCCAACAGCTCCTGGGAGGACCTTGTCGAGCTGAAGGCTGATGTGCGCTGCAACAACGAGGACACAGGCCACCTGATTGACAACTGGGCTGCCACCAGCATCCCGGTCCTGTACCCCCTCAACATCCTCACCAAGGACCAGGAAAACTCCACGCTGTATATCAGT :2805

936: F T P K G P K T H H V K H S Y Q V K I Q P S V Y D H N M P A L E A L V G V P Q P H P K G P I T H K W S V Q M E
:990 2806: TTCACCCCCAAAGGTCCCAAGACCCACCATGTCAAGCACAGCTACCAGGTGAAGATCCAGCCTTCTGTCTATGACCACAACATGCCTGCCCTGGAGGCCTTGGTTGGGGTACCACAGCCCCACCCCAAGGGGCCCATCACACACAAGTGGAGCGTGCAGATGGAG :2970 ¤ ¤ ¤ [Gen-Bank:DQ474234] which is predicted to lengthen the αhelix further. The nucleotidic sequence of this insertion corresponds to that of the 3'end of the adjacent bovine and human intron 18 (79% and 70% of identity respectively), suggesting that the insertion of the glutamine or of the 27 amino acids-long stretch in position 744 of the thigh domain comes from an alternative splicing rather than from different alleles. Although these two insertions were not observed in the WbCD11a yet and because of the between-species conservation of this potential alternative splicing site, we hypothesize that it can have a biological importance for the mature CD11a, for example, in regulating the ligand binding and signaling activity.

WbCD11a comparison among species
Overall, the general organization of wild boar (Sus scrofa), porcine (Sus scrofa domestica) [31], bovine (Bos taurus) [46], ovine (Ovis aries) [45], caprine (Capra hircus) [47], human [38], simian (Pan troglodytes) [44], canine (Canis familiaris) [GenBank:XM_547024], rat (Rattus norvegicus) [GenBank:NP_001029170], and murine (Mus musculus) [48] CD11a proteins is quite similar (Fig. 2). Comparison between mature WbCD11a sequence and its porcine, bovine, ovine, caprine, human, simian, canine, rat and murine counterparts shows respectively overall 99%, 77%, 77%, 77%, 76%, 76%, 76%, 70% and 69% identity, and 99%, 87%, 86%, 86%, 86%, 86%, 85%, 81%, and 80% similarity (BLOSUM62table) ( Table 1). The highest identity is found for the "G 1113 FFKR" motif, the genu, the MIDAS motif and the transmembrane region and the lowest for the cytoplasmic tail and the putative signal peptide (Table 1). Although DCBM3 presents a weak identity, its similarity score is high. The "G 1113 FFKR" sequence is highly conserved which is consistent with the stabilizing role of this motif for the alpha/beta complex, possibly because of its direct involvement in heterodimer forma-  : tion [42]. The genu seems to play a key role in the activation of the LFA-1 through the deployment of the receptor [41] and its great conservation is therefore not surprising. The high conservation of the MIDAS and the putative cat-ion binding motifs is consistent with an involvement of these regions in the functional activity of the LFA-1 α subunit, as suggested by the requirement of Mg 2+ and Ca 2+ for CD11a/CD18-dependent cellular interactions [40] or  [49,50]. The transmembrane region also shows a high degree of conservation, probably due to shared physicochemical and functional constraints. Indeed, residues lying in the membrane first have to possess a hydrophobic character to warrant liposolubility, which is confirmed by the presence of many leucine residues (Fig. 2). Secondly, bidirectional integrin signalling (inside-out and outside-in) is accomplished by transmission of information across the plasma membrane [51]. By contrast, the low conservation of the COOH-terminal part of the cytoplasmic tail suggests that it is not required to guarantee adequate functioning of LFA-1. This is in agreement with the observation that truncation of the LFA-1 α subunit cytoplasmic domain has no effect on binding to ICAM-1, whereas binding is markedly diminished by β subunit cytoplasmic domain truncation [52].
The "I 149 KGN" motif known to participate in the binding to ICAM-3 [53] shows a high degree of conservation (Fig.  2). The amino acid P 215 , participating to the binding to ICAM-1 [36] is highly conserved too (Fig. 2). Residue E 333 , located in the linker following the I domain which is critical for communication with the β 2 I-like domain, rolling, integrin extension and activation by Mn 2+ [16] is logically strictly conserved too. The K 1120 residue, critical for Rap1dependent LFA-1 activation and affinity up-regulation [5] is also strongly conserved (Fig. 2). Every cysteine residue in the mature WbCD11a is present at the same location in bovine, ovine, human, simian, and murine CD11a, which is consistent with a role in maintaining the global structure of the protein. Finally, of six potential Asn-glycosylation sites in WbCD11a, the ones present at amino acids 186 and 724 are strictly conserved (Fig. 2). In addition, although WbCD11a sequences were obtained from only four wild boars, one of them was heterozygous. Both alleles differed from those found in pigs by a G736A substitution. One allele displayed 2 additional substitutions compared to pigs: E168D (in the I-domain) and D621E (in the FG-GAP7, Fig. 1). According to the BLOSUM 62 table, these substitutions are theoretically predicted to have a weak impact on the general structure of CD11a.
The nucleotide and deduced amino acid sequences of wild boar CD18 cDNA Figure 3 The nucleotide and deduced amino acid sequences of wild boar CD18 cDNA.

K L T D I I P K S A V G E L S E D S S N V V Q L I K N A Y N K L S S R V F L D H N A L P D T L K V T Y D S F F:385
991: GAGAAGCTCACAGACATCATCCCCAAGTCCGCCGTCGGGGAGCTGTCGGAGGATTCCAGCAACGTGGTCCAGCTCATTAAGAACGCCTACAATAAACTGTCCTCCAGAGTGTTTTTGGATCACAACGCCCTCCCTGACACCCTGAAGGTCACGTACGACTCCTTC :1155 ¤ ¤ ¤

Y Q P P L C T D C P S C Q V P C A R Y A K C A E C L K F D T G P F A K N C S A E C G T T K L L P S R M S G R R:660
1816: GGCTACCAGCCGCCCCTGTGCACCGACTGCCCCAGCTGCCAGGTGCCCTGCGCCCGCTATGCCAAATGCGCCGAGTGCCTGAAGTTCGACACCGGCCCCTTCGCCAAAAACTGCAGCGCGGAGTGCGGGACCACCAAGCTGCTGCCCAGCCGGATGTCGGGCCGC :1980 ¤ ¤ ¤ However, these two wild boar-specific CD11a isoforms might display an altered/improved function compared to those described among domestic pigs [45].

WbCD18 comparison among species
Overall, the general organization of wild boar (Sus scrofa), porcine (Sus scrofa domestica) [30], bovine (Bos taurus) [60], water buffalo (Bubalus bubalis) (GenPept AAW29104), caprine (Capra hircus) [61], ovine (Ovis aries and Ovis canadensis) [62,63], human [64], canine (Canis familiaris) [65], murine (Mus musculus) [66], rat (Rattus norvegicus) [GenBank:NM_001037780], chicken (Gallus gallus) [67], carp and channel catfish (Cyprinus carpio and Ictalurus punctatus) [GenBank:AB031070] [68] CD18 proteins is quite similar (Fig. 4). Sequence comparisons between WbCD18 and its porcine, bovine, water buffalo, caprine, ovine, human, canine, murine, rat, chicken, carp and channel catfish counterparts shows respectively, 99%, 88%, 88%, 88%, 88%, 87%, 83%, 81%, 81%, 80%, 62%, 49% and 48% identity, and 99%, 93%, 93%, 93%, 93%, 93%, 90%, 89%, 88%, 88%, 76%, 64% and 63% similarity (BLOSUM62 table) ( Table 2). The MIDAS-like, ADMI-DAS, LIMBS motifs, the I-like domain, the EGF-2 domain and the cytoplasmic tail have the highest identity while the putative peptide signal, the β-tail domain, and the EGF-1 show the lowest identity ( Table 2). The very high interspecies conservation of the putative MIDAS-like, ADMIDAS, LIMBS, I-like domains and the cytoplasmic tail is consistent with an involvement of these regions in the functional activities of β 2 -integrins. Overall, the high evolutionary conservation of the I-like domain confirms its importance in β 2 -integrins functions, which is compatible with the observation that monoclonal antibodies binding epitopes mapped within this region inhibit binding of LFA-1 to ICAMs 1-3 [56]. The maximum conservation being observed for the CD18 MIDAS-like motif, it is tempting to speculate that it plays a fundamental role in β 2 -integrin function. In this way, it was demonstrated that the C 169 PNKEKEC sequence conserved among mammalian species (Fig. 4) constitutively activates LFA-1 binding to ICAM-1 [69]. LIMBS and ADMIDAS sites modulate binding of ligand to the MIDAS-like in the integrins that lack the αI domain [70][71][72] but the ADMIDAS seems also to regulate α L I domain affinity and to participate in the outside-in signalling [58]. The high degree of conservation in the cytoplasmic tail, with many Ser, Thr, and Tyr residues, is compatible with the important role that phosphorylation of these residues plays in regulating adhesive activity [73] and with the observation that cytoplasmic domain truncation of CD18 markedly diminishes binding of LFA-1 to ICAM-1 [52]. Importantly, it was shown that the phosphorylation of the highly conserved residues T 758 , T 759 and T 760 plays a crucial role in the activation of the receptor and the binding to ICAM-1 [59,74]. In addition, the key residue F766 for binding to ICAM-1 is strictly conserved [59]. Although EGF-1 possesses a weaker identity, its similarity is very high ( Table 2). The weaker conservation of the β-tail domain could translate a less degree of importance of this domain for the CD18 function. Every cysteine residue in the wild boar extracellular portion of mature CD18 is present at the same location in CD18 from other species, which is consistent with a role in maintaining the global structure of the protein. Similarly, all five potential Asn-glycosylation sites observed in wild boar are present at the same location in other mammalian species. Wild boar-specific CD18 isoform is characterized by two amino acid substitutions compared to domestic pigs: G560S and A721V, which should not result in structural differences, according to BLOSUM 62 table, but might impact CD18 function.

Conclusion
This study reports the sequencing of the wild boar β 2integrin CD11a/CD18 subunits cDNAs. Predicted CD11a and CD18 subunits share all the main structural characteristics of their mammalian homologues, with a larger interspecies conservation for the CD18 than the CD11a. Besides these strong overall similarities, wild boar and domestic pig LFA-1 differ by 2 (CD18) and 1 or 3 (CD11a) substitutions, of which one is located in the cru-  Black columns with white letter represent identity among the 14 species. Cysteine residues (¤), potential N-glycosylation sites (#) and potential cytoplasmic-tail phosphorylation sites (+) are marked at the bottom of the sequences in red for 100% identity and in blue for less. The important "C191PNKEKEC", L732, S756, T758TT and F766 residues are marked by (£) in red for 100% identity and in blue for less. The stripes above the sequences represent the deduced different constitutive parts of the protein: the signal peptide ( cial I-domain (CD11a, E168D). As most wild boars are seropositive to Actinobacillus pleuropneumoniae and because they have sustained continuous natural selection, future studies of the functional impact of these polymorphisms could bring interesting new information on the physiopathology of pneumonia in domestic pigs.