Canine candidate genes for dilated cardiomyopathy: annotation of and polymorphic markers for 14 genes

Background Dilated cardiomyopathy is a myocardial disease occurring in humans and domestic animals and is characterized by dilatation of the left ventricle, reduced systolic function and increased sphericity of the left ventricle. Dilated cardiomyopathy has been observed in several, mostly large and giant, dog breeds, such as the Dobermann and the Great Dane. A number of genes have been identified, which are associated with dilated cardiomyopathy in the human, mouse and hamster. These genes mainly encode structural proteins of the cardiac myocyte. Results We present the annotation of, and marker development for, 14 of these genes of the dog genome, i.e. α-cardiac actin, caveolin 1, cysteine-rich protein 3, desmin, lamin A/C, LIM-domain binding factor 3, myosin heavy polypeptide 7, phospholamban, sarcoglycan δ, titin cap, α-tropomyosin, troponin I, troponin T and vinculin. A total of 33 Single Nucleotide Polymorphisms were identified for these canine genes and 11 polymorphic microsatellite repeats were developed. Conclusion The presented polymorphisms provide a tool to investigate the role of the corresponding genes in canine Dilated Cardiomyopathy by linkage analysis or association studies.

DCM has been described in many different breeds of mostly giant and large dogs, including the Dobermann [20], Great Dane [21], Newfoundland [22] and Irish Wolfhound [23]. Clinical variation exists in the presentation and progression of DCM between different dog breeds and breed specific variation has also been found in histological findings in DCM-affected hearts tissue [24]. Since clinical DCM may be a late onset disease, following a long pre-symptomatic course, dogs are often used for breeding before the disease becomes apparent [25]. So far, no causative mutation has been discovered in canine DCM. The phenotype of the adult onset forms of canine DCM in most breeds is consistent with a defect in components of the cytoskeleton.
Of the 14 autosomal DCM candidate genes for the dog,  ACTC, CAV1, CSRP3, DES, LDB3, LMNA, MYH7, PLN,  SGCD, TCAP, TNNI3, TNNT2, TPM1 and VCL, genomic information and/or polymorphic markers were already available for ACTC [26,27], DES [28], PLN [29], SGCD [30] and TPM1 [31]. In this article, we describe a complete set of polymorphic markers for these 14 candidate genes for canine DCM. The markers, both microsatellites and Single Nucleotide Polymorphisms (SNPs), provide a useful tool to perform linkage and association studies between each of these genes and DCM in the different dog breeds. Furthermore, we present the annotation of 14 candidate genes in the canine genome, which will facilitate mutation screening of these genes.

Genomic Annotation
The 14 canine DCM candidate genes were identified on the canine genome by means of a BLAST analysis [32], using available canine and human DNA sequences as a query ( Table 1). The exons were identified based on the corresponding human exon sequence (retrieved from [33], Table 1). Each gene was found to be covered by 1 to 5 contigs of the Canis familiaris genome build 1.1. (Additional file 1 and Table 1 TNNT2 exon 6 showed 1 extra bp compared to human (G, bp 5622 of [Genbank: AAEX01013360]), however, this nucleotide was not found in the 2 traces covering this DNA sequence. Without this additional bp, exon 6 matched the corresponding human exon exactly in length. Exon 12 had 1 codon less than the human gene. Exon 13 was located at the end of genomic contig [Genbank: AAEX01013360] and although its terminal 2 putative bp were not included in this contig, exon 12 seemed to match the human exon. For the remaining candidate genes, ACTC, CSRP3, PLN, SGCD, TCAP, TPM1 and VCL, the annotated canine exons matched the corresponding human exons exactly. We could not identify non-coding exons. Apparently, the conservation of these exons is too low for identification purposes. Complementary DNA sequencing is necessary to identify these non-coding exons. All of the predicted introns of the 14 candidate genes started and ended with the canonical GT and AG dinucleotides, respectively [35]. Even though a high quality DNA sequence of the canine genome has recently become available, it has not yet been fully annotated.
The conservation of the coding region of each gene was assessed by BLAST comparison of the cDNA and derived amino acid sequences with those of human (at the website of NCBI [36], BLASTN and TBLASTX analysis, respectively). The percentages of identity at the nucleotide level varied between 88 and 95% (Table 1). At the amino acid level, the percentages of identity varied in general between 90-100%, except for the canine LDB3 protein, that was 79% identical to the human protein. The canine ACTC protein appeared to be identical to the human protein. In LDB3, a relatively low percentage of identity was found between the canine and human gene, both at the cDNA and the protein level. This was caused by the large (inframe) loss of part of exons (i.e. 4, 7, 8 and 9) compared to the human gene: the canine gene had 660 codons, the human gene had 734 codons.
The chromosomal position of the 14 canine candidate genes can be found in Table 1.
When analysing the location of the genes in the dog genome (Table 1), using the canine-human comparative map of Guyon et al. [37], each was found to be syntenic to the human location.

Single Nucleotide Polymorphism detection
We used denaturing high-performance liquid chromatography (DHPLC) analysis for the detection of SNPs in amplified genomic canine DNA fragments. Polymorphisms were assessed in DNA from Newfoundland dogs. For each gene, several DNA fragments of approximately 500 bp were selected based on melting profile (analyzed with WAVEMAKER™ software from Transgenomic) with a maximum of 2 melting temperatures covering each product. The melting behaviour of a fragment depends on the fragment's DNA sequence. Primers were designed using Primer3 [38] and annealing temperatures of the PCRs were optimized (Table 2). Touchdown PCR amplification of these fragments was performed with DNA of Newfoundland dogs (n = 16; 8 unrelated founders of a pedigree of Newfoundland dogs and 8 family members), using HotStartTaq DNA Polymerase (Qiagen). The Touchdown (TD) PCR program consisted of a denaturing step of 5 min at 95°C, followed by 14 cycles of 95°C 30 sec, Ta +7°C 30 sec, 72°C 20 sec, with a Ta decrease of 0.5°C/ cycle, followed by 25 cycles of 30 sec at 94°C, 30 sec at Ta°C, 30 sec at 72°C, followed by a final extension at 72°C for 2 min (Ta in Table 2). Subsequently, a heteroduplex formation step was carried out to allow formation of hetero-and homo-duplex products; the PCR products were heated 5 min at 95°C, after which the temperature was decreased gradually (38 cycles of 1 min, temperature decreasing 1.5°C/cycle), followed by a final step of 5 min at 10°C. Mutation analysis of the PCR products, based on the presence of heteroduplexes, followed on a WAVE instrument (WAVE Nucleic Acid Fragment Analysis System, Transgenomic). Multiple WAVE patterns of a single PCR fragment in different dogs pointed at existence of both homoduplexes and heteroduplexes and, therefore, indicated potential presence of SNPs in the fragment. In that case, the PCR fragment (of at least of 2 dogs per WAVE pattern) was cleaned (Shrimp Alkaline Phosphatase/ExoI) and the DNA sequence was obtained to determine the identity of the SNPs, by a commercial company (Lark Technologies™, UK).
Twenty-eight SNPs were discovered by WAVE analysis ( Table 2). No indication of the presence of a SNP was found in WAVE fragments of LMNA, MYH7 and TNNI3 (3, 5 and 3 fragments analyzed, respectively). One new SNP, TCAP SNP 29,957 T/C in genomic contig [Genbank: AAEX01022011], was found when we resequenced a TCAP fragment in a group of Newfoundland dogs. WAVE analysis of this fragment had not indicated presence of a potential SNP -although the obtained DNA sequences 1 Sequence used to identify the canine gene in the dog genome, Genbank accession numbers; C = canine sequence, H = human sequence; 2 Transcript ID numbers of human annotation [33] used to annotate the canine gene; 3 canine genomic contig in which the gene's coding exons were identified; 4 the percentage identity of each canine protein compared to the human protein (Genbank accession number is listed); 5 canine genomic contig containing only intronic sequence.  showed that both homozygous and heterozygous animals were among the dogs used for WAVE analysis. Conversely, sometimes WAVE analysis indicated potential presence of SNPs, yet sequencing of dogs with different WAVE patterns did not confirm these. This could be due to the sequencing procedure used.
In search of additional SNPs for canine ACTC and DES, genomic DNA fragments containing SNPs annotated by others (Table 2) were resequenced. After PCR amplification of these fragments, 1 µl of 1:15 diluted PCR product was used in a Tercycle big dye reaction with the F-PCRprimer for the ACTC SNP and a HPLC-purified M13 Fprimer (5'-GTTTTCCCAGTCACGAC-3') for the DES SNPs. The Tercycle consisted of 25 cycles of 30 sec at 96°C, 15 sec at 55°C and 2 min at 60°C. After purification (Sephadex TM G50 Superfine, Amersham Biosciences), each product was processed with an ABI PRISM ® 3100 Genetic Analyzer (Applied Biosystems). Five SNPs (ACTC 5,452G/A; DES 19,196C/T and 19,105G/A; LDB3 25,452A/G and TCAP 29.957 T/C) were identified by resequencing areas of earlier described SNPs (Table 2).
Of the total of 33 identified SNPs, 4 were in coding regions (DES 15,006C/T, LDB3 14,090C/T, TCAP 29,957T/C and TNNT2 10,466C/T). These exonic SNPs, however, did not cause polymorphisms at the amino acid level. Comparing the 33 newly discovered SNPs to the dog SNP database of the Broad Institute [39] showed 25 of our SNPs to be new, the remaining 8 SNPs matched SNPs present in the Broad database (see Table 2). This indicates that, in addition to the many SNPs that have become available by random sequencing of the dog genome, many more canine SNPs exist. Our limited search for SNPs in 14 DCM candidate genes took place in a single breed, the Newfoundland dog. However, a high percentage of SNPs found in one breed can be expected to be polymorphic in other breeds too [40]. All identified SNPs were submitted to dbSNP and the respective accession numbers are listed in Table 2.

Detection of microsatellite polymorphisms
Simple DNA sequences composed of CA, GAAA or GA repeats were identified in the genomic contigs that contain the candidate genes or in neighbouring contigs. For VCL, a polymorphic microsatellite became available through personal communication with P.Stabej (Table 3; a repeat was obtained from BAC RP81-251B5, isolated using methods as described in [28] with an overgo probe based on murine VCL exon 17, F-overgo CCAAGGTCA-GAGAAGCCTTCCAAC, R-overgo AAGTCAGGCTCCT-GAGGTTGGAAG). Primers were designed from the DNA sequence flanking the repeats and the forward primer was fluorescently labelled with 6-FAM or HEX. For some microsatellites, a 3-primer protocol was used for the PCR amplification (Table 3), using an M13-tailed (GTTTTC-CCAGTCACGAC-----(5'-3')) F-primer, a 6-FAM-labelled M13 primer (GTTTTCCCAGTCACGAC (5'-3')) and a Rprimer. Genotyping PCR reactions were incubated 12 min at 94°C, followed by 35 cycles of 10 sec at 94°C, 15 sec at Ta°C and 30 sec at 72°C, and a final step of 20 min at 72°C (Ta in Table 3). An ABI PRISM ® 3100 Genetic Analyzer (Applied Biosystems) was used for genotyping and allele sizes were determined with Genescan Analysis 3.7 and Genotyper 3.7 software (Applied Biosystems). Eleven polymorphic microsatellites were developed for ACTC (2   Table 3). The markers, mostly CA-repeats, showed multiple allele sizes (2-6 alleles/ marker) in a group of 16 Newfoundland dogs (Table 3).
To describe the informativeness of our microsatellite markers, the polymorphism information content (PIC) was obtained based on the genotypes of unrelated founders of a family of Newfoundland dogs (Table 3). According to [41], 2 of the 11 newly designed microsatellites were considered highly informative (PIC>0.50), 7 reasonable informative (0.25<PIC<0.50) and 2 slightly informative (PIC<0.25) in the Newfoundland founder dogs. Besides the 11 polymorphic microsatellites, 2 other markers were found to be monomorphic in the group of Newfoundland dogs, but might be polymorphic in other breeds. This was a MYH7 CA-repeat (position 11,730 of [Genbank: AAEX010141100]) and a TNNI3 CA-repeat (position 17,739 of [Genbank: AAEX01053915]. An already available microsatellite for TPM1 [31] was shown to be highly informative in our group ( Table 3).
The distance between the microsatellite and the corresponding gene was derived from the dog genome build 1.1 [42] and can be found in Table 3. This distance varied from zero for an intragenic microsatellite to 179.8 kb. The genomic locations of polymorphic microsatellites, already available for DES, SGCD, TPM1 and VCL, were determined. For DES a CA-repeat [28] was located at position 5,688 of [Genbank: AAEX01055032], 9.0 kb downstream of the stop codon. For SGCD both a GAAA-repeat and a CA-repeat were available [30]. The first was located at position 76,364 of [Genbank: AAEX4801016848], the second at position 42,047 of the same genomic contig and both markers are in intron 7 of SGCD. For TPM1 a GA-repeat [31] was located at position 88,113 of [Genbank: AAEX01008742], 6.5 kb downstream of the stop codon. A polymorphic GAAA-repeat for VCL showed to be located at position 12,680 of [Genbank: AAEX01016406] in the dog genome, 88.6 kb downstream of the stop codon.

Conclusion
With the annotation of these 14 candidate genes for DCM and the identification of polymorphic markers, the genes can be evaluated for the involvement in breed specific DCM. The SNPs and microsatellites presented in this paper are a powerful tool to analyse linkage between the fourteen candidate genes encoding cytoskeletal proteins and DCM in the dog. The annotation of each gene facilitates screening of these genes for mutations in naturally occurring canine DCM in specific breeds, potential models for forms of human DCM.