Prevalence of the prion protein gene E211K variant in U.S. cattle

Background In 2006, an atypical U.S. case of bovine spongiform encephalopathy (BSE) was discovered in Alabama and later reported to be polymorphic for glutamate (E) and lysine (K) codons at position 211 in the bovine prion protein gene (Prnp) coding sequence. A bovine E211K mutation is important because it is analogous to the most common pathogenic mutation in humans (E200K) which causes hereditary Creutzfeldt – Jakob disease, an autosomal dominant form of prion disease. The present report describes a high-throughput matrix-associated laser desorption/ionization-time-of-flight mass spectrometry assay for scoring the Prnp E211K variant and its use to determine an upper limit for the K211 allele frequency in U.S. cattle. Results The K211 allele was not detected in 6062 cattle, including those from five commercial beef processing plants (3892 carcasses) and 2170 registered cattle from 42 breeds. Multiple nearby polymorphisms in Prnp coding sequence of 1456 diverse purebred cattle (42 breeds) did not interfere with scoring E211 or K211 alleles. Based on these results, the upper bounds for prevalence of the E211K variant was estimated to be extremely low, less than 1 in 2000 cattle (Bayesian analysis based on 95% quantile of the posterior distribution with a uniform prior). Conclusion No groups or breeds of U.S. cattle are presently known to harbor the Prnp K211 allele. Because a carrier was not detected, the number of additional atypical BSE cases with K211 will also be vanishingly low.


Background
Transmissible spongiform encephalopathies (TSE), or prion diseases, are fatal neurological disorders of humans and other mammals that are characterized by accumulation of an abnormal, protease-resistant isoform of the prion protein (PrP TSE ) in the brain. In cattle, the largest disease outbreak was first recognized in 1986 in Great Britain and peaked in the early 1990's when the number of confirmed bovine spongiform encephalopathy (BSE) cases rose to more than 30,000 per year [1,2]. During this time, BSE transmission among cattle was caused primarily by feeding meat and bone meal derived from other BSEaffected cattle [3,4]. This so-called classical, or orally acquired BSE has since been identified in 24 additional countries around the world [5]. Consumption of beef from BSE-affected animals was implicated as the most likely cause of one human prion disease, variant Creutzfeldt-Jakob Disease (vCJD) [6][7][8][9]. However, after regulations were implemented to prevent BSE-contaminated tissues from entering the animal feed supplies and active BSE surveillance was increased, the number of BSE cases dropped dramatically [5]. This was followed by a corresponding reduction in vCJD cases [10].
An important outcome of intensive worldwide BSE surveillance has been the detection of atypical BSE in cattle. Atypical BSE cases may be distinguished from classical BSE by differences in: 1) distribution in the central nervous system, 2) molecular typing profile of PrP TSE by Western blot, 3) distribution of cases over time, and 4) outcomes of transmission studies in animal models [11][12][13][14]. A striking feature of atypical BSE cases is their advanced age at time of detection. For example, the average age of atypical BSE cases is 12 years at the time of detection, compared to an average of 6 years for orally acquired BSE cases [1,15,16]. Although few atypical BSE cases have been identified worldwide (approximately 30), they are significant because of their possible link to sporadic CJD in humans (i.e., CJD with unknown origins) [15].
Recent evidence suggests that specific bovine prion protein gene (Prnp) variants may represent genetic risk factors for atypical BSE in older cattle. The first of two indigenous U.S. BSE cases (a 12-year-old, cream-colored, Brahman cross from a Texas farm in November 2004) was found to be homozygous for a particular Prnp haplotype associated with atypical BSE [17]. The second BSE case (an approximately 10-year-old, deep red-colored, crossbred beef cow from an Alabama farm in March 2006) had a previously unidentified, non-synonymous E211K mutation in the Prnp coding sequence (unpublished results, J.A. Richt and S.M. Hall). Genetic risk factors for TSE diseases are well known in human populations where more than 20 pathogenic Prnp mutations have been discovered in families with inherited prion diseases [18]. These are transmitted as autosomal dominant disorders and include familial CJD, Gerstmann-Sträussler-Scheinker disease, and fatal familial insomnia. Because both indigenous U.S. BSE cases have arisen without any known exposure to other infectious prion agents, the possibility remains that they represent a type of inherited BSE that occurs in older cattle with certain Prnp haplotypes.
The discovery of a Prnp E211K variant in an atypical BSE case is particularly remarkable because it is analogous to the most common pathogenic mutation in humans (E200K) which causes hereditary CJD ( Figure 1A). In the human E200K mutation, the normal glutamate (E) codon GAG is replaced by a lysine (K) codon AAG. First reported in 1989, this human G to A transition at codon 200 has arisen independently at least four times in human history [19,20]. The Alabama atypical BSE case had a normal GAA (E) and a novel AAA (K) codon at position 211 in the prion gene (unpublished results, J.A. Richt and S.M. Hall). Investigators have not been able to identify the source of this apparent bovine G to A transition or the affected cow's herd of origin [21].
Knowledge of the distribution and frequency of Prnp E211K variants in cattle populations is critical for understanding and managing atypical BSE. The K211 allele had not previously been observed in a diverse sample of 311 full blood U.S. beef and dairy cattle [22][23][24]. However, it was not known whether the K211 allele was prevalent at a low frequency in commercially-produced crossbred cattle or was detectable in a more extensive sample of purebred cattle. To evaluate these possibilities, cattle from U.S. beef processing plants and registered purebred animals were genotyped at the Prnp locus. The present report describes a matrix-associated laser desorption/ionization-time-offlight mass spectrometry (MALDI-TOF MS) genotype assay for accurate high-throughput scoring of the Prnp E211K variant and its use to determine an upper limit for the K211 allele prevalence in U.S. cattle.

Scoring haplotypes and diplotypes of Prnp codons 210 and 211
The position of the E211K mutation is adjacent to that of a synonymous C/T polymorphism in codon 210. Thus it may be possible to observe four haplotype combinations of Prnp codons 210 and 211 ( Figure 1B, haplotype abbreviations: cE211, tE211, cK211, and tK211). Moreover, ten diplotypes are possible when all paired combinations of four haplotypes are considered. To account for these ten possible paired haplotype combinations, homogeneous mass extension (hME) reactions were designed to generate both the sense and antisense allele-specific extension products with unique molecular masses (Table 1).
Because most of the ten diplotypes had not been identified in cattle and thus could not be used as DNA controls, double-stranded synthetic DNA controls (42 bp) were used in their place (Table 1). MALDI-TOF MS analysis of the resulting extension products showed that the alleles were sufficiently resolved for accurate diplotype assignment (data not shown). Importantly, all possible K211 alleles were well resolved in both the sense and the antisense reactions, providing internal confirmation of the presence of any potential K211 allele. The predicted relative positions of all four haplotype alleles in their ten paired combinations are shown graphically in Additional file 1. These assays and their synthetic DNA controls provide the basis for accurate, high-throughput screening in cattle. cK211, acC-Aaa; tK211, acT-Aaa). The map features include: thick shaded arrow, coding sequence; black arrow, 5' and 3' untranslated regions of exon 3; and hatched arrow, bovine repetitive elements. The numbers above the vertical tick marks indicate the polymorphism position relative to the first base of the Prnp start codon (GenBank Accession number AY335912). The letters below the vertical tick marks are International Union of Biochemistry (IUB) ambiguity codes for SNPs in the sense direction (B = c/g/t, K = g/t, M = a/c, R = a/g, W = a/t, and Y = c/t). R1 through R7 refer to octapeptide repeats. R1 through R5 are octapeptide repeats orthologous to those in sheep. The previously reported non-synonymous polymorphisms at codons 46 (S46I) and 145 (S145N) are indicated at nt positions 137 and 461, respectively. The asterisks at nt positions -403, -399, and -276 denote polymorphisms in intron 2 that were not previously reported. Panel C: Amplicons for DNA sequencing or MALDI-TOF MS genotyping. Arrows denote positions of oligonucleotide primers used for amplification, sequencing, or primer extension and MALDI-TOF MS genotyping. The relative position of the 42-bp double stranded synthetic alleles is shown below positions 630 and 631 (i.e., below bovine codons 210 and 211).  Figure 2). The upper bounds for prevalence of Prnp K211 was estimated to be less than 1 in 2000 cattle based on the 95% quantile of a beta posterior distribution conditional on the observation of 6062 diverse U.S. cattle without any Prnp K211 alleles and a conservative uniform prior distribution (Additional file 2). Changing the two non-negative shape parameters, alpha and beta, to that of a more realistic but less conservative probability density function (e.g., alpha = 2 and beta = 5) did not affect the outcome.

Additional unrecognized Prnp sequence variation in U.S. cattle
Because unrecognized DNA sequence variation near Prnp codon 211 may interfere with genotype scoring, it was important to sequence the complete Prnp coding sequence in diverse sires and dams from purebred collections. Analysis of Prnp coding sequences from more than 1400 diverse purebred cattle from 42 breeds identified three previously unreported polymorphisms. All three single nucleotide polymorphisms (SNPs) were in intron 2 at -403, -399 and -276 bp upstream of the Prnp start codon, respectively ( Figure 1B) [27]. Because the germplasm of elite full blood animals is the foundation of seed stock and commercial cattle in the U.S., it represents a more thorough sampling of U.S. bovine germplasm than a random sample. In spite of the biased sampling, K211 carriers were not detected in any cattle tested. These results indicate, like atypical BSE, the Prnp K211 allele is vanishingly rare among U.S. cattle.
The origin of the K211 allele from the Alabama BSE case remains unknown. None of its ancestors were traceable and thus they were not available for testing [21]. The question remains, what was the origin of this Prnp K211 allele? Possibilities include: 1) it was a mutation that arose independently in early embryonic development of the 2006 Alabama BSE case, 2) it was a mutation that arose independently in a gamete of one of its parents, 3) it was a mutation that arose recently in a line of cattle whose descendants include the Alabama BSE case, or 4) it was an allele present in a population or breed not tested. Given the apparent scarcity of carriers in the U.S. cattle population, strategies for identifying K211 carriers might include sampling cattle from geographic areas relevant to the Alabama BSE case and the continued sampling of diverse purebred sires and dams from previously untested breeds.
Regardless of the origin of the Prnp K211 allele from the Alabama BSE case, there is always a remote chance that a new K211 mutation may arise independently in any calf.
In spite of the homology with the pathogenic K200 allele in humans, the functional significance of the K211 allele Mass spectrograms of Prnp genotypes for SNPs at codons 210 and 211 in U.S. cattle Figure 2 Mass spectrograms of Prnp genotypes for SNPs at codons 210 and 211 in U.S. cattle. Spectral peaks represent singly-charged ions whose mass-to-charge ratio (m/z) was compared with calibrants for mass determination. Panels A, B, and C: Representative mass spectrograms showing the three haplotype combinations observed in U.S. cattle in their order of prevalence: cE211/cE211, cE211/tE211, and tE211/tE211. The "antisense nE211" designation refers to a peak generated by either a cE211 or a tE211 allele because the genotype for this analyte is ambiguous in the antisense direction. Although conservative hME assay designs include 40 Dalton spacing between analytes, only 15 Dalton spacing was possible in this assay design. Nevertheless, the typical instrument resolution of approximately 4 Daltons is sufficient to resolve the analytes. Panel D: Mass spectrogram of the diplotype reconstituted from cloned cDNA of the 2006 Alabama BSE case. A recent report showed that five of six other atypical BSE cases shared a relatively uncommon haplotype that is distinct from those of the Alabama BSE case [17]. Regardless of the type, identifying undesirable bovine Prnp alleles provides the opportunity to manage them before they cause disease.

Conclusion
No groups or breeds of U.S. cattle are presently known to harbor the Prnp K211 variant. Because a carrier was not detected, the number of additional atypical BSE cases with K211 will also be vanishingly low.

Cattle samples
Bovine samples from large commercial beef processing plants consisted of approximately 50% from muscle (longissimus dorsi) collected in the winter of 2005-2006, and 50% from whole blood collected in the spring of 2007. The five beef processing plants were located in three states but receive cattle from all over the continental U.S. Registered purebred cattle samples consisted of those from cow-calf herds and diverse sire and dam collections. The Angus cow-calf herd samples were from seven herds and four Midwestern states [28] whereas the Santa Gertrudis cow-calf herd samples were from a single Texas herd.
Samples of male and female registered purebred cattle with diverse pedigrees were taken from semen, blood, or tail hair follicles, depending on gender and availability. Semen from sires used in artificial insemination (AI sires) was obtained from commercial and private suppliers. Where possible, animals within breed were chosen so they did not share parents or grandparents. To estimate the relative diversity among animals in breed groups, the average relationships were computed with CFC version 1.0 (Coancestry, Inbreeding (F), Contribution) [29]. For breed groups where pedigrees were readily available, 24 animals with four-generation pedigrees were evaluated. The average numerator relationships (i.e., average coancestries) among groups of 24 purebred or full blood animals were: Angus, 0.057; Beefmaster, 0.122; Brahman, . Selection criteria for this group included sire prominence (i.e., high usage in the dairy industry) and diversity. This set was complemented with liver samples from market dairy cattle [30] and various semen or hair samples available from willing private owners of registered purebred cattle including: Ayrshire, Brown Swiss, Guernsey, Holstein, Jersey, and Montbeliard. No experiments were performed on any of these animals for this research. All cattle samples were collected during the normal processes of working, showing, or federally inspected processing of animals.

DNA extraction and sequencing
DNA from muscle and whole blood samples was extracted by use of a solid-phase system incorporating either spin-columns or 96-well microtitration plates according to the manufacturer's instructions (Gentra Systems, Inc., Minneapolis, MN, USA). DNA from liver, muscle skin, or hair samples was extracted by standard procedures. Briefly, minced tissue (35 mg) was suspended in a lysis solution of 2.5 ml 10 mM TrisCl, 400 mM NaCl, 2 mM EDTA, 1% wt/vol sodium dodecyl sulfate, RNase A (250 ug/ml; Sigma Chemical Co., St. Louis, MO, USA), pH 8.0. The solution was incubated at 37°C with gentle agitation. After 1 hour, 1 mg proteinase K was added (Sigma Chemical Co.) and the solution was incubated overnight at 37°C with continued agitation. The sample was extracted twice with 1 vol of phenol:chloroform:isoamyl alcohol (25:24:1), and once with 1 vol of chloroform before precipitation with 2 vol of 100% ethanol. The precipitated DNA was washed once in 70% ethanol, briefly air dried, and dissolved in a solution of 10 mM TrisCl, 1 mM EDTA (pH 8.0). DNA from bull semen was extracted similarly with slight modification, including the presence of 40 mM dithiothreitol [31].
Polymerase chain reaction (PCR) cocktails and DNA sequencing reactions were carried out as previously described [22,23]. Following exonuclease I digestion, the amplicons were sequenced with BigDye terminator chemistry on an ABI 3730 capillary sequencer (PE Applied Biosystems, Foster City, CA, USA). Oligonucleotide primers were designed to amplify 1498 bp and 935 bp amplicons in separate reactions ( Figure 1C). The 935 bp amplicon was contained within the 1498 bp amplicon and each contained the entire 795 bp Prnp coding sequence (with six octapeptide repeats). Oligonucleotide primers for amplification were chosen so that they were not overlaying known SNPs or bovine repetitive elements. Both strands of each amplicon were sequenced for each animal to increase the quality of their consensus sequence and assist in recognizing "allelic drop out" due to misamplification. This phenomenon commonly occurs when an individual's genome contains a previously unrecognized polymorphism in the binding site of the amplification primer. The sequence mismatch between the amplification primer and the genomic DNA reduces the stability of heteroduplex formation often results in allele misamplification (sometimes referred to as null alleles). The DNA sequences, allele frequencies, SNP genotypes of animals, and their tracefiles are publicly available [26].

MALDI-TOF MS genotyping of adjacent SNPs
The hME chemistry (Sequenom, Inc., San Diego, CA, USA) was used to genotype the adjacent SNPs in Prnp codons 210 and 211 from both sense and antisense DNA strands (Table 1). Unlike single-base extension chemistries that use all four dideoxynucleotides together, hME chemistry uses selected combinations of dideoxynucleotides and deoxynucleotides and allows for multiple base extension across adjacent SNPs. The result may be used to unambiguously infer haplotypes for multiple, sequential SNPs. The assay was designed to score both DNA strands from the same PCR reaction for each animal. Because each genotype result must agree, this strategy provides an internal control for error checking and increases the confidence of detecting a rare allele. A 203 bp region, approximately centered about codon 211, was chosen for PCR amplification ( Figure 1C). The amplification primer binding sites for this amplicon are not known to be polymorphic in any cattle population. Additional non-specific sequences (i.e., mass tags) were added to the hME amplification primers to increase their mass and thereby shift them out of the useful region of the MALDI-TOF mass spectrum. The amplification primers with these mass tags produced a 226 bp amplicon for genetic analysis (Table  1). For some DNA samples, particularly those from hair follicles, success in amplification was influenced by the type of Taq polymerase used. Results presented here were produced with Thermo-Start ® PCR Master Mix (ABgene USA, Rochester, NY, USA). After PCR, a few microliters of each reaction were analyzed by agarose gel electrophoresis to monitor the amplification results. Shrimp alkaline phosphatase enzyme was subsequently added according to manufacturer's instruction to convert unincorporated dNTPs to dNDPs so they do not interfere with subsequent reactions. Each sample was split into two hME reactions containing different termination mixtures (Table 1, ddA or ddC/ddG) for the respective sense and antisense reactions. After thermocyling, the hME reactions of the paired samples were either reconstituted to conserve reagents and increase MALDI-TOF MS throughput or processed individually. A cation-exchange resin was added to remove salts that may interfere with analysis by mass spectroscopy. Samples were spotted in nanoliter amounts onto a matrix-arrayed silicon chip with 384 elements and analyzed with the manufacturer's MassARRAY compact system and software (Sequenom, Inc.). Synthetic DNA controls (42 bp) were designed to be nested within the 226 bp hME PCR product to minimize the potential amplification of any cross-contamination between synthetic K211 alleles and animal samples. this could be estimated directly by calculating a simple proportion (e.g., one K211 carrier per 6062 cattle tested), the animals from beef processing facilities were not randomly sampled and the diverse registered AI sires contribute a disproportionate number of alleles to U.S. cattle populations. Thus, prevalence was estimated with a posterior distribution given the number of observed carriers, the number of animals sampled, and a prior distribution that reflected prior knowledge and uncertainty. The beta distribution was used as a prior distribution for binomial proportions in Bayesian analysis [32]. The beta family of distributions was used for the posterior and prior distributions because they are conjugate (i.e., they yield the same functional forms for both prior and posterior distributions). The uniform distribution, a member of the beta family, was used as the prior distribution because it is conservative and does not underestimate the frequency of Prnp K211 carriers. The prior knowledge was that both the prevalence of atypical BSE and Prnp K211 carriers are low based on previous BSE surveillance and prion gene DNA sequencing, respectively [22][23][24]33]. All values of prevalence are equally likely under the prior uniform distribution; hence, posterior distribution with a uniform prior is biased to the high side yielding an inflated (i.e., conservative) 95% quantile. The posterior distribution for the prevalence of K211 carriers with a uniform prior was modeled by the beta distribution as follows: where x is the prevalence of carriers (heterozygotes), α is the number of carriers observed plus 1, and β is the total number of animals sampled plus 1. The 95% quantile of f was computed using the BETAINV function of Microsoft Office Excel which returns the inverse of the cumulative beta probability density function (BETADIST). port; S. Kluver and J. Rosch for secretarial assistance; R. Goode for invaluable assistance in identifying and collecting registered purebred cattle. We also thank the many commercial providers of bull semen, cattle breed associations, private cattle producers, and beef processors that generously donated their time, resources, and cattle germplasm to make this study possible. Products and company names are necessary to accurately report the methods and results; however, the USDA neither guarantees nor warrants the standard of the product. Use of names by USDA implies no approval of the product to the exclusion of others that may also be suitable.