New doppel neighbor: a tough little gene
5' prion neighbor: ADRA1A (alpha-adrenergic receptor)
Microsatellites in the prion-doppel region
Chr 20 genes with unigene links
Doppel polymorphisms in nvCJD: no apparent connection
2x prion repeats in lemurs
T188A prion mutation in an octogenarian with CJD
Normal function: a lesson from frataxin
Peyer's patch prion distribution: enhancing IHC sensitivity
Prion mRNA in sheep placenta
Mon, 7 Aug 2000 webmaster research Genome Annotation Tutorial #19 ... When and how to mask for GenScan: intronic coding pseudogenes ... Determining 3' ends of genes: end-aligned EST stacks ... Mixups at GenBank: no transcriptome target database, finished genomic sequence misplaced ... Pushing the limits of what is annotatable: orphans with few transcripts Bottom line: a sprawling but small new gene on chr 20p, rarely transcribed, with intronic pseudogenes Gene order: telomere ... Gol ... ADRA1A ... PRNP ... PRND ...new ... KIAA0168 ... centromereIt will never be possible to adequately annotate the human genome by computer methods alone; it has been hugely wasteful not to experimentally confirm features in parallel with genomic sequencing. The gene described below (located immediately centromeric to doppel) illustrates some of the problems: the ill-conceived EST program never received a mid-course correction (yielding millions of short, badly annotated mostly 3' non-coding sequences), GenBank placed crucial sequences in the wrong databases (notably mRNAs in non-redundant nucleotides), and gene prediction can fail because the human genome has a very convoluted history of improbable superimposed events.
New work on human doppel polymorphisms and transcripts warrants an update of observed transcripts of doppel. It quickly emerges that doppel transcripts are far more abundant in rodents (79) than in humans (21), even though more human ESTs have been sequenced (2,212,467 entries vs. 1,600,223). Nothing is known of course about relative translational efficiencies. Only 3 ESTs reach coding sequence and only 1 contains a splice junction. One sees the hopelessness of adequate annotation of prion/doppel from the complete lack of the intergenic and chimeric mRNAs in the EST collections.
Both mouse and human have alternate poly A signals (manifested as 3' end-justified EST stacks) but basically have little in common in their 3' UTRs. Frequencies of non-canonical signals are tabulated genome-wide in the July 2000 issue of Genome Research.. The most common variants, all occuring about 16 bp before the polyadenylation attachment site itself are:
AATAAA 58% canonical poly A signal CATAAA 1% less common ATTAAA 15% main alternate, often internal GATAAA 1% TATAAA 3% AATACA 1% AGTAAA 3% AATAGA 1% AATATA 2%
Given the reality of alternative polyadenylation sites, how then does one determine the true downstream end of the doppel gene? Doppel 3' UTR may sprawl out over many kilobases or perhaps the next cluster of transcripts belongs to some other gene. Because GenBank allowed extremely inadequate annotation of millions of ESTs, it is seldom possible to infer the originating strand. Thus it is not possible to say if these ESTs are on the same strand as doppel, which would allow them to be positively assigned to doppel (there not being room for another gene) or came from the anti-sense strand.
The basic tool for studying the transcriptome is blastn of repeat-masked genomic sequence at dbEST (plus the nr 'non-redundant' database, because of misplaced mRNAs). [Note: NCBI offers a 'filter' checkbox on its Blast server as well as retrotransposon annotation within entries, but these are less reliable than a fresh run using slow mode at the RepeatMasker site itself.]
This gives a cluster of EST stacks in the region 3-4 kbp downstream of doppel coding sequence that from custom transcript studies reported in the Oct 99 JMB, we know belong to doppel. The problem is interpreting the next two ESTs (AA913922, 479 bp which contains stSG53387 from 300-443 of UniGene cluster Hs.126516; and AA957982, 372 bp ) that occur some 3 kbp further downstream. (Earlier, only one of these EST was available so it was not pursued: singlet ESTs, often artefactual, are unsuitable for analysis.) None of these ESTs belong to a known UniGene cluster.
Do these two ESTs belong to human doppel, are they artefacts of the EST process, or do they come from a new divergently transcribed gene? Lurking inappropriately in the genomic database is a better-annotated mRNA (AL137296 length 1893 bp, Unigene Hs.126516) that contains these two loose ESTs at its 3' end as perfect matches. Further, a canonical poly A signal is seen at the appropriate position 18 bp before the polyadenylated 3' end. The long mRNA is not contiguous genomically; instead it has two sections joined by a canonical GT-AG splice site: 5' ... 69 bp ... 7514 bp intron ... 1804 bp terminus 3'
Translating AL137296 into all 3 positive frames, the best candidate (later to be confirmed, after some tweaking, by GenScan) has a good open reading frame at the 5' end with standard splice donors and acceptors. This provides a gene model for the 3' end of the putative new gene. The penultimate exon is found with less certainty by GenScan, so may be smaller than full length (phase-maintaining splice acceptors shown in red, others in green). However AL137296 appears to terminate 5' at the beginning of an exon, as database transcripts often do.
New gene model: span of 43403 bp, 369 bp coding (0.8%); 4492 bp intergenic to doppel telomerically; 40 bp 1753-1792 promoter followed by start of transcription and 5'UTR RepeatMasker identifies 1597..2385 as LTR1 67 bp 2065-2131 exon 1 MRFFRYTRQEHRDTESPLSLRQ 22 aa GenScan support only, no ESTs 14486 bp 2132-16617 intron 1 with 27 retrotransposons (6760 bp 47%) + dynein 3'UTR pseudogene 94 bp 16618-16711 exon 2 ESRHLSVLLAAVFLMFKTAFLLALKEPSFAFT 32 aa 20725 bp 16712-37436 intron 2 with 37 retrotransposons (14474 bp 70%) including (TAAAA)14 microsatellite 69 bp 37437-37505 exon 3 ..SLETAWKDWVPAAERRARRPNAG 23 aa VLVFPREWDRILVEFRSQAGGKALRTPSQGLE.. potential 5'extending open reading frame, unsupported 7514 bp 37506-45019 intron 3 with 13 retrotransposons (3027 bp 40%) + 1970 bp IPP pseudogene spanning 41811-43780 1804 bp 45020-46823 exon 4 CGCLQPILPPAPRVIFRNASLIVESTFLTPFSALLFVTPLLYKR- 45 aa 45020-45155 196 291 + MER5A 11-109 637 759 + L2 3178-3302 1024 1166 - MER63A 6-204 1257 1348 + L2 3172-3267 1667 bp 3'UTR not including last genomic A (not distinuishable from poly A tail) poly signal AATAAA poly A 46799-46804
GenScan with psIPP masked, exon detection sensitivity 1.0 [sensitivity 0.10 produces no further exon candidates] 1.00 Prom + 1753 1792 40 1.01 Init + 2065 2131 67 0 1 1.02 Intr + 16618 16711 94 2 1 1.03 Term + 45020 45155 136 2 1 1.04 PlyA + 46799 46804 6 >New gene 98 aa MRFFRYTRQEHRDTESPLSLRQ ESRHLSVLLAAVFLMFKTAFLLALKEPSFAFT SCSCGCLQPILPPAPRVIFRNASLIVESTFLTPFSALLFVTPLLYKR-
DNA for coding part of exons 1-4: atgcgatttttccggtacacaaggcaagaacaccgggatacagaaagccctctgtccttgcgacaag aaagcaggcacctctctgtcttgcttgctgctgtattcctcatgttcaagacagccttcttactggcactgaaggaaccatcttttgcatttacg tccctagagacagcctggaaagattgggtgccagctgcagagaggagagcaaggcgaccaaacgcagg ctgtggctgcctgcagcccatcctgcccccagcacctagagtcatctttagaaatgcaagcctgatcgtggaatccaccttcctaacacccttcagtgccctcctcttcgttacacccttactatacaaacgatag atgcgatttttccggtacacaaggcaagaacaccgggatacagaaagccctctgtccttgcgacaag M R F F R Y T R Q E H R D T E S P L S L R Q aaagcaggcacctctctgtcttgcttgctgctgtattcctcatgttcaagacagccttcttactggcactgaaggaaccatcttttgcatttacg E S R H L S V L L A A V F L M F K T A F L L A L K E P S F A F T tccctagagacagcctggaaagattgggtgccagctgcagagaggagagcaaggcgaccaaacgcagg S L E T A W K D W V P A A E R R A R R P N A G ctgtggctgcctgcagcccatcctgcccccagcacctagagtcatctttaga C G C L Q P I L P P A P R V I F R aatgcaagcctgatcgtggaatccaccttcctaacacccttcagtgccctcctcttcgtt N A S L I V E S T F L T P F S A L L F V acacccttactatacaaacgatag tccatggaccagcagcatcaacataaccagggagct T P L L Y K R -
Amino acid sequence of new gene adjacent to doeppel, 153 aa version MRFFRYTRQEHRDTESPLSLRQ ESRHLSVLLAAVFLMFKTAFLLALKEPSFAFT vlvfprewdrilvefrsqaggkalrtpsqgle SLETAWKDWVPAAERRARRPNAG CGCLQPILPPAPRVIFRNASLIVESTFLTPFSALLFVTPLLYKR-Is this really part of a gene, and if so, where is the rest of it? The gene prediction tool GenScan does not find any bona fide gene between doppel and KIAA0168, reporting instead a fictious fractional IPP gene that in fact is a processed pseudogene of a master IPP gene on unfinished chr 10 contig AC022536. It turns out that the psIPP confuses GenScan because even though its introns were spliced out prior to retro-insertion, either its splice donor/acceptor sites have survived or two subsequent retrotransposon events created intron-like features making it appear translocation-like (ie, with exons), rather than splice-processed. A second pseudogene, pure 3' UTR of dynein, gives rise to 85% quasi-matches in both transcripts and genomic sequence but these seem not to be an issue for GenScan.
After masking (replacing with N's, not deleting which confuses numbering) the IPP pseudogene, GenScan now finds the EST carboxy terminal peptide as well as a promoter, two upstream coding exons, and the expected poly A termination signal. The second of the upstream exons has singlet EST support (AI539384), though problematically not all of this EST is utilized. Still, to have a small GenScan prediction encompass a small EST in a 50 kbp stretch cannot be coincidental. Exon 3 is not found by GenScan even when set (0.10) for exaggerated exon tolerance; perhaps AL137296 is not fully splice processed. (Other gene prediction programs such as Grail might be used but we know in advance there cannot be further transcript support.) The final protein can then be brought to 153 aa.
The upstream part of the gene has its problematic aspects as well. The EST AI539384, while a singlet, is of high quality, matching genomic AL133396 continuously at 394/397 (99%) from 16409-16805; even though its annotation states high quality sequence stops at base 374, it is really bases 1-3 TTT that are off. Oddly, immediately downstream, positions 16806-17405 are annotated at GenBank as a loose CpG island. Indeed, this 600 bp region has 45 CpG or 7.5 per 100 bp (versus 55 GpC, ratio 0.81) where as flanking DNA has only 1 CpG per 100 bp and a GpC ratio of 0.12. In the scenario where GenScan exon 1 is a retrotransposon artefact, this might be the leading exon: it is not unusual for a CpG island to be "off-center" to the 3' side.
The problem here is that all reading frames have multiple stop codons (though one is harmlessly at the predicted exon boundary). This cDNA, which begins some 1873 bp upstream from the poly A site, may not represent mature mRNA (ie, not be fully spliced). While a coding extension 5' for exon 2 is possible, CHLAWQDSVVLIYLFVVSP, and even has a AG splice acceptor at the alanine, there is no support for this at GenScan and no suitable initiator methionine in a scenario where this is first exon.
On balance, a new gene is the best way to account for four ESTs in a 54 kb intergenic region that include one spliced transcript and 3 beginning at a polyadenylation site. Ab initio gene prediction, at least after masking out a known pseudogene, predicts a small gene that includes all or part of these 4 ESTs and little else. GenScan is not aware of EST databases, so makes an independent prediction. Since the ORFs are short, it cannot be a coincidence that the same 300 bp of a 54,000 bp region are identified as coding by two procedures, one theoretical and one experimental. Still, serious uncertainties remain as to whether all the exons are valid and whether others have not been missed. The feature is not likely to be a pseudogene because of its exonic structure, lack of stop codons, and transcriptional expression.
A further upstream singlet cDNA of length 429 bp, AW339576, is apparently an artefact (or possibly an alternate 3' UTR end to the KIAA0168 gene. It bridges over to the KIAA0168-containing contig AL133354 (which overlaps AL133396 over its first 104 bp in opposite orientation). Only 83-346 (57%) are not masked by MER94, AT low complexity, and L1MA9 retrotransposon flanking regions which make it unsuitable for coding; even the unmasked section has numerous stop codons in all reading frames. AA485158, another unsupported singlet human cDNA clone of length 362 is found just across the contig border, in a repeat-free zone spanning position 1295-1573 of AL133354.
The KIAA0168 gene consists of 5426 bp spanning 2024-45645 of AL133354 on the minus strand. When the first 2023 residues are reverse-complemented and butted to AL133396 and GenScan run again (ie, on everything between KIAA0168 and doppel), exactly the same new gene emerges. When all 59726 bp of AL133354 are adjoined, exons 1, 2 and 4 of new gene here is incorporated in a bogus 685 aa protein that includes 21-304 of KIAA0168 (ie GenScan has problems finding all of the well-characterized KIAA0168 gene). Doppel is found reliably by GenScan under all circumstances. A gene-within-a-gene scenario seems improbable here but cannot be pursued as no adjoining genomic sequence is available centromeric to KIAA0168.
Upstream of KIAA0168, two further stacks of ESTs are found in masked sequence, represented by AW304777 spanning 46093-46603 and AA904658 55202-55801. Neither translates to an open reading frame, suggesting 3' UTR and a gene to the centromere terminating here. There being no coding exons, the feature is not recognized by GenScan. The large gap in genomic sequence stymies further interpretation. These ESTs could also represent unrecognized or alternative 5'UTR exons of KIAA0168 itself.
Proprietary data resources, such as the TIGR gene indexes, Celera, and InCyte offer no further accessible new information though potentially singlet ESTs could receive additional backing or lengthening. Mouse genome would also help but is not currently available in this region. LabonWeb (essentially collating a large number of blast searches) does a respectable job, finding the whole EST family plus a candidate protein, KIAA1011 (AB023228) at 24% identity
This apparent gene represents a new orphaned protein family, though there is weak support with blastp over 80 residues to various isoforms of the Numb gene. It is premature to annotate protein properties at SwissProt. Spanning almost 50 kbp with 2 intronic pseudogenes, it accounts for all known ESTs and cDNAs between doppel and KIAA0168 (other than as noted above). The most reliable predicted portions are the 76 aa of the exons ESRHLSVLLAAVFLMFKTAFLLALKEPSFAFT ... CGCLQPILPPAPRVIFRNASLIVESTFLTPFSALLFVTPLLYKR-.
The bottom line overall for weakly transcribed genomic features (which are very abundant in the human genome) is that further directed experimental transcript work is required, similar to that needed to validate doppel. Indeed, we see again that genomic sequencing is best balanced with both annotation and feature confirmation effort.
18 Jun 00 webmaster research ADRA1A: adrenergic, alpha-1A-, receptor OMIM: 104219 .. UniGene: Hs.557 .. GeneCard .. SwissProt .. NM_000678 RefSeq (cDNA): 1860 bpThe 572 aa 7x transmembrane protein alpha-adrenergic receptor belongs to the family of G-protein coupled receptors that activate a phosphatidylinositol- calcium second messenger system. Its effect is mediated by G(Q) AND G(11) proteins. It has aspargine glycosylation sites at 65 and 82]. The distal portion of the long arm of chromosome 5 contains a large number of genes encoding membrane receptors belonging to various gene families, including homologous adrenergic receptors ADRB2 and ADRA1B, which are however several million bases apart.
ADRA1A is part of a very large gene family (121 blastp matches) that includes receptors such as alpha1C adrenergic, beta-1-adrenergic, 5-hydroxytryptamine 1a, beta-3-adrenergic, serotonin, beta2-adrenergic, adrenergic, alpha-2c and so on, making it completely unsuitable for synteny investigations. The nomenclature of this group of genes is totally garbled, with pharmacological effect confused with phylogenetic relationships, and grossly inconsistent use of protein names.. A 1991 BBRC paper mistakenly added a negative strand ORF as amino terminus to the ADRA1A gene (M76446: MAAALRSVMMAGYLSEWRTPTYRSTEMVQRLRMEAVQHSTSTAA); this was noticed by 1994 but the erroneous GenBank entry persists to this day and has permeated many secondary genomic resources.
ADRA1A should not be confused with an unrelated but similar acronym ARIA, for acetylcholine receptor-inducing activity, which was once thought to correspond to normal prion function, based on extensive copurification observed by Harris et al. PNAS 1991 Sep 1;88(17):7664-8. This would have been a most provocative result had the proteins of genomically adjacent genes copurified.
The account below apparently describes the first partial gene model for ADRA1A, as well as its flanking genes and direction of transcription. The amino terminal 86 residues (for the time being, exon 1) lie in the highly fragmented contig AL357040. Residues 87-371 are on a larger fragment of this piece and comprise exon 2. The known mRNA finishes nicely on another contig, AL121675 = dJ779E11, where coding exon 3 (201 aa, residues 367-572) is fused with 361 bp of 3'UTR. Exon 3 begins at 101921 of minus dJ779E11; the stop codon of exon 3 ends at position 101322 leaving 101,321 bp for a 3' linking probe.
Indeed, the ADRA1A gene is quickly extended in the telomeric direction to AL356414 and to a finished sequence containing the goliath gene and a ferritin light pseudogene in contig [NT_001989 = AL031670 = dJ681N20]. ADRA1A is also extendable in the centromeric direction to [AL121781*11 = HSJ1164C1], a 215,128 bp fragmented contig known to partially overlap the prion-doppel finished contig [NT_001001 = AL12191 + AL133396] of length 100167 + 148497 bp which is known to connect to the almost finished KIAA0168 contig.
This establishes the gene order as telomere -...- ferritin light -- goliath -- ADRA1A -- prion -- doppel -- KIAA0168 -..- centromere, as predicted by GenMap00. ADRA1A is transcribed towards the telomere. It is apparently the nearest 5' flanking gene to the prion-doppel complex. Some fragment ordering and assembling is feasible now, affording better gene prediction; however, in a month or two, these sequences may be finished.
The gene density is quite low, about 40% of the genome-wide average: 5 active genes in 1.18 million bp which works out to only 15,217 genes overall. While genes can be missed, it is very unlikely that the 5-7 genes needed to bring density up to average will be found. Overlap of contigs raises observed gene density GDB reports179 genes so far on all of chr 20 which is 1/50 of the whole genome.
The genomic neighborhood of the prion gene is gene-poor: AL353194 198987 bp 31 contigs 49715, 30978, 70931 useable, 17 kbp overlap C-term goliath NT_001989 130263 bp goliath, ferritin light ps AL356414 188914 bp too fragmented to annotate* 12-JUN-2000 16 contigs AL121675 113168 bp ADRA1A carboxy 08-JUN-2000 1 fragment AL357040 155950 bp ADRA1A amino* 12-JUN-2000 54 fragments AL121781 215128 bp no gene features in large fragments psL7* 18-JUN-2000 11 pieces NT_001001 248564 bp prion, doppel, two ps AL133354 130915 bp KIAA0168 01-JUN-2000 2 fragments total 1,182,902 bp (* needs adjustment to discount overlap)Some oddities might be pursued upon the Sanger Centre finishing these contigs. First, there is some indication of a weak tandem ADRA1A pseudogene. Second, there is an astonishing 99% genomic match of AL121781 (or distal AL357040) to 75 kbp AL133403, achromosome 6 unfinished contig, over almost the entire 215128 bp (note AC021974 provides a seemingly redundant further chr 20 match). This chr 6 match does not extend into NT_001001 nor AL121675 whether finished or unfinished. Chromosome 6 is also being sequenced by the Sanger Centre. AL133403 is 260, 049 bp in 30 contigs, the largest is 48504 bp. If not some data mixup, the correspondence would have to reflect a very recent, possibly allelic, translocative duplication. Further telomerically, the newly determined gene model for attractin, ATRN, had counterparts on chr 2 and chr 10 but not chr 6.
Neuroscience Letters: 290 (2) 2000 117-120 Simon Mead, Jonathan Beck, Andrew Dickinson, Elizabeth M. C. Fisher, John CollingeHighlights:
A novel human gene named Doppel (DPL) that has homology to the prion protein gene (PRNP) has recently been identified on chromosome 20p. By automated sequencing we have found a common (M174T, 48%) and an uncommon coding polymorphism. The polymorphic frequency of the M174T allele was examined in cases of new variant and sporadic Creutzfeldt-Jakob Disease and compared with the frequency in the normal UK population.
In sharp distinction to the M129V polymorphism of PRNP we have not found any evidence of disease association nor is there any association with age of onset, disease duration, or PrPsc strain type.
While the central role of PrP in prion pathogenesis is clear, many aspects of prion-related neurodegeneration remain obscure, notably the normal cellular function of PrP and the cause of neuronal loss. Searches for other PrP-like genes that may be involved in the pathogenesis of prion disease have, until recently, proved negative. However, a mouse gene has now been characterized which lies approximately 16kb downstream of the murine PrP locus. This gene, designated Doppel (Dpl), encodes a 179 residue protein with 19% identity and 50% amino-acid similarity to the C-terminal globular domain of PrP.
Importantly, the progressive cerebellar ataxia which has been described in some lines of Prnp knockout mice is associated with over-expression of Dpl . Conversely, other lines of Prnp knockout mice without Doppel upregulation lack this phenotype . Transgenic mice expressing N-terminally truncated PrP, leaving only the globular domain homologous to Doppel, also develop ataxia. It is possible that Doppel overexpression may be responsible for the neurodegeneration and that Doppel interacts with PrP either directly or by competing for the same receptor.
The human genomic sequence AL133396 comprises 150kb centered on PRNP and DPL. We used a general sequence analysis package called NIX (which runs a selection of bioinformatic programs on a given sequence, available at www.hgmp.mrc.ac.uk) and BLAST. The human DPL ORF was immediately identified by >70% nucleotide homology to the mouse Dpl ORF. The start of the human DPL ORF is located 25.1kb 3' to the start of the PRNP ORF. Mouse expression data has shown the presence of three normal Dpl exons, the ORF contained within the second exon.
The second mouse exon possesses a short or a long 3'UTR designated 2a and 2b. There are also two intergenic exons in the mouse, found in weakly expressed chimeric Prnp-Dpl transcripts. These chimeric transcripts provide the mechanism for over-expression of Dpl in the transgenic lines where the splice acceptor of Prnp exon 3 is deleted.
There is evidence to suggest a similar gene structure in human. ESTs have been identified for the human DPL ORF and three clusters 3' to the ORF. There is also a region of human genomic sequence 3kb 5' to the DPL ORF with 70% homology to a region around mouse exon 1 and a putative promoter region. There are a number of other regions of mouse-human homology, some of which correspond to known mouse exons and human EST clusters. Comparison of the human EST clusters identifies evidence of a common polymorphism in the human DPL 3'UTR. G at position 50624 is found in AA758081, AI288920, and AI242370 whereas A is found in AL13396, AI337054 and AI825182. There are no sequencing errors in neighbouring nucleotides.
The precise gene structure in human must be determined by expression analysis. More than 97 normal, 9 sporadic CJD and 23 new variant CJD patients were sequenced and 4 polymorphisms were identified. The first polymorphism was an ATG-ACG nucleotide change at codon 174, leading to a methionine to threonine substitution (M174T) . This polymorphism lies three residues from the C-terminus of DPL and so would be unlikely to affect Doppel protein structure. The second polymorphism is a non-coding C-T change and lies 38 bases 3' to the first. The two polymorphisms are, unsurprisingly, in tight linkage disequilibrium, the 174T allele has only been seen in association with a C at the 3'UTR site. Thirdlya non-coding polymorphism was identified in codon 174 at the third position (ACG-ACA) in 3/104 normal individuals.
Finally, an uncommon ACG-ATG (T26M) polymorphism at codon 26 was found (figure 2c). The methionine allele occurred in 3/98 normal individuals. This change occurs in a cluster of hydrophobic residues that follow a possible signal peptide at the N-terminus. Although the second nucleotide in this codon is also polymorphic (M174T) the three normal individuals were homozygous at this position (ACA-ACG).
The M174T polymorphism alters an Nla III restriction site allowing rapid genotyping. We typed 79 normal individuals from the CEPH family collection and 96 normal UK individuals. These were compared with 41 patients with nvCJD and 76 with sporadic CJD. Allele frequencies are given in the table. There was no difference in the M174T genotype frequency between nvCJD, sporadic CJD and normal individuals . Neither was there any association between age at onset, or duration of disease in sporadic CJD or new variant CJD and M174T genotype.
CJD can be classified into four molecular strain types by analysis of protease digested PrPSc. Also there was no obvious association between PrPSc type and this polymorphism. The M allele of T26M was found in 2/35 nvCJD cases and 0/9 sporadic CJD cases. The ACG-ACA change at codon 174 was found in 1/23 nvCJD and 0/9 sporadic CJD cases. PRNP M129V allele status was known in 137/174 normal individuals we screened. The frequency distribution of the four haplotypes comprising PRNP M129V and DPL M174T (MM, MT, VM, VT) is consistent with random segregation between the two loci.
We have not found evidence that polymorphism of the Doppel gene is associated with susceptibility to or phenotypic expression of new variant or sporadic CJD. Our findings are similar to those recently reported, the P56L polymorphism was not seen in our prion disease or normal samples. It remains possible that there is a lesser effect that we did not have the power to detect, particularly with the necessarily small number of variant CJD cases available for study. It is also possible that undetected polymorphisms in non-coding sequences involved in Doppel gene regulation, and not in linkage disequilibrium with M174T may play a role in disease susceptibility. A more extensive study of marker haplotypes across the whole genomic region and quantitative expression data will be necessary to address this.
Reference List 1 Moore RC, Lee IY, Silverman GL, Harrison PM, Strome R, Heinrich C et al. Ataxia in Prion Protein (PrP)-deficient Mice is Associated with Upregulation of the Novel PrP-like Protein Doppel. Journal of Molecular Biology 1999;292: 797-817. 2 Sakaguchi S, Katamine S, Nishida N, Moriuchi R, Shigematsu K, Sugimoto T et al. Loss of cerebellar Purkinje cells in aged mice homozygous for a disrupted PrP gene. Nature 1996;380: 528-31. 3 Bueler H, Fischer M, Lang Y, Bluethmann H, Lipp H-P, DeArmond SJ et al. Normal development and behaviour of mice lacking the neuronal cell-surface PrP protein. Nature 1992;356: 577-82. 4 Manson JC, Clarke AR, Hooper ML, Aitchison L, McConnell I, Hope J. 129/Ola mice carrying a null mutation in PrP that abolishes mRNA production are developmentally normal. Mol Neurobiol 1994;8: 121-7. 5 Shmerling D, Hegyi I, Fischer M, Bl”ttler T, Brandner S, G–tz J et al. Expression of amino-terminally truncated PrP in the mouse leading to ataxia and specific cerebellar lesions. Cell 1998;93: 203-14. 10. Peoc'h K, Guerin C, Brandel J-P, Launay J-M, Laplanche J-L. First report of polymorphisms in the prion-like protein gene (PRND): implications for human prion diseases. Neuroscience Letters 2000;286: 144-148.
Arch Neurol 2000 Jul;57(7):1058-1063 unobstructed full text Collins S, Boyd A, Fletcher A, Byron K, Harper C, McLean CA, Masters CL...An 82-year-old woman with pathologically confirmed CJD was found unexpectedly to harbor a novel mutation in PRNP.
Routine clinical investigations were undertaken to elucidate the cause of the rapidly progressive dementia and neurological decline manifested by the patient, including magnetic resonance imaging of the brain, electroencephalography, and cerebrospinal fluid analysis for the 14-3-3 beta protein. Standard postmortem neuropathological examination of the brain was performed, including immunocytochemistry of representative sections to detect the prion protein. Posthumous genetic analysis of the open reading frame of PRNP was performed on frozen brain tissue using polymerase chain reaction and direct sequencing.
The consistently negative immunostaining results for PrP in all brain regions, despite using 3 antibodies recognizing different regions of the protein, even though simultaneous control sections always stained positive with all 3 antibodies.
Concomitant with the exclusion of alternative diagnoses, the presence of characteristic periodic sharp-wave complexes on the electroencephalogram in combination with a positive result for 14-3-3 beta protein in the cerebrospinal fluid led to a confident clinical diagnosis of CJD, confirmed at autopsy. There was no family history of dementia or similar neurological illness, but patrilineal medical information was incomplete. Unexpectedly, full sequencing of the PRNP open reading frame revealed a single novel mutation consisting of an adenine-to-guanine substitution at nucleotide 611, causing alanine to replace threonine at codon 188. [60 controls lacked this change. The patient was admitted to the hospital with confusion following a fall at home, yet another apparent traum-associated onset.--webmaster]
In addition to expanding the range of PRNP mutations associated with human prion diseases, we believe this case is important for the following reasons. First, from an epidemiological perspective, the avoidance of occasional incorrect classification of patients manifesting neurodegenerative disorders that may have a genetic basis requires systematic genotyping, particularly when there are uncertainties regarding the family history. Second, the incidence of spongiform encephalopathy in elderly patients beyond the typical age range may be underestimated and does not preclude a genetic basis. Finally, as a corollary, this case highlights problematic issues in human transmissible spongiform encephalopathies, as illustrated by disease penetrance and age of onset in genotype-phenotype correlations. ...
Unfortunately, because of issues relating to the patient's illegitimate birth, no information was available concerning the medical history of the father or his family, and a protective code of secrecy was adopted by informed matrilineal relatives so that direct sourcing of information from the other surviving maternal relatives was not possible.
Working within this constraint, the only history of neurological disease in the patient's mother was sudden death from a stroke at age 75 years (verified on death certificate). Among the 11 maternal half-siblings of the proband, 7 were dead, with the family's understanding of cause of death as "heart" or "stroke" in 5 (ages at death ranged from 48 to 67 years), motor neurone disease in 1 (died at age 65 years), and a motor vehicle accident in 1 (died at age 50 years). The 4 living siblings (ages ranged from 58 to 68 years) were all well. The index case had 3 children, aged 52, 60, and 65 years, who were alive and well. At the time of this report, the proband's children all declined PRNP genotyping, and because of the code of secrecy adopted by the informed family members, none of the other relatives was offered screening for the T188A mutation.
Comment (webmaster): With a single case in an 82 year old and no family history, the case could not be made for familial CJD (recall N171S). Such a late onset means that even had history on the father's side been better known, little would be learned as few survived into their 80's in the last century. Of course no diagnosis could have been made until the 1920's (and even that is far-fetched). Note that the large matrilineal lineage family and siblings declined genotyping. It is unlikely that anyone would ever pursue in vitro conversion efficiency; perhaps an independent kindred will someday show up or 100 year old controls.
They overlooked earlier work establishing familial CJD in T188R and T188K, by Windl (1999) Hum Gen 105.244 and Finckh U (2000) Am J Hum Genet 66:110. [See the standard compilation of point mutations.]
As noted on this site 6 months ago, codon 188 is already a little bizarre. Of 12 CpG mutations in the CJD table, 10 have standard CpG outcomes (C to T or G to A). But at codon 188, two distinct changes are seen (now three), none canonical. The usual CpG outcomes give ATG met or ACA thr [silent], neither observed. (T188M could be neutral and so undetected.) Possibly local DNA secondary structure is affecting the mutational spectrum. Codon 188 is both a CpG site and a hotspot for CJD but the two concepts don't line up. The deamidation of methylated cytosine may be repaired oddly. There is no other codon with 3 amino acid variants though P105 and Q212 have two.
T188 occurs in helix B. It is a conserved element in mammalian prions including marsupial but not alignable to doppel or birds.
In summary, the preferred scenario has a CpG site is spewing off alternatives at hotspot rates that are slightly off due to secondary structure. While not mission-critical over the longest time frames, its absolute conservation within mammals implies any change as destablilizing. This fact, a better kindred story for causation in T188R and T188K, and confirmation of CJD in the single T188A case favor the interpretation of T188A as CJD-causative. The later onset then is due to similarly-sized alanine being a milder substitution for polar threonine than the charged residues arginine and lysine.
Finally, the authors should be applauded for these conclusions concerning ascertainment:
"patients manifesting neurodegenerative disorders that may have a genetic basis require systematic genotyping, particularly when there are uncertainties regarding the family history. Second, the incidence of spongiform encephalopathy in elderly patients beyond the typical age range may be underestimated and does not preclude a genetic basis. Finally, as a corollary, this case highlights problematic issues in human transmissible spongiform encephalopathies, as illustrated by disease penetrance and age of onset in genotype-phenotype correlations. "
12 Aug 00 webmaster researchIt is feasible to look systematically at microsatellites and low-complexity regions in the human prion-doppel region (as defined by Repeatmasker: pattern of 1 to 6 bp, repeated enough to give 20+ bp). There are 305 of them in the 600,000 bp of completed human sequence flanking the prion gene. These are provided as a comma-delimited text file suitable for importing into a spreadsheet.
What is interesting is that 3 separate humans have been sequenced in the prion region with some overlap (not counting a 4th at Celera). Sequencing error is unlikely due to multiple passes taken by experienced labs. By comparing these genomic sequences, it turns out that 3 di-allelic and 1 tri-allelic microsatellite polymorphismic sites already have been found:
1. (A)n between prion and doppel: 31 copies in U29185; 33 in AL133396; 34 in AF106918. [U29185: positions 35416-35446]
2. (TATG)n just prior to doppel CDS: 39.8 copies in AF106918; 41.8 in AL133396. [AF106918: positions 5616-5649]
3. (T)n post-doppel: 31 in AF106918, 32 copies in AL133396. [AF106918: positions 26728-26758]
4. A low complexity GC-rich site of length 54 in U29185, positions 13166-13219 but of length 55 in AL133396.
In other words, these give some very close-in markers useful (low recombination, lots of variations) in tracking distant kindreds with various types of familial CJD. It might also be worthwhile to see whether nvCJD victims have a common set of markers as well. Note however that mutation rates are so high that they can occur even within a fairly recent kindred; SNPs might be preferred today, though many of these will be rapid CpG type (which does occurs but once in 350 microsatellites here).
Since many amyloidoses are triplet-expansion diseases, either within exons or introns, it is worth looking at the potential for these in microsatellites. However, neither the CAG repeats of Huntington's disease nor the GAA/TTC repeats of Fredriech's ataxia occur:
(ATG)n 3 (ATG)n 3 (CAA)n 3 (CAA)n 3 (CAA)n 3 (CAT)n 3 (CCA)n 3 (GGA)n 3 (GGA)n 3 (TCC)n 3Now the prion octapeptide repeat region, at 24 bp, is not a microsatellite. However, within it, runs of 3-4 glycines (codons GGx) naturally give rise to small runs of 3-5 G's and GGTGGT. While these are too short to be enumerated as microsatellites and phase-changing length mutations are excluded, nonetheless this is a very frequent site of length variation within mammalian repeats.
cctcagggcggtggtggctgggggcagcctcatggtggtggctgggggcagcctcatggtggtggctgggggcagccccatggtggtggctggggacagcctcatggtggtggctgg P Q G G G G W G Q P H G G G W G Q P H G G G W G Q P H G G G W G Q P H G G G W G QIt is not totally trivial to align long genomic sequences with Blast or ClustalW. This is because so many retrotransposons occur that false alignments can be overwhelming. On the other hand, they can hardly be filtered out when simple repeats are the very object of inquiry. The comma-delimited mini-databases below align homologous microsatellites if placed side-by-side:
U29185,1,337,380,+,AT_rich,.,44,Low_complexity,1,44 U29185,2,1924,1951,+,AT_rich,.,28,Low_complexity,1,28 U29185,3,6961,6984,+,AT_rich,.,24,Low_complexity,1,24 U29185,4,7267,7290,+,AT_rich,.,24,Low_complexity,1,24 U29185,1,7533,7558,+,(GA)n,2,13,Simple_repeat,2,27 U29185,5,12565,12586,+,GC_rich,.,22,Low_complexity,1,22 U29185,6,13166,13219,+,GC_rich,.,54,Low_complexity,1,54 U29185,7,13489,13536,+,G-rich,.,48,Low_complexity,4,51 U29185,8,21385,21411,+,AT_rich,.,27,Low_complexity,1,27 U29185,9,24833,24858,+,AT_rich,.,26,Low_complexity,1,26 U29185,10,27483,27513,+,AT_rich,.,31,Low_complexity,1,31 U29185,2,28227,28255,+,(TTTTG)n,5,6.2,Simple_repeat,1,31 U29185,11,30028,30048,+,AT_rich,,21,Low_complexity,1,21 U29185,3,33600,33631,+,(CAAAA)n,5,6.4,Simple_repeat,1,32 U29185,4,35416,35446,+,(A)n,1,31,Simple_repeat,1,31 aligns with (reverse complementarity numbering): AL133396 ,44476,44519,+,AT_rich,,44,Low_complexity,1,44 AL133396 ,46063,46090,+,AT_rich,,28,Low_complexity,1,28 AL133396 ,51100,51123,+,AT_rich,,24,Low_complexity,1,24 AL133396 ,51406,51429,+,AT_rich,,24,Low_complexity,1,24 AL133396 ,51672,51697,+,(GA)n,2,13,Simple_repeat,2,27 AL133396 ,56703,56724,+,GC_rich,,22,Low_complexity,1,22 AL133396 ,57305,57359,dif,GC_rich,,55,Low_complexity,1,55 AL133396 ,57629,57676,+,G-rich,,48,Low_complexity,4,51 AL133396 ,65525,65551,+,AT_rich,,27,Low_complexity,1,27 AL133396 ,68973,68998,+,AT_rich,,26,Low_complexity,1,26 AL133396 ,71647,71677,+,AT_rich,,31,Low_complexity,1,31 AL133396 ,72391,72419,+,(TTTTG)n,5,6.2,Simple_repeat,1,31 AL133396 ,74192,74212,+,AT_rich,,21,Low_complexity,1,21 AL133396 ,77764,77795,+,(CAAAA)n,5,6.4,Simple_repeat,1,32 AL133396 ,79580,79612,dif,(A)n,1,33,Simple_repeat,1,33 and also with the last 3 with AF106918,12,228,248,+,AT_rich,.,21,Low_complexity,1,21 AF106918,5,3800,3831,+,(CAAAA)n,5,6.4,Simple_repeat,1,32 AF106918,6,5616,5649,dif,(A)n,1,34,Simple_repeat,1,34 -=-=-=-=-=-=- from intergenic to doppel: AF106918,7,7368,7391,+,(T)n,1,24,Simple_repeat,1,24 AF106918,13,8358,8399,+,T-rich,.,42,Low_complexity,4,45 AF106918,8,8369,8403,+,(TTTA)n,4,8.8,Simple_repeat,2,36 AF106918,14,17856,17880,+,GC_rich,.,25,Low_complexity,1,25 AF106918,9,20366,20526,+,(TATG)n,4,39.8,Simple_repeat,2,160 AF106918,10,20527,20549,+,(CA)n,2,11.5,Simple_repeat,2,24 AF106918,11,22548,22573,+,(T)n,1,26,Simple_repeat,1,26 AF106918,12,24213,24234,+,(T)n,1,22,Simple_repeat,1,22 AF106918,13,26728,26758,+,(T)n,1,31,Simple_repeat,1,31 AF106918,15,31090,31110,+,AT_rich,.,21,Low_complexity,1,21 aligns with: AL133396,81332,81355,+,(T)n,1,24,Simple_repeat,1,24 AL133396,82322,82363,+,T-rich,,42,Low_complexity,4,45 AL133396,82333,82367,+,(TTTA)n,4,8.8,Simple_repeat,2,36 AL133396,91820,91844,+,GC_rich,,25,Low_complexity,1,25 AL133396,94329,94498,dif,(TATG)n,4,41.8,Simple_repeat,2,168 AL133396,94499,94521,+,(CA)n,2,11.5,Simple_repeat,2,24 AL133396,96245,96271,?,(T)n,1,27,Simple_repeat,1,27 AL133396,?,?,?,?,?,?,?,?,? AL133396,100690,100721,dif,(T)n,1,32,Simple_repeat,1,32 AL133396,105053,105073,+,AT_rich,,21,Low_complexity,1,21
Type AT_rich (T)n (A)n (CA)n (TG)n GA-rich (TA)n T-rich GC_rich (CAAA)n CT-rich (TTTA)n (TTTTG)n (TTTTTG)n (TAAA)n A-rich (CAA)n (CAAAA)n (CAAAAA)n (CTGGGG)n (TATG)n (TTCC)n (TTTC)n (TTTG)n (TTTTA)n (TTTTC)n (ATG)n (GA)n (GCTTG)n (GGA)n (TCTG)n (TTCA)n (TTGG)n G-rich (AACTG)n (AGGGGG)n (CAAAT)n (CAGTA)n (CAT)n (CATATA)n (CCA)n (CCCCG)n (G)n (GAAA)n (GAAAA)n (GGGA)n (GGGGA)n (TAAAA)n (TAAAAA)n (TAGA)n (TATAA)n (TC)n (TCC)n (TTAAA)n C-rich polypyrimidine
Freq 113 32 23 17 17 16 11 9 8 6 6 5 5 5 4 4 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Bovine sequence AB001468 has 4.8 copies of a single microsatellite, (CAAAAA)n upstream of its prion gene. However this has no counterpart in human. Sheep sequence U67922 has 4 copies of (AACTG)n and 14.5 copies of (CA)n, both upstream. Here too there is no useful comparision across species. Microsatellites, by their nature, are too subject to change to survive over long evolutionary time frames.
d12 Aug 00 webmaster compilation See also Unigene Navigator for full set of 432 Blast Unigene or search Unigene
|ADA||Hs. 1217||adenosine deaminase||GDB:119649|
|ADRA1A||Hs. 557||adrenergic, alpha-1A-, receptor||..|
|AHCY||Hs. 85111||S-adenosylhomocysteine hydrolase||GDB:118983|
|BMP2||Hs. 73853||BMP2A||bone morphogenetic protein 2|
|BMP7||Hs. 51117||OP-1||bone morphogenetic protein 7|
|CENPB||Hs. 85004||centromere protein B (80kD)||GDB:118768|
|CHGB||Hs. 2281||chromogranin B (secretogranin 1)||...|
|CHRNA4||Hs. 10734||cholinergic receptor nicotinic alpha 4||neuronal nicotinic acetylcholine receptor a-4|
|COL9A3||Hs. 53563||collagen, type IX, alpha 3||GDB:625400|
|CSNK2A1||Hs. 12740/a">||casein kinase 2, alpha 1 polypeptide||GDB:129560|
|CST1||Hs. 36995||cystatin SN||GDB:119815|
|CST3||Hs. 75780||cystatin C (amyloid angiopathy)||GDB:119817|
|CST4||Hs. 56319||cystatin S||GDB:136381|
|EDN3||Hs. 1408||ET3||endothelin 3|
|EEF1A2||Hs. 2642||translation elongation 1 alpha 2||...|
|FKBP1A||Hs. 752||FKBP1||FK506-binding protein 1A (12kD)|
|GNAS1||Hs. 77288||GNAS||guanine nucleotide binding, alpha stim 1|
|GSS||Hs. 82327||glutathione synthetase||GDB:637022|
|HCK||Hs. 89555||JTK9||hemopoietic cell kinase|
|ID1||Hs. 75424||inhibitor of DNA binding 1, dom neg hlh||GDB:434745|
|INSM1||Hs. 89584||IA-1||insulinoma-associated 1|
|OXT||Hs. 1843||OT||oxytocin, prepro-neurophysin I|
|PCK1||Hs. 1872||phosphoenolpyruvate carboxykinase 1||GDB:125349|
|PCNA||Hs. 78996||proliferating cell nuclear antigen||GDB:120261|
|PLCB4||Hs. 74014||phospholipase C, beta 4||GDB:547787|
|PLCG1||Hs. 993||PLC1||phospholipase C, gamma 1|
|PPGB||Hs. 985||GSL||protective for beta-galactosidase|
|PYGB||Hs. 75658||phosphorylase, glycogen; brain||GDB:120326|
|RBL1||Hs. 87||retinoblastoma-like 1 (p107)||GDB:226502|
|RPN2||Hs. 75722||ribophorin II||GDB:119571|
|RPS21||Hs. 1948||ribosomal protein S21||...|
|SDC4||Hs. 72082||syndecan 4 (amphiglycan, ryudocan)||SYND4|
|SEMG1||Hs. 1968||SEMG||semenogelin I|
|SEMG2||Hs. 1981||semenogelin II||GDB:132657|
|SNRPB||Hs. 83753||SNRPB1||small nuclear ribonucleoprotein B B1|
|SNT1||Hs. 31121||syntrophin, alpha dystrophin A1, 59k||GDB:433889|
|TGM3||Hs. 2022||transglutaminase 3 E polypeptide||GDB:128014|
|TNNC2||Hs. 94377||troponin C2, fast||...|
|TOP1||Hs. 317||topoisomerase (DNA) I||GDB:120444|
|YWHAB||Hs.||tyrosine 3-monooxygenase/ beta||GDB:217081|
|ZNF133||Hs. 78434||zinc finger protein 133||GDB:137032|
|ZNF8||Hs. 99971||zinc finger protein 8||Hs. 2077|
10 Aug 00 MedlineComment (webmaster): Friedreich's ataxia, the most common inherited ataxia, is a pseudo-polyglutamine disorder often due to a trinucleotide (GAA) expanded repeat within an intronic Alu. It is an autosomal recessive neurodegenerative disease, with an estimated prevalence of approximately 1 in 50,000 and a deduced carrier rate of higher than 1 in 110 in European populations. [Analysis of 600,000 bp of genomic flanking sequence shows no GAA/TTC microsatellites with potential for expansion.]
Like many proteins involved in neurodegenerative disorders, the normal function of frataxin has been hard to pin down. Like the prion protein, frataxin binds a transition metal, has a novel (thus unhelpful) 3D fold , has numerous characterized point mutations, and has been sequenced in a great variety of species.
Unlike the prion protein, the frataxin orthologue can be readily identified in worm, fly, yeast, and even bacteria, with some residues remaining invariant for billions of years. While indicating some very basic and quasi-universal role in metabolism, that role in human could not be inferred because the protein function was equally not understood in these model organisms. The lesson here is that an immense amount may be known yet normal function is still very difficult to get at. Yet this is essential in many therapeutic scenarios.
This may or may not hold for the prion protein. Fish prion sequence, to be determined shortly, is the gateway to the model organisms. The prion protein, despite its above-average conservation, has amino acid compositional properties in its very best conserved regions that homology searches difficult. The protein is obviously quite ancient, yet it remains orphaned within the amniotes. Will the normal function of the prion/doppel orthologs be already known in model organisms, or witll it turn out to be another frataxin?
PNAS 2000 Aug 1;97(16):8932-7 Cho SJ, Lee MG, Yang JK, Lee JY, Song HK, Suh SWFriedreich ataxia is an autosomal recessive neurodegenerative disease caused by defects in the FRDA gene, which encodes a mitochondrial protein called frataxin. Frataxin is evolutionarily conserved, with homologs identified in mammals, worms, yeast, and bacteria. The CyaY proteins of gamma-purple bacteria are believed to be closely related to the ancestor of frataxin.
In this study, we have determined the crystal structure of the CyaY protein from Escherichia coli at 1.4-A resolution. It reveals a protein fold consisting of a six-stranded antiparallel beta-sheet flanked on one side by two alpha-helices. This fold is likely to be shared by all members of the conserved frataxin family. This study also provides a framework for the interpretation of disease-associated mutations in frataxin and for understanding the possible functions of this protein family.
Frataxin shows a remarkable evolutionary conservation, with homologs present in mammals, Caenorhabditis elegans, yeast, and Gram-negative bacteria. Significant similarity between the C-terminal portion of frataxin (residues 90-210) and the CyaY proteins of gamma-purple bacteria implies that the FRDA gene evolved from a CyaY gene of the mitochondrial ancestor. Disruption of the FRDA homolog in yeast, Yfh1, resulted in accumulation of iron in mitochondria and deficiency in Fe-S dependent respiratory enzymes and aconitase. The neurodegeneration observed in FRDA is believed to be the result of mitochondrial iron accumulation and oxidative stress. It has been also observed that the frataxin family bears limited sequence homology to a bacterial protein family, which confers resistance to tellurium.
Figure 1 provides sequence alignment of 10 members of frataxin family from Escherichia coli, Yersinia pestis, Yersinia intermedia, Erwinia chrysanthemi, Haemophilus influenzae, Rickettsia prowazekii, Saccharomyces cerevisiae, Schizosaccharomyces pombe, mouse, and human.
Nucleic Acids Res 2000 Jul 15;28(14):2815-2822 Grabczyk E, Usdin KLarge expansions of the trinucleotide repeat GAA*TTC within the first intron of the X25 (frataxin) gene cause Friedreich's ataxia, the most common inherited ataxia. Expansion leads to reduced levels of frataxin mRNA in affected individuals. Here we show that GAA*TTC tracts, in the absence of any other frataxin gene sequences, can reduce the amount of GAA-containing transcript produced in a defined in vitro transcription system.
This effect is due to an impediment to elongation that forms in the GAA*TTC tract during transcription, a phenomenon that is exacerbated by both superhelical stress and increased tract length. On supercoiled templates the major truncations of the GAA-containing transcripts occur in the distal (3') end of the GAA repeat.
To account for these observations we present a model in which an RNA polymerase advancing within a long GAA*TTC tract initiates the transient formation of an R*R*Y intramolecular DNA triplex. The non-template (GAA) strand folds back creating a loop in the template strand, and the polymerase is paused at the distal triplex-duplex junction.
Am J Hum Genet 2000 Sep;67(3):549-562 Adamec J, Rusnak F, Owen WG, Naylor S, Benson LM, Gacy AM, Isaya GFrataxin deficiency is the primary cause of Friedreich ataxia (FRDA), an autosomal recessive cardiodegenerative and neurodegenerative disease. Frataxin is a nuclear-encoded mitochondrial protein that is widely conserved among eukaryotes.
Genetic inactivation of the yeast frataxin homologue results in mitochondrial iron accumulation and hypersensitivity to oxidative stress. Increased iron deposition and evidence of oxidative damage have also been observed in cardiac tissue and cultured fibroblasts from patients with FRDA. These findings indicate that frataxin is essential for mitochondrial iron homeostasis and protection from iron-induced formation of free radicals.
The functional mechanism of frataxin, however, is still unknown. We have expressed the mature form of Yfh1p (mYfh1p) in Escherichia coli and have analyzed its function in vitro. Isolated mYfh1p is a soluble monomer (13,783 Da) that contains no iron and shows no significant tendency to self-associate. Aerobic addition of ferrous iron to mYfh1p results in assembly of regular spherical multimers with a molecular mass of approximately 1.1 MDa (megadaltons) and a diameter of 13+/-2 nm. Each multimer consists of approximately 60 subunits and can sequester >3,000 atoms of iron.
In yeast mitochondria, native mYfh1p exists as monomer and a higher-order species with a molecular weight >600,000. After addition of (55)Fe to the medium, immunoprecipitates of this species contain 16 atoms of (55)Fe per molecule of mYfh1p. We propose that iron-dependent self-assembly of recombinant mYfh1p reflects a physiological role for frataxin in mitochondrial iron sequestration and bioavailability.
Journal of General Virology (2000), 81, 2327-2337. Ragna Heggebø, Charles McL. Press, G. Gunnes, Kai Inge Lie, ... M. H. Groschup, Thor LandsverkA sensitive immunohistochemical procedure was used to investigate the presence of prion protein (PrP) in the ileal Peyer's patch of PrP-genotyped lambs, including scrapie-free lambs and lambs naturally and experimentally exposed to the scrapie agent. The tyramide signal amplification system was used to enhance the sensitivity of conventional immunohistochemical procedures to show that PrP was widely distributed in the enteric nervous plexus supplying the gut wall. In scrapie-free lambs, PrP was also detected in scattered cells in the lamina propria and in the dome and interfollicular areas of the Peyer's patch.
In the follicles, staining for PrP was mainly confined to the capsule and cells associated with vascular structures in the light central zone. In lambs naturally exposed to the scrapie agent, staining was prominent in the dome and neck region of the follicles and was also found to be associated with the follicle-associated epithelium. Similar observations were made in lambs that had received a single oral dose of scrapie-infected brain material from sheep with a homologous and heterologous PrP genotype 1 and 5 weeks previously.
These studies show that the ileal Peyer's patch in young sheep may be an important site of uptake of the scrapie agent and that the biology of this major gut-associated lymphoid tissue may influence the susceptibility to oral infection in sheep. Furthermore, these studies suggest that homology or heterology between PrP genotypes or the presence of PrP genotypes seldom associated with disease does not impede uptake of PrP.
Comment (webmaster): Has a far-from-optimal technique for IHC been used worldwide for all these years? This has many implications for under-diagnosis and too-low incidence of TSEs. Recall that:
Appl Immunohistochem Molecul Morphol 2000 Jun;8(2):162-5 Holm RIn situ hybridization is a technique that allows detection of specific DNA and RNA sequences in tissue sections. Nonisotopic techniques are fast and give a precise localization of the hybridization product, but a drawback is the low sensitivity. However, the sensitivity is dependent on the detection system used.
To evaluate a sensitive in situ hybridization method with nonradioactive probes we compared three different detection systems, using biotin-labeled human papillomavirus (HPV) 16 probes.
The three detection systems included (i) STAV-FITC method (streptavidin-fluorescein isothiocyanate/alkaline phosphatase anti-FITC), (ii) APAAP method (mouse anti-biotin/anti-mouse IgG/alkaline phosphatase mouse anti-alkaline phosphatase), and (iii) tyramide signal amplification (TSA) method (STAV-horseradish peroxidase (HRP)/biotinyl tyramide/STAV-HRP).
The in situ hybridization methods were tested on CaSki and SiHa cells and two cervical carcinomas known to be HPV16 positive. The cells and tissues and been fixed in 4% buffered formalin and paraffin embedded. The three different detection systems gave satisfactory nuclear staining in CaSki cells (CaSki cells contain > 500 copies of HPV16 DNA) and the two cervical carcinomas. However, demonstration of HPV16 DNA in SiHa cells (SiHa cells contain one to two HPV16 genome copies) was possible only by use of the APAAP method. It was concluded that the APAAP method provides the best sensitivity among the nonisotopic detection systems and can detect single viral copies in formalin-fixed and paraffin-embedded material.
14 Aug 2000 MedlineComment (webmaster): Schatzl's group has here a puzzling result on prion repeat lengths in two species of lemur: Varecia variegata variegata and Lemur macaco albifrons. Heterozygotes of 2x, 5x repeats are found. Recall Siberian goats had 3x; no 2x has ever been reported in other species or in disease. Due to rarity, it was not possible to look at more than two individuals. The GenBank entries will be posted shortly; reference sequences are provided below to the 165 aa determined.
Variable intensity of the 2x band is reported, depending on PCR conditions. A technical issue thus arises in regards to repeat DNA slippage: did this happen genomically or during PCR? A PCR artefact seems very unlikely: parallel human controls only had 5x, PCRs were repeated numerous times with thorough denaturation, the lab has sequenced many dozens of prion genes without such effects, no 3x or 4x slippage was seen, and nothing in the 5x sequence indicates additional secondary structure.
The two species of lemur differed at 4 positions in DNA, all silent transitions. Two are clustered in adjacent glycines (third codon position) in repeat 3, raising a question about their independence. Lemur taxonomy remains a mess (1, 2, 3), but it is not surprising to find the amino acid sequences identical given the slow rate of change of the prion protein elsewhere.
Lemur prion protein is quite conventional. Note that the first octapeptide repeat is 8 residues instead of 9; a missing fourth glycine (which occurs consistently in New World monkeys and as a rabbit allele) may facilitate replication slippage to 2x (16/17 nucleotides match between R1 and R5), perhaps unfavored in 9x first repeat species. A lysine at position 164 (human numbering) is unusual; mammalian prions are invariably arginine here except for camel, lama, mink, and squirrel monkey. The serine at 174 is unusual for a primate (asparagine); however lemurs are basal and serine/threonine are common variations at this position. The signal peptide was not sequenced; presumbably it would begin as the short form, MA--NLGC/Y.
Since 2x and 5x in two species are found by looking only a few individuals, both alleles must be common, say 50% frequency. This means 2x homozygotes must also be common 25%. Thus it is important to acquire more lemur samples to determine whether there is counter-selection against 2x (or 5x) homozygotes. Binding copper and superoxide dismutase activity may still be possible with only two repeats. However, this is the absolute minimum and the first and/or last repeats may normally have capping roles. See structural model of copper bound to repeat.
The other oddity here is that two rather different genera of lemurs have the unusual 2x allele. Does this mean that all Lemuridae (resp. all Strepsirhini) carry this polymorphism? Is it a fairly old allele that persisted across several speciation events? Since it is unlikely that it arise twice in two separate lemur groups that happened to be the only two sequenced, it seems to persist through some selective advantage, possibly down-modulating some function associated with the repeat region (partial gene copy mechanism).
The high reported susceptibility in conjunction with unprecedented 2x repeats raises the question of whether lemurs are on the verge of sporadic TSE with 2x, perhaps like 7x cattle breeds, sporadic TSE is masked by short lifespans.
Lemurs are MKHV instead of MKHM at the 3F4 epitope (aa 109-112), raising questions about its use by Bons et al. in PNAS 1999 Mar 30;96(7):4046-51 (note: a second antibody to106-126 was also used). The 3F4 epitope is probably not as clean and simple as MKHM. In fact, it will depend on the whole adjacent 3D surface to some extent. Thus it must be tested experimentally in each species separately, no matter how many MKHV species have been negative so far. The region is tabulated for 91 species below. From this, it follows that M and V are probably allelic in most groups of mammal if enough individuals are sequenced. Thus Bons' individual lemurs could well have M at 112.
Lemur species studied by Bons and by Schatzl (*) Eulemur fulvus mayottensis Eulemur fulvus albifrons Eulemur mongoz Eulemur macaco* Lemur catta Microcebus murinus Varecia variegata variegata* Varecia varietata rubra
Biol Chem. 2000 May-Jun;381(5-6):521-3. Gilch S, Spielhaupter C, Schatzl HM.
Lemur reference sequences. DNA variations (4) shown in caps. Varecia variegatus variegatus 5 OR: aagcctggaggaggctggaacactggggggagccgatacccggggcaaggcagccctgga K P G G G W N T G G S R Y P G Q G S P G ggcaaccgctacccaccccagggcggcggctggggacagcctcatggcggtggctgggga G N R Y P P Q G G G W G Q P H G G G W G cagccccatggCggCggctggggacaaccccatgggggtggctggggacagcctcatggt Q P H G G G W G Q P H G G G W G Q P H G ggtggctggggtcaaggaggtggctctcacggtcagtggaacaagcccagtaaaccaaaa G G W G Q G G G S H G Q W N K P S K P K accaacatgaagcacgtggcaggtgccgcagcggctggggcagtggtgggtggccttggt T N M K H V A G A A A A G A V V G G L G ggctacatgctAgggagtgcCatgagcaggcccctcatacattttggcaatgactatgag G Y M L G S A M S R P L I H F G N D Y E gaccgttactatcgcgaaaacatgtaccgttaccccaaccaagtgtactacaaaccggtg D R Y Y R E N M Y R Y P N Q V Y Y K P V gatcagtacagcaaccagaacagcttcgtgcacgactgcgtcaatatcaccatcaagcag D Q Y S N Q N S F V H D C V N I T I K Q cacacggtcaccacca H T V T T Varecia variegatus variegatus 2 OR: aagcctggaggaggctggaacactggggggagccgatacccggggcaaggcagccctgga K P G G G W N T G G S R Y P G Q G S P G ggcaaccgctacccaccccagggcggcggctggggacagcctcatggtggtggctggggt G N R Y P P Q G G G W G Q P H G G G W G caaggaggtggctctcacggtcagtggaacaagcccagtaaaccaaaaaccaacatgaag Q G G G S H G Q W N K P S K P K T N M K cacgtggcaggtgccgcagcggctggggcagtggtgggtggccttggtggctacatgctA H V A G A A A A G A V V G G L G G Y M L gggagtgcCatgagcaggcccctcatacattttggcaatgactatgaggaccgttactat G S A M S R P L I H F G N D Y E D R Y Y cgcgaaaacatgtaccgttaccccaaccaagtgtactacaaaccggtggatcagtacagc R E N M Y R Y P N Q V Y Y K P V D Q Y S aaccagaacagcttcgtgcacgactgcgtcaatatcaccatcaagcagcacacggtcacc N Q N S F V H D C V N I T I K Q H T V T acca T Black lemur (Lemur macaco albifrons; probably Eulemur macaco macaco) 5 OR: aagcctggaggaggctggaacactggggggagccgatacccggggcaaggcagccctgga K P G G G W N T G G S R Y P G Q G S P G ggcaaccgctacccaccccagggcggcggctggggacagcctcatggcggtggctgggga G N R Y P P Q G G G W G Q P H G G G W G cagccccatggTggTggctggggacaaccccatgggggtggctggggacagcctcatggt Q P H G G G W G Q P H G G G W G Q P H G ggtggctggggtcaaggaggtggctctcacggtcagtggaacaagcccagtaaaccaaaa G G W G Q G G G S H G Q W N K P S K P K accaacatgaagcacgtggcaggtgccgcagcggctggggcagtggtgggtggccttggt T N M K H V A G A A A A G A V V G G L G ggctacatgctGgggagtgcTatgagcaggcccctcatacattttggcaatgactatgag G Y M L G S A M S R P L I H F G N D Y E gaccgttactatcgcgaaaacatgtaccgttaccccaaccaagtgtactacaaaccggtg D R Y Y R E N M Y R Y P N Q V Y Y K P V gatcagtacagcaaccagaacagcttcgtgcacgactgcgtcaatatcaccatcaagcag D Q Y S N Q N S F V H D C V N I T I K Q cacacggtcaccacca H T V T T Black lemur (Lemur macaco albifrons; probably Eulemur macaco macaco) 2 OR: aagcctggaggaggctggaacactggggggagccgatacccggggcaaggcagccctgga K P G G G W N T G G S R Y P G Q G S P G ggcaaccgctacccaccccagggcggcggctggggacagcctcatggtggtggctggggt G N R Y P P Q G G G W G Q P H G G G W G caaggaggtggctctcacggtcagtggaacaagcccagtaaaccaaaaaccaacatgaag Q G G G S H G Q W N K P S K P K T N M K cacgtggcaggtgccgcagcggctggggcagtggtgggtggccttggtggctacatgctG H V A G A A A A G A V V G G L G G Y M L gggagtgcTatgagcaggcccctcatacattttggcaatgactatgaggaccgttactat G S A M S R P L I H F G N D Y E D R Y Y cgcgaaaacatgtaccgttaccccaaccaagtgtactacaaaccggtggatcagtacagc R E N M Y R Y P N Q V Y Y K P V D Q Y S aaccagaacagcttcgtgcacgactgcgtcaatatcaccatcaagcagcacacggtcacc N Q N S F V H D C V N I T I K Q H T V T acca T
Lemur prion reference sequences: no variation other than deletion of R2R3R4 octapeptides. Lemur prion 5 OR:165 aa; 2 OR 141 aa KPGGGWNTGGSRYPGQGSPGGNRYPPQGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGSHGQWNKPSKPKTNMKHVAGAAAAGAVVGGLGG YMLGSAMSRPLIHFGNDYEDRYYRENMYRYPNQVYYKPVDQYSNQNSFVHDCVNITIKQHTVTTBelow are the 91 species and 1 sheep and 1 mouse allele for 109-112 in phylogenetic grouping that can be imported into Excel or similar databases. Note M and V at 112 substituted for each other 6 times in various lineages during evolution, even in primates, valine occurs in two OW monkeys and one NW monkey, in addition to lemurs. Hamsters are instructive also.
J Mol Biol 1999 Nov 5;293(4):855-63 Kanyo ZF, Pan KM, Williamson RA, Burton DR, Prusiner SB, Fletterick RJ, Cohen FEThe X-ray crystallographic structures of the anti-Syrian hamster prion protein (SHaPrP) monoclonal Fab 3F4 alone, as well as the complex with its cognate peptide epitope (SHaPrP 104-113), have been determined to atomic resolution. The conformation of the decapeptide is an omega-loop. There are substantial alterations in the antibody combining region upon epitope binding. The peptide binds in a U-shaped groove on the Fab surface, with the two specificity determinants, Met109 and Met112, penetrating deeply into separate hydrophobic cavities formed by the heavy and light chain complementarity-determining regions. In addition to the numerous contacts between the Fab and the peptide, two intrapeptide hydrogen bonds are observed, perhaps indicating the structure bound to the Fab exists transiently in solution....
Comment (Alex Bossers): "3F4 was made by Richard Kascak in Staten-Island New York. Much molecular work has been done since using 3F4, for example Caughey et al. Basically 3F4 is often used for switching between species mouse and hamster. Hamster is 3F4+ while mouse isn't. In vitro conversion assays using mouse mutated to 3F4+ don't differ much from conversion assays using normal mouse. This effect has been used for determining species tropism switches as attempted by Collinge et al. (which unfortunately failed).
The 3F4 is a 'clean' monoclonal recognizing mainly the sequence MKHM by which the 2x M (at hamster positions 109 and 112) are a prerequisite for binding. ie mouse is MKHV (like sheep and many others) and does normally not bind in immunoprecipitation and standard Western blots.
However, this 'clean' reaction seems determined by the assay used: in Western blots and immunoprecipitations, epitopes are able to (partly) refold or stay undenatured giving a good reaction with 3F4. However, immunohistochemistry (many different methods) tends to use drastical denaturing methods (yes/no combined with PK) mainly to get rid of the PrP-C background signal. Thereby it might respond differently to antibodies and the 3F4 reaction may not be that clean. We've seen this also with different antibodies (even polyclonals to peptides) in which they do react with, say, BSE on Western blot but can't immunoprecipitate bovine PrP protein."
91 species and 1 sheep and 1 mouse allele for 109-112 (human numbering) Comma-delimited data imports cleanly into Excel M,K,H,V,Antilocapra americana,1,.,.,.,. M,K,H,V,Bison bonasus,2,.,.,.,. M,K,H,V,Bos javanicus,3,.,.,.,. M,K,H,V,Bos primigenius taurus,4,.,.,.,. M,K,H,V,Bos taurus,5,.,.,.,. M,K,H,V,Tragelaphus angasi,6,.,.,.,. M,K,H,V,Tragelaphus strepsiceros,7,.,.,.,. M,K,H,V,Capris hircus,8,.,.,.,. M,K,H,V,Capra ibex nubiana,9,.,.,.,. M,K,H,V,Ovibos moschatus moschatus,10,.,.,.,. M,K,H,V,Ovis aries,11,.,.,.,. T,K,H,V,Ovis aries,12,T,.,.,. M,K,H,V,Ovis canadensis,13,.,.,.,. M,K,H,V,Cervus elaphus nelsoni,14,.,.,.,. M,K,H,V,Cervus nippon dybowskii,15,.,.,.,. M,K,H,V,Cervus elaphus,16,.,.,.,. M,K,H,V,Muntiacus muntjak,17,.,.,.,. M,K,H,V,Odocoileus hemionus hemionus,18,.,.,.,. M,K,H,V,Odocoileus virginianus,19,.,.,.,. M,K,H,V,Dama dama,20,.,.,.,. M,K,H,M,Giraffa camelopardalis reticulitis,21,.,.,.,M M,K,H,V,Gazella subgutturosa,22,.,.,.,. M,K,H,V,Budorcas taxicolor,23,.,.,.,. M,K,H,V,Addax nasomaculatus,24,.,.,.,. M,K,H,V,Hippotragus niger,25,.,.,.,. M,K,H,V,Oryx leucoryx,26,.,.,.,. M,K,H,V,Tursiops truncatus,27,.,.,.,. M,K,H,V,Sus scrofa,28,.,.,.,. M,K,H,V,Camelus dromedarius,29,.,.,.,. M,K,H,V,Lama glama,30,.,.,.,. M,K,H,V,Equus familiaris caballus,31,.,.,.,. M,K,H,V,Equus quagga boehmi,32,.,.,.,. M,K,H,V,Canis lupus canadensis,33,.,.,.,. M,K,H,V,Canis lupus familiaris,34,.,.,.,. M,K,H,V,Canis lupus hallstromii,35,.,.,.,. M,K,H,V,Felis catus,36,.,.,.,. M,K,H,V,Mustela putorius,37,.,.,.,. M,K,H,V,Mustela vison,38,.,.,.,. M,K,H,M,Homo sapiens,39,.,.,.,M M,K,H,M,Pan troglodytes,40,.,.,.,M M,K,H,M,Pongo pygmaeus,41,.,.,.,M M,K,H,M,Gorilla gorilla,42,.,.,.,M M,K,H,M,Symphalangus syndactylus,43,.,.,.,M M,K,H,M,Hylobates lar,44,.,.,.,M M,K,H,M,Presbytis francoisi,45,.,.,.,M M,K,H,M,Papio hamadryas,46,.,.,.,M M,K,H,M,Mandrillus sphinx,47,.,.,.,M M,K,H,M,Macaca sylvanus,48,.,.,.,M M,K,H,M,Macaca nemestrina,49,.,.,.,M M,K,H,M,Macaca mulatta,50,.,.,.,M M,K,H,M,Macaca fuscata,51,.,.,.,M M,K,H,M,Macaca fascicularis,52,.,.,.,M M,K,H,M,Macaca arctoides,53,.,.,.,M M,K,H,M,Colobus guereza,54,.,.,.,M M,K,H,M,Cercocebus torquatus atys,55,.,.,.,M M,K,H,M,Cercopithecus patas,56,.,.,.,M M,K,H,M,Cercopithecus neglectus,57,.,.,.,M M,K,H,M,Cercopithecus mona,58,.,.,.,M M,K,H,M,Cercopithecus dianae,59,.,.,.,M M,K,H,M,Cercocebus aterrimus,60,.,.,.,M M,K,H,M,Cercopithecus aethiops,61,.,.,.,M M,K,H,V,Callicebus moloch,62,.,.,.,. M,K,H,M,Theropithecus gelada,63,.,.,.,M M,K,H,M,Saimiri sciureus,64,.,.,.,M M,K,H,V,Cebus apella,65,.,.,.,. M,K,H,V,Callithrix jacchus,66,.,.,.,. M,K,H,M,Ateles paniscus x Ateles fusciceps,67,.,.,.,M M,K,H,M,Ateles geoffroyi,68,.,.,.,M M,K,H,M,Aotus trivirgatus,69,.,.,.,M M,K,H,V,Lemur macaco albifrons,70,.,.,.,. M,K,H,V,Varecia variegata variegata,71,.,.,.,. M,K,H,M,Cavia porcellus,72,.,.,.,M M,K,H,V,Oryctolagus cuniculus,73,.,.,.,. M,K,H,M,Mesocricetus auratus,74,.,.,.,M F,K,H,V,Mus musculus,75,F,.,.,. L,K,H,V,Mus musculus,76,L,.,.,. L,K,H,V,Rattus norvegicus,77,L,.,.,. L,K,H,V,Rattus rattus,78,L,.,.,. M,K,H,V,Meriones unguiculatus,79,.,.,.,. M,K,H,V,Sigmodon fulviventer,80,.,.,.,. M,K,H,V,Sigmodon hispiedis,81,.,.,.,. M,K,H,V,Cricetulus griseus,82,.,.,.,. M,K,H,M,Cricetulus migratorius,83,.,.,.,M L,K,H,V,Trichosurus vulpecular,84,L,.,.,. L,K,H,V,Grus americana,85,L,.,.,. F,K,H,V,Gallus gallus,86,F,.,.,. F,K,H,V,Pachyptila desolata,87,F,.,.,. F,K,H,V,Pachyptila turtur,88,F,.,.,. F,K,H,V,Vultur gryphus,89,F,.,.,. F,K,H,V,Struthio camelus,90,F,.,.,. F,K,H,V,Anas platyrhynchos,91,F,.,.,. F,K,H,V,Tyto alba,92,F,.,.,. M,K,A,M,Trachemys scripta,93,M,.,A,M
BBRC 2000 Jul 14;273(3):890-893 Kubosaki A, Ueno A, Matsumoto Y, Doi K, Saeki K, Onodera TIn this study, prion protein (PrP) mRNA was focally detected in brain and placenta of pregnant sheep by Northern blot analysis. In addition, host-encoded cellular prion protein (PrP(C)) was observed in brain and placenta of the ruminant by Western blot analysis as well. Localization of PrP mRNA in pregnant sheep tissues was rendered possible with in situ hybridization. In sheep brain, PrP mRNA was predominantly localized within large neocortical neurons in the cerebrum, Purkinje cells and neurons of the molecular and granule cell layers in the cerebellum.
In the placenta, signals were observed in the myometrium, including stratum longitudinale tunicae muscles and circular layers of muscular tunics. In the caruncle and placentome, signals were stronger by in situ hybridization. Since accumulation of the scrapie isoform PrP (PrP(Sc)) is required to PrP(C), these results suggest that brain and placenta of sheep may be important organs and sites for the conversion of PrP(C) to PrP(Sc).