Prion paralogue found adjacent to main gene!
Long and short incubation mouse prion genes compared
Zebrafish RH panel locates prion gene
Could a bee sting cause scrapie? A near miss 29.2 MY ago
Anti-prion 4.5 kb mRNA: a harmless prion pseudogene?
Alternate exon 1 splicing in Holstein
MRI: specific diagnosis of nvCJD?
Cerebral amyloid angiopathy in an aged great spotted woodpecker.
Prion octapeptide: intermolecular copper
webmaster research posted 29 Aug 99 to web and BSE-listserve; updated 31 Aug 99
This new mouse sequence by IY Lee, D. Westaway, et al. was motivated in part by a defective IAP retrovirus element of 6,593 bp is inserted into the short incubation period mouse (Prp-a alleles) in the long intron 2 some 5,404 bp upstream of the translation start codon. That sequence is of comparable length, 38,418 bp, indicating that thousands of new positions have now been sequenced in Prnp-b which lacks the long retrovirus insertion.
We shall see shortly that 10,403 bp further downstream have been sequenced in long incubation mouse, ie, 29358-39760. This region contains an active new gene from 36212-36748 with a domain paralogous to the the distal region of the prion gene. Thus a stated goal of Lee et al. to determine the chromosomal environment of the prion gene has been attained on the 3' side, with the added bonus that the neighboring gene is related in sequence to the prion protein.
The large stretches of homologous overlap between long and short incubation period mouse sequences are also of considerable interest [compared below]. These mice differ in amino acid sequence at 108 and 189 and evidence in isogenic strains suggests that this is adequate to account for the different phenotypes, though the IAP insertion in commonly used strains of lab mice raises serious issues based on its influence on transcription in other situations (as noted by Lee et al.).
A simple alignment of long and short incubation period mouse genes shows 18,085 bp of sequence is available after the end of the mRNA for long incubation mouse but that the first 7,731 bp of this had already been examined for other genes by Lee with negative outcome (as with the 8,611 bp upstream exon 1) in short incubation mouse, human, and sheep. This leaves 10,403 bp of sequence with no counterpart in any sequenced mammalian prion gene. The human genome project has not yet reached this region of chromosome 20 (and mouse projects lag further). The idea here is to use the chromosome 20-restricted Blast server with the new mouse query, which 'extends' the Lee human prion sequence.
Four retrotransposons are noted in the GenBank annotation in this new region by Lee et al. These were confirmed, up to notational conventions, by the webmaster using Jurka's methodology [seebelow]. These insertions total 650 bp and so account for 6.2% of the new section. The ORF for the new gene is intercalated between the third and fourth of these retrotransposons. The GenBank annotation should be revised to reflect this, and perhaps a separate entry is needed for the new gene (which is known in human as well as mouse from sequenced mRNAs).
Lee's previous methods for searching for adjacent genes and pseudogenes included locating promoter CpG islands, using Blastn searches with masked sequences (here 29358-39760), seeking long ORFs in any frame conserved across species (not testable here as human sequence ends near short incubation mouse; sheep and cow end far earlier, at 5117 bp and 0 bp post-mRNA, resp.), and using programs such as GenscanW.
The webmaster took a slightly different approach on 28 Aug 99, using NCBI gapped tBlastx with the new distal portion of long incubation mouse prion as query against mouse and human databases of expressed sequence tags (dbEST). Recall tBlastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
tBlastx is a more sensitive method for detecting homology than Blastn because protein sequences are more conservative than nucleotide sequences, due to synonyomous codon drift, conserved residues, and statistical advantages of 20 choices instead of 4. The dbEST databases contain many mRNA sequences from factory labs not found in the non-redundant main GenBank repository; the human set is used despite the mouse probe because it is more extensive.
Blastn in fact will fail to pull up statistically significant matches from either repository in this case. Thus the new gene is easy to miss. (However GenscanW also suggests this new gene with all of U29187 used as query, in addition to finding the standard prion gene.) tBlastn and Blastp are successful on the main GenBank database once the correct probe is in hand.
While it is welcome news to have found the 3' boundary of the prion gene after all this sequencing effort, a most surprising bonus is that the neighboring gene is related in sequence to the distal part of the prion gene. This vindicates the reasoning given by Lee et al. on page 1023 of their 1998 paper for using large-scale sequencing to find "adjacent, perhaps functionally related, genes... in mammalian gene complexes."
For the time being, this new gene and protein will called Prnd, an acronym for prion-like doppelganger. Its function is unknown; note that it is missing the 106-126 amyloidogenic region as well as the repeat section. Based on the small number of dbEST sequences, the gene is probably rarely transcribed (or responds poorly to oligo dT amplification) relative to the prion gene itself which has been found several hundred time in EST sequencing.
[After the webmaster's posting of the discovery of the new prion paralogue gene on 29 Aug 99, D.Westaway at U Toronto kindly wrote to say that the 4 labs involved in sequencing the mouse gene already have a paper in press addressing the new gene, with new sequences to be released at the time the paper appears. This paper was submitted to the Journal of Molecular Biology on 19 May 99 accepted 24 June 99 and will be published imminently, possibly 3 Sep 99. While apparently reaching similar conclusions, they have additional mRNA and protein data that should greatly illuminate this remarkable situation. The GenBank sequence was re-annotated on 31 Aug 99 to show the Prnd gene. The protein sequence appears identical to the one posted earlier by the webmaster; some UTR detail could not have been deduced from the ESTs without further sequencing. They have not modified the human entry though prnd can be inferred there too and also, barely, in rat, from AI136375 3' UTR.
On 8 Sep 99 the GenBank entry was updated again. This time the provocative title of the upcoming JMB article was included:
J. Mol. Biol. (1999) In press [Sept 24, 1999 or later]. Moore,R.C., Lee,I.Y., Silverlman,G.L., Harrison,P.M., Storme,R., Heinrich,C., Karunaratne,A., Pasternak,S.H., Chishti,M.A., Liang,Y., Mastrangelo,P., Wang,K., Smit,A.F.A., Katamine,S., Carlson,G.A., Cohen,F.E., Prusiner,S.B., Melton,D.W., Tremblay,P., Hood,L.E. and Westaway,D.
They also came up with a clever name for the new gene, prion-like doppelganger, or prnd. This literally means a ghost-like companion. The webmaster has agreed to use this name because the mouse genome nomenclature committee has accepted it, noting however it has already been taken at GenBank, Medline, and Swissprot for PrnD for tryptophan halogenase, prnD for both aminopyrrolnitrin oxidase and delta1-pyrroline-5-carboxylate dehydrogenase, and prnd for an intergenic cis-acting regulatory region. (The word prion has been in use in other contexts for a century for that matter.)
In theory, the discovery of this gene really dates to results of the Washington University ESP and German Genome Projects, posted to GenBank 15 Feb 97 (AA104098) and 9 Feb 98 (AA796652) for mouse; 6 Aug 97 (AA234322) and 8 Jun 99 (AL042906) for human. There is no need to have the mouse genomic sequence in hand and no immediate need to know of the chromosome 20 adjacency because tBlastn with mouse prion protein terminal domain should and does , despite the asymmetry of tBlastn, pull out AA796652 below normal mouse prion ESTs (though just barely -- the full EST would then need to be run with Blastx to really establish convincing homology). An even more distal prion sequence should also pull AI588048 and allied sequences. The adjacency could then be readily established with FISH, YAC, or RH panels. However, as seen in the EST diagram below, the EST data has a gap of about 15 bp within the ORF and about a 1000 bp gap before the third set of ESTs.
However, AA796652, an EST dated 09-FEB-1998 from WashU, does a nice job of suggesting the spliced out intron, its length (but not its existence) depends on having long incubation period mouse sequence. The 38bp leader exon and the 9 bp leader into the ORF have no matches to prion gene exons or introns. The prion gene also lacks a 3' UTR exon and intron. Neiher the 3'UTR portions nor exon 2a and 2b really fit the EST data so must represent direct mRNA data; exon 3 at 38013-39315 because of AI509336 386 bp 38347-38726, AA190150 186 bp subsequence, and AI136375 38726-39082 are the available ESTs and the latter responds to oligo dA with oligo DT at 39082 (however rat could have short indels).
The human EST AL042906 is tantalizing in that the 221 bp do not correspond to anything in mouse gene, suggesting a remote exon in humans. The sequence has 6 undetermined nucleotides 3' in addition to seemingly have errors towards the 5' end. The abrupt failure to align past the ORF could reflect a splice site to a human exon so far downstream that it does not correspond to anything in mouse or that non-ORF has evolved too rapidly to allow homology detection. However, this region does in fact correspond at 89% identity to 866-1034 of dJ1068H6.00023 an unfinished sequence at the chromosome 20 Blast server.
The annotation does not describe a promoter region; perhaps this is analyzed in the paper. If co-regulated by some of the same cofactors as the prion gene, then the upstream region should carry similar SP-1 sites etc and perhaps the well-conserved motif region. But the rodent reference prion promoter shows no match above exon 1 of prnd. While some alternative splicing is seen in bovine prion exon 1, it is not yet clear whether the prnd gene has alternative splicing and/or alternate polyA signals. The polyA signal AATAAA at 39291 is not annotated: 1261 tgttcttatc ttgcaaacAA TAAAcaccca tacatataca gac. Mouse exon 2b exhibits no polyA signal so is presumbably a splice site. Note that the prnd gene is not previously known from mapping of chromosome 20 nor its synteny counterparts and there is no known disease associated with mutations in it.
The 31 Aug 99 annotation provides the following structure for the gene:
standard prion gene 6205-21675 [3' end of prion mRNA is separated by 12,411 bp from prnd exon 1] L1ME3A retrotransposon 30341-30573 [delimits prnd promoter region 5' of exon 1] exon 1 34086-34124 [5' UTR of 39 bp, no coding sequence] intron 1 34125-36204 [includes retrotransposon URR1A 35130 - 35341] exon 2a 36205-36799 [9 bp of 5' UTR, then coding sequence, then 49 bp of 3' UTR] exon 2b 36205-37472 [3'UTR extended form of exon 2 with 722 bp of 3' UTR] intron 2 37472-38012 [contains retrotransposon cPB1D9 37553-37665 = 113 bp] exon 3 38013-39315 1302 bp of 3' UTR, no coding sequence, classical polyA signal AATAAA at 39291] CDS 36212-36751 mRNA 34086-34124 + 36205-37472 + 38013-39315 Changing numbers to post CDS: exon 2a ends at 48 exon 2b ends at 721 exon 3 extends from 1262-2564 PB1D9- extends from 802-914
...agaggtgagctggtgggcaaaggta....... end of exon 2b tgtttgctggttgggcccctgcaatgtc....... end of exon 2b ............................gactccaggagttgctgagc start of exon 3
There are 9 ESTs put in the databases by other researchers that are relevent to the prnd gene; 6 from mouse, 2 from human, and 1 from rat that form 3 non-overlapping groups. While homology allows "mixing and matching" of species for purposes of tiling the gene or creating a more complete ancestral mammalian probe, protein sequences must be drawn from a single species' ESTs. The single rat sequence is all 3' UTR so it mainly confirms the presence of the mRNA in this species and allows a partial measurement of how fast this region is changing relative to mouse [83% identity pver 12 million years since divergence].
Upstream group:
>AA796652 426 bp mouse mammary, covers 38 bp of exon 1 and 390 bp of exon 2. 1 GGGCTCCAAG CTTCAGAGGC CACAGTAGCA GAGAACCG ag attcacc atg aagaaccggc 61 tgggtacatg gtgggtggcc atcctctgca tgctgcttgc cagccacctc tccacggtca 121 aggcaagggg cataaagcac aggttcaagt ggaccggaag tcctgcccag cagcggcggc 181 cagatcaccg aagctcgggt agctgagaac cgcccaggag ccttcatcaa gcaaggccgg 241 aagctggaca tcgactttgg agcagagggc aacaggtact acgcggctaa ctattggcag 301 ttccctgatg ggatctacta cgaaggctgc tctgaagcca acgtgaccaa ggagatgctg 361 gtgaccagct gcgtcaacgc cacccaggcg gccaaccagg ctgagttctc ccgggagaag 421 caggat >AL042906 731 bp human testis. Main guide to human prnd protein sequence. 1 tggctggcca ctgtctgcat gctgctcttc agccacctct ctgcggtcca gacgaggggc 61 atcaagcaca gaatcaagtg gaaccggaag gccctgccca gcactgccca gatcactgag 121 gcccaggtgg ctgagaaccg cccgggagcc ttcatcaagc aaggccgcaa gctcgacatt 181 gacttcggag ccgagggcaa caggtactac gaggccaact actggcagtt ccccgatggc 241 atccactaca acggctgctc tgaggctaat gtgaccaagg aggcatttgt caccggctgc 301 atcaatgcca cccaggcggc gaaccagggg gagttccaga agccagacaa caagctccac 361 cagcaggtgc tctggcggct ggtccaggag ctctgctccc tcaagcattg cgagttttgg 421 ttggagaggg gcgcaggact tcgggtcacc atgcaccagc cagtgctcct ctgccttctg 481 gctttgatct ggctcatggt gaaataagct tgccaggagg ctggcagtac agagcgcagc 541 agcgagcaaa tcctgcnagt gaccaactnt tcttccccaa accacgcgtg ttcttgaaag 601 gtgcccagaa cggcgatgca cttcgcactt gcaaatgccc cttnccacgt attgcncccc 661 tggtattgtg cccttgccgt ttctgataaa atggggggac ttgtgggctt tttccgtcac 721 ttccattt >AA234322 190 bp human melanocytes etc.covers first 125 bp of ORF, near subsequence of AL042906 1 aaggttctga cgccgatgag gaagcacctg tatctggtgg tggctggcca ctgtctgcat 61 gctgctcttc agccacctct ctgcggtcca gacgaggggc atcaagcaca gaatcaagtg 121 gaaccggaag ccctgctcag cactgcccag atcactgagg cccaggtggc tgagaaccgc 181 ccgggagcct >AA183841 383 bp mouse testis has 19/19 hit but apparently coincidental.
Distal ORF/proximal 3' UTR group:
>AA104098 472 bp from mouse heart; 120 bp of distal ORF; identities = 325/325 (100%) 3/148 to 327/472 of 3' UTR 36607-37078 1 gcgagtcctg tggcggctga tcaaagagat ctgctccgcc aagcactgcg atttctggct 61 ggaaagggga gctgcgcttc gggtcgccgt ggaccaaccg gcgatggtct gcctgctggg 121 tttcgtttgg ttcattgtga agtaaaaatc aatgaagctg gcagccacag aggtgagctg 181 gtgggcaaag gtagacagag gtagcccagt tctctctatc tagcccccga gtgttctgaa 241 agtacaacgt gtagcgtttc agggcatttc aaaagtccct cccaagtact ccccctactc 301 catgtgtttg ataatgtgtt tcagtgcccc tatgctccac ccctgtgaga cctggcctgt 361 tcctgccttt gcagctacac taggtgagaa accagccaaa ggatacaaga attgtcctgt 421 gcactccacc tgattttcca actccaggaa actgaggctc acattgcagg ag >AI428337 418 bp from mouse heart. Identities = 386/390 (98%) 187/418 to 576/29 of 3' UTR 29/37327 - 418/36938, no hit to first 28 bp, no ORF overlap 1 ttttatttgt cagaagacag attttcaaat gtctacttta gcttaatttt tgttatattg 61 tctcacgtaa ggtgtcaagg tgcacggttg tcgctcaggg cttcccactt agttcagggt 121 gtgacttcca catttcagtc cacattcggg actgcacttg gacctgagtc ttggagggat 181 caggaggcag ctctaggcta cacagagcag aaaacctgag ggaagacaga gcaggggaac 241 cccagctctt ctgttttcta actgtgctgg acagggactc ctgcaatgtg agcctcagtt 301 tcctggagtt ggaaaatcag gtggagtgca caggacaatt cttgtatcct ttggctggtt 361 tctcacccag tgtagctgca aaggcaggaa caggccaggt ctcaccaggg tggagcat >AI588048 120 bp from mouse heart (a subsequence of AA104098) 120 bp of distal ORF 36606-36725 1 agcgagtcct gtggcggctg atcaaagaga tctgctccgc caagcactgc gatttctggc 61 tggaaagggg agctgcgctt cgggtcgccg tggaccaacc ggcgatggtc tgcctgctggDistal 3' UTR group:
>AI509336 386 bp mouse spleen. Identities = 379/386 (98%) 1596/1 to 1975/386 of 3' UTR 38347-38726 full alginment 1 ggcccagatc acacagttta ccccttgcac gccattttaa atatcagaca ataaagaagg 61 aatagttagt tcttagatgg tgatgttggt gtttcagctc caagcttaag gcttagttac 121 ccgggtgcac aaagctgtag acaaccaact tccttccagc gtgaaaatgt gtaggctgga 181 cagagtcccg tgctccacga tcgctgagtt ctttcagacc attggttgtc acacccacaa 241 gcctaactgt gacagccaat gagtgtccat gtgattttga ctgcaaattt cgcttatttg 301 atttttagcg gagacgtcct tgctcttact tgtggatcag gagaggtttt cttagccctt 361 ctaagtgttc ccaacccgtg cgctgt AA190150 186 bp mouse spleen (subsequence of AI509336). Identities = 185/186 (99%) 1596/1 to 1781/186 of 3' UTR 1 ggcccagatc acacagttta ccccttgcac gccattttaa atatcagaca ataaagaagg 61 aatagttagt tcttagatgg tgatgttggt gtttcagctc caagcttaag gcttagttac 121 ccgggtgcac aaagctgtag acaaccaact tccttccagc gtgaaaatgt gtaggctgga 181 cagagt AI136375 368 bp rat; most distal sequence. Matches 38718-39072 at 83% identity with lowered gap penalties (note poly T, rat numbering) 1 tttttttttt tttgtatttt aaaggttgta atgaaacaaa attaaaaccc tcatctttat 61 atttagacac agtaaagggg tacgggtctc atactaaggt agaaggaggg gatagggtta 121 aagaaacaag aaaacaaaag tcattaaaaa atggacagaa agcactagat gatgtcaagt 181 aaagctttaa ggtacccagc gttctggccc ggtattagga ttgcaaattg ctacatttgc 241 agttgcaaat gagtaaagga gtggagagtc tggagaggtg aaaactttgt gaccgatcag 301 actcaggaat tgcagccccg ccccctttta ctcggctgtc agtttgcgtg cagttctaca 361 gagcacacBe this as it may, the EST researchers did not recognize or pursue the similarity -- there is no annotation in the entry signifying similarity to prion gene (which would have enabled other researchers to find it by full text search). Indeed, EST labs would have no way of knowing that this would be of tremendous interest to researchers in the prion field.
In other words, the prnd gene could have been found on 16 Feb 97 had anyone troubled to ask an idle computer to take each entry in dbEST(or a non-redundant subset) at regular intervals and done Blastx against the main GenBank protein repository. [tBlastx is not allowed online; Blastn is too weak.] A simpler method, apparently not available, is an alerting service whereby a researcher submits a protein sequence of interest and asks for a daily email alert of blastx run against this single protein database.
Thus this landmark result for the prion gene could have been found 30 months ago. If a paralogue for prion gene could be discovered in this manner, what manner of other important data lies buried in dbESTs? Are there additional paralogues for the prion gene -- or for that matter the prnd gene -- still to be found within the EST collection? -- perhaps these could be picked up by tblastn using prion protein itself (or a proper fragment thereof) with favorable parameter settings and a great detail of patience in EST chromosome tiling.]
The initial hits from tBlastx provide homology at the protein level. This is sharply delimited to a distal portion of mammalian prion protein, approximately the last 100 residues (including the GPI anchor region). Extensive searches do not find any match to the proximal part of the prion protein. The full length of the new gene is 179 amino acids, its molecular weight 20442 and pI quite basic at 9.51 (due to 26 arg + lys).
The ORF runs from position 36212-36748 of the long incubation mouse sequence, coding for 179 amino acids. The human sequence is is 74% identical and 85% similar to mouse (the EST situation for human is of limited quality, leaving its N-terminus uncertain). These genes are unlikely to be pseudogenes because polyadenylated mRNA is recovered from various tissues (heart, mammary, testis); mouse and human proteins are still strongly homologous after nearly 100 million years (indicating selective pressure); and a long open reading from with initiating methionine and terminal amber stop codon.
>mouse prnd protein
MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLD
IDFGAEGNRYYAANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLW
RLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVK
>human prnd protein
.......WWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFIKQGRKLD
IDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEFQKPDNKLHQQVLW
RLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLMVK
>human prnd protein from contig 00023
MRKHLSWWWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFI
KQGRKLDIDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEF
QKPDNKLHQQVLWRLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLtVK
>mouse prnd U29187 36212-36748+stop or AA796652-AA104098 tiled ESTs
atg aagaaccggctgggtacatggtgggtggccatcctctgcatgctgcttgccagccac
ctctccacggtcaaggcaaggggcataaagcacaggttcaagtggaaccggaaggtcctgc
ccagcagcggcggccagatcaccgaagctcgggtagctgagaaccgcccaggagccttcat
caagcaaggccggaagctggacatcgactttggagcagagggcaacaggtactacgcggct
aactattggcagttccctgatgggatctactacgaaggctgctctgaagccaacgtgacca
aggagatgctggtgaccagctgcgtcaacgccacccaggcggccaaccaggctgagttctc
ccgggagaagcaggatagcaagctccaccagcgagtcctgtggcggctgatcaaagagatc
tgctccgccaagcactgcgatttctggctggaaaggggagctgcgcttcgggtcgccgtgg
accaaccggcgatggtctgcctgctgggtttcgtttggttcattgtgaag taa
>human prnd from AL042906 and subsequenceAA234322
GGAAGCACCTGTATCTGGTGGtggctggccactgtctgcatgctgctcttcagccacctc
tctgcggtccagacgaggggcatcaagcacagaatcaagtggaaccggaaggccctgccc
agcactgcccagatcactgaggcccaggtggctgagaaccgcccgggagccttcatcaag
caaggccgcaagctcgacattgacttcggagccgagggcaacaggtactacgaggccaac
tactggcagttccccgatggcatccactacaacggctgctctgaggctaatgtgaccaag
gaggcatttgtcaccggctgcatcaatgccacccaggcggcgaaccagggggagttccag
aagccagacaacaagctccaccagcaggtgctctggcggctggtccaggagctctgctcc
ctcaagcattgcgagttttggttggagaggggcgcaggacttcgggtcaccatgcaccag
ccagtgctcctctgccttctggctttgatctggctcatggtgaaataa
Blast alignment of mouse and human prnd
Score = 278 bits (704), Expect = 2e-74
Identities = 127/171 (74%), Positives = 146/171 (85%)
Frame = +1
Query: 9 MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLD 68
......WW+A +CMLL SHLS V+ RGIKHR KWNRK LPS+ QITEA+VAENRPGAFIKQGRKLD
Sbjct: 1 STCIWWWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTA-QITEAQVAENRPGAFIKQGRKLD 177
Query: 69 IDFGAEGNRYYAANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDS 128
IDFGAEGNRYY ANYWQFPDGI+Y GCSEANVTKE VT C+NATQAANQ EF +K D+
Sbjct: 178 IDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEF--QKPDN 351
Query: 129 KLHQRVLWRLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVK 179
KLHQ+VLWRL++E+CS KHC+FWLERGA LRV + QP ++CLL +W +VK
Sbjct: 352 KLHQQVLWRLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLMVK 504
A Wu-Blast alignment suggests the homology could even be more extensive. If the repeat is left out and the signal region considered a parallel not needing sequence resemblance, then the alignment becomes closer to global:
Score = 108 (38.0 bits), Expect = 0.00023, P = 0.00023 Identities = 40/155 (25%), Positives = 77/155 (49%), Frame = +2 Query: 27 KWNRKALPST-----AQITEAQVAENRPGAFI---KQGRKLDIDFGAE-GNRYYEANYWQ 77 +WN+ + P T A A G ++ R + I FG++ +RYY N + mprnp: 341 QWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPI-IHFGSDYEDRYYRENMHR 517 Query: 78 FPDGIHYNGCSEANVTKEAFVTGCINAT------QAANQGE-FQKPDNKLHQQVLWRLVQ 130 +P+ ++Y E + + FV C+N T +GE F + D K+ + R+V+ mprnp: 518 YPNQVYYRPMDEYS-NQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMME----RVVE 682 Query: 131 ELCSLKH---CEFWLERGAGLRVTMHQPVLLCLLALIWLMV 168 ++C ++ + + +RG+ + + PV+L + LI+L+V mprnp: 683 QMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIV 805Now the prnd gene is well-established by several independent criteria, irregardless of any similarity to prion protein. Homology to prion protein is not particularly striking at 22% identity and 55% similarity and a 0.03% chance of finding a match of this quality by chance at the current size of the non-redundant GenBank database (462,722 sequences, 1,273,519,603 bp). These are low levels of similarity. Yet it cannot be coincidence that the prnd Blast probe pulls out 102 prion sequences and nothing else from this vast database, especially when the Blast tool does not know that the context is chromosome 20 adjacency to the prion gene.
There are no other sequences in the databases homologous to DHAP proteins or DNA. Bird prions do not consistently show signficant homology, suggesting that the gene dates to the mammalian lineage. The question arises, how can the low percent identity and slow evolutionary change be reconciled with a late origin of this domain? (Note the region of sequence overlap does indeed amount to almost the full globular domain of the prion protein, ie alignment occurs shortly after the 106-126 region at no later than 139 in human numbering.) If the partial gene doubling or domain module switching occurred prior to bird/mammal split, sequence divergence is better explained.
The 3D structure is not as hopeless to predict as it first seems, due to conserved anchors and alpha helix, though it s worse than threading chicken onto mammal. The purported 28569 -28608 promoter, 3'utr, and polyA site and signal could be studied further and the meagre EST situation in humans is error-salvagable 5' though still unsatisfactory and a poor substitute for directed experimental effort. No homologue is apparent in nematode or drosophila, which is aggrevating as far as normal function of prnd/prp or easier experimental models.
The prnd ORF should be sequenced in at least 3 species best chosen dynamically to minimize sequencing effort with respect to information gained. The proper choice of species depends very much on the rate of evolution of the particular protein and its individual domains of interest. Some blast and secondary structure tools are far more effective given a clustalw alignment.
The conserved residues are of interest: mouse prnd protein retains the first NVTK glycosylation site (but not the second though it has a second NATQ site), the two disulpide cysteines, many of the aromatic amino acids, and seemingly the GPI terminus (or at least a membrane tail):
CVNATQAANQAEFSREKQDSKLHQRVLW RLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVKPsort and SignalP 2.0 predict a possible signal region ending at position 26 MKNRLGTWWVAILCMLLASHLSTVKA/RGIKHRF with a possible transmembrane tail from positions 162 - 178 and to otherwise be located in the cytoplasm. Some alpha helix is predicted but little beta sheet:
The function of this mRNA and protein is totally unknown. There is no sign of a chromosomal tandem duplication on streak tests. It cannot be the 'anti-prion' because the gene is on the same strand so its mRNA would not be complementary to prion mRNA nor would it hybridize stringently. No amount of forcing causes protein homology to the earlier region of the prion protein ; however, weak homology and a long intron are hard to rule out. 35,341 is the end of the last known transposon; 37,553 begins the next one: this avoids conflict.
The next easy test is seeking CpG islands that correlate with the protein homology domain. This is most easily done by replacing CpG with XX in a colored font. The prion gene itself has a CpG island to serve as internal control. There is enrichment for CpG around 36300 but it is not dramatic, beginning at 36241 that could work for a simple gene with no untranslated leading exons. This areahas 26 CpG within 424 bp.
One could also look for promoters, splice boundaries, and polyA signals to complete the picture of this gene. GenScanW predicts a 40 bp promoter at 28569 -28608, a single exon gene at 36212 -36751 of 540 bp (or 179 amino acids) and a polyA at 37877 -37882 of sequence AATAA. This is very impressive agreement with alignment methods.
Interestingly, this software does not quite predict the main prion gene, but rather an 56 aa proximal extension [with no Blastp neighbors and no apparent signficance:
GENSCAN prediction of normal prion mskifvtnfllpkfndgflaplapacpapfhsrlprvvgsadrfwalrriggrsviMANL GYWLLALFVTMWTDVGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGTWGQPHGGGW GQPHGGSWGQPHGGSWGQPHGGGWGQGGGTHNQWNKPSKPKTNFKHVAGAAAAGAVVGGL GGYMLGSAMSRPMIHFGNDWEDRYYRENMYRYPNQVYYRPVDQYSNQNNFVHDCVNITIK QHTVVTTTKGENFTETDVKMMERVVEQMCVTQYQKESQAYYDGRRSSSTVLFSSPPVILL ISFLIFLIVGTo sum up, a viable gene has been found as the next-door neighbor of the mammalian prion gene. Further it bears a faint resemblance distally to the globular domain of the prion protein itself, ending the orphan status of this protein. Whether this domain exercises a similar function as a module of the new protein or interacts (say in a heterodimer) in any way with normal prion protein or affects the prion disease process remain to be established.
At this point, the most useful information would be sequences of the prnd protein from a few more species, tissue expression (especially brain), immunolocalization in vivo, and perhaps knockouts and recombinant protein production and interaction studies. The partial paralogue at this point has yielded no information about normal function of the prion protein nor does it quite illustrate the paralogous Gibbs principle lacking the 106-126 domain.
29 Aug 99 webmaster commentaryAs noted above, alignment of long and short incubation period mouse genes is quite worthwile for purposes of comparing laboratory strains.
The new long sequence begins 6204 bp above exon 1 versus 8,611 for the old short sequence. The mRNAs are 2,154 bp and 2,153 bp in length and differ only in 2 intermediate positions in the 3' UTR (other than the well-known non-synonymous coding changes). Prior to exon 1, aligning the comparable regions gives identities = 6109/6216 (98%) and gaps = 58/6216. The gaps largely represent replication slippage tandem repeats, eg:
Query: 1021 ggtgtggccatctgcatctggtatctggtctgaaggtgcgtggataccctctgtgcccgt 1080 |||||||||||||||||||||| ||||||||||||||||||||||||||||||| Sbjct: 3462 ggtgtggccatctgcatctggt-------ctgaaggtgcgtggataccctctgtgcccgt 3514 Query: 1856 gacacgcatggatacacatacacacacacacacacacacacacacacacacacacacaca 1915 |||||||||||||||||| |||||||||||||||||||||||||| Sbjct: 4295 gacacgcatggatacaca----------------cacacacacacacacacacacacaca 4338
In the promoter region, a run of perfect agreement over nearly 500 bp surprisingly gives way to 4 small changes near exon 1. It is difficult to assess whether these are important to regulation of transcription but the main motif region is conserved. Recall the promoter region in rodents may be delimited by the 1245 extraneous bp of rat cytochrome c pseudogene beginning 452 bp upstream of exon 1 (see graphic or alignment of known pre-exon 1 sequences).
long : 6049 agcatttaagccagtccggagcggtgactcatccccccccacccccacccccccgcgaga 6108 short:......................................t.....a.-................... 8515 long : 6109 gacgcggcgcggccattggtgagcatcacgccccgcccctcgccccgcctagctcccgcc 6168 short:...................................................a.............. 8575 possible AP-2 site long : 6169 tgccccgcccctttccactcccggctcccccgcgtt 6204 short:.......................................... 8611
Alignment of intron 1 is excellent, identities = 2184/2191 (99%), gaps = 2, with no changes near splice boundaries. Intron 2 also aligns well with the exception of the IAP region; the retrotransposon structure is otherwise identical.
Looking downstream of the polyA site at 7732 bp of comparable region, the identity stays high to the end at 99% [position 29,358 in the new sequence]. The retrotransposon structures agree except for long incubation mouse having MLT2D of 186 and for differences in simple repeats. More interestingly, the new terminal stretch sequenced in long incubation mouse only has a L1ME3 of 92 bp, a L1ME3A of 233 bp, a URR1A of 212 bp, and a PB1D9 of 113 bp. From the annotation, we know how to account for only 650 bp of the residual 10,403 bp (6.2%). This compares fairly well to retrotransposons found by the Censor server.
GenBank annotation:36748 30254 30345 L1ME3 896 987 92 bp 30341 30573 L1ME3A 983 1215 233 35130 35341 URR1A 5772 5983 212 37553 37665 PB1D9 8195 8307 113 Censor server: 897 987 L1ME2 449 + 1086 1157 L1ME2 671 + 3666 3684 (CAAA) 20 + 5344 5369 (TTTTTG) 32 + 5772 5983 URR1 236 + 8159 8203 RSINE2 45 - 8223 8307 B1 85 - Changing numbers to post CDS: exon 2a ends at 48 exon 2b ends at 721 exon 3 extends from 1262-2564 PB1D9- extends from 802-914
GenBank AF163764 28 Aug 99 webmaster commentaryThis is a 25 Aug 99 GenBank entry for the bovine prion gene that contains upstream regions never sequenced before. No journal article has appeared yet. The work was done by Follet, J., Schulze, T., Cesbron, J.Y. and Lemaire-Vieille, C. of the Laboratoire de Physiopathologie des Affections Neurodegeneratives Transmissibles, Institut Pasteur in Lille, France.
The region sequenced covers 6923 bp, most of which (4329 bp) are 5' of exon 1ab. The sequence stops at the end of exon 2 and so does not contain intron 2 or the ORF. While we can only guess at what will be in the article, it seems clear that the focus will be the alternatitive splicing of exon 1 described in the entry [exon 1a: 4330-4382 = 53 bp versus exon 1b: 4330-4497 = 168 bp]. Tissue specificity in alternative splice site use was likely studied; it is possible that prion protein levels and BSE susceptibility correlates with more use of one alternative.
Recall that no other mammal, including closely related sheep, is known to exhibit alternative splicing. The longer splice variant is the anomalous one in the comparative sense. This same alternative splicing was characterized in 1997 in bovines by Horiuchi M et al who found both mRNAs used in all tissues except spleen and found similar in vitro translation efficiencies. Exon 1 has been aligned from all species available (as has exon 2) and the hypothetical unutilized ovine 1b analyzed by homology.
So the question arises, what is new here? This region of the bovine prion gene was sequenced in 1992 by Yoshimoto et al. D26150 covers 3404 bp, 802 bp upstream of exon 1 to 9 bp past exon 2, making it a subset of the new sequence if the intron 2 portion is not considered. Restricting gapped Blastn to bovine with the older sequence as query shows a disturbing number of differences to the new sequence [region 3525-6923], identities = 3353/3413 (98%), gaps = 32/3413. Are these sequence errors (and if so, whose?) or bovine breed differences (Japan and France) or rapid evolution of intronic sequence?
Exon 1a itself is identical in all 4 bovine GenBank entries; however exon 1b shows 4 changes from D26150, though none are near splice boundaries:
AF163764 4330 gccagtcgctgacagccgcagagctgagagcgtcttctctctcgcagaagcaggtaaata 4389 D26150 803 ............................................................ 862 AF163764 4390 gccgcgtagtcctttaaactcccagcggaggacgcccaaccctgggtcttgcagccgagg 4449 D26150 863 ..................................-.................g....... 921 AF163764 4450 ccccagggcacccagccgaatcggattggtgggaggcagaccttgacc 4497 D26150 922 -.......-....................................... 967
Exon 2 also exhibits 4 substitutions. Here the new sequence differs from three other bovine exon 2 sequences and also from sheep (lower two sequences). This level of change in a conserved region suggests sequencing error in the new posting -- surely two breeds of cow do not differ more from each other than they do from sheep.
AF163764 6826 gacttctgaatatatttgcaaactgaacagtttcaaccgccccgaagcatctgtcttccc. 6885 D10612 54 ..................a...................aag................... 113 D26150 3298 ..................a...................aag................... 3357 AB001468 54 ..................a...................aag................... 113 X79913 56 ..................a...................aag.t................. 115 U67922 8139 ..................a...................aag.t................. 8198 AF163764 6886 agagacacaaatccaacttgagctgaatcacagcagat 6923 D10612 114 ...................................... 151 D26150 3358 ...................................... 3395 AB001468 114 ...................................... 151 X79913 116 .........g............................ 153 U67922 8199 .........g............................ 8236
It is of interest to compare the 4329 bp region upstream of exon 1 of the new bovine sequence to sheep U67922 [1321-5665 = 4345 bp], also determined by Lee et al for a long distance 5' of exon 1. Here the alignment extends for the full length of the shorter bovine sequence at 95% identity, ie, the retrotransposon structure is identical. There is one signficant 28 bp gap in bovine beginning at position 2736, corresponding to CCTCA GACAC TGAGT CTTCC CAACA GCA in sheep.
Analysis of bovine retrotransposons using J Jurka's Censor (shows general mammalian elements only): 425 796 LINE2 797 866 MER5A 1177 1238 MER5A 1272 1365 LINE2 1712 1808 LINE2 2330 2624 MLT1G 2626 2752 MLT1G2 2753 2874 MLT1G 3578 3665 MER94 Sheep GenBank annotation: 1745..2082 LINE2" 2092..2203 MER5A" 2194..2492 Bov-B" 2482..2561 MER5A" 2590..2683 LINE2" 2684..2842c Bov-tA2" 3011..3174 LINE2" 3650..3942 MLT1G" 3756..4215 MLT1F"
PNAS Vol. 96, Issue 17, 9745-9750, August 17, 1999Locating and sequencing the prion gene in fish is long-overdue. The sequence would greatly illuminate the origin of this protein, its normal function, and timing of emergence of key domains. Enough of the 106-126 region may still be present to cause concern in intensive fish farming as infered from antibody binding to the 112 region.
RH mapping, a familiar method from the human genome project, creats a panel of markers produced by fusion of irradiated zebrafish chromosomes with mouse cells. The authors characterized 849 simple sequence length polymorphism markers, 84 cloned genes and 122 expressed sequence tags allowed the production of an RH map, with average breakpoint frequency of 148 kilobase, covering 88% of the zebrafish genome. Comparison of marker positions in RH and meiotic maps indicated a 96% concordance. Mapping expressed sequence tags and cloned genes will help identify candidate genes for specific mutations in zebrafish.
The prion gene was not one of their markers (even though the aunt of a lead author died of CJD in 1997). However, as the webmaster noted on 16 Dec 98, by looking at the human RH map for enough genes nearby (some of which _were_ zebrafish markers), synteny could safely inferred between human chr 20 12 pter and linkage group 17 (or its tetraploidization doppelganger, LG 20) in zebrafish. Fish experienced a tetraploidization event with the other copy of the prion gene landing on LG20. Whether both copies were retained or whether they diverged in function as paralogues will be an interesting question.
Looking now at the searchable databases (1, 2, 3) in conjunction with Figure 1A of the PNAS paper we see a much improved map of LG17 and the associated small clone lines that hopefully carry the fish prion gene.
19 Aug 99 webmasterCould a bee sting cause scrapie? Yes, indeed -- there was an apparent near miss 23.6 million years ago in a common ancestor of sheep and cow -- a retrotransposon event that might have boosted prion protein production to levels fostering sporadic TSE.
Ruminants contain a 1220 bp mariner retrotransposon in their 3' UTR portion of their mRNA. This element, with its terminal inverted repeats, are described by Lee as a fossil transposase pseudogene with homology to the Mellifera (bee) subfamily. It is probably an old insertion shared by all ruminants since it has 7-8 frameshifts and 5 stop condons -- figure 3 of the Lee paper shows a guided translation and the correct flanking human gene alignment. The insertion in cow/sheep occured between 27587 and 27588 in terms of human 3' UTR numbering, just downstream of the Bov-tA3, greatly increasing the length of ruminant mRNA.
Dating the 3 retrotransposon insertions in ruminant prion 3' UTR mRNA is accomplished by simply aligning cow and sheep 3' UTR mRNA sequences in ClustalW and comparing the rate of fixation of mutations in the retrotransposon regions to that of the main 3' UTR mRNA, an estimate can be obtained by simple proportionality of when the insertion events took place relative to cow/sheep divergence. The main error is in the effects of selection in distorting rates of fixation -- some regions important to mRNA are no doubt conserved whereas retrotransposons in general are less constrained. Note however that the 2.1k polyA signal is quite close (65bp) to the start of the first retrotransposon and that many regions of non-transposon mRNA are poorly conserved. There are no saturation effects -- the rates are so low that multiple hits at the same site are insignificant. Gaps are treated as a fifth base.
Since the cow/sheep divergence, there have been about 60 mutational events fixed in evolution in the 1503 bp of the main mRNA 3' UTR. Restricting attention to the 56 simple events (= 3.73% difference or 96.17% identity), the rate is 3.73 events per 100 bp since divergence:
47 changes of form *x* 7 changes of form *xy* 1 *xyz* 2 *xyzq* 3 *xyzqr*Aligning the 3 retrotransposons gives rates of
Bov-B LINE (387 bp): 21 events in 385 bp, for a rate of 5.45 per 100 bp Bov-tA3 SINE (159 bp): 11 events in 159 bp, for a rate of 6.91 per 100 bp OaMAR1 mariner) (1,220 bp): 54 events in 1223 bp, for a rate of 4.41 per 100 bpNote first that these rates are consistent with the requirement that the retrotransposon insertions occurred in a common ancestor. The relative rates within the 3 retrotransposons gives a relative order of insertion of the Sine element earliest, followed by the Line element, and most recently, the mariner element. These scale as (Line:Sine:mariner)=(1.23:1.56:1). Since none of these were present at the time ruminants diverged from carnivores, the events are bounded by that divergence. For illustrative purposes, assuming 40 million years as the date of insertion of the mariner element gives 49.2 my as the date of the Line element and 62.4 my for the Sine.
The divergence of cow and sheep may be taken as 20my. The relative mutation rates of (mRNA:Line:Sine:mariner) = (1:1.46:1.85:1.18) then scale to 20my, 29.2my, 37my, and 23.6my for purposes of dating the events:
Bov-B LINE 29.2 my Bov-tA3 SINE 37.0 my Mariner 23.6 my
This predicts that all 3 retrotransposons will be present in deer and elk mRNA (the mariner element just barely) but not in cetaceans, pigs, or camels, since the bovid/cervid divergence is said to be 21my and whales diverged at 60my. Thus the mariner insertion event -- resulting from a rare bee sting long ago that transferred DNA into an unfortunate ruminant -- might be quite useful in resolving or confirming details of artiodactyl divergence (giraffe, gazelle, etc.).
One can only wonder what happened to tissue-specific regulation of translation of the prion gene subsequent to this event. Or if a bee sting today could cause scrapie.
Human prion mRNA corresponding to the less-used 2.5k mRNA species aligns fairly well with the sheep 2.1 polyA site:
sheep: 613 tgttt-aagca-cct-tcaagtgatattcctttctttagtaacataaagtatagataatt 669 ||||| ||| | ||| | | ||| ||||||||||||| ||| ||||| ||||| human: 26979 tgtttaaaggaccctat-atgtggcattcctttcttta--aac------tataggtaatt 27029 sheep: 670 aaggtacct--taattaaactaccttctagacactg-agagcaaat 712 |||| | || || |||| | |||||||||||||| || |||||| human: 27030 aaggcagctgaaaagtaaattgccttctagacactgaag-gcaaat 27074It is interesting to compare this dating method to that recently used by HS Lee et al to track the origins of E200K founding mutations, Am J Hum Genet 1999 Apr;64(4):1063-1070. Here, neighboring microsatellite markers were used to separate distinct mutational events, which could then be dated by known historic diasporas and point changes as here. The results are somewhat muddled however by ambiguous haplotypes, slippage events, and recombination -- radiation hybrid panel markers were not used. Point mutations such as E200K are harder to date because the same one can occur over and over, especially in CpG sites, as well as revert, whereas retrotransposon bursts are irreversible and uniquely clocked.
Arch Neurol 1999 Aug;56(8):951-7 Na DL, Suh CK, Choi SH, Moon HS, Seo DW, Kim SE, Na DG, Adair JCCJD is a rare transmissible disease that typically causes a rapidly progressive dementia and leads to death in less than 1 year. Although a few anecdotal reports suggest that diffusion-weighted magnetic resonance imaging may help substantiate premortem diagnosis of CJD, detailed correlation between radiographic data and clinical, electrophysiologic, and metabolic parameters is not available.
:
Signal abnormalities on diffusion-weighted images in 3 consecutive patients with probable CJD were correlated with psychometric features, electroencephalographic findings, and functional images with either positron emission tomography or single photon emission computed tomography.
Focality of abnormalities on diffusion-weighted image, not apparent on routine magnetic resonance images, correlated closely with clinical manifestations of CJD. The topographic distribution of signal abnormality on diffusion-weighted image corresponded with abnormal metabolism or perfusion on positron emission and single photon emission computed tomographic scans. In 2 cases, the laterality of diffusion abnormalities correlated with periodic sharp wave activity on electroencephalograms. These findings extend previous observations that suggested a diagnostic and localizing utility of diffusion-weighted imaging in CJD.
Reuters North America Wed, Aug 18, 1999Scientists may be getting closer to diagnosing the human equivalent of mad cow disease with a simple brain scan, New Scientist magazine said Wednesday. The new variant of Creutzfeldt-Jakob Disease (nvCJD), a brain-wasting illness, is difficult to diagnose and is usually only confirmed with an autopsy.
But researchers at the Royal Victoria Infirmary in Newcastle upon Tyne in northern England and the National CJD Surveillance Unit in Edinburgh, Scotland, said scarring deep in the brain's posterior thalamus could be a sign of the disease. That part of the brain controls sensory information like hearing, touch and vision.
"This scarring shows up as increased signal intensity on magnetic resonance images (MRI) of the brain, and isn't seen in forms of CJD not linked to the consumption of BSE (bovine spongiform encephalopathy, or mad cow disease) infected beef," the magazine said.
But picking up the scarring on an MRI is difficult and takes a trained eye. Detection is made harder because many radiologists see only one case of nvCJD in several years.
To overcome the problem, Alan Coulthard and his colleagues in Newcastle upon Tyne developed a standardized method to read MRI scans. To devise the method, they compared scans of three nvCJD patients with 14 other people with no neurological problems. The study was small so the researchers are not sure how reliable the test is but they are planning a larger study that they hope will confirm their results.
21 Aug 99 webmasterThe prion literature from the early 1990's contains various bizarre papers relating to an open reading frame on the anti-sense strand (that could be revisited using the many dozens of new species sequenced) and a polyadenylated 4.5 kb mRNA from brain (but not liver, spleen, or lung) that hybridized under stringent conditions with prion mRNA. The last of these papers showed that the knockout mice still had the 4.5 kb mRNA, ie, it originates at a second locus. Expression is unchanged in TSE. The 4.5 kb species has been sought and found in mouse, hamster, and bovine. It has never been cloned or sequenced.
This RNA presumably is mRNA for some secondary gene because it is not found experimentally in non-polyadenlyated RNA. The boundaries of hybridization are poorly established. Probe A of Moser et al. raises concerns about mere retrotransposons 5' in hamster (but not 3', see below); however probe B is completely within the ORF, running from NcoI to HinCII restriction fragments of hamster, or approximately positions 190-460 of the ORF.
Now mouse brain has been the subject of tens of thousands of sequencing runs to determine expressed sequence tags (ESTs), ie cDNA made from oligo dT primers, ie mRNA of expressed genes. The 4.6 mRNA, which appears to be abundant, should thus appear in the mouse EST database at GenBank. The 4.5 kb mRNA has not been sought in humans but is presumbably present, given its expression in a range of other mammals, putting it in the even larger human brain EST collection.
However, RT PCR runs in the EST databases do not extend generally for more that a few hundred base pairs, ie, the last 10% of the 4.5 kb mRNA. Since the extent and region of the 4.5 kb mRNA hybridizing to prion mRNA has never been determined, it is possible that the portion of the 4.5 kb species hybridizing is upstream of the region that would be in the EST collection. Probe B is however a favorable query because few prion ESTs extend past the 1606 bp 3' UTR into the probe B region. Probe B could be extended in the 5' direction (omitting prion introns) to pick up more of the 3' region of 4.6 kb mRNA EST sequence. However, this may not be enough to find the 4.5 kb species even if in the EST collection
Note that the 4.5 kb mRNA is + (complementary to - prion mRNA) so cDNA made from it is - so matches of prion DNA probe with cDNA EST sequences are +/+ unlike +/- matches of prion DNA with prion mRNA cDNA. (The EST database is actually mixed.) This is irrelevent to Blastn searching which allows both senses but restricts sharply the number of sequences that need be examined in detail
The 4.6 kb mRNA's gene could also have been completely sequenced in the course of human or mouse genomics projects; the former is scheduled to be completed by spring, 2000. This would appear as an imperfect +/- hit on a Blastn search using probe B. It is not clear how to translate stringent hybridization into a Blast score.
Note that any protein made from 4.6kb mRNA would not have any sequence similarity to prion protein, no matter what its reading frame or ORF location. However, Blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database, so prion mRNA as a probe would search for 4.6 kb protein as well. However, the part that hybridizes with prion mRNA may not be translated into protein, ie, could correspond to 5' or 3' UTR.
Protein made from this alternative gene would have nothing in common with the prion protein sequence, properties, or regulation, though interference of mRNAs remains an issue. If statistics rules out a fluke occurence of ahybridizing species and no regulatory role is found, then the likliest explanation is that a prion pseudogene has become integrated into another gene oriented so that the prion minus strand has become (a harmless?) part of the mRNA of second gene, just as the mariner pseudogene appears in the ruminant prion 3' UTR mRNA.
On 21 Aug 99, dozens of Blastn and Blastx searches using probe B and its extensions with a range of parameter opations failed to turn up a strong candidate for the 4.6 kb species. So this will have to be revisited as the human genome project progresses.
Nature 1991 Jul 25;352(6333):291-2 Hewlinson, RG et al [omitted at Medline] Comment on: Nature 1991 Feb 14;349(6310):569-71 [SEs: The prion's progress. Weissmann C] Comment on: Nature 1991 May 9;351(6322) 106:[Anticipating the anti-prion protein? Goldgaber D]This paper was the first to show the 4.6 kb mRNA.
Nature 1993 Mar 18;362(6417):213-4 Moser M, Oesch B, Bueler HThis paper showed that the polyadenylated 4.6 mRNA was present in hamster and mouse knockouts.
Neurobiol Aging 1999 Jan-Feb;20(1):53-6 Nakayama H, Katayama K, Ikawa A, Miyawaki K, Shinozuka J, Uetsuka K, Nakamura S, Kimura N, Yoshikawa Y, Doi KA male great spotted woodpecker (Picoides major), which was at least 16 years old, died due to general weakening. Cerebral vascular walls, including capillaries, were positively stained with Congo red with green-gold birefringence, and some of which showed a severe deposition of the Congophilic materials resulting in a corona-like fibrillar radiating structure.
The Congophilic materials were positive for beta amyloid protein, but negative for prion protein. Only a few senile plaque-like structures were observed in the cortex by PAM stain and beta amyloid immunostain. The present case is the first observation of cerebral amyloid angiopathy in avian species and will indicate the presence of such age-related cerebral lesions also in birds.
Comment (webmaster): This is quite interesting in that the amyloidosis may be part of the normal aging process. The congo red test is solid support. This is a different kettle of fish from assertions of TSE in squirrels etc.
Biochemistry 1999 Aug 31;38(35):11560-11569 Miura T, Hori-i A, Mototani H, Takeuchi HThe cellular form of prion protein is a precursor of the infectious isoform, which causes fatal neurodegenerative diseases through intermolecular association. One of the characteristics of the prion protein is a high affinity for Cu(II) ions. The site of Cu(II) binding is considered to be the N-terminal region, where the octapeptide sequence PHGGGWGQ repeats 4 times in tandem. We have examined the Cu(II) binding mode of the octapeptide motif and its pH dependence by Raman and absorption spectroscopy. At neutral and basic pH, the single octapeptide PHGGGWGQ forms a 1:1 complex with Cu(II) by coordinating via the imidazole N(pi) atom of histidine together with two deprotonated main-chain amide nitrogens in the triglycine segment. A similar 1:1 complex is formed by each octapeptide unit in (PHGGGWGQ)(2) and (PHGGGWGQ)(4).
Under weakly acidic conditions (pH approximately 6), however, the Cu(II)-amide(-) linkages are broken and the metal binding site of histidine switches from N(pi) to N(tau) to share a Cu(II) ion between two histidine residues of different peptide chains.
The drastic change of the Cu(II) binding mode on going from neutral to weakly acidic conditions suggests that the micro-environmental pH in the brain cell regulates the Cu(II) affinity of the prion protein, which is supposed to undergo pH changes in the pathway from the cell surface to endosomes. The intermolecular His(N(tau))-Cu(II)-His(N(tau)) bridge may be related to the aggregation of prion protein in the pathogenic form.
Comment (webmaster): There have been numerous studies of isolated repeats and copper; no studies of copper in the intact repeat domain (pre-106) nor in native protein. This paper is remarkable for the inter-molecular bridging reported and for the modest pH change bringing this about. pH has been studied before so it is not immediately clear why this phenomenon was not seen earlier. Copper bridging makes no real sense in the pathology: the repeat region is inessential for this and the aggregate has been known for decades to be a classical cross-beta amyloidosis. However there could be a role here in normal function.
Liu JJ, Lindquist S Nature 1999 Aug 5;400(6744):573-6The yeast [PSI+] element represents a new type of genetic inheritance, in which changes in phenotype are transmitted by a 'protein only' mechanism reminiscent of the 'protein-only' transmission of mammalian prion diseases. The underlying molecular mechanisms for both are poorly understood and it is not clear how similar they might be. Sup35, the [PSI+] protein determinant, and PrP, the mammalian prion determinant, have different functions, different cellular locations and no sequence similarity; however, each contains five imperfect oligopeptide repeats-PQGGYQQYN in Sup35 and PHGGGWGQ in PrP. Repeat expansions in PrP produce spontaneous prion diseases.
Here we show that replacing the wild-type SUP35 gene with a repeat-expansion mutation induces new [PSI+] elements, the first mutation of its type among these newly described elements of inheritance. In vitro, fully denatured repeat-expansion peptides can adopt conformations rich in beta-sheets and form higher-order structures much more rapidly than wild-type peptides. Our results provide insight into the nature of the conformational changes underlying protein-based mechanisms of inheritance and suggest a link between this process and those producing neurodegenerative prion diseases in mammals.
Comment (webmaster): The paper looks at a double deletion and a double insertion, finding that only the latter lead to the phenotype. While there is no harm in this for yeast, no analogy to the repeat situation in mammalian prion exists despite claims made in the paper. First, the repeat region can be deleted altogether and prion disease can still occur in animal models. Second, the repeat region is typically clipped and missing from in vivo amyloid. Third, two extra repeats is not associated with enhanced risk for CJD, and fourth, the goat double deletion is accompanied by a significant point mutation and only one heterozygous goat showed an increased incubation period when exposed to full repeat sheep agent also missing the point mutation (so simply attributable to interference and under-production. The bottom line is that the yeast system has nothing more in common with mammalian prion than any of the other 25 amyloidoses do.