Second genomic copy of human prion gene !
Mad Cow Home ... Best Links ... Search this site

Prion paralogue found adjacent to main gene!
Long and short incubation mouse prion genes compared
Zebrafish RH panel locates prion gene
Could a bee sting cause scrapie? A near miss 29.2 MY ago
Anti-prion 4.5 kb mRNA: a harmless prion pseudogene?
Alternate exon 1 splicing in Holstein
MRI: specific diagnosis of nvCJD?
Cerebral amyloid angiopathy in an aged great spotted woodpecker.
Prion octapeptide: intermolecular copper

Prion paralogue found adjacent to main gene !

 webmaster research posted 29 Aug 99 to web and BSE-listserve; updated 31 Aug 99 
A 39,760 bp sequence for long incubation period mouse prion gene Prnp-b, U29187, appeared at GenBank on 30 June 99. It is alluded to briefly in an earlier heroic sequencing paper focused on short incubation mouse Prnp-a (U29186), human, and sheep prion sequences.

This new mouse sequence by IY Lee, D. Westaway, et al. was motivated in part by a defective IAP retrovirus element of 6,593 bp is inserted into the short incubation period mouse (Prp-a alleles) in the long intron 2 some 5,404 bp upstream of the translation start codon. That sequence is of comparable length, 38,418 bp, indicating that thousands of new positions have now been sequenced in Prnp-b which lacks the long retrovirus insertion.

We shall see shortly that 10,403 bp further downstream have been sequenced in long incubation mouse, ie, 29358-39760. This region contains an active new gene from 36212-36748 with a domain paralogous to the the distal region of the prion gene. Thus a stated goal of Lee et al. to determine the chromosomal environment of the prion gene has been attained on the 3' side, with the added bonus that the neighboring gene is related in sequence to the prion protein.

The large stretches of homologous overlap between long and short incubation period mouse sequences are also of considerable interest [compared below]. These mice differ in amino acid sequence at 108 and 189 and evidence in isogenic strains suggests that this is adequate to account for the different phenotypes, though the IAP insertion in commonly used strains of lab mice raises serious issues based on its influence on transcription in other situations (as noted by Lee et al.).

A simple alignment of long and short incubation period mouse genes shows 18,085 bp of sequence is available after the end of the mRNA for long incubation mouse but that the first 7,731 bp of this had already been examined for other genes by Lee with negative outcome (as with the 8,611 bp upstream exon 1) in short incubation mouse, human, and sheep. This leaves 10,403 bp of sequence with no counterpart in any sequenced mammalian prion gene. The human genome project has not yet reached this region of chromosome 20 (and mouse projects lag further). The idea here is to use the chromosome 20-restricted Blast server with the new mouse query, which 'extends' the Lee human prion sequence.

Four retrotransposons are noted in the GenBank annotation in this new region by Lee et al. These were confirmed, up to notational conventions, by the webmaster using Jurka's methodology [seebelow]. These insertions total 650 bp and so account for 6.2% of the new section. The ORF for the new gene is intercalated between the third and fourth of these retrotransposons. The GenBank annotation should be revised to reflect this, and perhaps a separate entry is needed for the new gene (which is known in human as well as mouse from sequenced mRNAs).

Lee's previous methods for searching for adjacent genes and pseudogenes included locating promoter CpG islands, using Blastn searches with masked sequences (here 29358-39760), seeking long ORFs in any frame conserved across species (not testable here as human sequence ends near short incubation mouse; sheep and cow end far earlier, at 5117 bp and 0 bp post-mRNA, resp.), and using programs such as GenscanW.

The webmaster took a slightly different approach on 28 Aug 99, using NCBI gapped tBlastx with the new distal portion of long incubation mouse prion as query against mouse and human databases of expressed sequence tags (dbEST). Recall tBlastx compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.

tBlastx is a more sensitive method for detecting homology than Blastn because protein sequences are more conservative than nucleotide sequences, due to synonyomous codon drift, conserved residues, and statistical advantages of 20 choices instead of 4. The dbEST databases contain many mRNA sequences from factory labs not found in the non-redundant main GenBank repository; the human set is used despite the mouse probe because it is more extensive.

Blastn in fact will fail to pull up statistically significant matches from either repository in this case. Thus the new gene is easy to miss. (However GenscanW also suggests this new gene with all of U29187 used as query, in addition to finding the standard prion gene.) tBlastn and Blastp are successful on the main GenBank database once the correct probe is in hand.

While it is welcome news to have found the 3' boundary of the prion gene after all this sequencing effort, a most surprising bonus is that the neighboring gene is related in sequence to the distal part of the prion gene. This vindicates the reasoning given by Lee et al. on page 1023 of their 1998 paper for using large-scale sequencing to find "adjacent, perhaps functionally related, genes... in mammalian gene complexes."

For the time being, this new gene and protein will called Prnd, an acronym for prion-like doppelganger. Its function is unknown; note that it is missing the 106-126 amyloidogenic region as well as the repeat section. Based on the small number of dbEST sequences, the gene is probably rarely transcribed (or responds poorly to oligo dT amplification) relative to the prion gene itself which has been found several hundred time in EST sequencing.

[After the webmaster's posting of the discovery of the new prion paralogue gene on 29 Aug 99, D.Westaway at U Toronto kindly wrote to say that the 4 labs involved in sequencing the mouse gene already have a paper in press addressing the new gene, with new sequences to be released at the time the paper appears. This paper was submitted to the Journal of Molecular Biology on 19 May 99 accepted 24 June 99 and will be published imminently, possibly 3 Sep 99. While apparently reaching similar conclusions, they have additional mRNA and protein data that should greatly illuminate this remarkable situation. The GenBank sequence was re-annotated on 31 Aug 99 to show the Prnd gene. The protein sequence appears identical to the one posted earlier by the webmaster; some UTR detail could not have been deduced from the ESTs without further sequencing. They have not modified the human entry though prnd can be inferred there too and also, barely, in rat, from AI136375 3' UTR.

On 8 Sep 99 the GenBank entry was updated again. This time the provocative title of the upcoming JMB article was included:

Ataxia in Prion Protein (PrP) Deficient Mice is Associated with Upregulation of the Novel PrP-like Protein Doppel

 J. Mol. Biol. (1999) In press [Sept 24, 1999  or later].
Moore,R.C., Lee,I.Y., Silverlman,G.L., Harrison,P.M., Storme,R., Heinrich,C., Karunaratne,A., Pasternak,S.H., Chishti,M.A., Liang,Y., Mastrangelo,P., Wang,K., Smit,A.F.A., Katamine,S.,  Carlson,G.A., Cohen,F.E., Prusiner,S.B., Melton,D.W., Tremblay,P.,  Hood,L.E. and Westaway,D.

They also came up with a clever name for the new gene, prion-like doppelganger, or prnd. This literally means a ghost-like companion. The webmaster has agreed to use this name because the mouse genome nomenclature committee has accepted it, noting however it has already been taken at GenBank, Medline, and Swissprot for PrnD for tryptophan halogenase, prnD for both aminopyrrolnitrin oxidase and delta1-pyrroline-5-carboxylate dehydrogenase, and prnd for an intergenic cis-acting regulatory region. (The word prion has been in use in other contexts for a century for that matter.)

In theory, the discovery of this gene really dates to results of the Washington University ESP and German Genome Projects, posted to GenBank 15 Feb 97 (AA104098) and 9 Feb 98 (AA796652) for mouse; 6 Aug 97 (AA234322) and 8 Jun 99 (AL042906) for human. There is no need to have the mouse genomic sequence in hand and no immediate need to know of the chromosome 20 adjacency because tBlastn with mouse prion protein terminal domain should and does , despite the asymmetry of tBlastn, pull out AA796652 below normal mouse prion ESTs (though just barely -- the full EST would then need to be run with Blastx to really establish convincing homology). An even more distal prion sequence should also pull AI588048 and allied sequences. The adjacency could then be readily established with FISH, YAC, or RH panels. However, as seen in the EST diagram below, the EST data has a gap of about 15 bp within the ORF and about a 1000 bp gap before the third set of ESTs.

However, AA796652, an EST dated 09-FEB-1998 from WashU, does a nice job of suggesting the spliced out intron, its length (but not its existence) depends on having long incubation period mouse sequence. The 38bp leader exon and the 9 bp leader into the ORF have no matches to prion gene exons or introns. The prion gene also lacks a 3' UTR exon and intron. Neiher the 3'UTR portions nor exon 2a and 2b really fit the EST data so must represent direct mRNA data; exon 3 at 38013-39315 because of AI509336 386 bp 38347-38726, AA190150 186 bp subsequence, and AI136375 38726-39082 are the available ESTs and the latter responds to oligo dA with oligo DT at 39082 (however rat could have short indels).

The human EST AL042906 is tantalizing in that the 221 bp do not correspond to anything in mouse gene, suggesting a remote exon in humans. The sequence has 6 undetermined nucleotides 3' in addition to seemingly have errors towards the 5' end. The abrupt failure to align past the ORF could reflect a splice site to a human exon so far downstream that it does not correspond to anything in mouse or that non-ORF has evolved too rapidly to allow homology detection. However, this region does in fact correspond at 89% identity to 866-1034 of dJ1068H6.00023 an unfinished sequence at the chromosome 20 Blast server.

The annotation does not describe a promoter region; perhaps this is analyzed in the paper. If co-regulated by some of the same cofactors as the prion gene, then the upstream region should carry similar SP-1 sites etc and perhaps the well-conserved motif region. But the rodent reference prion promoter shows no match above exon 1 of prnd. While some alternative splicing is seen in bovine prion exon 1, it is not yet clear whether the prnd gene has alternative splicing and/or alternate polyA signals. The polyA signal AATAAA at 39291 is not annotated: 1261 tgttcttatc ttgcaaacAA TAAAcaccca tacatataca gac. Mouse exon 2b exhibits no polyA signal so is presumbably a splice site. Note that the prnd gene is not previously known from mapping of chromosome 20 nor its synteny counterparts and there is no known disease associated with mutations in it.

The 31 Aug 99 annotation provides the following structure for the gene:

standard prion gene     6205-21675  [3' end of prion mRNA is separated by 12,411 bp from prnd exon 1]
L1ME3A retrotransposon 30341-30573  [delimits prnd promoter region 5' of exon 1]

exon 1                 34086-34124  [5' UTR of 39 bp, no coding sequence]
intron 1               34125-36204  [includes retrotransposon URR1A 35130 - 35341]
exon 2a                36205-36799  [9 bp of 5' UTR, then coding sequence, then 49 bp of 3' UTR]
exon 2b                36205-37472  [3'UTR extended form of exon 2 with 722 bp of 3' UTR]
intron 2               37472-38012  [contains retrotransposon cPB1D9 37553-37665 = 113 bp]
exon 3                 38013-39315  1302 bp of 3' UTR, no coding sequence, classical polyA signal AATAAA at 39291]

CDS                    36212-36751
mRNA                   34086-34124 + 36205-37472 + 38013-39315

Changing numbers to post CDS:
exon 2a ends at 48
exon 2b ends at 721
exon 3 extends from 1262-2564
PB1D9- extends from 802-914
...agaggtgagctggtgggcaaaggta.......               end of exon 2b
tgtttgctggttgggcccctgcaatgtc.......               end of exon 2b
............................gactccaggagttgctgagc start of exon 3

There are 9 ESTs put in the databases by other researchers that are relevent to the prnd gene; 6 from mouse, 2 from human, and 1 from rat that form 3 non-overlapping groups. While homology allows "mixing and matching" of species for purposes of tiling the gene or creating a more complete ancestral mammalian probe, protein sequences must be drawn from a single species' ESTs. The single rat sequence is all 3' UTR so it mainly confirms the presence of the mRNA in this species and allows a partial measurement of how fast this region is changing relative to mouse [83% identity pver 12 million years since divergence].

Upstream group:

>AA796652 426 bp mouse mammary, covers 38 bp of exon 1 and 390 bp of exon 2.
       1 GGGCTCCAAG CTTCAGAGGC CACAGTAGCA GAGAACCG  ag attcacc atg aagaaccggc  
       61 tgggtacatg gtgggtggcc atcctctgca tgctgcttgc cagccacctc tccacggtca
      121 aggcaagggg cataaagcac aggttcaagt ggaccggaag tcctgcccag cagcggcggc
      181 cagatcaccg aagctcgggt agctgagaac cgcccaggag ccttcatcaa gcaaggccgg
      241 aagctggaca tcgactttgg agcagagggc aacaggtact acgcggctaa ctattggcag
      301 ttccctgatg ggatctacta cgaaggctgc tctgaagcca acgtgaccaa ggagatgctg
      361 gtgaccagct gcgtcaacgc cacccaggcg gccaaccagg ctgagttctc ccgggagaag
      421 caggat
 
>AL042906  731 bp human testis.  Main guide to human prnd protein sequence.  
        1 tggctggcca ctgtctgcat gctgctcttc agccacctct ctgcggtcca gacgaggggc
       61 atcaagcaca gaatcaagtg gaaccggaag gccctgccca gcactgccca gatcactgag
      121 gcccaggtgg ctgagaaccg cccgggagcc ttcatcaagc aaggccgcaa gctcgacatt
      181 gacttcggag ccgagggcaa caggtactac gaggccaact actggcagtt ccccgatggc
      241 atccactaca acggctgctc tgaggctaat gtgaccaagg aggcatttgt caccggctgc
      301 atcaatgcca cccaggcggc gaaccagggg gagttccaga agccagacaa caagctccac
      361 cagcaggtgc tctggcggct ggtccaggag ctctgctccc tcaagcattg cgagttttgg
      421 ttggagaggg gcgcaggact tcgggtcacc atgcaccagc cagtgctcct ctgccttctg
      481 gctttgatct ggctcatggt gaaataagct tgccaggagg ctggcagtac agagcgcagc
      541 agcgagcaaa tcctgcnagt gaccaactnt tcttccccaa accacgcgtg ttcttgaaag
      601 gtgcccagaa cggcgatgca cttcgcactt gcaaatgccc cttnccacgt attgcncccc
      661 tggtattgtg cccttgccgt ttctgataaa atggggggac ttgtgggctt tttccgtcac
      721 ttccattt

>AA234322 190 bp human melanocytes etc.covers first 125 bp of ORF, near subsequence of AL042906
        1 aaggttctga cgccgatgag gaagcacctg tatctggtgg tggctggcca ctgtctgcat
       61 gctgctcttc agccacctct ctgcggtcca gacgaggggc atcaagcaca gaatcaagtg
      121 gaaccggaag ccctgctcag cactgcccag atcactgagg cccaggtggc tgagaaccgc
      181 ccgggagcct 


>AA183841 383 bp mouse testis has 19/19 hit but apparently coincidental.

Distal ORF/proximal 3' UTR group:

>AA104098 472 bp from mouse heart; 120 bp of distal ORF; identities = 325/325 (100%) 3/148 to 327/472 of 3' UTR 36607-37078 
        1 gcgagtcctg tggcggctga tcaaagagat ctgctccgcc aagcactgcg atttctggct
       61 ggaaagggga gctgcgcttc gggtcgccgt ggaccaaccg gcgatggtct gcctgctggg
      121 tttcgtttgg ttcattgtga agtaaaaatc aatgaagctg gcagccacag aggtgagctg
      181 gtgggcaaag gtagacagag gtagcccagt tctctctatc tagcccccga gtgttctgaa
      241 agtacaacgt gtagcgtttc agggcatttc aaaagtccct cccaagtact ccccctactc
      301 catgtgtttg ataatgtgtt tcagtgcccc tatgctccac ccctgtgaga cctggcctgt
      361 tcctgccttt gcagctacac taggtgagaa accagccaaa ggatacaaga attgtcctgt
      421 gcactccacc tgattttcca actccaggaa actgaggctc acattgcagg ag

>AI428337 418 bp from mouse heart.  Identities = 386/390 (98%) 187/418 to 576/29 of 3' UTR 29/37327 - 418/36938, no hit to first 28 bp, no ORF overlap
        1 ttttatttgt cagaagacag attttcaaat gtctacttta gcttaatttt tgttatattg
       61 tctcacgtaa ggtgtcaagg tgcacggttg tcgctcaggg cttcccactt agttcagggt
      121 gtgacttcca catttcagtc cacattcggg actgcacttg gacctgagtc ttggagggat
      181 caggaggcag ctctaggcta cacagagcag aaaacctgag ggaagacaga gcaggggaac
      241 cccagctctt ctgttttcta actgtgctgg acagggactc ctgcaatgtg agcctcagtt
      301 tcctggagtt ggaaaatcag gtggagtgca caggacaatt cttgtatcct ttggctggtt
      361 tctcacccag tgtagctgca aaggcaggaa caggccaggt ctcaccaggg tggagcat

>AI588048 120 bp from mouse heart (a subsequence of AA104098) 120 bp of distal ORF 36606-36725
        1 agcgagtcct gtggcggctg atcaaagaga tctgctccgc caagcactgc gatttctggc
       61 tggaaagggg agctgcgctt cgggtcgccg tggaccaacc ggcgatggtc tgcctgctgg
Distal 3' UTR group:
>AI509336 386 bp  mouse spleen. Identities = 379/386 (98%) 1596/1 to 1975/386 of 3' UTR 38347-38726 full alginment
        1 ggcccagatc acacagttta ccccttgcac gccattttaa atatcagaca ataaagaagg
       61 aatagttagt tcttagatgg tgatgttggt gtttcagctc caagcttaag gcttagttac
      121 ccgggtgcac aaagctgtag acaaccaact tccttccagc gtgaaaatgt gtaggctgga
      181 cagagtcccg tgctccacga tcgctgagtt ctttcagacc attggttgtc acacccacaa
      241 gcctaactgt gacagccaat gagtgtccat gtgattttga ctgcaaattt cgcttatttg
      301 atttttagcg gagacgtcct tgctcttact tgtggatcag gagaggtttt cttagccctt
      361 ctaagtgttc ccaacccgtg cgctgt

AA190150 186 bp mouse spleen (subsequence of AI509336). Identities = 185/186 (99%) 1596/1 to 1781/186 of 3' UTR  
        1 ggcccagatc acacagttta ccccttgcac gccattttaa atatcagaca ataaagaagg
       61 aatagttagt tcttagatgg tgatgttggt gtttcagctc caagcttaag gcttagttac
      121 ccgggtgcac aaagctgtag acaaccaact tccttccagc gtgaaaatgt gtaggctgga
      181 cagagt

AI136375 368 bp rat; most distal sequence. Matches 38718-39072 at 83% identity with lowered gap penalties (note poly T, rat numbering) 
        1 tttttttttt tttgtatttt aaaggttgta atgaaacaaa attaaaaccc tcatctttat
       61 atttagacac agtaaagggg tacgggtctc atactaaggt agaaggaggg gatagggtta
      121 aagaaacaag aaaacaaaag tcattaaaaa atggacagaa agcactagat gatgtcaagt
      181 aaagctttaa ggtacccagc gttctggccc ggtattagga ttgcaaattg ctacatttgc
      241 agttgcaaat gagtaaagga gtggagagtc tggagaggtg aaaactttgt gaccgatcag
      301 actcaggaat tgcagccccg ccccctttta ctcggctgtc agtttgcgtg cagttctaca
      361 gagcacac
Be this as it may, the EST researchers did not recognize or pursue the similarity -- there is no annotation in the entry signifying similarity to prion gene (which would have enabled other researchers to find it by full text search). Indeed, EST labs would have no way of knowing that this would be of tremendous interest to researchers in the prion field.

In other words, the prnd gene could have been found on 16 Feb 97 had anyone troubled to ask an idle computer to take each entry in dbEST(or a non-redundant subset) at regular intervals and done Blastx against the main GenBank protein repository. [tBlastx is not allowed online; Blastn is too weak.] A simpler method, apparently not available, is an alerting service whereby a researcher submits a protein sequence of interest and asks for a daily email alert of blastx run against this single protein database.

Thus this landmark result for the prion gene could have been found 30 months ago. If a paralogue for prion gene could be discovered in this manner, what manner of other important data lies buried in dbESTs? Are there additional paralogues for the prion gene -- or for that matter the prnd gene -- still to be found within the EST collection? -- perhaps these could be picked up by tblastn using prion protein itself (or a proper fragment thereof) with favorable parameter settings and a great detail of patience in EST chromosome tiling.]

The initial hits from tBlastx provide homology at the protein level. This is sharply delimited to a distal portion of mammalian prion protein, approximately the last 100 residues (including the GPI anchor region). Extensive searches do not find any match to the proximal part of the prion protein. The full length of the new gene is 179 amino acids, its molecular weight 20442 and pI quite basic at 9.51 (due to 26 arg + lys).

The ORF runs from position 36212-36748 of the long incubation mouse sequence, coding for 179 amino acids. The human sequence is is 74% identical and 85% similar to mouse (the EST situation for human is of limited quality, leaving its N-terminus uncertain). These genes are unlikely to be pseudogenes because polyadenylated mRNA is recovered from various tissues (heart, mammary, testis); mouse and human proteins are still strongly homologous after nearly 100 million years (indicating selective pressure); and a long open reading from with initiating methionine and terminal amber stop codon.

 
>mouse prnd protein 
MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLD
IDFGAEGNRYYAANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDSKLHQRVLW
RLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVK  

>human prnd protein
.......WWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFIKQGRKLD
IDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEFQKPDNKLHQQVLW
RLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLMVK

>human prnd protein from contig 00023
MRKHLSWWWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFI
KQGRKLDIDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEF
QKPDNKLHQQVLWRLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLtVK

>mouse prnd U29187  36212-36748+stop or AA796652-AA104098 tiled ESTs
atg aagaaccggctgggtacatggtgggtggccatcctctgcatgctgcttgccagccac
ctctccacggtcaaggcaaggggcataaagcacaggttcaagtggaaccggaaggtcctgc
ccagcagcggcggccagatcaccgaagctcgggtagctgagaaccgcccaggagccttcat
caagcaaggccggaagctggacatcgactttggagcagagggcaacaggtactacgcggct
aactattggcagttccctgatgggatctactacgaaggctgctctgaagccaacgtgacca
aggagatgctggtgaccagctgcgtcaacgccacccaggcggccaaccaggctgagttctc
ccgggagaagcaggatagcaagctccaccagcgagtcctgtggcggctgatcaaagagatc
tgctccgccaagcactgcgatttctggctggaaaggggagctgcgcttcgggtcgccgtgg
accaaccggcgatggtctgcctgctgggtttcgtttggttcattgtgaag taa

>human prnd from AL042906 and subsequenceAA234322  
GGAAGCACCTGTATCTGGTGGtggctggccactgtctgcatgctgctcttcagccacctc
tctgcggtccagacgaggggcatcaagcacagaatcaagtggaaccggaaggccctgccc
agcactgcccagatcactgaggcccaggtggctgagaaccgcccgggagccttcatcaag
caaggccgcaagctcgacattgacttcggagccgagggcaacaggtactacgaggccaac
tactggcagttccccgatggcatccactacaacggctgctctgaggctaatgtgaccaag
gaggcatttgtcaccggctgcatcaatgccacccaggcggcgaaccagggggagttccag
aagccagacaacaagctccaccagcaggtgctctggcggctggtccaggagctctgctcc
ctcaagcattgcgagttttggttggagaggggcgcaggacttcgggtcaccatgcaccag
ccagtgctcctctgccttctggctttgatctggctcatggtgaaataa
Blast alignment of mouse and human prnd
 Score =  278 bits (704), Expect = 2e-74
 Identities = 127/171 (74%), Positives = 146/171 (85%)
 Frame = +1

Query: 9  MKNRLGTWWVAILCMLLASHLSTVKARGIKHRFKWNRKVLPSSGGQITEARVAENRPGAFIKQGRKLD 68
           ......WW+A +CMLL SHLS V+ RGIKHR KWNRK LPS+  QITEA+VAENRPGAFIKQGRKLD
Sbjct: 1    STCIWWWLATVCMLLFSHLSAVQTRGIKHRIKWNRKALPSTA-QITEAQVAENRPGAFIKQGRKLD 177

Query: 69  IDFGAEGNRYYAANYWQFPDGIYYEGCSEANVTKEMLVTSCVNATQAANQAEFSREKQDS 128
           IDFGAEGNRYY ANYWQFPDGI+Y GCSEANVTKE  VT C+NATQAANQ EF  +K D+
Sbjct: 178 IDFGAEGNRYYEANYWQFPDGIHYNGCSEANVTKEAFVTGCINATQAANQGEF--QKPDN 351

Query: 129 KLHQRVLWRLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVK 179
           KLHQ+VLWRL++E+CS KHC+FWLERGA LRV + QP ++CLL  +W +VK
Sbjct: 352 KLHQQVLWRLVQELCSLKHCEFWLERGAGLRVTMHQPVLLCLLALIWLMVK 504

A Wu-Blast alignment suggests the homology could even be more extensive. If the repeat is left out and the signal region considered a parallel not needing sequence resemblance, then the alignment becomes closer to global:

Score = 108 (38.0 bits), Expect = 0.00023, P = 0.00023
 Identities = 40/155 (25%), Positives = 77/155 (49%), Frame = +2

Query:    27 KWNRKALPST-----AQITEAQVAENRPGAFI---KQGRKLDIDFGAE-GNRYYEANYWQ 77
             +WN+ + P T     A    A       G ++      R + I FG++  +RYY  N  +
mprnp:   341 QWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPI-IHFGSDYEDRYYRENMHR 517

Query:    78 FPDGIHYNGCSEANVTKEAFVTGCINAT------QAANQGE-FQKPDNKLHQQVLWRLVQ 130
             +P+ ++Y    E +  +  FV  C+N T          +GE F + D K+ +    R+V+
mprnp:   518 YPNQVYYRPMDEYS-NQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMME----RVVE 682

Query:   131 ELCSLKH---CEFWLERGAGLRVTMHQPVLLCLLALIWLMV 168
             ++C  ++    + + +RG+ + +    PV+L +  LI+L+V
mprnp:   683 QMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIV 805
Now the prnd gene is well-established by several independent criteria, irregardless of any similarity to prion protein. Homology to prion protein is not particularly striking at 22% identity and 55% similarity and a 0.03% chance of finding a match of this quality by chance at the current size of the non-redundant GenBank database (462,722 sequences, 1,273,519,603 bp). These are low levels of similarity. Yet it cannot be coincidence that the prnd Blast probe pulls out 102 prion sequences and nothing else from this vast database, especially when the Blast tool does not know that the context is chromosome 20 adjacency to the prion gene.

There are no other sequences in the databases homologous to DHAP proteins or DNA. Bird prions do not consistently show signficant homology, suggesting that the gene dates to the mammalian lineage. The question arises, how can the low percent identity and slow evolutionary change be reconciled with a late origin of this domain? (Note the region of sequence overlap does indeed amount to almost the full globular domain of the prion protein, ie alignment occurs shortly after the 106-126 region at no later than 139 in human numbering.) If the partial gene doubling or domain module switching occurred prior to bird/mammal split, sequence divergence is better explained.

The 3D structure is not as hopeless to predict as it first seems, due to conserved anchors and alpha helix, though it s worse than threading chicken onto mammal. The purported 28569 -28608 promoter, 3'utr, and polyA site and signal could be studied further and the meagre EST situation in humans is error-salvagable 5' though still unsatisfactory and a poor substitute for directed experimental effort. No homologue is apparent in nematode or drosophila, which is aggrevating as far as normal function of prnd/prp or easier experimental models.

The prnd ORF should be sequenced in at least 3 species best chosen dynamically to minimize sequencing effort with respect to information gained. The proper choice of species depends very much on the rate of evolution of the particular protein and its individual domains of interest. Some blast and secondary structure tools are far more effective given a clustalw alignment.

The conserved residues are of interest: mouse prnd protein retains the first NVTK glycosylation site (but not the second though it has a second NATQ site), the two disulpide cysteines, many of the aromatic amino acids, and seemingly the GPI terminus (or at least a membrane tail):

CVNATQAANQAEFSREKQDSKLHQRVLW
RLIKEICSAKHCDFWLERGAALRVAVDQPAMVCLLGFVWFIVK 
Psort and SignalP 2.0 predict a possible signal region ending at position 26 MKNRLGTWWVAILCMLLASHLSTVKA/RGIKHRF with a possible transmembrane tail from positions 162 - 178 and to otherwise be located in the cytoplasm. Some alpha helix is predicted but little beta sheet:

The function of this mRNA and protein is totally unknown. There is no sign of a chromosomal tandem duplication on streak tests. It cannot be the 'anti-prion' because the gene is on the same strand so its mRNA would not be complementary to prion mRNA nor would it hybridize stringently. No amount of forcing causes protein homology to the earlier region of the prion protein ; however, weak homology and a long intron are hard to rule out. 35,341 is the end of the last known transposon; 37,553 begins the next one: this avoids conflict.

The next easy test is seeking CpG islands that correlate with the protein homology domain. This is most easily done by replacing CpG with XX in a colored font. The prion gene itself has a CpG island to serve as internal control. There is enrichment for CpG around 36300 but it is not dramatic, beginning at 36241 that could work for a simple gene with no untranslated leading exons. This areahas 26 CpG within 424 bp.

One could also look for promoters, splice boundaries, and polyA signals to complete the picture of this gene. GenScanW predicts a 40 bp promoter at 28569 -28608, a single exon gene at 36212 -36751 of 540 bp (or 179 amino acids) and a polyA at 37877 -37882 of sequence AATAA. This is very impressive agreement with alignment methods.

Interestingly, this software does not quite predict the main prion gene, but rather an 56 aa proximal extension [with no Blastp neighbors and no apparent signficance:

GENSCAN prediction of normal prion
mskifvtnfllpkfndgflaplapacpapfhsrlprvvgsadrfwalrriggrsviMANL
GYWLLALFVTMWTDVGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGTWGQPHGGGW
GQPHGGSWGQPHGGSWGQPHGGGWGQGGGTHNQWNKPSKPKTNFKHVAGAAAAGAVVGGL
GGYMLGSAMSRPMIHFGNDWEDRYYRENMYRYPNQVYYRPVDQYSNQNNFVHDCVNITIK
QHTVVTTTKGENFTETDVKMMERVVEQMCVTQYQKESQAYYDGRRSSSTVLFSSPPVILL
ISFLIFLIVG
To sum up, a viable gene has been found as the next-door neighbor of the mammalian prion gene. Further it bears a faint resemblance distally to the globular domain of the prion protein itself, ending the orphan status of this protein. Whether this domain exercises a similar function as a module of the new protein or interacts (say in a heterodimer) in any way with normal prion protein or affects the prion disease process remain to be established.

At this point, the most useful information would be sequences of the prnd protein from a few more species, tissue expression (especially brain), immunolocalization in vivo, and perhaps knockouts and recombinant protein production and interaction studies. The partial paralogue at this point has yielded no information about normal function of the prion protein nor does it quite illustrate the paralogous Gibbs principle lacking the 106-126 domain.

Long and short incubation mouse prion genes compared

29 Aug 99 webmaster commentary
As noted above, alignment of long and short incubation period mouse genes is quite worthwile for purposes of comparing laboratory strains.

The new long sequence begins 6204 bp above exon 1 versus 8,611 for the old short sequence. The mRNAs are 2,154 bp and 2,153 bp in length and differ only in 2 intermediate positions in the 3' UTR (other than the well-known non-synonymous coding changes). Prior to exon 1, aligning the comparable regions gives identities = 6109/6216 (98%) and gaps = 58/6216. The gaps largely represent replication slippage tandem repeats, eg:

Query: 1021 ggtgtggccatctgcatctggtatctggtctgaaggtgcgtggataccctctgtgcccgt 1080
            ||||||||||||||||||||||       |||||||||||||||||||||||||||||||
Sbjct: 3462 ggtgtggccatctgcatctggt-------ctgaaggtgcgtggataccctctgtgcccgt 3514

                                                                        
Query: 1856 gacacgcatggatacacatacacacacacacacacacacacacacacacacacacacaca 1915
            ||||||||||||||||||                ||||||||||||||||||||||||||
Sbjct: 4295 gacacgcatggatacaca----------------cacacacacacacacacacacacaca 4338

In the promoter region, a run of perfect agreement over nearly 500 bp surprisingly gives way to 4 small changes near exon 1. It is difficult to assess whether these are important to regulation of transcription but the main motif region is conserved. Recall the promoter region in rodents may be delimited by the 1245 extraneous bp of rat cytochrome c pseudogene beginning 452 bp upstream of exon 1 (see graphic or alignment of known pre-exon 1 sequences).

long : 6049 agcatttaagccagtccggagcggtgactcatccccccccacccccacccccccgcgaga 6108
short:......................................t.....a.-................... 8515
                                                                        
long : 6109 gacgcggcgcggccattggtgagcatcacgccccgcccctcgccccgcctagctcccgcc 6168
short:...................................................a.............. 8575  possible AP-2 site
                                                
long : 6169 tgccccgcccctttccactcccggctcccccgcgtt 6204
short:.......................................... 8611

Alignment of intron 1 is excellent, identities = 2184/2191 (99%), gaps = 2, with no changes near splice boundaries. Intron 2 also aligns well with the exception of the IAP region; the retrotransposon structure is otherwise identical.

Looking downstream of the polyA site at 7732 bp of comparable region, the identity stays high to the end at 99% [position 29,358 in the new sequence]. The retrotransposon structures agree except for long incubation mouse having MLT2D of 186 and for differences in simple repeats. More interestingly, the new terminal stretch sequenced in long incubation mouse only has a L1ME3 of 92 bp, a L1ME3A of 233 bp, a URR1A of 212 bp, and a PB1D9 of 113 bp. From the annotation, we know how to account for only 650 bp of the residual 10,403 bp (6.2%). This compares fairly well to retrotransposons found by the Censor server.

GenBank annotation:36748 

30254   30345   L1ME3    896   987     92 bp
30341   30573   L1ME3A   983   1215   233
35130   35341   URR1A   5772   5983   212
37553   37665   PB1D9   8195   8307   113

Censor server:

  897    987   L1ME2     449 +
 1086   1157   L1ME2     671 +
 3666   3684   (CAAA)     20 +
 5344   5369   (TTTTTG)   32 +
 5772   5983   URR1      236 +
 8159   8203   RSINE2    45 -
 8223   8307   B1        85 -


Changing numbers to post CDS:
exon 2a ends at 48
exon 2b ends at 721
exon 3 extends from 1262-2564
PB1D9- extends from 802-914


Alternate exon 1 splicing in Holstein

GenBank AF163764 28 Aug 99 webmaster commentary
This is a 25 Aug 99 GenBank entry for the bovine prion gene that contains upstream regions never sequenced before. No journal article has appeared yet. The work was done by Follet, J., Schulze, T., Cesbron, J.Y. and Lemaire-Vieille, C. of the Laboratoire de Physiopathologie des Affections Neurodegeneratives Transmissibles, Institut Pasteur in Lille, France.

The region sequenced covers 6923 bp, most of which (4329 bp) are 5' of exon 1ab. The sequence stops at the end of exon 2 and so does not contain intron 2 or the ORF. While we can only guess at what will be in the article, it seems clear that the focus will be the alternatitive splicing of exon 1 described in the entry [exon 1a: 4330-4382 = 53 bp versus exon 1b: 4330-4497 = 168 bp]. Tissue specificity in alternative splice site use was likely studied; it is possible that prion protein levels and BSE susceptibility correlates with more use of one alternative.

Recall that no other mammal, including closely related sheep, is known to exhibit alternative splicing. The longer splice variant is the anomalous one in the comparative sense. This same alternative splicing was characterized in 1997 in bovines by Horiuchi M et al who found both mRNAs used in all tissues except spleen and found similar in vitro translation efficiencies. Exon 1 has been aligned from all species available (as has exon 2) and the hypothetical unutilized ovine 1b analyzed by homology.

So the question arises, what is new here? This region of the bovine prion gene was sequenced in 1992 by Yoshimoto et al. D26150 covers 3404 bp, 802 bp upstream of exon 1 to 9 bp past exon 2, making it a subset of the new sequence if the intron 2 portion is not considered. Restricting gapped Blastn to bovine with the older sequence as query shows a disturbing number of differences to the new sequence [region 3525-6923], identities = 3353/3413 (98%), gaps = 32/3413. Are these sequence errors (and if so, whose?) or bovine breed differences (Japan and France) or rapid evolution of intronic sequence?

Exon 1a itself is identical in all 4 bovine GenBank entries; however exon 1b shows 4 changes from D26150, though none are near splice boundaries:

AF163764 4330 gccagtcgctgacagccgcagagctgagagcgtcttctctctcgcagaagcaggtaaata 4389
D26150   803  ............................................................ 862

AF163764 4390 gccgcgtagtcctttaaactcccagcggaggacgcccaaccctgggtcttgcagccgagg 4449
D26150   863  ..................................-.................g....... 921

AF163764 4450 ccccagggcacccagccgaatcggattggtgggaggcagaccttgacc 4497
D26150   922  -.......-....................................... 967

Exon 2 also exhibits 4 substitutions. Here the new sequence differs from three other bovine exon 2 sequences and also from sheep (lower two sequences). This level of change in a conserved region suggests sequencing error in the new posting -- surely two breeds of cow do not differ more from each other than they do from sheep.

AF163764 6826 gacttctgaatatatttgcaaactgaacagtttcaaccgccccgaagcatctgtcttccc. 6885
D10612   54   ..................a...................aag................... 113
D26150   3298 ..................a...................aag................... 3357
AB001468 54   ..................a...................aag................... 113
X79913   56   ..................a...................aag.t................. 115
U67922   8139 ..................a...................aag.t................. 8198
 
AF163764 6886 agagacacaaatccaacttgagctgaatcacagcagat 6923
D10612   114  ...................................... 151
D26150   3358 ...................................... 3395
AB001468 114  ...................................... 151
X79913   116  .........g............................ 153
U67922   8199 .........g............................ 8236

It is of interest to compare the 4329 bp region upstream of exon 1 of the new bovine sequence to sheep U67922 [1321-5665 = 4345 bp], also determined by Lee et al for a long distance 5' of exon 1. Here the alignment extends for the full length of the shorter bovine sequence at 95% identity, ie, the retrotransposon structure is identical. There is one signficant 28 bp gap in bovine beginning at position 2736, corresponding to CCTCA GACAC TGAGT CTTCC CAACA GCA in sheep.

Analysis of bovine retrotransposons using J Jurka's Censor (shows general mammalian elements only):
 425    796   LINE2 
 797    866   MER5A 
1177   1238   MER5A 
1272   1365   LINE2 
1712   1808   LINE2 
2330   2624   MLT1G 
2626   2752   MLT1G2 
2753   2874   MLT1G 
3578   3665   MER94 

Sheep GenBank annotation:
1745..2082 LINE2"
2092..2203 MER5A"
2194..2492 Bov-B"
2482..2561 MER5A"
2590..2683 LINE2"
2684..2842c Bov-tA2"
3011..3174 LINE2"
3650..3942 MLT1G"
3756..4215 MLT1F"

Zebrafish RH panel locates prion gene

PNAS Vol. 96, Issue 17, 9745-9750, August 17, 1999
Locating and sequencing the prion gene in fish is long-overdue. The sequence would greatly illuminate the origin of this protein, its normal function, and timing of emergence of key domains. Enough of the 106-126 region may still be present to cause concern in intensive fish farming as infered from antibody binding to the 112 region.

RH mapping, a familiar method from the human genome project, creats a panel of markers produced by fusion of irradiated zebrafish chromosomes with mouse cells. The authors characterized 849 simple sequence length polymorphism markers, 84 cloned genes and 122 expressed sequence tags allowed the production of an RH map, with average breakpoint frequency of 148 kilobase, covering 88% of the zebrafish genome. Comparison of marker positions in RH and meiotic maps indicated a 96% concordance. Mapping expressed sequence tags and cloned genes will help identify candidate genes for specific mutations in zebrafish.

The prion gene was not one of their markers (even though the aunt of a lead author died of CJD in 1997). However, as the webmaster noted on 16 Dec 98, by looking at the human RH map for enough genes nearby (some of which _were_ zebrafish markers), synteny could safely inferred between human chr 20 12 pter and linkage group 17 (or its tetraploidization doppelganger, LG 20) in zebrafish. Fish experienced a tetraploidization event with the other copy of the prion gene landing on LG20. Whether both copies were retained or whether they diverged in function as paralogues will be an interesting question.

Looking now at the searchable databases (1, 2, 3) in conjunction with Figure 1A of the PNAS paper we see a much improved map of LG17 and the associated small clone lines that hopefully carry the fish prion gene.

Dating retrotransposon insertions in ruminant prion mRNA

19 Aug 99 webmaster
Could a bee sting cause scrapie? Yes, indeed -- there was an apparent near miss 23.6 million years ago in a common ancestor of sheep and cow -- a retrotransposon event that might have boosted prion protein production to levels fostering sporadic TSE.

Ruminants contain a 1220 bp mariner retrotransposon in their 3' UTR portion of their mRNA. This element, with its terminal inverted repeats, are described by Lee as a fossil transposase pseudogene with homology to the Mellifera (bee) subfamily. It is probably an old insertion shared by all ruminants since it has 7-8 frameshifts and 5 stop condons -- figure 3 of the Lee paper shows a guided translation and the correct flanking human gene alignment. The insertion in cow/sheep occured between 27587 and 27588 in terms of human 3' UTR numbering, just downstream of the Bov-tA3, greatly increasing the length of ruminant mRNA.

Dating the 3 retrotransposon insertions in ruminant prion 3' UTR mRNA is accomplished by simply aligning cow and sheep 3' UTR mRNA sequences in ClustalW and comparing the rate of fixation of mutations in the retrotransposon regions to that of the main 3' UTR mRNA, an estimate can be obtained by simple proportionality of when the insertion events took place relative to cow/sheep divergence. The main error is in the effects of selection in distorting rates of fixation -- some regions important to mRNA are no doubt conserved whereas retrotransposons in general are less constrained. Note however that the 2.1k polyA signal is quite close (65bp) to the start of the first retrotransposon and that many regions of non-transposon mRNA are poorly conserved. There are no saturation effects -- the rates are so low that multiple hits at the same site are insignificant. Gaps are treated as a fifth base.

Since the cow/sheep divergence, there have been about 60 mutational events fixed in evolution in the 1503 bp of the main mRNA 3' UTR. Restricting attention to the 56 simple events (= 3.73% difference or 96.17% identity), the rate is 3.73 events per 100 bp since divergence:

47 changes of form *x*
7 changes of form *xy*
1 *xyz*
2 *xyzq*
3 *xyzqr*
Aligning the 3 retrotransposons gives rates of
Bov-B LINE (387 bp):        21 events in 385 bp,  for a rate of 5.45 per 100 bp
Bov-tA3 SINE (159 bp):      11 events in 159 bp,  for a rate of 6.91 per 100 bp
OaMAR1 mariner) (1,220 bp): 54 events in 1223 bp, for a rate of 4.41 per 100 bp 
Note first that these rates are consistent with the requirement that the retrotransposon insertions occurred in a common ancestor. The relative rates within the 3 retrotransposons gives a relative order of insertion of the Sine element earliest, followed by the Line element, and most recently, the mariner element. These scale as (Line:Sine:mariner)=(1.23:1.56:1). Since none of these were present at the time ruminants diverged from carnivores, the events are bounded by that divergence. For illustrative purposes, assuming 40 million years as the date of insertion of the mariner element gives 49.2 my as the date of the Line element and 62.4 my for the Sine.

The divergence of cow and sheep may be taken as 20my. The relative mutation rates of (mRNA:Line:Sine:mariner) = (1:1.46:1.85:1.18) then scale to 20my, 29.2my, 37my, and 23.6my for purposes of dating the events:

Bov-B LINE    29.2 my
Bov-tA3 SINE  37.0 my 
Mariner       23.6 my

This predicts that all 3 retrotransposons will be present in deer and elk mRNA (the mariner element just barely) but not in cetaceans, pigs, or camels, since the bovid/cervid divergence is said to be 21my and whales diverged at 60my. Thus the mariner insertion event -- resulting from a rare bee sting long ago that transferred DNA into an unfortunate ruminant -- might be quite useful in resolving or confirming details of artiodactyl divergence (giraffe, gazelle, etc.).

One can only wonder what happened to tissue-specific regulation of translation of the prion gene subsequent to this event. Or if a bee sting today could cause scrapie.

Human prion mRNA corresponding to the less-used 2.5k mRNA species aligns fairly well with the sheep 2.1 polyA site:

sheep: 613   tgttt-aagca-cct-tcaagtgatattcctttctttagtaacataaagtatagataatt 669
             ||||| ||| | ||| | | |||  |||||||||||||  |||      ||||| |||||
human: 26979 tgtttaaaggaccctat-atgtggcattcctttcttta--aac------tataggtaatt 27029

                                                           
sheep: 670   aaggtacct--taattaaactaccttctagacactg-agagcaaat 712
             |||| | ||   || |||| | |||||||||||||| || ||||||
human: 27030 aaggcagctgaaaagtaaattgccttctagacactgaag-gcaaat 27074
It is interesting to compare this dating method to that recently used by HS Lee et al to track the origins of E200K founding mutations, Am J Hum Genet 1999 Apr;64(4):1063-1070. Here, neighboring microsatellite markers were used to separate distinct mutational events, which could then be dated by known historic diasporas and point changes as here. The results are somewhat muddled however by ambiguous haplotypes, slippage events, and recombination -- radiation hybrid panel markers were not used. Point mutations such as E200K are harder to date because the same one can occur over and over, especially in CpG sites, as well as revert, whereas retrotransposon bursts are irreversible and uniquely clocked.

Diffusion-weighted magnetic resonance imaging in probable Creutzfeldt-Jakob disease: a clinical-anatomic correlation.

Arch Neurol 1999 Aug;56(8):951-7
Na DL, Suh CK, Choi SH, Moon HS, Seo DW, Kim SE, Na DG, Adair JC
CJD is a rare transmissible disease that typically causes a rapidly progressive dementia and leads to death in less than 1 year. Although a few anecdotal reports suggest that diffusion-weighted magnetic resonance imaging may help substantiate premortem diagnosis of CJD, detailed correlation between radiographic data and clinical, electrophysiologic, and metabolic parameters is not available.

:

Signal abnormalities on diffusion-weighted images in 3 consecutive patients with probable CJD were correlated with psychometric features, electroencephalographic findings, and functional images with either positron emission tomography or single photon emission computed tomography.

Focality of abnormalities on diffusion-weighted image, not apparent on routine magnetic resonance images, correlated closely with clinical manifestations of CJD. The topographic distribution of signal abnormality on diffusion-weighted image corresponded with abnormal metabolism or perfusion on positron emission and single photon emission computed tomographic scans. In 2 cases, the laterality of diffusion abnormalities correlated with periodic sharp wave activity on electroencephalograms. These findings extend previous observations that suggested a diagnostic and localizing utility of diffusion-weighted imaging in CJD.

Brain scans may help to diagnose new strain of CJD

Reuters North America  Wed, Aug 18, 1999
Scientists may be getting closer to diagnosing the human equivalent of mad cow disease with a simple brain scan, New Scientist magazine said Wednesday. The new variant of Creutzfeldt-Jakob Disease (nvCJD), a brain-wasting illness, is difficult to diagnose and is usually only confirmed with an autopsy.

But researchers at the Royal Victoria Infirmary in Newcastle upon Tyne in northern England and the National CJD Surveillance Unit in Edinburgh, Scotland, said scarring deep in the brain's posterior thalamus could be a sign of the disease. That part of the brain controls sensory information like hearing, touch and vision.

"This scarring shows up as increased signal intensity on magnetic resonance images (MRI) of the brain, and isn't seen in forms of CJD not linked to the consumption of BSE (bovine spongiform encephalopathy, or mad cow disease) infected beef," the magazine said.

But picking up the scarring on an MRI is difficult and takes a trained eye. Detection is made harder because many radiologists see only one case of nvCJD in several years.

To overcome the problem, Alan Coulthard and his colleagues in Newcastle upon Tyne developed a standardized method to read MRI scans. To devise the method, they compared scans of three nvCJD patients with 14 other people with no neurological problems. The study was small so the researchers are not sure how reliable the test is but they are planning a larger study that they hope will confirm their results.

Anti-prion 4.5 kb mRNA: a harmless prion pseudogene?

21 Aug 99 webmaster
The prion literature from the early 1990's contains various bizarre papers relating to an open reading frame on the anti-sense strand (that could be revisited using the many dozens of new species sequenced) and a polyadenylated 4.5 kb mRNA from brain (but not liver, spleen, or lung) that hybridized under stringent conditions with prion mRNA. The last of these papers showed that the knockout mice still had the 4.5 kb mRNA, ie, it originates at a second locus. Expression is unchanged in TSE. The 4.5 kb species has been sought and found in mouse, hamster, and bovine. It has never been cloned or sequenced.

This RNA presumably is mRNA for some secondary gene because it is not found experimentally in non-polyadenlyated RNA. The boundaries of hybridization are poorly established. Probe A of Moser et al. raises concerns about mere retrotransposons 5' in hamster (but not 3', see below); however probe B is completely within the ORF, running from NcoI to HinCII restriction fragments of hamster, or approximately positions 190-460 of the ORF.

Now mouse brain has been the subject of tens of thousands of sequencing runs to determine expressed sequence tags (ESTs), ie cDNA made from oligo dT primers, ie mRNA of expressed genes. The 4.6 mRNA, which appears to be abundant, should thus appear in the mouse EST database at GenBank. The 4.5 kb mRNA has not been sought in humans but is presumbably present, given its expression in a range of other mammals, putting it in the even larger human brain EST collection.

However, RT PCR runs in the EST databases do not extend generally for more that a few hundred base pairs, ie, the last 10% of the 4.5 kb mRNA. Since the extent and region of the 4.5 kb mRNA hybridizing to prion mRNA has never been determined, it is possible that the portion of the 4.5 kb species hybridizing is upstream of the region that would be in the EST collection. Probe B is however a favorable query because few prion ESTs extend past the 1606 bp 3' UTR into the probe B region. Probe B could be extended in the 5' direction (omitting prion introns) to pick up more of the 3' region of 4.6 kb mRNA EST sequence. However, this may not be enough to find the 4.5 kb species even if in the EST collection

Note that the 4.5 kb mRNA is + (complementary to - prion mRNA) so cDNA made from it is - so matches of prion DNA probe with cDNA EST sequences are +/+ unlike +/- matches of prion DNA with prion mRNA cDNA. (The EST database is actually mixed.) This is irrelevent to Blastn searching which allows both senses but restricts sharply the number of sequences that need be examined in detail

The 4.6 kb mRNA's gene could also have been completely sequenced in the course of human or mouse genomics projects; the former is scheduled to be completed by spring, 2000. This would appear as an imperfect +/- hit on a Blastn search using probe B. It is not clear how to translate stringent hybridization into a Blast score.

Note that any protein made from 4.6kb mRNA would not have any sequence similarity to prion protein, no matter what its reading frame or ORF location. However, Blastx compares a nucleotide query sequence translated in all reading frames against a protein sequence database, so prion mRNA as a probe would search for 4.6 kb protein as well. However, the part that hybridizes with prion mRNA may not be translated into protein, ie, could correspond to 5' or 3' UTR.

Protein made from this alternative gene would have nothing in common with the prion protein sequence, properties, or regulation, though interference of mRNAs remains an issue. If statistics rules out a fluke occurence of ahybridizing species and no regulatory role is found, then the likliest explanation is that a prion pseudogene has become integrated into another gene oriented so that the prion minus strand has become (a harmless?) part of the mRNA of second gene, just as the mariner pseudogene appears in the ruminant prion 3' UTR mRNA.

On 21 Aug 99, dozens of Blastn and Blastx searches using probe B and its extensions with a range of parameter opations failed to turn up a strong candidate for the 4.6 kb species. So this will have to be revisited as the human genome project progresses.

References:

Anti-prions and other agents.

Nature 1991 Jul 25;352(6333):291-2 
Hewlinson, RG et al [omitted at Medline]
Comment on: Nature 1991 Feb 14;349(6310):569-71 [SEs: The prion's progress. Weissmann C]
Comment on: Nature 1991 May 9;351(6322) 106:[Anticipating the anti-prion protein? Goldgaber D]
This paper was the first to show the 4.6 kb mRNA.

An anti-prion protein?

Nature 1993 Mar 18;362(6417):213-4 
Moser M, Oesch B, Bueler H
This paper showed that the polyadenylated 4.6 mRNA was present in hamster and mouse knockouts.

Cerebral amyloid angiopathy in an aged great spotted woodpecker.

Neurobiol Aging 1999 Jan-Feb;20(1):53-6
Nakayama H, Katayama K, Ikawa A, Miyawaki K, Shinozuka J, Uetsuka K, Nakamura S, Kimura N, Yoshikawa Y, Doi K
A male great spotted woodpecker (Picoides major), which was at least 16 years old, died due to general weakening. Cerebral vascular walls, including capillaries, were positively stained with Congo red with green-gold birefringence, and some of which showed a severe deposition of the Congophilic materials resulting in a corona-like fibrillar radiating structure.

The Congophilic materials were positive for beta amyloid protein, but negative for prion protein. Only a few senile plaque-like structures were observed in the cortex by PAM stain and beta amyloid immunostain. The present case is the first observation of cerebral amyloid angiopathy in avian species and will indicate the presence of such age-related cerebral lesions also in birds.

Comment (webmaster): This is quite interesting in that the amyloidosis may be part of the normal aging process. The congo red test is solid support. This is a different kettle of fish from assertions of TSE in squirrels etc.

Raman Study of Copper(II) Binding Mode of Prion Octapeptide

Biochemistry 1999 Aug 31;38(35):11560-11569 
Miura T, Hori-i A, Mototani H, Takeuchi H
 
The cellular form of prion protein is a precursor of the infectious isoform, which causes fatal neurodegenerative diseases through intermolecular association. One of the characteristics of the prion protein is a high affinity for Cu(II) ions. The site of Cu(II) binding is considered to be the N-terminal region, where the octapeptide sequence PHGGGWGQ repeats 4 times in tandem. We have examined the Cu(II) binding mode of the octapeptide motif and its pH dependence by Raman and absorption spectroscopy. At neutral and basic pH, the single octapeptide PHGGGWGQ forms a 1:1 complex with Cu(II) by coordinating via the imidazole N(pi) atom of histidine together with two deprotonated main-chain amide nitrogens in the triglycine segment. A similar 1:1 complex is formed by each octapeptide unit in (PHGGGWGQ)(2) and (PHGGGWGQ)(4).

Under weakly acidic conditions (pH approximately 6), however, the Cu(II)-amide(-) linkages are broken and the metal binding site of histidine switches from N(pi) to N(tau) to share a Cu(II) ion between two histidine residues of different peptide chains.

The drastic change of the Cu(II) binding mode on going from neutral to weakly acidic conditions suggests that the micro-environmental pH in the brain cell regulates the Cu(II) affinity of the prion protein, which is supposed to undergo pH changes in the pathway from the cell surface to endosomes. The intermolecular His(N(tau))-Cu(II)-His(N(tau)) bridge may be related to the aggregation of prion protein in the pathogenic form.

Comment (webmaster): There have been numerous studies of isolated repeats and copper; no studies of copper in the intact repeat domain (pre-106) nor in native protein. This paper is remarkable for the inter-molecular bridging reported and for the modest pH change bringing this about. pH has been studied before so it is not immediately clear why this phenomenon was not seen earlier. Copper bridging makes no real sense in the pathology: the repeat region is inessential for this and the aggregate has been known for decades to be a classical cross-beta amyloidosis. However there could be a role here in normal function.

Oligopeptide-repeat expansions modulate 'protein-only' inheritance in yeast.

Liu JJ, Lindquist S
Nature 1999 Aug 5;400(6744):573-6 
The yeast [PSI+] element represents a new type of genetic inheritance, in which changes in phenotype are transmitted by a 'protein only' mechanism reminiscent of the 'protein-only' transmission of mammalian prion diseases. The underlying molecular mechanisms for both are poorly understood and it is not clear how similar they might be. Sup35, the [PSI+] protein determinant, and PrP, the mammalian prion determinant, have different functions, different cellular locations and no sequence similarity; however, each contains five imperfect oligopeptide repeats-PQGGYQQYN in Sup35 and PHGGGWGQ in PrP. Repeat expansions in PrP produce spontaneous prion diseases.

Here we show that replacing the wild-type SUP35 gene with a repeat-expansion mutation induces new [PSI+] elements, the first mutation of its type among these newly described elements of inheritance. In vitro, fully denatured repeat-expansion peptides can adopt conformations rich in beta-sheets and form higher-order structures much more rapidly than wild-type peptides. Our results provide insight into the nature of the conformational changes underlying protein-based mechanisms of inheritance and suggest a link between this process and those producing neurodegenerative prion diseases in mammals.

Comment (webmaster): The paper looks at a double deletion and a double insertion, finding that only the latter lead to the phenotype. While there is no harm in this for yeast, no analogy to the repeat situation in mammalian prion exists despite claims made in the paper. First, the repeat region can be deleted altogether and prion disease can still occur in animal models. Second, the repeat region is typically clipped and missing from in vivo amyloid. Third, two extra repeats is not associated with enhanced risk for CJD, and fourth, the goat double deletion is accompanied by a significant point mutation and only one heterozygous goat showed an increased incubation period when exposed to full repeat sheep agent also missing the point mutation (so simply attributable to interference and under-production. The bottom line is that the yeast system has nothing more in common with mammalian prion than any of the other 25 amyloidoses do.

Mad Cow Home ... Best Links ... Search this site