Human Ghost Prion
Mad Cow Home ... Best Links ... Search this site

Prnd and Prnp gobbled up by human genome project
Which came first, the prion or the ghost?
Ghost prion promoters
Bridging the gap
Pseudogene prion neighbor: isopentenyl diphosphate:dimethylallyl diphosphate isomerase IPP
Mutant ubiquitinylation tied to brain diseases
Pentosan binding to mature prion protein
French review of FFI: haplotypes and phenotypes
Protection of personnel and environment against CJD in pathology laboratories.
Leptomeningeal melanoma and CJD in a patient with chronic lymphocytic leukaemia.

Human Ghost Prion

3 Sep 99 webmaster research
The sequencing of the human genome and of chromosome 20 in particular is moving at a rapid pace. This will allow a global search for other genes related to the prion and ghost prion proteins and will have significant ramifications for determining the normal function as well as origin of these gene products. Other genes of interest include diverged duplicates (paralogues with different but probably related function), fossil pseudogenes (snapshots of the active gene at the time of creation that evolved subsequently under neutral selection), orthologues (that for some reason escaped detection by southern blot), and locally adjacent non-homologous genes (that might offer clues to overall cluster function.In the last few weeks, an adjacent paralogue of the prion protein and an adjacent isoprenyl isomerase pseudogene have been found.

On 11 Aug 99, the chromosome 20 team at the Sanger Centre uploaded a huge swatch of chromosome 20 sequence enveloping the human Prnd doppelganger gene 5' as well as the main prnp gene (without having the time for analysis or annotation). That sequence of 159,851 bp in 30 unordered non-overlapping pieces appeared on GenBank as AL109808.

One contig in particular, dJ1068H6.00023 , covers the new ghost prion gene [Prnd: prion-like doppel] and provides a better translation to protein than the mRNA-derived EST AL042906, which grounded out 5' and had 1 error 3'. The correct sequence for the human prnd protein is 72% identical to mouse and 82% similar:


 M  R  K  H  L  S  W  W  W  L  A  T  V  C  M  L  L  F  S  H 
 L  S  A  V  Q  T  R  G  I  K  H  R  I  K  W  N  R  K  A  L 
 P  S  T  A  Q  I  T  E  A  Q  V  A  E  N  R  P  G  A  F  I 
 K  Q  G  R  K  L  D  I  D  F  G  A  E  G  N  R  Y  Y  E  A 
 N  Y  W  Q  F  P  D  G  I  H  Y  N  G  C  S  E  A  N  V  T 
 K  E  A  F  V  T  G  C  I  N  A  T  Q  A  A  N  Q  G  E  F 
 Q  K  P  D  N  K  L  H  Q  Q  V  L  W  R  L  V  Q  E  L  C 
 S  L  K  H  C  E  F  W  L  E  R  G  A  G  L  R  V  T  M  H 
 Q  P  V  L  L  C  L  L  A  L  I  W  L  T  V  K  -  
The new sequence contains 334 bp upstream of ORF of the human gene, but no particular match to the 7bp 5' UTR of exon 2 of mouse, ATTCACC. The contig provides 4855 bp of distal human chromosome [beginning at position 866] so 3009 bp more than for mouse. The alignment with mouse is spotty below the ORF but with a 78% identical hit over 89 bp within exon 3 of mouse prnd (38013-39315): 38124/2521-38209/2608 [notation: mouse U29187/human 00023].

This contig explains the mysteries of why so few human ESTs were in the database and why the last 221 bp of human mRNA AL042906 had no homology to mouse -- mouse prnd is simply a poor probe for rapidly evolving 3' UTR. A good human probe like contig 00023 easily finds human mRNAs at 161-653 (eg, AL041968 ) and 2087-2557 (eg AI655440) and also shows that AA234322 extends to 11 bp of 5' UTR.

9 Blastn hits of contig 00023 to human dbEST

The 9 mRNA fragments in the human EST database on 3 Sep 99 related to below occur in two distinct groups, both strongly nested. The first group (AL041968: 161-693, AI656950, and AI637716) define a region extending 533 bp from 161-693 below the end of the CDS. The second cluster (AI242370, AI288920, AI337054, AI655440:2087-2558, AI825182, AA758081: 2344-2644) defines a transcribed region of 578 bp from positions 2087-2664. (There are two additional ESTs in the CDS, AL042906 and AA234322.)

The human sequence has a set of 5 post-ORF retrotransposons within contig 00023, 2 of which are primate-specific (Alu) and 3 possibly inserted in the era of common ancestor with rodents. (However, the Line 2b and MIR are not found in mouse sequence; the LIME 4 is possibly too distal.)

The question now is how to align the post-CDS sequences of mouse and human so that gene structure, mRNAs, polyadenylation, alternate splicing, and alternate terminating can be compared and the evolutionary conservation of these features assessed. Despite the lack of retrotransposon anchors common to mouse and human, the clustering pattern of human ESTs suggests a similar exon/intron pattern to mouse. There is no analogue to a short exon2a in human in the ESTs; the key EST, AL042906, extends 209 bp contiguously into the post-CDS (versus 49 bp for mouse exon 2b(after correcting many small errors in the alignment with contig 00023 -- AL042906 is the flawed sequence because AL041968 agree perfectly with contig 00023). It overlaps 3' UTR AL041968 by 48 bp and begins 34 bp within the Line 2b. All 11 human ESTs are contiguous; none span a splice site.

Mouse is not an ideal probe to retrieve human ESTs, given the divergence. The two human CDS ESTs retrieved by mouse are AL042906 and a shorter pseudo-subsequence AA234322. The former sequence starts at the 25th bp of the coding sequence and so has no information about 5' UTR or any upstream exon. Contig 00023 is a better probe to proximal human ESTs (though not to an exon 1 like that of mouse): it finds no new ESTs but does improve the alignment significantly with AA234322, from 322-334 of contig 00023.

In other words, this EST does in fact go upstream 11 bp of the beginning of the CDS, agGTTCTGACGCG. (It was sequenced at WashU 06-AUG-1997). This establishes that human, like mouse, has a stretch of 5' UTR as part of the CDS exon. (There is no splice site here in contig 00023. Mouse only had 7 bp prior to a splice site with exon 1.) GRAIL analysis suggests that the AG do not belong with the leader but instead are the last 2 bp of a putative exon 1, which in fact is consistent. Indeed, a splice site is reported at position 323 of contig 00023, ie at just this spot.

Contig:    321 CAGGTTCTGACGC-Gatgaggaagcacctg-agctggtggtggctggccactgtctgcatg 379
                |||||||||||| |||||||||||||||| | ||||||||||||||||||||||||||||
AA234322:  1   AAGGTTCTGACGCCgatgaggaagcacctgtatctggtggtggctggccactgtctgcatg 61
Human and mouse sequences do not align in the short 5' UTR (nor in the 334 bp available from the contig):
AGGTTCTGACGCG  atg...  human
A..TTC..AC.C.  atg...  mouse 36205

34081 gtgagGGCTCCAAGCTTCAGAGGCCACAGTAGCAGAGAACCGAGgtatgtggcggggatt  mouse splice junction
....................36191 gacagcccagcctttcccttgcagATTCACCatgaagaac 
...........................tgacccaccgccgtttctctggcAGGTTCTGACGCGatg  human splice junction?

Exon 2ab mouse barely aligns significantly (if generously gapped: open penalty 2, extension 1) with human ESTs from the proximal group: 346-455. However, this is almost entirely contained in a human MIR element, suggesting a fragment of this ancient retrotransposon has gone unrecognized in mouse (positions 301-255 human; 301-403 mouse). If so, this suggests that prnd diverged long before the MIR insertion which in turn preceded human/mouse divergence. GenScanW reports a poly A signal at AATAAA at position 551 of human.

Exon 3 of mouse has a similarity to contig 00023 (masked for 2 Alu) at positions 1656-1743 human; 112-197 [38125] mouse exon 3 or 2095-2180 post-CDS. Here the comparative mapping places this sequence agreement within both mouse mRNA and a reasonable extension of distal human ESTs. Allowing for the 264 bp of human-only Alu-Sq and 113 bp of mouse-only PB1D9 puts the exon 3 splice sites at 1396 and 1262 bp post-CDS, repectively. Allowing for the later Alu-Sx, the human mRNA would terminate shortly before the Lime4 element at 3162 (if it is the same length). In this scenario, the mouse sequence does not have this retrotransposon. But the overall lack of homology suggests that the mRNA situations will not have strongly conserved parallels, in contrast to the prion gene situation.

Provided gaps penalties are set loosely (1 for opening, 1 for extension), a match is found between a portion of exon 3 of mouse and upstream of distal human ESTs,

exon 3 mouse/human Alu masked contig (but numbered post-ORF).  Identities = 70/89 (78%), Positives = 70/89 (78%), Gaps = 4/89 (4%)  Expect = 3e-05  4 gaps of 1 bp, 7 transitions, and 8 transversions.  Rat sequence may allow some gaps to be catagorized as deletions or insertions; for now, rat EST may have sequence error.
human: 1656 cattgcagaacatgagtgctgatgagg-gcacctcttgtgctgagtcccctcagctatca 1428
            |||| ||||||| |||||||||| ||  | || ||  ||||||||| |||   || ||| 
mouse: 86   cattccagaacacgagtgctgat-agcagtacttcgagtgctgagttcccagtgcaatc- 890

human: 1429 gtgttcttctcaaggacacatttggaagg 1743
            | ||||||||||||||||| | ||| |||
mouse: 891  g-gttcttctcaaggacacctctggcagg 197

exon 2b mouse/human 3'UTR Identities = 82/112 (73%),  Gaps = 11/112 (9%)
human: 346 aggaaactgaggcccagagagctgaagtac-tgcacccagcatcaccagctagaaagtgg 404
           ||||||||||||| || |  || | ||| | ||   ||||||    ||| ||||||   |
mouse: 301 aggaaactgaggctcacattgcaggagtccctgt--ccagca----cagttagaaaacag 354
human: 405 cagagccaggattcaaccctggct-tgtctaaccccaggttttctgctctgt 455
            |||||  || |||  ||||| || |||||  || |||||||||||||||||
mouse: 355 aagagctggggttc--ccctg-ctctgtcttccctcaggttttctgctctgt 403
AL041968 494 bp upstream hit whole length human testis mRNA EST 08-JUL-1999 begins 161 post ORF, ends 693, picks up 21 bp of mystery 221 distal as well as first two retros LINE2B and MIR, possible polyA.
        1 ctgcgttctg atagatgggg gactgtggct tctccgtcac tccattctca gcccctagca
       61 gagcgtctgg cacactagat tagtagtaaa tgcttgatga gaagaacaca tcaggcactg
      121 cgccacctgc ttcacagtac ttcccaacaa ctcttagagg taggtgtatt cccgttttac
      181 agataaggaa actgaggccc agagagctga agtactgcac ccagcatcac cagctagaaa
      241 gtggcagagc caggattcaa ccctggcttg tctaacccca ggttttctgc tctgtccaat
      301 tccagagctg tctggtgatc actttatgtc tcacagggac ccacatccaa acatgtatct
      361 ctaatgaaaa tgtgaaagct ccatgtttag aaataaatga aaacacctga gctggtggct
      421 gcgtactttg actggtacat ggaggctatt ttaatccttc taactaaact aaaatagatt
      481 aaagggaATT AAAc
The contig 00023 contains 5 retrotransposons and a simple repeat. The Alu repeats, which are primate-specific, should be masked out prior to alignment with other species; the other repeats could have counterparts in mouse (post-ORF numbering) but do not seem to (indicating poor correspondence):
human post-ORF retrotransposons:
 179    266   LINE2B   88 bp +
 302    441   MIR     140 bp -
 788   1051   Alu-Sq  264 bp -
1055   1072   (T)      18 bp +
2714   3005   Alu-Sx  292 bp -
3162   3417   L1ME4   356 bp -

mouse post-ORF retrotransposons:
35130   35341   URR1a   5772   5983   212
37553   37665   PB1D9   8195   8307   113

Which came first, the prion or the ghost?

5 Sep 99 webmaster research
Since the prnd coding sequence has evolving more rapidly than the prion gene over the last 100 million years, there is less selective pressure on it. Over the same time interval, wildtype (short incubation) mouse prion gene and human wildtype prion (met 129) have accumulated 24 amino acid changes out of 214 positions or 11.2% change [repeat regions excluded] for a rate 11 accepted substitutions per 100 amino acids per 100 million years, whereas mounse and human prnd proteins differ at 49 positions out of 179 or 27.3% for a rate of 27, and so is evolving some 2.43 x faster than the prion proteins over this time frame. This would apply perhaps even more strongly to non-coding 3' UTR of which little is needed for splice sites, polyA signals, and polyadenylation sites.

On the whole, prnd matches mammalian prion protein much more poorly than avian prion does. This suggests the globular domain duplication took place prior to the split of birds and mammals. [There is no sign of the pre-repeat, the repeat, or that incredibly conserved 106-126 region. Prnd and prion protein are, however, unmistakable paralogues.]

Yet because prnd is evolving faster, its divergence point might have still have come after the bird/mammal divergence at 310 million years ago though necessarily before the mouse/human divergence at 100 mya. A prnd sequence from sheep or cow would not help to align mouse and human prnd (there is no ambiguity now) but might resolve the ancestral residue at some of the 49 positions of disagreement through outgroup arbitration (three positions, shown below, are resolved with eutherian prion protein as distant outgroup). Positions where sheep agrees with mouse but not human are resolvable given the known tree topology. This in turn reduces ambiguity in aligning prnd with ancestral prion protein but more importantly, allows comparison of reconstructed 100 my prnd with 100 my prion.

These sequences would have to be converging rapidly as one goes backwards in time to permit inference of the time of duplication (fix tree geometry). However, there is not any striking improvement in comparing well-established ancestral eutherian mammal prion to prnd. If rates of change have held roughly constant, this imperceptible rate of convergence supports very early divergence of prnd. In the definitely alignable regions, bird and mammal prions agree almost twice as often as prnd agrees with mammal prion to the exclusion of bird. This could reflect similar functions causing residues to be constrained differently in evolutionary rate than in prnd, which surely has a distinct function.

Only a single (hence idiosyncratic) marsupial sequence is available; that species is said to represent a divergence from the mammalian lineage at 150 million years. It may be easier to determine a prnd sequence here than in bird. It does not align with prnd noticably better than placental mammals do. Note that prnd and prion protein together define a deep probe pattern that might be used to find an ancestral protein in nematode or drosophila or zebrafish (whose function would hopefully be known); however that protein would be dimly related to either contemporary mammal protein.

I[H/D]FG[N/A][E/D][Y/-][E/G][D/N]RYY[R/A][E/A]N[M/Y][Y/W][R/Q][F/Y]PDG[I/V]YY[R/E][P/G][C/V][D/S]E[A/Y][S/N][N/V][Q/T][K/N][E/N]XFVTDCVN[A/I]T is the best available probe at this time; there are no tBlastn matches to ESTs from zebrafish etc. as of 20 Sept 99.

(Note though prnd is called the ghost, it is not clear at this time which stretch of chromosome holds the older segment, ie prion gene may be the duplicated section. Its slower rate of change may indicate that it retained the conserved ancestral role after duplication even though it was the new copy. Prnd lacks enough other structure to be a functional gene without the prion globular domain.)

The only regions of persuasive matching of all three groups of proteins is in the C-terminal tail, about the first disulphide and glycosylation site, FVHDCVNIT, and most strikingly in the earlier DRYYRENMYRYPNQVYYR region (146-164 in human prion notation). These latter region might be called the cross-under region in terms of the 3D structure. It forms the deep interior core of the protein in conjunction with the interior FVHDCVNIT region. As noted many times earlier on this site, this is by far the most conservative part of the prion 3D fold. By anchoring the prnd structure with these coordinates borrowed from the mammalian prion protein and using the alpha helices of prnd, the 3D structure of central prnd can be worked out with a considerable degree of confidence. The level of homology, 22% or so, is more than adequate for strong conservation of the overall fold.

mouse     human     mouse     human     bird
 prnp      prnp      prnd      prnd      prnp

D.........D.........N.........N.........Y....prnd closer to mammal prion than to bird
R.........R.........R.........R.........R....all agree
Y.........Y.........Y.........Y.........W....mammal/mammal prions/prnd agree
Y.........Y.........Y.........Y.........W....mammal/mammal prions/prnd agree
E.........E.........A.........A.........E....mammal/bird prions agree
N.........N.........N.........N.........N....all agree
M.........M.........Y.........Y.........S....marsupial is Q here
Y.........H.........W.........W.........A....mammal/mammal prions/prnd aromatic
R.........R.........Q.........Q.........R....mammal/bird prions agree
Y.........Y.........F.........F.........Y....mammal/bird prions agree
P.........P.........P.........P.........P....all agree
N.........N.........D.........D.........N....mammal/bird prions agree
Q.........Q.........G.........G.........Q....mammal/bird prions agree
V.........V.........I.........I.........V....mammal/bird prions agree
Y.........Y.........Y.........H.........Y....ancestral resolved to Y (marsupial is M)
Y.........Y.........Y.........Y.........Y....all agree 
R.........R.........E.........N.........R....mammal/bird prions agree

F.........F.........L.........F.........F....ancestral resolved to F
V.........V.........V.........V.........V....all agree
H.........H.........H.........H.........A....mammal/mammal prions/prnd agree
D.........D.........S.........S.........D....mammal/bird prions agree
C.........C.........C.........C.........C....first disuphide cysteine preserved
V.........V.........V.........I.........F....ancestral prnd resolved to V
N.........N.........N.........N.........N....signal for glycosylation preserved
I.........I.........A.........A.........I....mammal/bird prions agree
T.........T.........T.........T.........T....first glycosylation site preserved

best probe to nematode

Ghost prion promoters

5 Sep 99 webmaster research
Sanger Center contig dJ1068H6.00894 has no homology to any part of human sequence U29185 nor to prnd contig 00023 yet positions 33873-34136 of mouse U29187 matches (with astromical odds, 5e-22) positons 2386 to 2646 of this 2849 bp contig with overall identity of 77%.

This therefore represents 2849 bp of new human chromosome 20 sequence. Ita is no coincidence that mouse exon 1 occupies a portion of the overlap, 34086-34124. Since exon 2 of mouse does not begin until 36205, some 2082 bp might be expected before exon 2 of human begins. However, this contig extends mainly in the 'wrong' 5' direction and so provides only another 203 bp in the direction of contig 00023 or about 10% of the expected gap. (Overall, a 4706 bp gap exists in humans between the end of U29185 and the beginning of contig 00023, assuming similarity to mouse; 00894 thus bridges 60% of it.)

It is not be surprising that this is the best conserved stretch in the region: intron 1 will be poorly conserved except for splice junctions (inter-gene regions should drift apart rapidly); exon 1 and its upstream promoter and downstream splice junction and post-promoter region would be conserved because of functional constraints. Mouse and human promoters thus differ at 11 of 39 positions, so are 72% conserved. This is better than exon 1 of prion promoter which has a very problematic situation but is similar to prion exon 2. The human exon 1 and promoter regions need experimental confirmation; when abutted with contig 00023 (assuming missing intervening sequence is simply irrelevent intron 1), GenScanW does not confirm any exon 1 here. GRAIL indicates that the terminal AG belongs to the end of the AA234322 EST.

Assuming that the upstream boundary of aligning regions delimit the key regulatory region of the prnd promoter, it follows that mouse promoter extends from 33873-34085 or 213 bp and that human prnd promoter extends 210 bp from 2386-2595.

Human exon 1 and promoter with predicted transcription start underlined:
5' gccatcaagattttcacgtggtttccttagtaaagtgtgatgagaaggtccatccttctcaggatgaaggagtggtccaggaagccctgattggtctgccggggagggaa
Mouse and human prand promoters and exon 1's aligned
mouse: 33873 catcgagatcttcacgtggttttatcagtgaagagtaacaagaggctccaaccttc-cag 33931
             |||| |||| ||||||||||||  | ||| ||| || |  ||| | |||| ||||| |||
00894: 2386  catcaagattttcacgtggtttccttagtaaagtgtgatgagaaggtccatccttctcag 2445
mouse: 33932 ggtggaggagtgatgcagga-gcccttgattggtcctgctgtggagggaagggctgccct 33990
             | || ||||||| | ||||| ||||| |||||||| ||| | |||||||||||||||| |
00894: 2446  gatgaaggagtggtccaggaagccct-gattggtc-tgccggggagggaagggctgcctt 2503
mouse: 33991 atttggagggctggagctcggtagagaggccccctccccctgcagcgcctaTATAgctga 34050
             ||||||||  ||| ||       || | ||| |||||||| ||||| |||||||||||| 
00894: 2504  atttggagacctgcag-------ga-atgccacctcccccggcagctcctaTATAgctgg 2555
mouse: 34051 gtgg---tggc--ccagggaaggtgttccggagaagtgaG-GGCTCCAAGCTTCAGAGGC 34104
             | ||   ||||  |||  || ||||| | || | | || |  |||| | ||| |||||| 
00894: 2556  gcggacctggctgcca-agagggtgtgctgggggactgtGCAGCTCGAGGCTCCAGAGG- 2613
mouse: 34105 CACAGT--AGCAGAGAACCGAGgtatgtggcggg 34136
             |||| |  || ||||| || ||||| |||| |||
00894: 2614  CACACTCCAG-AGAGAGCCAAGgtacgtgggggg 264
The putative splice junction, which would remove human intron 1 should be:
................tgacccaccgccgtttctctggcagGTTCTGACGCGatg  human splice junction
Contig 00894 pulls up no ESTs in either human or mouse by Blastn. No genes or exons are found by GenScanW within it. There seems no simple way of extending it within currently available chromosome 20 data. Analysis for retrotransposons shows a single early insertion, 6-158 MLT1A- of 153 bp (missing in mouse).

The human prnd gene is usefully analyzed using the dozens of programs (most of which are mediocre) at Baylor:

GENIE finds on all of contig 00023:

Exon 0: 335- 561
Exon 1: 960- 1005 or 1017
Exon 2: 1208- 1300
Number of Donor    sites:  
    2   1018  0.81
    3   1185  0.83
   12   4696  0.82
Number of Acceptor sites:   
    1    323  0.80
    8   3040  0.69
     6 potential polyA sites were predicted
 1417 LDF-  4.10
 2212 LDF-  2.59
 2239 LDF-  1.28
 3178 LDF-  3.20
 3503 LDF-  5.33
 4220 LDF-  4.52
Transcription factor binding sites are reported by Transfac's MatInspector; their significance awaits experimentation. They are given below in order of confidence level (highest at top), ie, there are two (fused) TATA type 1 recognition sites with high scores (search sequence is ctaTATAggagctgc) about 27 bp ahead of the transcription start site, in both human and mouse.
# Name	    Confidence
2 TATA_01	5.33
1 AP2_Q6	4.78
1 GKLF_01	4.76
2 NF1_Q6	4.11
4 IK2_01	3.95
3 MZF1_01	3.84
1 CREL_01	2.74
2 USF_Q6	2.52
2 DELTAEF1	2.42
1 CAAT_01	2.21
2 USF_C	    2.19
2 GC_01	    2.12
1 CEBPB_01	2.07
1 TH1E47_01	1.95
1 NFAT_Q6	1.91
3 SP1_Q6	1.7
2 MYCMAX_02	1.67
2 NMYC_01	1.64
1 GFI1_01	1.57
2 CETS1P54	1.53
2 USF_01	1.15
4 AP4_Q5	0.96

Bridging the gap

6 Sep 99 webmaster research
How much sequence is missing between human prion and human prnd -- and how do we bridge the gap? One method is to wait for the Sanger Centre to finish this region of chromosome 20; perhaps the imminent JMB paper has it also. There may be additional related genes in either direction -- possibly it was a region spanning prnp-prnd region that was duplicated (though the tandem nature today suggests not). For that reason, extensions 5' and 3' should be analyzed as well. (The contigs here are only a small part of the enveloping 159,851 bp of a larger region being sequenced in 30 as-yet unordered pieces, HSJ1187J4. This has, on quick analysis with GenScanW, other gene candidates. )

To estimate the prnp-prnd gap, note mouse CDS-CDS gap is 15,764 bp and mouse mRNA-CDS gap is 12,411 bp. Mouse prion CDS ends at 20441 and the mRNA at 21675; exon 1 of the prnd gene begins at 34086 and the CDS leader at 36205. The human prion gene CDS ends at 26211 and its mRNA ends at 27817 but the sequence extends 7705 beyond this. Both the gap to human prnd exon 1 and that to contig 00023 are to be determined. Assuming similarity to mouse, there is an estimated 4706 bp gap in humans. As noted above, contig 00894 spans 2849 bp of this; as an unfinished sequence it may still contain artifacts from E.coli, yeast, vector, phage etc..

This contig family could be partly assembled using U29185 and U29187 as probes. Further the set could be used to 'patch' missing human sequence between the end of the prn gene and the beginning of prnd, by looking for staggered extensional sequence or by using mouse homology to assemble contigs. Similarly, the 5' and 3' ends of the chromosomal 20 region might be expanded.

Contig dJ1068H6.00167 seems to extend human prion from its 3' end; dJ1068H6.00024 seems to extend off mouse U29187 some 1100 bp in a favorable direction; and dJ1068H6.00905 also may cover human promoter (or contain a retrotransposon related to that of 00023).

The bottom line is that contig 00024 extends the published 3' end of human U29185 by 6755, followed by 2884 bp of contig 00905, then 2849 bp of contig 00894 (with putatuive exon 1 of human prnd), then apparently 1703 bp of contig 01225 (intron 1 or prnd), and finally the 5717 bp of contig 00023 (CDS of prnd).

This allows the intergene distances to be calculated for both human and mouse: inter-CDS distances (stop to met) are 14472 bp and 15771 bp, resp.

Pseudogene neighbor: isopentenyl diphosphate:dimethylallyl diphosphate isomerase IPP

6 Sep 99 webmaster research
If this region of the chromosome covered by dJ1068H6 averages one active gene per 25-35 kilobase, then dJ1068H6 should contain 2-3 neighboring genes (in addition to prnp and prnd). Since prnd turned out to be of exceptional interest, it is worth analyzing all 30 fragments of dJ1068H6 [fasta] even if these neighbors cannot yet be sited.

In searching for genes, one must be aware of pseudogenes, retrotransposon transposases etc. that appear in many places as weak homologues, as well as alternate splicing products (even of prnp and prnd), immature analytic software, and similar complications.

The first step is to analyze each of the 30 contigs to see if they cover known parts of chromosome 20 spanned by prion-prnd to exclude these from further analysis. This can be done by selecting non-repeated elements along the prion-prnd axis (eg, exon 2 of prnp) as probes and blasting them against Sanger Center chromosome 20 to recover the hopefully single contig that contains them.

Next the residual contigs can have their retrotransposons mapped and masked. The resulting sequences can then be examined by GenScanW for candidate genes -- even those these are unplaceable for the time being. These candidates can be tested further against EST and non-redundant databases to see if they are active or known genes.

This approach quickly turns up a degenerate pseudogene for isopentenyl diphosphate:dimethylallyl diphosphate isomerase IPP on contig 01219 which is examined in detail below.

The 1998 radiation hybrid map of chromosome 20 displays various genes such as CDP-diacylglycerol synthase and various microsatellites as immediate neighbors of the prion gene. Again, these can be tested for coverage by dJ1068H6. A recent E200K study also constructed a fine-structure YAC map of the prion gene relative to numerous microsatellites most of which are on the RH map as well. The goal here is to determine the orientation of prion - prnd axis relative to the centromere-telemere as well as to locate neighboring features and anchor the contig on chr 20.

Build of Chromosome 20 12pter using dJ1068H6

Ordered 5' to 3' relative to prion gene + strand
56,347 bp are joined and ordered; 9,226 are new.  U29185 had 35,522 bp.

00030   3223 bp  5' end of prion, 2525 bp new 5'overhang, 1626 bp retros, dJ1164I1.352 close
.......... 239    528   Alu-Y -
.......... 529    559   (A) -
.......... 960   1246   Alu-Sx -
..........1630   1756   Alu-J -
..........1891   1995   (CT) -
..........1996   2254   Alu-Sg +
..........2320   2731   MER74 -
..........2813   2927   MER3 -
00471  11942 bp  exon 1 of prion gene
00270   2597 bp  exon 2 of prion gene
01097   4040 bp  intron 2 15895-19934 within U29187, minus strand
00808   5381 bp  5' end of exon 3 prion CDS
00562   2907 bp  3' end of exon 3 of prion gene
00167   6349 bp  almost 3' end of prion gene, 54 bp short, 2684 bp retros
.......... 731   1191   MLT2B2-
..........1346   1632   Alu-
..........1871   2157   Alu-Sg-
..........2168   2336   Alu-Jo-
..........2812   3098   MER115-
..........3529   3661   Alu-Jo+
..........4048   4202   MER5A+
..........4481   4663   MER5A+
..........5445   5736   Alu-Sz-
..........5844   6291   MER88+
00024   6755 bp  3' end of prion gene +6701 bp new human sequence
..........   1    130   Alu-J +
.......... 148    426   L1ME3A +
.......... 642    856   MER80 +
.......... 861    911   MER81 -
.......... 928   1023   MER81 -
..........1028   1263   MER80 +
..........1294   1583   Alu-Sx +
..........2669   3068   L1PA7 -
..........3719   4007   Alu-Sq -
..........4061   4419   L1MB7 +
..........4733   5020   Alu-Jb -
..........6237   6616   MLT1B -
00905   2884 bp  matches inter-gene region of mouse U29187
..........1914   2077   L1ME2-
..........2078   2391   L1ME3A-
..........2393   2681   Alu-Sc-
..........2770   2883   MLT1+

00894   2849 bp  contains exon 1 of prnd gene
.............6    158   MLT1A- of 153 bp 
01225?  1703 bp  intron 1: matches 168 bp post exon 1 prnd not URR1A 35130-35341 artefact as 35517-35678
.......... 920   1091   MIR  -
..........1373   1637   Alu-Jo -
..........1640   1692   (GT) -
00023   5717 bp  contains prnd gene, 5 retrotransposons, analyzed above.
..........1044   1131   LINE2B-d
..........1167   1306   MIR+
..........1653   1916   Alu-Sq+
..........1920   1937   (T)-
..........3579   3870   Alu-Sx+
..........4027   4282   L1ME4+

18 contigs not locatable at this time (not homologous to proximal or distal mouse)
80,304 bp that extend prnp-prnd region without gaps or overlaps

01219+  3156 bp IPP pseudogene, analyzed below
..........   1     81   Alu-Jo-
..........1032   1308   Alu-Jo-
..........2006   2297   Alu-Jb+
..........2309   2587   Alu-Sx+
00241   2587 bp 
00306   1852 bp 
00318   8852 bp 197_aa no blastp, fair ests, 4142 bp retros
.......... 895   1017   MER5A-
..........1631   1822   MER4C+
..........1846   2292   MER4A+
..........2314   2382   LINE2+
..........2456   2635   LINE2+
..........2672   3170   LOR1-
..........3814   4171   THE1B+
..........4193   4294   MER5A-
..........4333   4471   MER5A-
..........5685   5968   Alu-Sx+
..........6073   6173   L1ME_ORF2+ [does not contained predicted protein]
..........6174   6311   L1ME_ORF2+ [does not contained predicted protein]
..........6468   7256   LTR1+
..........7659   7821   L1P_MA2+
..........7830   8119   Alu-Jb+
..........8127   8178   L1P_MA2+
..........8193   8521   L1MA9-
00329   2663 bp 
00400   3339 bp 
00445   8012 bp 
00511   6751 bp 101 aa no blastp, strong ests, 4243 bp in retros
..........   2    121   L1ME_ORF2 -
..........1165   2395   THE1BR -
..........2396   2730   THE1B -
..........2745   3287   LTR26E -
..........3364   3759   LTR26 +
..........3859   4069   LINE2 -
..........4530   4710   LINE2 -
..........4750   5329   LINE2 -
..........5411   5772   LTR16A1 -
..........5780   6071   Alu-Jo -
..........6187   6215   LTR16A1 -
00643   4317 bp 
00635   5342 bp  
00744   5000 bp  
00769   1868 bp 
00802   3186 bp 
01014   2894 bp 
01079 3336 bp 137aa no blastp, very strong ests, but 3213 bp retros
..........   7    123   L1MB7  +
.......... 125    361   MER8 -
.......... 375   1190   L1ME_ORF2 +
..........1191   1381   L1P_MA2 +
..........1389   1677   Alu-Sx -
..........1684   1749   L1P_MA2 +
..........1767   1895   TIGGER5_B +
..........1925   2033   TIGGER1 +
..........2036   2181   Alu-Sxzg +
..........2188   3328   THE1BR  -
01094   3752 bp 45 aa no blastp, very weak ests, 918 bp of retros.
.......... 225    508   Alu-Sp +
..........1895   2035   MER63A -
..........2768   2848   MER5A +
..........2851   2980   LINE2 +
..........3041   3322   Alu-Jb -
01129   7341 bp 
01347   6056 bp 

Checking adjacent markers for presence on chromosome 20 contig:
no:  CDP-diacylglycerol synthase, not sequenced by HGP yet
yes: Homo sapiens mRNA for KIAA0548 protein, in dJ581P3.00110 etc.
yes: stSG29963  is simply a prion gene EST for exon 3  and 00562
no:  stSG8643 is a short EST not covered
no:  Human putative cyclin G1 interacting proten
yes: stSG8000  Heat Shock Cognate 71 Kd Protein, in dj681N20.C20.1
yes: Bdyc4e10 CENPB Centromere protein B (80kD), in dJ1009E24.01256
yes:  sts-M76446 ADRA1A Adrenergic, alpha-1A-, receptor, dJ753D10
yes: sts-H22126 EST, in dJ1009E24.01546

no:  CHGB Chromogranin B (secretogranin 1)
no:  PCNA Proliferating cell nuclear antigen
no:  SHGC-16916  EST
no:  SGC34960 Human mRNA for KIAA0168 gene, complete cd..
Note that the recent paper mapping E200K micorsatellites by HS Lee et al provides a definitive ordering of these in the vicinity of the prion gene. The order of the markers (which disagrees with the RH location of prion gene) is below. The hope here is that some of the microsatellites or other markers will be locatable in dJ1068H6 :
D20S 842
D20S181 CA repeat: dJ991B18.01764
D20S193 CA repeat: dJ310O13.00912, weak to dJ1068H6.01129 
D20S116 CA repeat: dJ1009E24.01546
prion gene according to RH mapping
D20S97 CA repeat: dJ734C18,  weak hit to 00445 and 00318, on both maps
D20S835 dJ1007P8.02421 plus 00024, 808; on RH map but not E200K paper
D20S482: at GenBank under alternative name GATA51D03 or G08052 or L16411; dJ744A17 match
prion gene according to YAC mapping
D20S895 CA repeat: dJ839B4.01060, weak 00241, not on RH
D20S849 CA repeat: dJ461P17, 00024, not on RH
D20S882 CA repeat: dJ1022P6.02021dJ734C18, on both maps
The IPP (or IDI1) pseudogene occurs in contig 01219 on the minus strand, later part of a larger segment dJ1187J4.01598 that also contains doppel prion (CDS 434-961) on the positive strand. It has a large number of point mutations, stop codons, and reading frame changes, establishing that it is quite degenerate and incapable of producing functional protein -- the event that created it must be quite old. It matches the 228 aa human protein used as a tBlastn probe with probability of coincidence 1.2e-62.

The pseudogene includes 31 bp of 5' UTR of IDI isomerase mRNA (position 12589 of contig), the CDS region, and 1256 bp of 3' UTR (position 9823 of contig). Since the human gene was determined from mRNA, its introns are not known, (making it likely but difficult to evaluate whether it is a processed pseudogene). The human protein of 288 amino acids is involved in early steps of cholesterol biosynthesis. It is located on human chromosome 10 (official name: IDI1: isopentenyl-diphosphate delta isomerase 1) and is reduced in the peroxisomal deficiency diseases, Zellweger syndrome and and neonatal adrenoleukodystrophy. This location (provided it is the true parent) rules out origin of the pseudogene by tandem duplication.

5'3' Frame 2 really old pseudogene

01219:  1294 ..........DRCEPLCTTEKCIL
              MP+    HL +QQVQLL + CIL

             D+ +     E CILIDEND+ IG ETKKNCH NENI  GLLH+A SVFL NT+      +

                 K+ F        CSHPLSNP +LE +D + +   AQR LKAELGIP+EE PPEE  



   E  K  C  I  L  I  D  E  N  D  S  N  I  G  T  E  T  K  K  N 
 C  H  E  N  E  N  I  G  N  G  L  L  H  Q  A  L  S  V  F  L 
 L  N  T  K  M  S  Y  S  D  S  R  D  Q  M  L  K  L  P  F  E 
 P  V  S  P  I  L  G  C  S  H  P  L  S  N  P  D  K  L  E  R 
 N  D  V  I  D  I  S  -  V  A  Q  R  H  L  K  A  E  L  G  I 
 P  M  E  E  A  P  P  E  E  T  Y  Y  L  I  S  T  C  S  E  V 
 Q  S  D  G  I  W  G  E  H  K  T  D  Y  I  L  F  V  R  K  N 
 I  T  S  N  S  D  P  S  E  I  K  S  Y  F  L  C  V  K  G  R 
 A  R  T  S  E  K  S  S  H  W  -  N  -  N  N  T  M  I  S  S 
 Y  Y  R  D  F  S  L  -  M  V  G  -  L  K  S  F  E  S  V  Y 
 -  P  -  K    

frame 2 translation fits distally:


Human reference gene for IPP delta isomerase: NM_004508 1807 bp mRNA of 07-MAY-1999 CDS 51..737
Rat and hamster genes are also available and could help date the origin of the pseudogene.

        1 tctgtggccg gaggctgatc agtgttctag aacagatcag acattttgta ATGATGCCTG
      721 AAATATACAG AATGTGAata tgtaggtaaa tgattacaga aaaatttatg tgcttaacaa
Analysis of the other gene candidates predicted by GenScanW involves checking them in repeat-masked contigs, looking for homologies to known proteins, and seeing if theyl are represented in EST databases. These genes could be artefacts, pseudogenes, processed pseudogenes, distant paralogues, actual working genes, or some combination thereof. While they seem to bear no relation to prion or prnd genes, they might still provide clues as to normal function.

The contig 00318 is nearly half retrotransposons. Some sections of these correspond to ORFs. The protein found by GenScanW occurs in both native and masked contig -- at least the first 43aa do. However, the next sections do not, in any reading frame.

>00318 prot

>native contig frame1-

>masked contig frame 2-
The 00511 predicted protein of 101 amino acids had very strong EST matches (eg, AA663548, e-15 chr 21q21). Again, only part is found in masked sections. The interpretation is not immediate due ot unreliability of GenScanW predictions. However, the translation of the best EST hit (despite two stop codons) has a highly significant hit with the first 77 residues of the 00511 predicted protein. Since the EST may contain errors; indeed a Blastn match such as L81803 1760-2092 has no stop codons in this region.
>511 prot
>native contig frame 2+

>native contig frame 2+ rest of predicted protein

>masked contig frame 1+
mfpsllcchlsrlephcawaqhtlwpdngngpa  sslhslltqqSCTCGKQAAALVFHRTLLRPCPQTI-

The best dbEST match to tBlastn aligns signficantly as protein to the 00511 contig protein:
Score = 97.9 bits (240), Expect = 9e-21
Identities = 49/77 (63%), Positives = 59/77 (75%), Gaps = 2/77 (2%)


           CISNE+P     +N  C

>L81803 ORF containing 511 match
Candidate gene 01079 consists almost completely of 10 retrotransposons comprising a staggering 96% of this stretch of chromsome 20. Any gene here would belong to one of these elements. The 01094 is also problematic.

Mutant ubiquitinylation tied to brain diseases

 Tuesday August 31 1999  press releaseNature Genetics 1999;23:10-11, 47-51. 
A gene defect that impairs cell's ability to ``houseclean'' proteins can lead to degeneration of brain cells, report Japanese researchers. The findings are the first to confirm the theory that proteins that cannot be broken down accumulate in brain cells, leading to neurodegenerative disorders such as Alzheimer's, Huntington and Parkinson's disease.

Normally, cells mark proteins that are destined to be broken down as waste, with a protein called ubiquitin. Ubiquitin is subsequently released from the degraded proteins and recycled for future use. The team of researchers found that mice with neurodegenerative disorders have a defective gene, which prevents an enzyme from producing and recycling ubiquitin. As a result, there is a build-up of waste in nerve cells, which leads to their degeneration.

``Our data suggest that altered function of the ubiquitin system directly causes neurodegeneration,'' they write in the September issue of Nature Genetics.

The team of researchers, led by Kazumasa Saigoh and Yu-Lai Wang of the National Center of Neurology and Psychiatry in Tokyo, studied a strain of genetically altered mice called ''gad'' mice whose brains show changes similar to those of humans with inherited neurodegenerative diseases. They compared the diseased mouse brains with those of normal mice, homing in on the enzyme responsible for producing ubiquitin.

The authors write that their findings provide ``a useful model for investigating human neurodegenerative disorders.'' In an accompanying editorial, Marcy Macdonald of Harvard Medical School notes that it remains to be seen why the genetic defect affects neurons specifically. ``This is not an easy challenge,'' she writes. ``Certainly, the discovery bodes well for our understanding of this complex and essential cellular pathway in the nervous system.''

Comment (webmaster): 42 medline abstracts contain 'prion' and 'ubiquitin.'

Characterization and polyanion-binding properties of purified recombinant prion protein.

Biochem J 1999 Sep 15;342(Pt 3):605-613
Brimacombe DB, Bennett AD, Wusteman FS, Gill AC, Dann JC, Bostock CJ
Certain polysulphated polyanions have been shown to have prophylactic effects on the progression of transmissible spongiform encephalopathy disease, presumably because they bind to prion protein (PrP). Until now, the difficulty of obtaining large quantities of native PrP has precluded detailed studies of these interactions. We have over-expressed murine recombinant PrP (recPrP), lacking its glycophosphoinositol membrane anchor, in modified mammalian cells. Milligram quantities of secreted, soluble and partially glycosylated protein were purified under non-denaturing conditions and the identities of mature-length aglycosyl recPrP and two cleavage fragments were determined by electrospray MS.

Binding was assessed by surface plasmon resonance techniques using both direct and competitive ligand-binding approaches. recPrP binding to immobilized polyanions was enhanced by divalent metal ions. Polyanion binding was strong and showed complex association and dissociation kinetics that were consistent with ligand-directed recPrP aggregation.

The differences in the binding strengths of recPrP to pentosan polysulphate and to other sulphated polyanions were found to parallel their in vivo anti-scrapie and in vitro anti-scrapie-specific PrP formation potencies. When recPrP was immobilized by capture on metal-ion chelates it was found, contrary to expectation, that the addition of polyanions promoted the dissociation of the protein.

French review of FFI: haplotypes and phenotypes

Clin Exp Pathol 1999;47(3-4):176-80
Delisle MB, Uro-Coste E, Gray F, Vital C
Since its description in 1986, Fatal Familial Insomnia (FFI) became the third most common inherited prion diseases (23 described families, 3 isolated cases). It is characterized by a mutation at codon 178 of the prion protein gene cosegregating with the methionine polymorphism at codon 129 of the mutated allele.

Insomnia, dysautonomia, disruption of circadian rhythms and motor dysfunctions (myoclonus, ataxia, dysarthria, spasticity) are the main clinical symptoms in the homozygote patients (met/met at codon 129). Heterozygotes have motor dysfunctions from onset and cognitive changes. Phenotypic variability does not appear to be strictly related to codon 129 polymorphism as recently stressed in some reports.

Neuropathology shows marked neuronal loss and gliosis in the thalamus, especially in the medio-dorsal and antero-ventral nuclei, without any amyloid deposits. Some spongiosis may be seen essentially in the cerebral cortex, in patients with longer duration disease.

The D178N mutation coupled with the 129 valine codon is linked to a subtype of Creutzfeldt-Jakob disease. However, in these two phenotypically different diseases, two protease resistant fragments of the pathogenic PrP (PrPres) are accumulated. They differ in molecular mass. In FFI PrPres, the unglycosylated form is underrepresented. This particularity does not result from the preferential conversion of the glycosylated forms but from an inaccessibility of non glycosylated form to conversion. PrPres has been shown to be form allelic origin. [sic]

Neuronal apoptosis was found to contribute to neuronal loss in FFI. Its presence correlates with neuronal loss, being invariably noticed in the thalamus and medullary nuclei. It is not correlated with PrPres accumulation. The quantity of deposits is globally low in FFI brains and rarely immunohistochemically detected. Pathogenesis of lesions and clinical signs remain to be assessed.

Protein dysfunction could be hypothesized according to some clinical and experimental data as well as to the discordance between protein accumulation and programmed cell death. Neurotoxicity is also postulated. Studies on this pathology led to consider the existence of "strains" in human prion diseases. Despite remarkable advances, many issues remain unsolved in this non spongiform prion disease.

GSS disease and the French-Alsatian A117V variant.

Clin Exp Pathol 1999;47(3-4):161-75
Mohr M, Tranchant C, Steinmetz G, Floquet J, Grignon Y, Warter JM
GSS disease is a rare familial form of prion disease. This autosomal dominant disorder is constantly associated with a point mutation on the PrP gene. Eight mutations affecting respectively codons 102, 105, 117, 145, 202, 212 and 218, have been so far described. Symptoms are variable and include ataxia and dementia. They generally appear between the fourth and sixth decade. Mean duration of the disease (5 years) is on the whole longer than that of other familial forms of prion diseases. GSS disease is neuropathologically characterized by the presence of numerous multicentric or unicentric PrP amyloid deposits widespread throughout the encephalon. Spongiform change is inconstant. Neurofibrillary tangles have been described in some families. Clinicopathological features show considerable variability. Pathogenesis of amyloidosis and associated lesions as well as factors underlying the phenotypic polymorphism of the disease remain only partially known.

Prion disease with octapeptide repeat insertion.

Clin Exp Pathol 1999;47(3-4):153-9
Vital C, Gray F, Vital A, Ferrer X, Julien J
About 8% of prion disease cases are familial and a few are due to an octapeptide repeat insertion (OPRI) in the prion protein gene. A suitable neuropathological examination has been performed in 20 cases from 9 families and in 3 isolated cases. The number of OPRI ranges from 4 to 9 multiples of 24 base-pair.

Results from routine histopathological preparations and from immunohistochemistry performed after special tissue pretreatment were compared with those of molecular genetic investigation. Eight cases with 4 to 7 multiples of OPRI exhibited characteristic elongated deposits in the cerebellar molecular layer, which were visible on slides prepared with antibodies against the prion protein only. Conversely, 6 cases with 8 or 9 multiples of OPRI presented typical plaques already obvious on routine preparations.

These variable modifications in the cerebellar molecular layer deserve to be underlined, in particular the elongated deposits which are characteristic for cases presenting 4 to 7 OPRI.

Protection of personnel and environment against CJD in pathology laboratories.

Clin Exp Pathol 1999;47(3-4):192-200
Richard M, Biacabe AG, Perret-Liaudet A, McCardle L, Ironside JW, Kopp N
Most neuropathology laboratories have been faced with the question of dealing with cases of Creutzfeldt-Jakob disease (CJD) which is a rare neurodegenerative disorder. Neuropathologists have been long aware of the transmissibility and unique properties of the agent which make it resistant to conventional inactivating reagents. The emergence of iatrogenic cases and of the bovine spongiform encephalopathy (BSE) crisis has induced anxiety among laboratory staff and raised questions about the efficiency of safety measures and procedures hitherto applied in pathology laboratories. This article aims at presenting an overview of the risk involved in handling CJD material. It gives practical advice and a key to more detailed procedures, guidelines and recommendations available in scientific literature and through government agencies. Neuropathologists and biochemists are at a higher potential risk than others since the diagnosing of CJD involves the handling of nervous tissue which contains the highest levels of infectivity.

Case report: Leptomeningeal melanoma and Creutzfeldt-Jakob disease in a patient with chronic lymphocytic leukaemia.

Neuropathol Appl Neurobiol 1999 Jul;25(4):345-348
King A, Ryan P, Puranik A, Doey L, Barnes P
A 78-year-old woman with known chronic lymphocytic leukaemia (CLL) was admitted to a psychiatric unit because of rapidly declining cognitive function. Clinical examination also revealed cerebellar signs and she later became akinetic and mute. She deteriorated and died of bronchopneumonia. The histology from the post-mortem confirmed the presence of CLL in the lymph nodes and she was also found to have diffuse leptomeningeal melanoma. In addition, there was extensive prion protein deposition in the cerebral cortex, but without significant spongiosis. The astrocytosis that was present appeared superficial only.

Furthermore, prion protein appeared to be co-expressed with betaA4 in the form of plaques. The patient therefore had evidence of sporadic Creutzfeldt-Jakob disease (CJD) in addition to meningeal melanoma and CLL. This case further illustrates the importance of employing prion protein immunohistochemistry in suspected cases of CJD, especially where the histology is atypical.

Mad Cow Home ... Best Links ... Search this site