Model 10 hamster prion 90-231
3D Crunch to benefit prion research
Crunch prion sequence alignment and links
GenBank: false alarm on chromosome 7
Retrotransposon structure of the prion gene region
Longitudinal trends in the UK BSE epizootic
Beta intermediate found for 121-231
Intergen BSA and aprotonin validation study
Detection of PrpSc by multi-spectral ultraviolet fluorescence
Homology of prion globular domain to signal peptidases?
Signal peptidases are not homologous to prion protein:-- review
2 June 98 webmaster
![]() |
Recall that the UCSD-UCSF groups determined the nmr structure of golden hamster [Mesocricetus auratus] prion 90-231, which begins GQGGGTHN QWNKPSKPKTWMKHAGAAAAGAVVGGLGGYMLG... and so is longer than mouse which begins 121 VVGGLGGYMLG... though hamster 90-107 does not really have a fixed structure past threonine 107 [domain recognising software puts the boundary here -- special thanks to SA Islam of the Biomolecular Modelling Laboratory, Imperial Cancer Research Fund for running this]. An ensemble of 15 structures was provided to PDB; it has now been determined that model 10 is the best single representative of 2PRP. This file may be downloaded by itself
The hamster structure, while strongly affirming the general folding scheme found in mouse, is not yet refined. PDB posts numerous incongruities found by a quality checking program called WhatIf. These are not unreasonable for an unrefined structure and are a cautionary note against using 2PRP for critical tasks such as changes due to CJD mutations. Also new are the automatically generated page of results of HSSP. This provides an unusual horizontal alignment of hamster to all 41 prion sequences at SwissProt, together with quantitative measures of the degree of variation at each codon position. In turn, these allow color-coding the backbone chain according to degree of evolutionary constraint. These generally neutral sites should be contrasted to localizations of the 20 known point mutations causing CJD. |
Silicon Graphics, Glaxo Wellcome, Imperial Cancer Research Fund, Swiss Institute of Bioinformatics and the Lyon Bioinformatics CenterSee also: SWISS-MODEL, PDB, SWISS-PROT, and EBI (TrEMBL database).
Cortaillod, Switzerland -- Silicon Graphics, Inc. today announced that it is working with leading international bioinformatics research organizations in a major scientific and technological undertaking called "3D Crunch." The project, launched today, will analyze over 200,000 public protein sequences and use advanced technology to predict 50,000 new 3D protein structures. Its findings will be accessible via the World Wide Web, expanding scientists' knowledge of proteins and ultimately accelerating drug discovery to combat disease.
A Silicon Graphics CRAY Origin2000TM server will serve as the computing engine for 3D Crunch. The software component of the project will be spearheaded by bioinformatics and protein modeling experts from Glaxo Wellcome, the Imperial Cancer Research Fund (ICRF), The Swiss Institute of Bioinformatics (SIB) and the Lyon Bioinformatics Center (PBIL) at the University Claude Bernard.
... Located at the Silicon Graphics European Advanced Technology Center in Cortaillod, Switzerland, the 64-processor CRAY Origin2000 server will power SWISS-MODEL, the 3D protein modeling software developed by Dr. Manuel Peitsch, worldwide director of scientific computing at Glaxo Wellcome and his research team. The SWISS-MODEL program will analyze the 200,000 public protein sequences in the SWISS-PROT and TrEMBL databases [ie, not all of GenBank], and predict their 3D structures by comparing them to related proteins with known structures stored in the Protein Databank (PDB).
Then, for those sequences that cannot be predicted by SWISS-MODEL, additional analysis with software and databases developed at the ICRF and the PBIL will be performed. The program FOLDFIT, developed jointly by the ICRF and Glaxo Wellcome, will be used to suggest a function for the bacterial protein sequences in the databases that cannot be modeled with comparative methods such as SWISS-MODEL.
"Today, most of the 4,500 protein structures available to the world have been generated by the difficult and time-consuming work required in traditional laboratory methods," said Dr. Peitsch. "We're compressing essentially a year of computing time down to a week, thanks to the power of the CRAY Origin2000 server. 3D Crunch will provide a significantly larger resource of computationally-generated structural information to researchers throughout the global scientific community, profoundly advancing our ability to understand the function and structure of proteins important in drug discovery and design."
3D Crunch has been completed Scientists are now compiling and making the resulting database publicly available through the project participants' Web sites. The SWISS-PROT and TrEMBL databases are developed jointly by the group of Amos Bairoch at the newly created Swiss Institute of Bioinformatics (SIB) and the group of Rolf Apweiler at the EMBL outstation - the European Bioinformatics Institute (EBI). PDB was established at Brookhaven National Laboratory, New York, in 1971 and has been developed by the Protein Data Bank team over the past 26 years - with now 11 mirror sites throughout the world. Joel L. Sussman is head of the PDB.
Webmaster commentary 4 June 98This is a highly signficant announcement with potential applications to identifyinging prion normal function and understanding alleles in livestock as well as CJD mutations (even those these are not in SwissProt or Trembl. Note many of the 3D Crunch results for prions have already been obtained by individuals using SwissModel to thread prion sequences from various species, sheep alleles, human mutants, and ancestral sequences onto the known mouse nmr structure.
However, 3D CRUNCH will do a far better job at optimizing structures and taking on fainter relationships as no one can know that capabilities as well as the programmers. Perhaps a second run would make sense, on the OMIM database of human mutations for all genes( even though changes are tiny they are important).
Three immediate benefits to prion research:
First, 3D Crunch will result in the convenient availability of prion 3D structures for 41 species. These will be available at PDB and hence at SwissModel for a starting point for further modelling.
Second, they will produce a 3D structure for marsupial and avian prions that will illuminate deeper features mammalian prion structure and function. Avian globular domain will be done using a deletion, probably of length 10, EAVAAANQTE beginning at position 198 in mouse numbering. The full sequence will then have to be restored by insertion and re-modelled by energy minimization.
Third, while no other protein at any database has sequence homology to prion protein, still some other protein might be found to have the same fold and so a distant relationship with possible implications for function. Massive use of analytic tools not available online such as FOLDFIT could potentially reduce the embarassing number of URFs (though here they talk only about suggesting a function for orphaned bacterial protein sequences). This could suggest structures for the repeat and 106-126 domain and find the long-sought relatives of the prion protein in non-vertebrates by creating a vast number of hypothetical structures not currently represented at PDB.
Species and link | Codon | Hamster | Comparison of species in listed order |
---|---|---|---|
prio_mesau P04273 prio_mouse P04925 prio_rat P13852 prio_aottr P40245 prio_atege P40246 prio_thege Q95270 prio_calmo P40248 prio_calja P40247 prio_rabit Q95211 prio_cerpa Q95174 prio_cerne Q95173 prio_cerae P40250 prio_macsy Q95200 prio_cerat Q95145 prio_cermo Q95172 prio_colgu P40251 prio_ponpy P40256 prio_saisc P40258 prio_atepa P51446 prio_prefr P40257 prio_macfa P40254 prio_caphi P52113 prio_certo Q95176 prio_cebap P40249 prio_mansp P40255 prio_odohe P47852 prio_sheep P23907 prio_cerel P79142 prio_gorgo P40252 prp2_bovin Q01880 prp1_trast P40242 prp2_trast P40243 prio_bovin P10279 prio_pantr P40253 prio_human P04156 prio_musvi P40244 prio_muspf P52114 prio_camdr P79141 prio_pig P49927 prio_trivu P51780 prio_chick P27177 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | G Q G G G T H N Q W N K P S K P K T N M K H M A G A A A A G A V V G G L G G Y M L G S A M S R P M M H F G N D W E D R Y Y R E N M N R Y P N Q V Y Y R P V D Q Y N N Q N N F V H D C V N I T I K Q H T V T T T T K G E N F T E T D I K I M E R V V E Q M C T T Q Y Q K E S Q A Y Y D G R R S | gg-ggggg-gggggggggggg-ggg---g----gggggg-- qqqqqqqq-qqqqqqqqqqqq-qqq---q----qqqqqq-s gggggggg-gggggggggagg-ggg---g----gggggggs gggggggggggggggggggggggggggggggggggggggqg ggggggggggggggggggggggggggggggggggggggggg tttttttttttttttttttttsttttsttstttttssasgs hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhyy nnnnnnnsnnnnnnnssnnsnsnnnsssssgggssggggnh qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqkn wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwq nnnnnhnnghhhhhhnnnnnhnhnhnnnnnnnnnnggnn-k kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkp ppppppppppppppppppppppppppppppppppppppppw ssssssssssssssssssssssssnssssssssssssssdk kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkp ppppppppppppppppppppppppppppppppppppppppp kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk ttttttttttttttttttttttttttttttttttttttttt nnnnnsnnssssssssnnnssnsssnnnnnnnnnnnnsnnn mllmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmlf kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh mvvmmmvvvmmmmmmmmmmmmvmvmvvvmvvvvmmvvvvvv aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ggggggggggggggggggggggggggggggggggggggggg aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ggggggggggggggggggggggggggggggggggggggggg aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ggggggggggggggggggggggggggggggggggggggggg ggggggggggggggggggggggggggggggggggggggggg lllllllllllllllllllllllllllllllllllllllll ggggggggggggggggggggggggggggggggggggggggg ggggggggggggggggggggggggggggggggggggggggg yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmma llllllllllllllllllllllllllllllllllllllllm ggggggggggggggggggggggggggggggggggggggggg ssssssssssssssssssssssssssssssssssssssssr aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaav mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm sssssssssssssssssssssssssnsssssssssssssss rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrg ppppppppppppppppppppppppppppppppppppppppm mmmlllllllllllllilllllllllllilllliillllvn miliiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiy hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh fffffffffffffffffffffffffffffffffffffffff ggggggggggggggggggggggggggggggggggggggggd nnnnnnnnnnnnnnnnnnnnnnnnnnnnsnsssssnnnsns ddddddddddddddddddddddeddddddddddddddddep wwwyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyd eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee ddddddddddddddddddddddddddddddddddddddddy rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyw yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyw rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrs eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmqs nyyyyyyyyyyyyyyyyyyyyyyyyyyyhhyyhhhyyyyya rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy ppppppppppppppppppppppppppppppppppppppppp nnnnnnnnnnnnnnnnnsnnnnnnnnnnnnnnnnnnnnnnn qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqr vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyymy yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrkkkrrr ppppppppppppppppppppppppppppppppppppppppd vvvvvvvvvvvvvvvvvvvvvvvvvvvvmvvvvmmvvvviy dddddddddddddddddddddddddddddddddddddddds qqqqqqqqqqqqqqqqqqqqqqqqqqrqqqqqqqeqqqqqs yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyp nsssnssnssssssssssnssssssnsnssssssssssssv nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnsnnnnnsp qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnd nnnnnnnnsnnnnnnnnnnnnnnnntntnnnnnnnnnssnv fffffffffffffffffffffffffffffffffffflffff vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhha ddddddddddddddddddddddddddddddddddddddddd ccccccccccccccccccccccccccccccccccccccccc vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvf nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn iiiiiiiiiiiiiiiiiviiiiiiiiiiiiiiiiiiiiiii ttttttttttttttttttttttttttttttttttttttttt iiiiiiiiviiiiiiiiiiiiviiivvvivvvviivvvvvv kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkt qqqqqqqqqqqqqqqqqqqqqqqqqqqqqeqqeqqqqqqqe hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhy tttttttttttttttttttttttttttttttttttttttts vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvti ttttttttttttttttttttttttttttttttttttttttg ttttttttttttttttttttttttttttttttttttttttp tttttttttttttttttttttttttttttttttttttttta tttttttttttttttttttttttttttttttttttttttta kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk ggggggggggggggggggggggggggggggggggggggggk eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeen nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnt fffffffffffffffffflfffffffffffffffffffffs ttttttttttttttttttttttttttttttttttttttttv eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee ttttttttttttttttttttttttttttttttttttttttm dddddddddddddddddddddddddddddddddddddddde ivvvvvvvivvvvvvvvvvvvivvviiiviiiivvmmvvin kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk immimmmmimmmmmmmmmmmmimmmmimmmmmmmmiimmiv mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmimv eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeet rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrk vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvi eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeer qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqe mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm ccccccccccccccccccccccccccccccccccccccccc tvviiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiivviiiv ttttttttttttttttttttttttttttttttttttttttq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy qqqeeqeeqeeeeeeeeeeeeqeeeqqqeqqqqeeqqqqqr kkkkrkkkqkkkkkkkrkrkkrkrkrrrrrrrrrrrqrkae eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeey sssssssssssssssssssssssssssssssssssssyyyr qqqqqqqqqqqqqqqqqqqqqqqqqqqeqqeeqqqeeqeel aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa yyyyyyyyayyyyyyyyyyyyyyyyyyyyyyyyyyyysy-- yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyya-- dddqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqgq-- gggrrrrr-rrrrrrrrrrrrrrrrrrrrrrrrrrrrrr-- rrrggggg-gggggggggggggggggggggggggggggg-- rrrsssss-ssssssssssssasssaaasaaaassaaaa-- ssssssss-ssssssssssssssssssssssssssssss-- |
webmaster 25 May 98An odd new GenBank prion entry, AF053356 [large file]: it seems some people in Germany sequenced 227,968 bases in the 22nd region of the distal q arm of human chromosome 7. Computer analysis of their sequenced gave, among other things, a database match [P = 1.8e-31 ] to a region called complement (158406..158527) in the human prion gene. The piece is 124 bp long.
The prion gene has long been said to be on chromosome 20 in single copy. There are quite a few papers on anti-prion sequence from somewhere else in the genome. The sequence might also be compared to that of the in-press analysis of the human prion introns noted earlier, U29185.
158404 -158527 is the sequence at issue:
ctgta gtcctagcta caaaggaggc tgaggcagga gagtcacttg aacccaggag gtggaggttc cagtgagcct tgattgtacc actgcactcc agcctgggcg acagagcgag actccat
This aligns with, say, 3290-3411 of U29185 prion gene with 82% agreement over 122 residues [an AluSx] but also aligns in 10 other places in the + strand and 4 other places in the - strand. The bottom line is that this has nothing to do with prion coding or regulatory exons, it is simply an Alu (abundant retrotransposon DNA) that happens to have also parasitized the prion DNA region in 15 places.
misc_feature complement(158406..158527) /note="90, 2:xyz, 6988..7109, of, GB|U29185|HSU29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 1.8e-31, S, =, 90" /note="Region: Data base match" misc_feature complement(158406..158527) /note="86, 3:xyz, 28258..28379, of, GB|U29185|HSU29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 1.0e-28, S, =, 86" /note="Region: Data base match" misc_feature 158406..158527 /note="84, 6:xyz, 34001..34122, of, GB|U29185|HSU29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 1.0e-28, S, =, 84" /note="Region: Data base match" misc_feature complement(158406..158526) /note="80, 4:xyz, 11524..11644, of, GB|U29185|HSU29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 4.7e-25, S, =, 80" /note="Region: Data base match" misc_feature complement(158406..158524) /note="84, 4:xyz, 10007..10125, of, GB|U29185|HSU29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 4.7e-25, S, =, 84" /note="Region: Data base match" misc_feature complement(158406..158527) /note="86, 3:xyz, 28258..28379, of, EMB|U29185|U29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 1.2e-28, S, =, 86" /note="Region: Data base match" misc_feature complement(158406..158527) /note="90, 2:xyz, 6988..7109, of, EMB|U29185|U29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 2.1e-31, S, =, 90" /note="Region: Data base match" misc_feature 158406..158527 /note="84, 6:xyz, 34001..34122, of, EMB|U29185|U29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 1.2e-28, S, =, 84" /note="Region: Data base match" misc_feature complement(158406..158526) /note="80, 4:xyz, 11524..11644, of, EMB|U29185|U29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 5.4e-25, S, =, 80" /note="Region: Data base match" misc_feature complement(158406..158524) /note="84, 4:xyz, 10007..10125, of, EMB|U29185|U29185, Homo, sapiens, prion, protein, (PrP), gene,, complete, cds., 2/98, Length, =, 35,522, P, =, 5.4e-25, S, =, 84" /note="Region: Data base match".......
Lee,I.Y., Westaway,D., Smit,A.F., Wang,K., Cooper,C., Yao,H., Prusiner,S.B. and Hood,L.These authors evidently sequenced very extended regions of mouse, sheep, and human prion genes and then processed them with software that could identify parasitic regions in the introns (that account in part for their length). Here, I processed the GenBank entry to sort repeat types alphabetically and by number of occurrences. -- webmaster.
U29185 human prion 35,522 bases; mouse 38,418 bases, and sheep 31,412 bases. Interspersed repeats were identified with RepeatMasker; simple sequence repeats were identified with Sputnik. [These are not online services.]
These GenBank entries are nearly unreadable as computer output so are presented here re-sorted by repeat name, length, and species as seen in the table below [start and stop not shown here for simplicity]. Human prion had 89 garbage insertion in 53 classes, mouse 56 in 47 classes, and sheep 43 in 38 classes. These elements, plus exons, accounted for only 41,272 bases of 105,422 total bases sequenced, or 39%, ie 61% is either unrecognized, regulatory, or unique.
Even anchoring at common exons, there is little similarity of inserts across these species, though sheep and cow might be close enough to align. This illustrates an interesting and unwanted aspect of mammalian genomes: 95% or more of the sequence seems to be non-functional debris. AluSx was the subject of the previous posting.
species | strand | #reps | length | repeat class |
human | - | 3 | 111 | AluJb |
human | + | 5 | 172 | AluJo |
human | + | 2 | 297 | AluSg |
human | + | 2 | 304 | AluSq |
human | + | 6 | 293 | AluSx |
human | + | 2 | 287 | AluY |
human | + | 1 | 32 | caaaa |
human | + | 1 | 133 | exon1 |
human | + | 1 | 98 | exon2 |
human | + | 1 | 2,353 | exon3 |
human | - | 2 | 131 | FLAM_A |
human | + | 2 | 132 | FLAM_C |
human | - | 10 | 269 | L1M2_orf2 |
human | + | 1 | 66 | L1MA9 |
human | + | 2 | 150 | L1MB8 |
human | + | 2 | 85 | L1MC1 |
human | + | 2 | 195 | L1ME3 |
human | - | 2 | 534 | L1PA9 |
human | + | 11 | 45 | LINE2 |
human | - | 1 | 230 | LTR16C |
human | - | 2 | 370 | MER28 |
human | + | 1 | 118 | MER3 |
human | - | 5 | 100 | MER5A |
human | - | 2 | 133 | MER5B |
human | + | 1 | 420 | MER65A |
human | - | 1 | 264 | MER74B |
human | + | 2 | 450 | MER88 |
human | + | 6 | 70 | MIR |
human | - | 1 | 312 | MLT1A1 |
human | + | 1 | 207 | MLT1F |
human | + | 2 | 97 | MLT1G |
human | + | 2 | 460 | MLT2CB |
human | - | 2 | 44 | MSTC |
human | - | 2 | 71 | TIGGER2 |
mouse | + | 1 | 41 | ACx20 |
mouse | - | 3 | 109 | B1-F |
mouse | - | 6 | 61 | B1_MM |
mouse | - | 1 | 210 | B2 |
mouse | - | 3 | 184 | B3 |
mouse | - | 2 | 49 | B4 |
mouse | + | 1 | 121 | B4A |
mouse | + | 1 | 24 | CAx12 |
mouse | + | 1 | 46 | exon1 |
mouse | + | 1 | 97 | exon2 |
mouse | + | 1 | 2,007 | exon3 |
mouse | + | 1 | 24 | GATTx6 |
mouse | + | 1 | 66 | GGAn |
mouse | + | 1 | 61 | GGAx20 |
mouse | - | 2 | 1,703 | IAP |
mouse | - | 2 | 381 | IAPLTR1_MM |
mouse | - | 2 | 66 | ID4 |
mouse | - | 1 | 59 | ID5 |
mouse | - | 1 | 48 | ID6 |
mouse | - | 2 | 78 | ID_RN |
mouse | - | 1 | 541 | L1MA4 |
mouse | - | 1 | 169 | L1_MM |
mouse | - | 1 | 181 | LINE2 |
mouse | + | 1 | 82 | LTR10A |
mouse | - | 1 | 151 | MER5A |
mouse | - | 1 | 424 | MER67C |
mouse | - | 1 | 33 | MLT1A1 |
mouse | - | 1 | 199 | MLT1B |
mouse | + | 1 | 432 | MLT2CB |
mouse | + | 1 | 319 | ORR1A3 |
mouse | - | 1 | 383 | ORR1B |
mouse | + | 1 | 362 | ORR1D |
mouse | - | 3 | 51 | PB1D10 |
mouse | + | 3 | 89 | PB1D7 |
mouse | + | 1 | 55 | PB1D9 |
mouse | + | 1 | 56 | RSINE1 |
mouse | + | 1 | 28 | TAAAx7 |
mouse | + | 1 | 38 | TTGTx9 |
sheep | + | 1 | 25 | ACx12 |
sheep | - | 2 | 124 | Bov-A2 |
sheep | - | 6 | 98 | Bov-B |
sheep | + | 1 | 227 | Bov-tA1 |
sheep | + | 8 | 88 | Bov-tA2 |
sheep | + | 1 | 158 | Bov-tA3 |
sheep | + | 1 | 51 | exon1 |
sheep | + | 1 | 97 | exon2 |
sheep | + | 1 | 4,027 | exon3 |
sheep | - | 2 | 147 | L1M2_orf2 |
sheep | - | 3 | 99 | L1_Art |
sheep | + | 3 | 93 | LINE2 |
sheep | - | 1 | 106 | LINE2a |
sheep | - | 1 | 219 | MER21B |
sheep | - | 1 | 421 | MER57int |
sheep | - | 5 | 50 | MER5A |
sheep | - | 3 | 175 | MLT1F |
sheep | + | 1 | 292 | MLT1G |
sheep | + | 1 | 1,219 | Oamar1 |
Torsten Brinch ... Ind. Ag. Observer" Longitudinal trends in the UK BSE epizootic have drawn considerable interest -- with main focus being on indications of the decline of disease incidence, and future virtual disappearance. The dataset most frequently worked from is the time series of number of cases by month (or year) of disease onset.
Another option would be to use data on number of confirmed cases by month of birth, which is known for appr. 2/3 of all confirmed BSE cases. Little has been published using these data, however. Recently I have used this data set (graphed data from refs (1) and (2)), in an attempt to obtain some indication of the changing exposure to birth cohorts over time.
The unadjusted data set of cases by month of birth does not depicture changes
in exposure well, because there is a huge seasonal variation. This variation
can reasonably be assigned to two main factors:
- a distinct annual calving pattern, UK calving season being in the autumn.
- calves born early in the calving season are preferentially chosen for
replacement into the herd.
My seasonal adjustment method is very simple: - sum all cases born in each calendar month across the years (12 sums), - calculate the proportion of these sums born in each year, - plot all proportions as a function of time; -- and this graph emerges:
1) Significant disease causing exposure is apparent to cows born from the autumn 1982 and onwards. Exposure increased strongly for cows born during 1986 and the first half of 1987.
1) For cows born during the year preceding the feed ban in July 1988, a high plateau of exposure to disease causing agent was reached (it is noteworthy that appr. 1/3 of all confirmed cases was born during this period)
2) At the time of the feed ban, a dramatic drop is seen. (apparently exposure to cows born just a few months after the feed ban has been only half the exposure to cows born a few months before the ban and during the preceding high exposure plateau)
3) The immediate dramatic drop was not sustained, and exposure continued at significant levels to cattle born several years after the feed ban.
Two weaknesses to the simple approach to data used to produce the graph have been identified:
a) Not all cases born in 1989 and onwards have reached the age where disease sets on, thus data for the latest birth cohorts are incomplete -- this leads to an impression of misleadingly low exposure to cows born during 1989, 1990, and 1991. (This weakness can be adressed by correction using the result from the SMR analysis (Fig.18, (3)) which was produced by the method described by Hoinville (3))
b) It is assumed that the annual calving and replacement selection pattern has been constant during the period; but it is known that the UK calving season has been broadened and shifted forward during the period. This change is not easily corrected for and should lead to spurious secondary exposure peaks for calves born during June and July in the latest years (and valleys in the early years).
Finally I will add, that I find that the results of my method are not fully consistent with some of the assumptions of the Anderson et al (4) analysis of the longitudinal trends of BSE, mainly:
- the assumption of a strong seasonal variation of -feed risk- (I think the seasonal variation can be more simply explained by the annual pattern of calving and replacement selection.)
- the assumption of feed risk peaking -after the feed ban- (I think the feed risk peaked before or at the time of the feed ban)
- the estimated/assumed mean age at infection being 1.3 years (I think the mean age of disease causing exposure must have been lower, and that it might well be as early as the weaning period)."
Refs:
(1) MAFF, UK: BSE Progress report (1995)
(2) MAFF, UK: Programme to eradicate BSE in the United Kingdom (1996)
(3) Hoinville "Decline in the incidence of BSE in cattle born after
the introduction of the 'feed ban'" (1994) Vet.Rec. v.370, p.274-275
(4) Anderson et al. "Transmission dynamics and epidemiology of BSE
in British cattle" (1996) Nature, v 382, p.779-788
27 May 98 webmaster review and commentThis is from Hornemann and Glockshuber at Zurich, PNAS Vol. 95, Issue 11, 6010-6014, May 26, 1998.
Basically, they gradually denature [long incubation] mouse prion fragment121-231 in urea at acid pH, finding a dominantly populated beta-structure unfolding intermediate even without the urea at pH 4.0 (at constant ionic strength). Guanidinium acid-induced unfolding transitions to beta of human 90-231 was previously reported in J. Biol. Chem. 272: 27517 (1997) and in Biochemistry 36, 3543-3553. This is also somewhat reminiscent of transthyretin.
They don't determine the actual structure here, it is just far-UV CD spectroscopy with the urea itself masking events below 210nm (though surely alpha helices are gone), but it sounds like the new state is populated sufficiently to proceed with solution nmr, which would be quite exciting. While they don't get aggregations and didn't measure CR birefringence, I would still vote for cross-beta.
It might seem interesting to compare short incubation mouse, L108F and T189V relative to long, but I would guess that L108F is the main effect, so would be moot here.
Note 90-120, the traditional conformationally shifting region, is missing here, that is, the globular domain 121-231 also may [reversibly] change conformation as the rogue isoform is created. Recall that the epitope recognized by the Prionics PrPSc-specific antibody is composed of three segments from 121-231. Somehow I had missed an earlier study characterizing the frayed amino terminus of the infective species as 73-90 of Weissmann et al, CSHSQB 61, 511-522 (1996).
Their perspective is reasonable enough:
"This is consistent with the nucleation/condensation model for the formation of the infectious PrPSc oligomer (12), which postulates a fast equilibrium between monomers of PrPC and PrPSc and a rate-limiting association of PrPSc monomers to an oligomer of critical size that forms a nucleus for further irreversible incorporation of PrPSc monomers into a growing PrPSc oligomer. Within the framework of this model, any change in solvent conditions that shifts the PrPC/PrPSc equilibrium toward PrPSc would accelerate the formation of oligomeric PrPSc. A change in solvent conditions, namely a shift from physiological to acidic pH, may indeed occur during the conversion of PrPC into PrPSc in vivo: The cell surface protein PrPC is normally exposed to physiological pH. PrPC appears to be clustered in cholesterol-rich invaginations of the plasma membrane termed caveolae (30), which may bud from the membrane and fuse with endosomes in a clathrin-independent endocytosis pathway. Intriguingly, insoluble PrPSc amyloid accumulates in the endosomal lumen of scrapie-infected cells (31, 32) where the pH varies between 4.0 and 6.0 (33), indicating that the propagation of the infectious scrapie agent may occur during endocytosis at acidic pH. "
The data here weighs in favoring the model of a fast equilibrium between monomeric PrPC and a monomeric PrPSc precurso r with rate-limiting reaction the formation of a PrPSc oligomer, which acts as nucleus for the growth of larger oligomers of the infectious agent , as opposed to the UCSF model with the rate-limiting step the irreversible autocatalytic conversion of monomeric PrPC to monomeric PrPSc.
They would explain CJD-causative mutations Asp178Asn and Glu200Lys as accomplishing end result as the same protonation of acidic residues at low pH, eliminating favorable interactions with the helix dipoles and facilitate the formation of the intermediate. Glu219Lys, a neutral polymorphism also affecting an acidic residue, is not discussed.
Note that the beta unfolding intermediate here has not been connected in any way to the beta structure found in PrPSc much less to infectivity. It may simply be that it is very common for proteins in general to form such unfolding intermediates at low pH in urea, for example src homology domains are somewhat analagous. There were no control proteins or literature review in the present study.
The hard part is to prove that this beta is the beta of the rogue conformer -- it could simply be unrelated residues and forming unrelated strands.
Biopharm 25 May 98 Blum, M et al.This is a 6 page study that looked at the effect of process purification steps on removing deliberately introduced spikes of hamster scrapie brain, using one-year bioassay and western blot. The authors report quite large reductions in ID50, 16-17 logs. Brain homogenate does not necessarily simulate infectivity sources plausibly present in blood fractions used to manufacture BSA and other ubiquitous products; sourcing from low-BSE countries provides another level of protection.
Biochem Biophys Res Commun 1998 May 8;246(1):100-106 Rubenstein R, Gray PC, Wehlburg CM, Wagner JS, Tisone GCPrion diseases are progressive degenerative disorders of the central nervous system. The transmissibility and fatal nature of these diseases necessitate their rapid and accurate diagnosis. The hallmark of these diseases is the accumulation of PrPSc, a protease-resistant form of a host-coded glycoprotein. We have been evaluating the use of multi-spectral ultraviolet fluorescent spectroscopy as a means of detecting and distinguishing between different forms of PrPSc. Spectroscopic measurements of fluorescence from untreated and proteinase K (PK)-treated PrPSc, purified from 263K scrapie strain-infected hamster brains and ME7 scrapie strain-infected mouse brains, were performed.
Spectra of untreated and PK-treated PrPSc samples for 263K and ME7 appeared qualitatively different. The identification and discrimination of PrPSc were possible based on these spectral signatures, calculations of their fluorescence cross sections, and determination of the orthogonal differences. This technique has the potential not only for the sensitive, specific, and direct detection of PrPSc, but also for the ability to distinguish between different forms of the prion protein.
FEBS Lett 1998 Apr 24;426(3):291-296 Glockshuber R, Hornemann S, Billeter M, Riek R, Wider G, Wuthrich K... Here we report the observation of structural similarities between the domain of PrP(121-231) and the soluble domains of membrane-anchored signal peptidases. At the level of the primary structure we find 23% identity and 41% similarity between residues 121-217 of the C-terminal domain of murine PrP and a catalytic domain of the rat signal peptidase. The invariant PrP residues Tyr-128 and His-177 align with the two presumed active-site residues of signal peptidases and are in close spatial proximity in the three-dimensional structure of PrP(121-231).
29 May 98 webmaster reviewI do not find not the slightest shred of support for any relationship of the globular domain of prion protein to signal peptidases (nor any protease) in this article. On the contrary.
Corrections are posted to a previous flawed hemoglobin homology and repeat random coil model from this group. High standards for research are necessary because of the serious human health threat posed by TSEs -- these are not diseases of fruit flies.
Here is a partial list of factual errors, inconvenient observations, methodological omissions, and experimental gaps:
* The authors are initially motivated by their finding that prion protein is slowly degraded in nmr experiments as it sits for several days at room temperature, concluding that the prion protein itself possesses auto-catalytic proteolytic activity. However more commonly, trace amount of protease from E.coli copurifies, or is introduced from air (fungal spores) or glassware etc.. E. coli cannot be made protease-free; ironically, the signal peptidase lepB itself is among the essential proteases. Domain junctions are extremely vulnerable to proteases -- here, the globular domain is folded and protected but the sites of the nicks are non-native fragments. Finally, many proteins have very weak, functionally irrelevent catalytic properties; the classic example is bovine serum albumen.
* There are 5 major homological classes of proteases [serine, aspartic, zinc, cysteine, and signal]; effective cross-kingdom commercially available inhibitors exist for each class including signal peptidases [a beta-lactam]. The authors did not determine which class of protease was causing their problem. It is most unlikely that the signal peptidase class is involved because there is no substrate present. That is, signal peptides are highly specialized substrates; 1-24 of prion protein is missing and no other region remotely qualifies using neural nets trained on thousands of signals.
* Inappropriate assays for intrinsic prion proleolytic activity: why test substrates for the other 4 classes but not for the class under discussion? An effective cross-kingdom substrate for signal peptides is readily available: a fusion protein of outer membrane protein with nuclease A [called pro-Omp nuclease]. Ironically, the authors quote extensively from a paper [ref 31: Prot Scci 6: 1129 (1997)] where this substrate is discussed in detail. Alternately, a synthetic peptide corresponding to the junction of nicks observed during nmr proteolysis could be used (of course, the class of protease involved can usually be inferred from the never-characterized nicks themselves).
* Their source paper for signal peptidase says on page 1130 that "the mechanism of action of signal peptidases remains undefined even today [1997]" and that "the functional significance of this apparent substitution [lys to his in domains D] is not understood [p1133]." Does it follow from this that his 177 of prion protein functions in an active site? Contrary to the paper, his 177 is not invariant: for example, it is arginine in wolf, dingo, and dog [and far worse, alanine in 10 species of bird in a strongly alignable area, with no Lewis base in sight]. Similarly asn 173 (lys in horse), asn174 and the 'tripeptide insertion' are lost in birds (which are equally valid -- see below) -- such volatility in active site residues is totally unprecedented.
* Tyrosine cannot plausibly substitute for serine in an active site. The residue is far bulkier, far more hydrophobic, never seen as protease nucleophile (the authors here can only invoke topoisomerases) and the residue is already committed to an invariant beta strand. It is not legitimate to allow arbitrary rotamer substitutions for 3 amino acids simultaneously: serious steric conflicts result and the whole protein must be rearranged. Tyrosine, histidine, and asparagine are found in the 'wrong' place in two different species nmr structures for a very good reason: energy minimization. If these residues are so out of place, what does this say about the rest of the structure? Note that the active site alignment argument evaporates when all of the key residues of signal peptidase must be replaced and displaced.
* The burden of proof falls on the authors when they seek to spacially align signal peptidases with prion gobular domain. Thus, the secondary structure is readily determined with high confidence by online prediction tools and the xray structure of the E.coli signal protease fragment is available by private correspondence from Kuo. If the authors feel that UmuD' is similar and relevent, simply compute the rms difference of the best alignment. Try threading signal peptidase onto prion PDB corrdinates. These 4 tests take less than 10 minutes to complete -- there are 6 authors. The lex A protease family has only the vaguest ties to signal peptidases -- this is to wander off-topic. And why even test trypsin substrates or start comparing the 'active site' to cysteine proteases or wonder about beta sheet aggregates in UmuD'? These protease classes have nothing whatsoever to do with each other or to the comparison of signal peptidases to prion protein.
* The critical assertion of 23% identity is, in my opinion, completely bogus (non-robust). It is dependent on cherry-picking sequences to align from within these large families and upon hand-gapping without penalty, both of which are known to vastly inflate homology estimators. Blast2p and DSSP queries with prion protein do not return signal peptidases; queries with signal peptidases do not return prion proteins. The reason: there is a 99.9995% chance of finding proteins with this level of alignment to prion protein simply by chance. So the degree of homology is not remotely in the twilight zone; it is completely lacking in statistical significance. Signal peptidase proteins among themselves hardly have this level of identity.
Looking at Figure 1 in the paper, the authors chose a 'favorable' denominator to get their 23%. That is, the found 20 identities, the prion fragment is 100 residues and rat signal is 91 plus gaps of 6 and 3. Note 20/100 = 20% while 20/91 = 22%, so perhaps non-matching C-terminal residues were not used. Both numbers are already below the cutoff for significance. However, it gets much worse:the authors use a standard gap penalty of 3.0 for opening and 0.1 for elongation from GCG BestFit, evidently not reading software documentation. For gaps of 6 and 3, this amounts to a penalty of 3.0 + 0.6+ 3.0 + 0.3 = 6.9. This penalty must be taken from the residue identities count in order to use the 25% identity level for statistical significance [as determined by authors of DSSP, C Sander and R Schneider, Proteins 9: 56-68 1991, see page 61.] In other words, (20 - 6.9)/100 =13.1%, not remotely in the gray area for statistical signfiicance.
* Signal peptidases evidently originated in the early pre-Cambrian. Although they are poorly conserved with less than 7% identical residues as a group, weakly related members can be found in species from all kingdoms. Since mammals already have an orthologous signal peptidase (eg, SPC 18/21 in the heteropentamer from rat), the authors are evidently thinking of prion protein as a paralogue. Since prion protein is reported from salmon, this implies a gene duplication and divergence of function at least 400 million years ago (first vertebrates date to 546 Mya). There would have been no repeat region, crucial 106-126 domain, carbohydrate sites, intra-molecular disulphide bond, signal peptide, or GPI anchor at the time of gene duplication. [The authors make the astonishing suggestion that mitochondrial peptidase [nearly] homo-dimer implies a prion hetero-dimer with 'Protein X' [p294] despite billions of years of divergence and monomers and pentamers elsewhere in the family. ]
Thus it is arbitrary and capricious to align rat peptidase with mouse prion -- one might equally well use chicken as the prion representative, but then the asserted correspondences become significantly weaker. The methodology commonly used in this situation is to reconstruct ancestral sequences (to suppress noise): ie, recontruct the bird-mammal common ancestory and compare this to a reconstructed early vertebrate signal peptide, in primary and secondary structure and threaded 3D rms. Again, this takes a few minutes to accomplish with online software -- again, there are 6 authors listed. Perhaps the results were inconvenient.
* No rationale is given for the proposed activity. Prion protein sits on the extracellular membrane surface -- of what use is a signal peptidase activity here when the substrate is in the ER? Signal peptidases have integral membrane domains: why was this discarded in favor of a GPI-anchor? What is the function of the other two domains, where and when did they arise, and how do they dovetail with the proposed peptidase activity?
The function and relationships of the globular domain of prion protein remain a total mystery.
[Note the nascent prion protein must of course interact transiently with the signal peptidase in the ER in order to have its leader peptide removed. Mutations blocking this process are easily created yet none of the known CJD mutations are of this type.]