Analysis of the prion gene: introduction
Feature tables of mammalian prion genes
Rat prion cytochrome c pseudogene
Outgroup arbitration applied to prion promoter
Alignment of cyt c pseudogene delimited pre-exon 1
Ancestral size of intron 1
Full length alignment and analysis of cow and sheep
Supplemental resources: on-site, off-page
...Masked human prion: 13,885 bp removed, 21,637 bp left
...Masked mouse prion: 11,692 bp removed, 26,726 bp left
...Masked rat prion: 917 bp removed, 7577 bp left
...Masked sheep prion
...Masked cattle prion
Supplemental resources: off-site
...Censor Server at GIRI
...Blast services: 2 sequences, personal database, or by taxon
25 May 99 webmasterIY Lee et al. published a major article on the human, sheep, and mouse prion gene sequences in Genome Res. 1998 Oct;8(10):1022-37]. Recall that the coding part of the mammalian gene comprises some 774 nucleotides whereas these researchers sequenced 35,000 bp envelopes of these three species. Many improvements and new observations were made relative to the version that appeared in the 1996 Erice symposium, previously summarized and expanded upon at this site.
The article is well worth reading in detail. A cytochrome c processed pseudogene was observed in rat prion delimiting the promoter region; a laminin receptor pseudogene in mouse far upstream (a curious coincidence given its purported binding to prion protein); a 4.5 S rRNA pseudogene, and various SINE and LINE retrotransposons that were classified into ancestral mammalian and lineage-specific. They also reconstructed the history of intron 2, which is 9,970, 18,012, and 14,031 bp in human, mouse, and sheep, respectively. The mammalian prion gene grew longer over time on average through retrotranspositional insertional events in excess of deletional loss.
Intron 2 was 9,100 bp at the time of mammalian divergence. Since the average is 14,004 today, an extra 4,904 bp (54%) has crept in over the eons, about 50 bp per million years. This process is no doubt continuing today (viz., the IAP defective virus in some mouse strains). A more recent retrotransposon can split a previous one upon insertion. This gives a strip-and-join method of finding unrecognizeable older insertions and a way of ordering of events that supplementing point mutational rate dating techniques (as used with the Alu primate family).
SINEs and LINEs are a specialized world. The literature is immense, the sequences an acquired taste, the subtypes unending. Because retrotransposon analysis is constantly improving, the paper by Lee is not fully consistent either with their GenBank entries nor with a 15 Mar 99 analysis by retrotransposon expert, Jerzy Jurka, whose Censor Servor analyzes and annotates genes in real time. Indeed, many GenBank entries are under-annotated or mis-annotated with respect to retrotransposons.
Below, various topics are explored concerning the structure of the prion gene in more depth than was possible in the IY Lee paper, exploiting the more contemporary feature analysis by Jurka and by considering additional sequenced species. By stripping off insertional elements and pseudogenes found in particular lineages, better alignments and Blast queries are possible.
The rat cytochrome c pseudogene conveniently limits the extent of the prion promoter to 454 bp. A primitive rodent promoter can be reconstructed using repeat masking and a tree-topology driven consensus technique called outgroup arbitration, which allows resolution of most indels into either deletions or insertions and determines as well which lineage had the point mutation where sequences are different.
It turns out that in non-coding DNA, whether promoter or not, small deletions or insertions occur at roughly the same rate as point mutations. (Note these changes are 'accepted' rates, not incidence rates; only changes fixed by genetic drift and selection across the species are under consideration.) The size distribution of indels shows a rapid fall-off with increasing size; many occur in a repeat context, suggesting replication slippage.
The whole region delimited 5' by the rat pseudogene and 3' by the start of exon 2 is aligned here for all species available. Erratic changes can be understood by first aligning close pairs such as sheep and cow that present no difficulties. The human sequence can often cast a deciding vote when sheep and cow differ. Anchor regions (long conserved stretches) allow rodents to be added and localize residual uncertainty
This sets the stage for asking why exon 1 aligns so poorly between rodents and other mammals; indeed, upstream presumptive transcription start and control signals are far better preserved. The ancestral size and history of intron 1 can be deduced using rodent, primate, and artiodactyl sequences. Here, after discounting retrotransposons, non-rodents average 2,386 bp while rodents average 1,871 yet no conspicuous large deletion can be identified that accounts for the 515 bp difference.
15 May 99 GenBank U29185 human gene feature analysis by Lee et al, Censor Servor , webmasterThe table below shows that 57 retrotransposons make up 39.1% of the 35,522 bp human prion gene region sequenced. Exons 1, 2, 3 total about 991 bp or only 2.8%. That leaves the origin and function of 64% of the region unexplained.
None of the insertion events below are recent. The Alu mainly date to a period 35 million years ago when a master element was active in primates. The LINE elements were in some cases inserted prior to the mammalian radiation. Thus, these elements are not plausible human population polymorphisms. Simple sequence repeats (SSRs) are more useful markers in this regard: see Figure 1 of the Lee paper.
Features of the human prion gene | ||||||||||
dir + + + - + - - + - - + + - - + + + + + + - + + - + + + + + - - - - + - + + + + - - - - - + - - - - + - - - - - + + 29+ |