Prion Gene Regulation By Alternate Polyadenylation
Mad Cow Home ... Best Links ... Search this site

What are prion ESTs good for? Introduction
Human and mouse ESTs
Alignment of sheep and cow 3' UTR
Prion gene 3' UTR resources at GenBank
Prion gene expression in sheep modulated by alternative polyadenylation
References/Abstracts
Terminal alignment of the major mRNA
Tissues where mouse and human prion ESTs are found
Anti-prion 4.5 kb mRNA: harmless prion pseudogene?

What are prion ESTs good for? Introduction

22 Aug 1999 webmaster
Prion research has many bizarre aspects, most notably a culture of denial vis-a-vis related amyloidoses, foregoing of easy and instructive experiments in favor of yet another pathology phenotyping or flock ORF genotyping, widespread unawareness of protein chemistry central to the subject, and missing skills pertaining tobasic online bioinformatic resources used so widely elsewhere in molecular medicine.

For example, there are 443 prion sequences at GenBank. Some 265 of these are expressed sequence tags, ie, various factory labs perform RT PCR with oligo dT on mRNA from a bizillion tissues of mouse and human to see which genes were expressed in which tissues. There is not a single use nor mention of this resource in the 6,000 paper prion literature.

For historical reasons, many people take a dim view of the reliability of EST sequences, so it is important to recognize that sequencing accuracy is actually excellent, often at the level of 0-1 errors per 400 base pairs. (Compare this to the guinea pig or kudu prion sequences.)

Note that EST sequences are not full length for a mRNA as long as the human prion gene, which typically has a 1606 bp 3' UTR, so they are not suitable for finding a tissue where exon 2 is utilized. EST sequences, despite their large numbers, do not represent large numbers of individuals at this time and so do not often detect polymorphisms.

What then are these prion EST sequences good for? Because they start at the 3' end of an in vivo processed transcript and work their way upstream, the ESTs have direct information on alternative polyadenylation, acceptable polyA signals and sites, and tissue use, and thus implications for tissue-specific regulation of mRNAs types, stability, and utilization.

This in turn has possible applications to sporadic CJD and nvCJD or scrapie susceptibility via polymorphisms and mutations (possibly in a secondary gene) resulting in prion protein overproduction. After all these years, we still have no idea how much sporadic CJD is non-ORF familial nor what else beyond met/met distinguishes the nvCJD victims genetically.

There are two motivating precedents for looking at the 3' UTR:

(1) a 1996 paper [Hum Mut 7:280]: the -A21G nucelotide polymorphism upstream of the met initiator codon is found exclusively in A117V. In other words, these two polymorphisms are too tightly linked to have ever been separated by recombination. More bluntly, does A117V cause CJD or does -A21G?

(2) A meeting abstract [#869 Am Soc Hum Gen 46, Mahal SP et al] from 1996 also found two polymorphisms within 600 bp upstream of exon 1 in a screen of sporadic CJD and apparently nvCJD. Nothing further has been released by August 1999 -- a long delay that raises questions.

Some specific mouse and human ESTS are discussed below along with a graphic comparing alternate prion mRNA organization across 8 species and sequence alignments around the various alternative polyA sites, which have held up well over evolutionary time scales.

Terminal alignment of the major mRNA

20 Aug 99 webmaster
The seven species studied so far have predominant mRNA of similar size, after discounting retrotransposons, alternative exon 1 and 2 use, and secondary mRNA polyadenylation sites. Lee et al aligned 4 of 7 species' terminal 3' UTR mRNA in Fig. 6C of their 1998 Genome Research paper, noting a common poly A signal and site, as well as a GT-rich post-adenylation stretch and a conserved 40bp box upstream of the signal.

Here this alignment is extended to all available species (some with multiple independent sequences) and continued in both directions. In effect, the alignability suggests that this site has been the major polyA since these species last had a common ancestor some 100 million years ago. This is not a universal terminus -- no homology exists to any other known human or mouse gene. The low rate at which point mutations and small indels become fixed (relative to intermediate mRNA sequences) suggest considerable selective pressure constrain this region and that polymorphisms here may affect mature prion mRNA formation stability and levels of protein produced, with possible implications for sporadic TSE and susceptibility to infection.

Note that a search for secondary structure (hairpins) in full length mRNA [R Luck, JMB 258 813 1996] found no significant structures either 5' or 3' of the open reading frame (Fig 5, pg 820). The only mRNA features conserved across species was in hairpin C in the repeat region which found support in the use of minor codons at 3rd position. Therefore it is unlikely that conserved 3' UTR stretches are related to hairpins. More likely, they have to do with polyadenylation sites or other unknown sequence features important to mRNA stability and translation or to chromosomal structure.

Using the alignment clamped to the known phylogenetic tree, an ancestral mammal 3' UTR sequence is developed (which serves as a noise-reduced query probe and a baseline for mutational rate), as well as a predicted cervid terminus (the species most likely to be next sequenced). As noted by Lee, rodents have experienced a more rapid rate of mutational fixation than the other lineages.

pre-signal alignnment is good:

human     1     gaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttcta tatttgtaactttgcatgt-tcttgttttgttatataaaaaaattgtaaatgtttaatatctgactgaa 128 
hamster   2321  ...............c..c.-...t.....c......c...........c.c........ ...................-c..........c.........gt..a.........gc... .......   2446
mouse     30565                          ..........-.c...........c.cg....... ...................a.t.........c.........gt..a.........gc... a......  30666
rat       2504                           ......c...-.c...........c.cg....... ............c......a.-.........c.........gt..a.........gc...           2596
bovine    4127                               .........c......g...-.......... ...................-a.........-.g.-.-....gt..a.....a..............     4218
sheep     26196                              .........c......g...-.......t.. ...................-a.........-.g.-.-....gt..a.....a..............    26287
mink      2286                           ......c............................ ...........gg......-a.....                                   ........  2604

>bovinenopost-mRNA. 4021cacgttttggccaacccaatactgaatacttaaaggaaactcttccgtgttgtccttagc 4081cttacagcgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcat 4141gtgcttatattttctatatttgtaactttgcatgtacttgttttgtgttaaaagtttata 4201aatatttaatatctgactaaaattaaacaggagctaaaaggagg >sheep 26101caacccaatactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtg 26161cactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatatt 26221ttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaata 26281tctgactaaaattaaacaggagctaaaaggagtatcttccacggagtgtctggctgtgtt 26341caccagtgtgcacactatgttggcagcttcatttggggggttaatatgagaaaagtgaca >mink ctaaatacttaatatgtaga 2221aatccttttgcgtggtcctcaggcttacacgtgcactgaatagttttgtatgatagagcc 2281catgtggtcttcgaaatatgcatgtactttatattttctatatttgtaactgggcatgta 2341cttgtataaaaaatgtataaacattcgaactcttgactagaattaaacaggaactgagtg 2401tgtcccatgtgtttgcagtgacattcaccaccgcaccctgtgttgg >human 27601atatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcat 27661gtaa gaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattt 27721tctatatttgtaactttgcatgttcttgttttgttatataaaaaaattgtaaatgtttaa 27781tatctgactgaaATTAAAcgagcgaagatgagcaccacctcccgtgtctgcagttgtatt 27841tcctggtgcttgccctgtgttggggactgttttgggggttaatctgagccaagtggcgct 27901 ttctgtcctc ccttctcaag tgatggccga tggttcacgc acttccccct gttcctgccc 27961 ttgtcctcac ttcccagtca cccactagtt catctctgcg gcttttgcat tttctccaca >mouse 30481atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta 30541ggattccaaagcagacccctagctggtctttgaatctgcatgtacttcacgttttctata 30601tttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctatca 30661gactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctccaatgc 30721tccgagtccactgtttatcgccagggtggcttgggctcatttcacatccctgtccctgag >rat 2401ctgaagtgtggaacgcactggccgttctgtgcagtactaagtgtgacccttgggctttca 2461atgtgcactcggttccgtatgattccaaagtagagccctagctggtcttcgaatctgcat 2521gtacttcacgttttctatatttgtaacttcgcatgtatttgttttgtcatataaaaagtt 2581tataaatgtttgctatctgactgacattaaatagaagctatgatgagcacgtgtgggggt 2641ttttctccttcaatgctcctggccctgtgtttgtcacgagggtggcttgggctcatctga >hamster 2221atgcatccgaagtacgtaatgcactgaccatttcacccggtatcagatgttttctgtgtg 2281gcccctagctttccttcaacatgcattcggttccatatatgaatccaaagtggaccccct 2341aactggtctctgaaatctgcatgtacttcacattttctatatttgtaactttgcatgtcc 2401ttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaataggagc 2461tatgatgagcacccctgcagggtttgttctctgttctctgcttctggcccttgtgtttgt 2521 tgccagggta acttgggctc acacaaggta ggtaatggct aatttcacat gccttcccct >rodent atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta ggattccaaagTagacccctagctggtctttgaatctgcatgtacttcacgttttcta tatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctat cTgactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctcc aatgctccgagtccactgtttatcgccagggtggcttgggctcatttcacatccc tgtccctgaTgggcctcgggtcttacctctggtcctgtcttgtttccactggc tttgcatTttcccctaagttGtacttagccctgctgaaacacaaaagcactcctggg gaggaggggtggggagagga
CLUSTAL W (1.74) multiple sequence alignment


bovine          CACGTTTTGGCCAACCCAATACTGAATACTTAAAGGAA--ACTCTTCC------------ 46
sheep           -----------CAACCCAATACTGAATACTTAAAGGAA--ACTCTTCT------------ 35
mink            --------------CTAAATACTTAATATGTA--G-AA--ATCCTTTT------------ 29
human           -------------------------ATATGTG--GGAA--ACCCTTTT------------ 19
mouse           ------------------ATCC--AGT-----ACTAAATG------CT------TACC-- 21
rat             ------------------CTGA--AGTGTGGAACGCACTGGCCGTTCTGTGCAGTACTAA 40
hamster         --------------ATGCATCCGAAGTACGTAATGCACTGACCATTTCACCCGGTATCAG 46
                                          *         *                       

bovine          ---------GTGTTGTCCTTAGCCTT---ACAGCGTGCACTGAATAGTTTTGTATAAGA- 93
sheep           ---------GTGTTGTCCTTAGCCTT---ACAGTGTGCACTGAATAGTTTTGTATAAGA- 82
mink            ---------GCGTGGTCCTCAGGCTT---ACA-CGTGCACTGAATAGTTTTGTATGATAG 76
human           ---------GCGTGGTCCTTAGGCTT---ACAATGTGCACTGAATCGTTTCATGTAAGA- 66
mouse           ---------GTGTGACCCTTGGGCTT---TCAGCGTGCACTCA---GTTCCGTAG--GA- 63
rat             ---------GTGTGACCCTTGGGCTT---TCAATGTGCACTCG---GTTCCGTAT--GA- 82
hamster         ATGTTTTCTGTGTGGCCCCTAGCTTTCCTTCAACATGCATTCG---GTTCCATATATGA- 102
                         * **   **   *  **    **   **** *     ***   *     * 

bovine          ATCCAGAGTGA--------------TATTTGAAATACGCATGTGCTT-ATATTTTCTATA 138
sheep           ATCCAGAGTGA--------------TATTTGAAATACGCATGTGCTT-ATATTTTTTATA 127
mink            AGCCCATGTGG--------------TCTTCGAAATATGCATGTACTTTATATTTTCTATA 122
human           ATCCAAAGTGGACACCATTAACAGGTCTTTGAAATATGCATGTACTTTATATTTTCTATA 126
mouse           TTCCAAAGCAGACCCC--TAGCTGGTCTTTGAA-TCTGCATGTACTTCACGTTTTCTATA 120
rat             TTCCAAAGTAGAGCCC--TAGCTGGTCTTCGAA-TCTGCATGTACTTCACGTTTTCTATA 139
hamster         ATCCAAAGTGGACCCCC-TAACTGGTCTCTGAAATCTGCATGTACTTCACATTTTCTATA 161
                  **   *                 * *  *** *  ****** *** *  **** ****

bovine          TTTGTAACTTTGCATGTACTT-GTTTTGTGTT---AAAAGTTTATAAATATTTAATATCT 194
sheep           TTTGTAACTTTGCATGTACTT-GTTTTGTGTT---AAAAGTTTATAAATATTTAATATCT 183
mink            TTTGTAACTGGGCATGTACTT-GTATA--------AAAAATGTATAAACATTCGAACTCT 173
human           TTTGTAACTTTGCATGTTCTT-GTTTTGTTATATAAAAAAATTGTAAATGTTTAATATCT 185
mouse           TTTGTAACTTTGCATGTATTTTGTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCA 180
rat             TTTGTAACTTCGCATGTATTT-GTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCT 198
hamster         TTTGTAACTTTGCATGTCCTT-GTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCT 220
                *********  ******  ** ** *         ****   * ****  **     ** 

bovine          -GACTAAAATTAAACAGGAGCTAAAAGGAGG----------------------------- 224
sheep           -GACTAAAATTAAACAGGAGCTAAAAGGAGTATCTTC--CACGGAGTGTCTGGCTGTG-- 238
mink            TGACTAGAATTAAACAGGAACT----G-AGTGTGTCC--CATG---TGTTTG-CAGTGAC 222
human           -GACTGAAATTAAAC--GAGCGAAGATGAGCACCACCT-CCCG---TGTCTG-CAGTTGT 237
mouse           -GACTGACATTAAATAGAAGCTATGATGAACACC--TGGCGGG----GTTTG----TTCT 229
rat             -GACTGACATTAAATAGAAGCTATGATGAGCACG--TGTGGGG----GTTT-----TTCT 246
hamster         -GACTGACATTAAATAGGAGCTATGATGAGCACCCCTG-CAGG----GTTTG----TTCT 270
                 ****   ******    * *       *                               

bovine          ------------------------------------------------------------
sheep           ---TTCACCAGTGTGCACACT-ATGTTGGCAGCTTC-ATTTGGGGGGTTAATATGAGAAA 293
mink            A--TTCACCACCG--CACCCT-GTGTTGG------------------------------- 246
human           A--TTTCCTGGTGCTTGCCCT-GTGTTGGGGACT---GTTTTGGGGGTTAATCTGAGCCA 291
mouse           CTCTCCAATGCTCCGAGTCCA-CTGTTTATCGCCAGGGTGGCTTGGGCTCATTTCACATC 288
rat             C-CTTCAATGCTCCTGGCCCT-GTGTTTGTCACGAGGGTGGCTTGGGCTCAT-------- 296
hamster         CTGTTCTCTGCTTCTGGCCCTTGTGTTTGT------------------------------ 300
sheep U67922
cattle AB001468
mink S46825
human U29185
mouse U29187
... U29186
rat D50093
golden hamster M14054
Beginning with the 8 partly redundant rodent sequences, it is best to reconstruct the common ancestor of rat, mouse, and hamster. Bovine sequence agrees with the Lee sequence at all points where the Goldmann sequences differ, suggesting strongly that the Goldmann sequences are erroneous. This gives:
>rodent
atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta
ggattccaaagTagacccctagctggtctttgaatctgcatgtacttcacgttttcta
tatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctat
cTgactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctcc
aatgctccgagtccactgtttatcgccagggtggcttgggctcatttcacatccc
tgtccctgaTgggcctcgggtcttacctctggtcctgtcttgtttccactggc
tttgcatTttcccctaagttGtacttagccctgctgaaacacaaaagcactcctggg
gaggaggggtggggagagga

Tissues where mouse and human prion ESTs are found

20 Aug 99 webmaster
Of the 204 Soares ESts, 14 are from brain, 16 from mammary, 18 from embryo, 60 from tumors, utuerus, lung, kidney, skin, Tcell, colon, liver, myotubes, heart, melanocyte, spleen, ovary,

(((((prion[All Fields] AND soares[All Fields]) NOT brain[All Fields]) NOT mammary[All Fields]) NOT embryo[All Fields]) NOT tumor[All Fields]) --

Human and mouse ESTs

16 Aug 99 webmaster
A great many of the 443 prion sequences at GenBank are actually expressed sequence tags, or RT PCR using polyT primers on bulk mRNA in various tissues of mouse and human. These thus give information about the 3' UTR and possibly about tissue use of various alternative polyadenylation sites. The GenBank search term ((((((prion[All Fields] NOT NCI_CGAP[All Fields]) NOT Soares[All Fields]) NOT Sugano[All Fields])NOT Stratagene[All Fields]) NOT patent[All Fields]) NOT primer[All Fields]) leaves 174 sequences of actual prion genes from various species.

Blastn, set to human ests, with the 3'UTR of human prion + 100 bp past the mRNA as query, ie 26212 to 27817 to 27917, pulls up 168 ESTs, with somewhat ragged ends 3' but mostly corresponding to the expected main mRNA. However, there were ones that began farther upstream. Does this reliably mean that in some tissues, mRNA is made and polyadenylated at an earlier site?

The NCI_CGAP series has 49 sequences (all human), the Soares 203 (122 human, 81 mouse), Stratagene has 33 (27 mouse, 4 rat) and the Sugano 13 (all mouse), accounting for 265 of the prion entries. The entries give the primer used. Some of the more recent NCI_CGAP simply use oligo dT as primer and recover mRNA sequences of excellent quality. For example AI801189 has 381 bp in 100% agreement with the main human sequence, U29185. The EST covers 27387 to 27816 of that sequence (plus 1 additional A), ie, perfectly reflects the most common in vivo mRNA, of length 2587 bp, the so-called 2.5 kb mRNA, with probable polyA signal aaattaaacg agcgaagatg agcacc. 123 ESts have similar terminations.

By including flanking sequence past the end of the main human mRNA, it can be seen that 111 EST sequences begin as expected and none continue on downstream (though minor ragged ends are seen). Thus the 2586 bp mRNA (called the 2.5k species) is the longest seen.

The NCI_CGAP series almost all start at the normal mRNA end.  However there were 3 unusual ones, one starting at 501, 1239 upstream, and 312 bp downstream of the usual end.  The one shown runs in U29185 from  26325-26705, whereas the 3'UTR runs  -26212-27817, ie this mRNA is 1113 bases shorter than the full 1606 bp 3' UTR.

    26281 gcccttttag tggtggtgtc tcactctttc ttctctcttt gtcccggata ggctaatcaa
    26341 tacccttggc actgatgggc actggaaaac atagagtaga cctgagatgc tggtcaagcc
    26401 ccctttgatt gagttcatca tgagccgttg ctaatgccag gccagtaaaa gtataacagc
    26461 AAATAAccat tggttaatct ggacttattt ttggacttag tgcaacaggt tgaggctaaa
    26521 acaaatctca gaacagtctg aaataccttt gcctggatac ctctggctcc ttcagcagct
    26581 agagctcagt atactaatgc cctatcttag tagagatttc atagctattt agagatattt
    26641 tccattttaa gaaaacccga caacatttct gccaggtttg ttaggaggcc acatgatact
    26701 tattc  aaaaa aatcctagag
Query: 1598 ATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTATATAACAAA 1539 Query: 1585 TAATTTCAGTCAGATATTAAACATTTACAATTTTTTTATATAACAAAACAAGAACATGCA 1526 Query: 1539 AACAAGAACATGCAAAGTTACAAATATAGAAAATATAAAGTACATGCATATTTCAAAGAC 1480 Query: 501 TTTTTTTGAATAAGTATCATGTGGCCTCCTAACAAACCTGGCAGAAATGTTGTCGGGTTT 442 >gb|AI354282.1|AI354282 Query: 1239 AACATTGCAGAAAAGTAATACATATCTGCTAGGTGACAATATCAAACAATTCAGGGAATA 1180 >gb|AI828378.1|AI828378 >gb|AA906777.1|AA906 Length = 292 Query: 1606 TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT 1547 TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT Sbjct: 312 TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT 371
>gi|4094435 TTTTTTTTTTTTTTTTTTTTTTTTTTTGAATAAGTATCATGGGGCCTCCTAACAAACCTGGCAAAAATGT TGTCGGGTTTTCTTAAAAGGGAAAATATCTTTAAATAGCTATGAAATCTCTACTAAAATAGGGCATTAGT ATACTGAGCTCTAGCTGCTGAAGGAGCCAGAGGTATCCAGGCAAAGGTATTTCAAACTGTTCTGAGATTT GTTTTAGCCTCAACCTGTTGCACTAAGTCCAAAAATAAGTCCAGATTAACCAATGGTTATTTGCTGTTAT ACTTTTACTGGCCTGGCATTAGCAACGGCTCATGATGAACTCAATCAAAGGGGGCTTGACCAGCATCTCA GGTCTACTCTATGTTTTCCAGGGCCCATCAGGGCCAAGGGTATTGATTAGCCTATCCG

Probing the mouse EST database with the 3' UTR of mouse mRNA turns up a number of sequences shorter than full length mRNA. For example, gi|4617056 matches U29186 genomic mouse prion from 29425 - 29950 whereas normal exon 3 UTR runs from 29455- 30687, ie, there must be a polyA site at 29950 and a polyA signal preceding this. This gives rise to a 3'UTR of 496 bp and a mRNA of 1415 bp. At least 10 other ESTs begin in the same region, often up to 50 bp shorter, from embryo, T cell, mammary, and myotubes but not mouse brain [AA56217984, AI11755268, AA64593576, AA16379486, AA9606664, AI6078895, AA26019379, AA47607827, W99102, AA72698185, AI89319821]


    29701 AAATAActgc tggctagttg gggctttgtt ttggtctagt gAATAAAtac tggtgtatcc
    29761 cctgacttgt acccagagta caaggtgaca gtgacacatg taacttagca taggcaaagg
    29821 gttctacaac caaagaagcc actgtttggg gatggcgccc tggaaaacag cctcccacct
    29881 gggatagcta gagcatccac acgtggaatt ctttctttac taacaaacga tagctgattg
    29941 aaggcaacag gaaaaaaaaa atcaaattgt 
>AI607889  Soares mouse mammary gland  
oligoT-ctgttgccttcaatcagctatcgtttgttagtaaagaaagaattccacgtgtggatgctctagctatc
ccaggtgggaggctgttttccagggcgccatccccaaacagtggcttctttggttgtagaaccctttgcc
tatgctaagttacatgtgtcactgtcaccttgtactctgggtacaagtcaggggatacaccagtatttat
tcactagaccaaaacaaagccccaactagccagcagttatttggtgttatattcttattggcccggtgtt
agcactggctgatgacagactccatcaaagggacctgaagcaaagagcaactggtctactgtacatttcc
cagggcccatcagtgccaggggtattagcctatgggggacacagagaagcaagaatgagaaccacctcaa
ttgaaagagctacaggtggataacccctcccccagcctagaccacgagaatgcgaaggaacaagcaggaa
agcctccctcatcccacgatcagangatgaggaaagga
A later site corresponding to 29791-30434 (ie, 253bp less than full mRNA) is represented by several sequences as well:
 
    30181 tagcttctgc cctatgtttc tgtacttcta tttgaactgg ataacagaga gacaatctaa
    30241 acattctctt aggctgcaga taagagaagt aggctccatt ccaaagtggg aaagaaattc
    30301 tgctagcatt gtttaaatca ggcaaaattt gttcctgaag ttgcttttta ccccagcaga
    30361 cataaactgc gatagcttca gcttgcactg tggattttct gtatagAATA TATAAAacat
    30421 aacttcaagc ttat  gtcttc tttttaaaac atctgaagta tgggacgccc tggccgttcc

>AA522175.1 Soares mouse mammary 
GTGACACATGTAACTTAGCATAGGCAAAGGGTTCTACAACCAAAGAAGCCACTGTTTGGGGATGGCGCCC
TGGAAAACAGCCTCCCACCTGGGATAGCTAGAGCATCCACACGTGGAATTCTTTCTTTACTAACAAACGA
TAGCTGATTGAAGGCAACAGGAAAAAAAAAATCAAATTGTCCTACTGACGTTGAAAGCAAACCTTTGTTC
ATTCCCAGGGCACTAGAATGATCTTTAGCCTTGCTTGGATTGAACTAGGAGATCTTGACTCTGAGGAGAG
CCAGCCCTGTAAAAAGCTTGGTCCTCCTGTGACGCGAGGGATGGTTAAGGTACAAAGGCTAGAAACTTGA
GTTTCTTCATTTCTGTCTCACAATTATCAAAAGCTAGAATTAGCTTCTGCCCTATGTTTCTGTACTTCTA
TTTGAACTGGATAACAGAGAGACAATCTAAACATTCTCTTAGGCTGCAGATAAGAGAAGTAGGCTCCATT
CCAAAGTGGGAAAGAAATTCTGCTAGCATTGTTTAAATCAGGCAAAATTTGTTCCTGAAGTTGCTTTTTA
CCCCAGCAGACATAAATTGCGATACGTCAGCTTGCACTGTGGATTTTCTGTATAGAATATATAAAACATA
ACTTCAAGCTTAT

Alignment of sheep and cow 3' UTR

7 Aug 99 webmaster 
The Blast server can be used to provide an excellent alignment. Below Lee's sheep sequence (with retrotransposons masked out) was aligned against other ruminant prion sequences. After small adjustments by hand near indels, outgroup arbitration was used to construct the ancestral ruminant 3' UTR. (This parsimony method, at positions where sheep sequences differ, lets the bovine sequence rule. This can resolve some point mutations and indels, as well as sequencing errors and other noise. Here 22 improvements can be made in the ruminant sequence; these are shown in caps. Use of mink 3' UTR can refine this further.)

In Figure 3 of Goldmann et al 1999, clear homology is established about the 2.1k polyA signal at 680 3' UTR for cow and sheep. Indeed, 84 consecutive nucleotides are identical, mostly distal of the ATTAAA signal. The signal is found 73 bp upstream of the first retrotransposon. This signal region was not tested for longer range homology in, say, mink or human. Blastn of the 84 nucleotides pulls up mink and human prion but not rodent, with the first AATTAA putative signal conserved in mink and human (27052 region homology). [Probe: gtgatattcctttctttagtaacataaagtatagatAATTAAggtaccttAATTAAactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgct]

  
QUERY     1     gtgatattcctttctttagtaacataaagtatagatAATTAAggtacctt---AATT--- 54  AAacta
M31313    1472  ..................................................---....--- 1525
M31313    1526                                     ...............---....--- 1508
AJ223072  1400  ..................................................---....--- 1453
AJ223072  1454                                     ...............---....--- 1436
U67922    23678 ..................................................---....--- 23731
U67922    23732                                    ...............---....--- 23714
AB001468  1591      ....................--.................t......---....taa 1641
D10612    1591      ....................--.................t......---....taa 1641
D38179    2147  ................................................             2194
S46825    1614       ............c.....c....c...............c.-g..aaa....--- 1664   mink

QUERY     6     attcctttctttagtaacataaagtatagatAATTAAggtaccttAATTAAacta--ccttc 65   sheep
U29185    27052                    taaactataggtAATTAAggcagctgaaaagtaaattgccttc     human

QUERY     66    tagacactgagag-caaatctg--ttgt----ttatctggaacccaggatgattttgaca 118
U29185    27057 ..........-..g.......cct....ccat...c......a....a............ 27115

..........27001 gcattccttt ctttaaacta taggtAATTA Aggcagctga aaagtaaatt gccttctaga non-aligning human

Ancestral ruminant 3' UTR sequence (unresolvable alternative bases shown above)

....................................................a...................a..
RUMIN    1     gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc 60
 
...........................c.....................g............t...........g
RUMIN    61    tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata 119
 
...............c..................t...............-g......................
RUMIN    120   ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta 178     1

.........................g............g......c.............................
RUMIN    179   ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt 238

.....................................g.................t...................
RUMIN    239   ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa 298

........................................................c.-................
RUMIN    299   gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca 358

................................tg..........t..............................
RUMIN    359   tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag 417

...........................................................g...............
RUMIN    418   gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa 477
.............................................t.............................

RUMIN    478   tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata 537    1
.......................................................g...................

RUMIN    538   atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca 597    1

...................................................g...................--..
RUMIN    598   gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa 656     1

...............................t....ttaat..................................
RUMIN    657   agtatagataattaaggtacc-----ttaATTAAActaccttctagacactgagagcaaa 711  2.1k site

.................................................................-.........
RUMIN    712   tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag 771

.........................t...................................t.............
RUMIN    772   aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt 831    2

..............................t.......a..a................................a
RUMIN    832   gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac- 890     2

...............aagaa.a.........................a..--...............c.......
RUMIN    891   -----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta 945    1

.....................................a..tt.........c...................tatc
RUMIN    946   ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa---- 999

...........................ca.-............a..........g.............t.....-
RUMIN    1000  -taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact 1057

................a............................gg............................
RUMIN    1058  g-ttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag 1116  2

.......................g.....c......a.....g........t....................at..
RUMIN    1117  acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a 1174   3i

....................a............g....aagaa............................t.t.
RUMIN    1175  cAtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc 1228  1

...........................................................................
RUMIN    1229  taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa 1288

...............................................c.....................c....
RUMIN    1289  taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgc 1347 7d

...........................................................................
RUMIN    1348  actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt 1407  2d

................c..........................................................
RUMIN    1408  tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat 1467

...........................................................................
RUMIN    1468  ctgactaaaattaa 1481

best corrected reduced sheep sequence
gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc
tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata
ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta
ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt
ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa
gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca
tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag
gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa
tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata
atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca
gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa
agtatagataattaaggtacc-----ttaattaaactaccttctagacactgagagcaaa
tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag
aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt
gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac-
-----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta
ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa----
-taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact
g-ttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag
acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a
cAtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc
taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa
taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgcd
actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt
tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat
ctgactaaaattaa
best corrected reduced sheep sequence (gaps removed)
gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc
tacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtata
ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta
ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt
ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa
gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca
tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag
gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa
tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata
atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca
gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa
agtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaa
tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag
aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt
gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac
atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta
ggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaagaaa
taaattagagatgatagaaatctgatccattcagagtagaaaaagaaattccattact
gttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag
acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaaa
cAtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacc
taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa
taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgcd
actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt
tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat
ctgactaaaattaa
  
gb|U67922|OAPRP  Ovis aries prion protein gene, complete cds     1491  0.0
emb|AJ223072|OAPRION  Ovis aries PrP gene, complete cds          1477  0.0
gb|M31313|SHPPRP  Ovis aries prion protein (PrP) gene, compl...  1465  0.0
dbj|D38179|SHPPRPA  Sheep gene for prion protein PrP, comple...  1342  0.0
dbj|D10cow|BOVPRP1  Bovine mRNA for prion protein                1271  0.0
 
QUERY    1     gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc 60
U67922   23049 ............................................................ 23108
AJ223072 772   ............................................................ 831
M31313   843   ............................................................ 902
D38179   1518  ............................................................ 1577
D10cow   957   .....................................a...................a.. 1016
 

QUERY    61    tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata 119
U67922   23109 ...............................................-............ 23167
AJ223072 832   ...............................................-............ 890
M31313   903   ...............................................-............ 961
D38179   1578  ...............................................-............ 1636
D10cow   1017  ............c.....................g............t...........g 1076
 

QUERY    120   ataatacccttggcgcttacagcactgggaaatgaca-agcagacatgagatgctgttta 178
U67922   23168 .....................................-...................... 23226
AJ223072 891   .....................................-...................... 949
M31313   962   .....................................-.................a.... 1020
D38179   1637  .....................................-...................... 1695
D10cow   1077  c..................t...............-.g...................... 1135

QUERY    179   ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt 238
U67922   23227 ............................................................ 23286
AJ223072 950   ............................................................ 1009
M31313   1021  ............................................................ 1080
D38179   1696  ............................................................ 1755
D10cow   1136  ..........g............g......c............................. 1195

QUERY    239   ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa 298
U67922   23287 ............................................................ 23346
AJ223072 1010  ............................................................ 1069
M31313   1081  ............................................................ 1140
D38179   1756  ............................................................ 1815
D10cow   1196  ......................g.................t................... 1255

QUERY    299   gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca 358
U67922   23347 ............................................................ 23406
AJ223072 1070  ............................................................ 1129
M31313   1141  ............................................................ 1200
D38179   1816  ............................................................ 1875
D10cow   1256  .........................................c.-................ 1314

QUERY    359   tagacccagggtccaccct-gttgagagcatgtgtcctgtgtctgcagagaactataaag 417
U67922   23407 ...................-........................................ 23465
AJ223072 1130  ...................-........................................ 1188
M31313   1201  ...................-........................................ 1259
D38179   1876  ...................-........................................ 1934
D10cow   1315  ...............-...g.........t.............................. 1373

QUERY    418   gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa 477
U67922   23466 ............................................................ 23525
AJ223072 1189  ............................................................ 1248
M31313   1260  ............................................................ 1319
D38179   1935  ............................................................ 1994
D10cow   1374  ............................................g............... 1433

QUERY    478   tggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacata 537
U67922   23526 ............................................................ 23585
AJ223072 1249  ........................................c................... 1308
M31313   1320  ........................................c................... 1379
D38179   1995  ............................................................ 2054
D10cow   1434  ..............................t............................. 1493

QUERY    538   atacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca 597
U67922   23586 ............................................................ 23645
AJ223072 1309  ....-....................................................... 1367
M31313   1380  ....-....................................................... 1438
D38179   2055  ............................................................ 2114
D10cow   1494  ........................................g................... 1553

QUERY    598   gccttccattttgtatgttt-aagcaccttcaagtgatattcctttctttagtaacataa 656
U67922   23646 ....................-....................................... 23704
AJ223072 1368  ....................-....................................... 1426
M31313   1439  ....................a....................................... 1498
D38179   2115  ....................-....................................... 2173
D10cow   1554  ....................a...............g...................--.. 1611

QUERY    657   agtatagataattaaggtacc-----ttaattaaactaccttctagacactgagagcaaa 711
U67922   23705 .....................-----.................................. 23759
AJ223072 1427  .....................-----.................................. 1481
M31313   1499  .....................-----.................................. 1553
D38179   2174  .....................                                        2194
D10cow   1cow  ................t....ttaat.................................. 1671

QUERY    712   tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag 771
U67922   23760 .........................................                    23800
U67922   24188                                          ................... 24206
AJ223072 1482  .........................................                    1522
AJ223072 1909                                           .........-......... 1926
M31313   1554  .........................................                    1594
M31313   1981                                           .........-......... 1998
D10cow   1672  .......................................                      1710

QUERY    772   aatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagt 831
U67922   24207 ............................................................ 24266
AJ223072 1927  ...............-...-........................................ 1984
M31313   1999  ...............-...-........................................ 2056
AB001cow 2113  ..........t...................................t............. 2172

QUERY    832   gttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtac- 890
U67922   24267 ...........................................................- 24325
AJ223072 1985  ..............g...................................-........- 2042
M31313   2057  ..............g...................................-........- 2114
AB001cow 2173  ...............t.......a..a................................a 2232

QUERY    891   -----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagtta 945
U67922   24326 -----....................................................... 24380
AJ223072 2043  -----...................................................a... 2097
M31313   2115  -----...................................................a... 2169
AB001cow 2233  aagaa.a.........................a..--...............c...a... 2290

QUERY    946   ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa---- 999
U67922   24381 .........................--.............................---- 24434
AJ223072 2098  .........................--.............................---- 2151
M31313   2170  .........................--.............................---- 2223
AB001cow 2291  ......................a..tt.........c...................tatc 2350

QUERY    1000  -taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact 1057
U67922   24435 -...........-............................................... 24492
AJ223072 2152  -...........-............................................... 2209
M31313   2224  -...........-............................................... 2281
AB001cow 2351  t...........ca.-............a..........g.............t.....- 2408

QUERY    1058  g-ttatttaagaaggtaaaattatttcctgaattgttcaatattgtcacctagcagatag 1116
U67922   24493 .-.......................................................... 24551
AJ223072 2210  .-.....a.................c.................................. 2268
M31313   2282  .-.....a.................c.................................. 2340
AB001cow 2409  .a............................gg............................ 2cow

QUERY    1117  acactattattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a 1174
U67922   24552 .........................................................--. 24609
AJ223072 2269  ....---..................................................--. 2323
M31313   2341  ....---..................................................--. 2395
AB001cow 2469  ........g.....c......a.....g........t....................at. 2528

QUERY    1175  cgtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc 1228
U67922   24610 .....-.................-----................................ 24663
AJ223072 2324  .a...-.................-----................................ 2377
M31313   2396  .a...-.................-----................................ 2449
AB001cow 2529  .a...a............g....aagaa............................t.t. 2588

QUERY    1229  taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa 1288
U67922   24664 ..............................................               24709
AJ223072 2378  ..............................................               2423
M31313   2450  ..............................................               2495
AB001cow 2589  ..............................................               2634

QUERY    1289  taaatgtactgaatacttaaaggaaactcttctgtgttgtccttag-ccttacagtgtgc 1347
U67922   26109       ........................................-............. 26161
AJ223072 3819        ........-..-...................-..---.-.t............. 3865
M31313   3889        ........-..-...................-..---.-.t............. 3935
AB001cow 4040        ..........................c.............-........c.... 4092

QUERY    1348  actgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt 1407
U67922   26162 ............................................................ 26221
AJ223072 3866  .....-......-............................................... 3923
M31313   3936  .....-......-............................................... 3993
AB001cow 4093  ............................................................ 4152

QUERY    1408  tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat 1467
U67922   26222 ............................................................ 26281
AJ223072 3924  ............................................................ 3983
M31313   3994  ............................................................ 4053
AB001cow 4153  .c.......................................................... 4212

QUERY    1468  ctgactaaaattaa 1481
U67922   26282 .............. 26295
AJ223072 3984  .............. 3997
M31313   4054  .............. 4067

Prion gene 3' UTR resources at GenBank


What 3' prion UTR sequences are known and what features do they have?  The best starting point are the three long prion sequences determined by Inyoul Lee [Genome Res. 8 (10), 1022-1037 (1998)] and sequences obtainable from them by Blast services

sheep U67922
... AJ223072
... M31313
... D38179
goat Z71825 mRNA EST
cattle AB001468
... D10612
mink S46825
human U29185
mouse U29187
rat D50093
... M20313 golden hamster M14054
... M37381
... K02234

Other ruminant sequences for this region extracted by Blastn include 2 full length sheep sequences and 1 full length bovine sequence; all of which contain a slight mismatch about 220 bp before the terminus, suggesting the Lee sequence has a slight glitch (or a breed differance). The overall alignment shows that the retrotransposon events occurred prior to the divergence of sheep and cattle.

None of these sequences have internal repeats of any signficant length. Blastn with sheep against human pulls up only the prion gene but oddly only a poor alignment from 1400 on, terminally 27,694 to 27,797. Using reduced sheep improves this slightly, aligning poorly over 550 terminal base pairs. Lowering gap penalties on advanced Blastn options to -G2 -E1 is appropriate given that small deletions and insertions are about as common as point mutations elsewhere in non-coding regions of this gene. Another good Blast trick is to use 'tlat master slave with identity' and adjust the expect value (e equal to any convenient number in custom settings) to eliminate undesirable sequences from the alignment.

This very much improves the alignment, which now extends best from 506-1481 of reduced sheep to 26867-27797 of human with the statistical significance threshold for reporting matches set at 0.001 but full-length alignment (with 4 areas of mismatch) when set at 0.01;

Sheep sequence, either full or reduced, advance gapping or not, recovers 9 rodent prion sequences with a very poor but extensive alignment, except for about 60 bp terminally. Thus it will be possible to align sheep and cattle along their entire length, with some help distally from human outgroup arbitration, to determine variable positions and an ancestral sequence. Again, a deer or elk sequence would be a great help.

A dozen rodent 3'UTR sequences have been determined, 9 of which are full length. Mouse, rat, and golden hamster are available in multiple entries, allowing good reconstruction of an ancestral rodent 3' UTR. These align fairly well with human over the first 250 bp and last 108 bp, poorly except for length otherwise.. The 7 mouse sequences differ from each other only at 5 scattered point mutations. However mouse and rat differ at 130 sites plus 37 small deletions, out of 1234 bp considered, so about 13.5 %, so a rather rapid rate of evolution over the 12 million years since divergence. Mouse and hamster differ at 143 point changes and 44 deletions.

Exon 3 (post stop codon) is given as:

sheep Lee sequence 23,049-26,295 = 3,247 bp of which 1766 bp is found in 3 retrotransposons, leaving residual length 1,481 comparable to other mammals. See also Goldmann sequence AJ223072, M31313 [with 82 differences probably mostly sequence error], D38179.

     repeat_region   23,801..24,187  Bov-B    =   387 bp [alt: BOV2; LINE-like element]
     repeat_region   24,709..24,867  Bov-tA3  =   159 bp [alt: 24,708..24,872 BOVTA; art-2 SINE]
     repeat_region   24,889..26,108  OaMAR1   = 1,220 bp [alt: none, off Medline, Lee entry only, not Jurka]
     poly A sites predicted at 998  and  1287 of type AATAAA
..................gg gcaaccttcc tgttttcatt atcttcttaa tctttgccag gttgggggag
    23101 ggagtgtcta cctgcagccc tgtagtggtg gtgtctcatt tcttgcttct ctcttgttac
    23161 ctgtataata atacccttgg cgcttacagc actgggaaat gacaagcaga catgagatgc
    23221 tgtttattca agtcccatta gctcagtatt ctaatgtccc atcttagcag tgattttgta
    23281 gcaattttct catttgtttc aagaacacct gactacattt ccctttggga atagcatttc
    23341 tgccaagtct ggaaggaggc cacataatat tcattcaaaa aaacaaaact ggaaatcctt
    23401 agttcataga cccagggtcc accctgttga gagcatgtgt cctgtgtctg cagagaacta
    23461 taaaggatat tctgcatttt gcaggttaca tttgcaggta acacagccat ctattgcatc
    23521 aagaatggat attcatgcaa cctttgactt atgggcagag gacattttca caaggaatga
    23581 acataatacg aaaggcttct gagactaaaa aattccaaca tatggaagag gtgcccttgg
    23641 tggcagcctt ccattttgta tgtttaagca ccttcaagtg atattccttt ctttagtaac
    23701 ataaagtata gataattaag gtaccttAAT TAAActacct tctagacact gagagcaaat  signal for 2.1k polyA
    23761 ctgttgttta tctggaaccc aggatgattt tgacattgct tagggatgtg agagttggac
    23821 tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa ctatagtgtt ggagaaaact
    23881 cttgagagtc ccttggactg aaaggagatc agtcctgaat attcattgga aggactgatg
    23941 ctgaagctga aactccaata ctttggtcac ctgatgggaa gaactgaagg caggagggat
    24001 gctaggaaag actgaaggca ggaggagaag gggacgacag aggatgagat ggctagatgg
    24061 catcatggac tcaatggaca tgagcttaag taaactccag gagttggcga tggacaggga
    24121 gacctggcgt cctgcagtcc atggtgtcgc agagtcggac acgattgagt gactaaattg
    24181 aggtgaaccc agattttaac atagagaatg cagatacaaa aactccatat tcatttgatt
    24241 gaatcttttc ctgaaccagt gctagtgttg gactggtaag agtataacag catatatagg
    24301 ttatgtgatg aagagaatag tgtacatgaa atatgtgcat ttctttattg ctgtcttata
    24361 attgtcaaaa aagaaagtta ggtccttggt ttctgtaaaa ttgacttgaa tcaaaaggga
    24421 ggcatttaaa gaAATAAAtt agagatgata gaaatctgat ccattcagag tagaaaaaga
    24481 aattccatta ctgttattta agaaggtaaa attatttcct gaattgttca atattgtcac
    24541 ctagcagata gacactatta ttctgtactg tttttactag cttgcacctt gtggtatcct
    24601 atgtaaaaac gtatttgcat atgacaaact ttttctgtta gagcaattaa catctgaacc
    24661 acctaatgca ttacctgttt ttgtaaggta ctttttgtaa ggtactaagg agatgtgggt
    24721 ttaatcccta ggtcaggtaa atcccctaga ggaagaaatg gcaacccact ccagtattct
    24781 tgccaggaaa atccagtggg cagaggagcc tggcagggta cagtctaagc atggggttgc
    24841 aaagagtgag acaagacttg agctactgaa caataaggac AATAAAtgct gggtcggcta
    24901 aaaggttcat taggtttttt ttctgtaaga tggctctagt agtacttgtc tttatcttca
    24961 ttcgaaacaa ttttgttaga ttgtatgtga cagctcttgt atcagcatgc atttgaaaaa
    25021 aacatcaaaa ttggtaaatt tttgtatagc catcttacta ttgaagatgg aagaaaagaa
    25081 gcaaaatttt cagcatatca tgctgtatta tttcaagaaa gataaccaaa atgcaaaaat
    25141 gtatttgtga agtgtatgga gaaggggctg caactgatca agcttgtcaa agtagtttgt
    25201 gaagtttcgt gctggagatt tcttattgga cgatgctcca cagttggata taccagttga
    25261 agttgatagt gatcaaattg agatattgag aataatcgat gttataccac gcgggagata
    25321 gctgacatac tcaaaatatc caaatagaac cttgaaaacc atttgcacca tctcagttat
    25381 gttaatcact ttgatgtttg agttccacat aagcaaaaaa acaacaacaa caaaaaaaaa
    25441 cacaaccttg accatatttg cgcatgcagt tctctactga aatgattgaa aacactttgt
    25501 ttttaaaaac agattttgat taacagtggg tacgatacaa taacgtagaa tggaagaaat
    25561 tgtagggtga gcaaaatgaa ccaccaccac caaaggccag tcttcctcta aagaagatgt
    25621 gtgtatggtg ggattggaaa gtaatcctct attatgaatt cttctggaaa acactgctcc
    25681 taattagacc aactgaaagc agcactcaac gaaaagcatc cagaattagt caatagaaaa
    25741 cataatcttc catcaggata acgcaagact acatatttct ttgatgaccc agcatggctg
    25801 gagtttctga ttcatctgtt gtattcagac gttgcatctt tggatttttt ccatttattt
    25861 cagtctacaa aattatcata atggaaaaaa tttccattcc ctggaagatt gtaaagtgca
    25921 tctggaaaat ttctttgctc aaaaagataa aaagttttgt gaacacagaa ttatgacgtt
    25981 gcctgaaaaa tggcagaagg tagtggaaca aaagagtgac tatgttgttt ggtaaagttc
    26041 ttagtgaaaa tgaaaaatgt gtcttttatt tttatttaaa caccaaaggc acattttggc
    26101 caacccaata ctgaatactt aaaggaaact cttctgtgtt gtccttagcc ttacagtgtg
    26161 cactgaatag ttttgtataa gaatccagag tgatatttga aatacgcatg tgcttatatt
    26221 ttttatattt gtaactttgc atgtacttgt tttgtgttaa aagtttataa atatttaata
    26281 tctgactaaa attaa
Reduced sheep 3' UTR (deleted retrotransposons, suitable for comparison with human and mouse): net length 1,481
..................gg gcaaccttcc tgttttcatt atcttcttaa tctttgccag gttgggggag
    23101 ggagtgtcta cctgcagccc tgtagtggtg gtgtctcatt tcttgcttct ctcttgttac
    23161 ctgtataata atacccttgg cgcttacagc actgggaaat gacaagcaga catgagatgc
    23221 tgtttattca agtcccatta gctcagtatt ctaatgtccc atcttagcag tgattttgta
    23281 gcaattttct catttgtttc aagaacacct gactacattt ccctttggga atagcatttc
    23341 tgccaagtct ggaaggaggc cacataatat tcattcaaaa aaacaaaact ggaaatcctt
    23401 agttcataga cccagggtcc accctgttga gagcatgtgt cctgtgtctg cagagaacta
    23461 taaaggatat tctgcatttt gcaggttaca tttgcaggta acacagccat ctattgcatc
    23521 aagaatggat attcatgcaa cctttgactt atgggcagag gacattttca caaggaatga
    23581 acataatacg aaaggcttct gagactaaaa aattccaaca tatggaagag gtgcccttgg
    23641 tggcagcctt ccattttgta tgtttaagca ccttcaagtg atattccttt ctttagtaac
    23701 ataaagtata gataattaag gtaccttaat taaactacct tctagacact gagagcaaat
    23761 ctgttgttta tctggaaccc aggatgattt tgacattgct ccc........agattttaac 
          atagagaatg cagatacaaa aactccatat tcatttgatt
    24241 gaatcttttc ctgaaccagt gctagtgttg gactggtaag agtataacag catatatagg
    24301 ttatgtgatg aagagaatag tgtacatgaa atatgtgcat ttctttattg ctgtcttata
    24361 attgtcaaaa aagaaagtta ggtccttggt ttctgtaaaa ttgacttgaa tcaaaaggga
    24421 ggcatttaaa gaAATAAAtt agagatgata gaaatctgat ccattcagag tagaaaaaga
    24481 aattccatta ctgttattta agaaggtaaa attatttcct gaattgttca atattgtcac
    24541 ctagcagata gacactatta ttctgtactg tttttactag cttgcacctt gtggtatcct
    24601 atgtaaaaac gtatttgcat atgacaaact ttttctgtta gagcaattaa catctgaacc
    24661 acctaatgca ttacctgttt ttgtaaggta ctttttgtaa ggtactaagaa caataagga
          cAATAAAtgtactgaatactt aaaggaaact cttctgtgtt gtccttagcc ttacagtgtg
    26161 cactgaatag ttttgtataa gaatccagag tgatatttga aatacgcatg tgcttatatt
    26221 ttttatattt gtaactttgc atgtacttgt tttgtgttaa aagtttataa atatttaata
    26281 tctgactaaa attaa
reduced sheep with poly A sites identified: 998 LDF- 4.97 1287 LDF- 2.52 gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtctacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtataataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctgtttattcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttcatagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaatggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggcagccttccattttgtatgtttaagcaccttcaagtgatattcctttctttagtaacataaagtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagagaatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagtgttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtacatgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagttaggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaaga.AATAAA.ggtatcctatgtaaaaacgtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacctaatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggac.AATAAA.tgtactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaa
Bov-B 23801..24187 (378 Blast hits, including muntjak, goat, deer, and viper)
        1 tagggatgtg agagttggac tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa
       61 ctatagtgtt ggagaaaact cttgagagtc ccttggactg aaaggagatc agtcctgaat
      121 attcattgga aggactgatg ctgaagctga aactccaata ctttggtcac ctgatgggaa
      181 gaactgaagg caggagggat gctaggaaag actgaaggca ggaggagaag gggacgacag
      241 aggatgagat ggctagatgg catcatggac tcaatggaca tgagcttaag taaactccag
      301 gagttggcga tggacaggga gacctggcgt cctgcagtcc atggtgtcgc agagtcggac
      361 acgattgagt gactaaattg aggtgaa 

Bov-tA3 24709..24867 (527 Blast hits, including deer, bison, and goat)
        1 ggagatgtgg gtttaatccc taggtcaggt aaatccccta gaggaagaaa tggcaaccca
       61 ctccagtatt cttgccagga aaatccagtg ggcagaggag cctggcaggg tacagtctaa
      121 gcatggggtt gcaaagagtg agacaagact tgagctact

Oamar1   24889 to 26108 (few Blast hits, retrotransposon basis unknown)
        1 ctgggtcggc taaaaggttc attaggtttt ttttctgtaa gatggctcta gtagtacttg
       61 tctttatctt cattcgaaac aattttgtta gattgtatgt gacagctctt gtatcagcat
      121 gcatttgaaa aaaacatcaa aattggtaaa tttttgtata gccatcttac tattgaagat
      181 ggaagaaaag aagcaaaatt ttcagcatat catgctgtat tatttcaaga aagataacca
      241 aaatgcaaaa atgtatttgt gaagtgtatg gagaaggggc tgcaactgat caagcttgtc
      301 aaagtagttt gtgaagtttc gtgctggaga tttcttattg gacgatgctc cacagttgga
      361 tataccagtt gaagttgata gtgatcaaat tgagatattg agaataatcg atgttatacc
      421 acgcgggaga tagctgacat actcaaaata tccaaataga accttgaaaa ccatttgcac
      481 catctcagtt atgttaatca ctttgatgtt tgagttccac ataagcaaaa aaacaacaac
      541 aacaaaaaaa aacacaacct tgaccatatt tgcgcatgca gttctctact gaaatgattg
      601 aaaacacttt gtttttaaaa acagattttg attaacagtg ggtacgatac aataacgtag
      661 aatggaagaa attgtagggt gagcaaaatg aaccaccacc accaaaggcc agtcttcctc
      721 taaagaagat gtgtgtatgg tgggattgga aagtaatcct ctattatgaa ttcttctgga
      781 aaacactgct cctaattaga ccaactgaaa gcagcactca acgaaaagca tccagaatta
      841 gtcaatagaa aacataatct tccatcagga taacgcaaga ctacatattt ctttgatgac
      901 ccagcatggc tggagtttct gattcatctg ttgtattcag acgttgcatc tttggatttt
      961 ttccatttat ttcagtctac aaaattatca taatggaaaa aatttccatt ccctggaaga
     1021 ttgtaaagtg catctggaaa atttctttgc tcaaaaagat aaaaagtttt gtgaacacag
     1081 aattatgacg ttgcctgaaa aatggcagaa ggtagtggaa caaaagagtg actatgttgt
     1141 ttggtaaagt tcttagtgaa aatgaaaaat gtgtctttta tttttattta aacaccaaag
     1201 gcacattttg gccaacccaa 

bovine AB001468 and D10612 (from 957 to 2096 = 1140 bp of 3' UTR), D26150

     3'UTR           957..4244           = 3,288 bp
     repeat_region  1713..2093  Bov-B    =   381 bp [alt: BOV2; LINE-like element 1717-2093]
     repeat_region  2634..2792  Bov-tA3  =   159 bp [alt: 2633-2797 24,708..24,872 BOVTA; art-2 SINE]
     repeat_region  2814..4039  OaMAR1   = 1,226 bp [mariner element]
         total bovine retrotransposons   = 1,766 bp
                                   net   = 1,522 bp

     old_sequence    2096 D10612 replace g by c
     variation       2182 replace a by g
     misc_signal     4003..4011 ttatttaaa  AU-rich element that is thought to mediate mRNA degradation
     polyA_signal    4207..4212 taatat
     polyA_signal    4222..4227 attaaa
PolyA predicted at position 1850+957
      957 .............................................................gggc
      961 aaccttcctg ttttcattat cttcttaatc tttaccaggt tgggggaggg agtatctacc
     1021 tgcagccccg tagtggtggt gtctcatttc gtgcttctct ctttgttacc tgtatgctaa
     1081 tacccttggc gcttatagca ctgggaaatg aagagcagac atgagatgct gtttattcaa
     1141 gtcccgttag ctcagtatgc taatgcccca tcttagcagt gattttgtag caattttctc
     1201 atttgtttca agaacacgtg actacatttc ccttttggaa tagcatttct gccaagtctg
     1261 gaaggaggcc acataatatt cattcaaaaa aacaaaccgg aaatccttag ttcatagacc
     1321 cagggtccac ctggttgaga gcttgtgtcc tgtgtctgca gagaactata aaggatattc
     1381 tgcattttgc aggttacatt tgcaggtaac acagccagct attgcatcaa gaatggatat
     1441 tcatgcaacc tttgacttat gggtagagga cattttcaca aggaatgaac ataatacgaa
     1501 aggcttctga gactaaaaaa ttccaacata tgggagaggt gcccttggtg gcagccttcc
     1561 attttgtatg tttaaagcac cttcaagtgg tattcctttc tttagtaaca aagtatagat
     1621 aattaagtta ccttaattta attaaactac cttctagaca ctgagagcaa atctgttgtt
     1681 tatctggaac ccaggatgat tttgacattg tttagagatg tgagagttga actgtaaaga
     1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc
     1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca
     1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact
     1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc
     1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg
     2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaacccagat
     2101 tttaacatag agaatgcaga tataaaaact ccatattcat ttgattgaat cttttcctta
     2161 accagtgcta gtgttggact ggtaagatta taacaacaaa tataggttat gtgatgaaga
     2221 gaatagtgta caaagaaaag aaatatgtgc atttctttat tgctatcata attgtcaaaa
     2281 aacaaaatta ggtccttggt ttctgtaaaa ttaacttttg aatcaacagg gaggcattta
     2341 aagaaatatc ttaaattaga gacagtagaa atctgataca ttcagagtgg aaaaagaaat
     2401 tctattacga ttatttaaga aggtaaaatt atttcctggg ttgttcaata ttgtcaccta
     2461 gcagatagac actattgttc tgcactgtta ttactggctt gcactttgtg gtatcctatg
     2521 taaaaataca tatattgcat atgacagact taagaatttc tgttagagca attaacatct
     2581 gaactatcta atgcattacc tgtttttgta aggtactttt tgtaaggtac taaggagacg
     2641 tgggtttaat ccctaggtca tgtaaatccc ctggaggagg aaatagcaac ccactccagt
     2701 attcttgcca ggagaatccc atgggcagag gagcctggca gggtgcagtc catgcatagg
     2761 gttgcaaaga gtcagacaag acttgagcta ctaaacaata acaacaataa atgctgggtt
     2821 ggctaaaagg ttcattaggt tttttttctg taagatggct gtctttaact tcattcgaaa
     2881 caattttgtt agattgtatg tgacagctct tgtatcagca tgcatttgaa aaagaaaaca
     2941 acttaccaaa attggtgaat ttttgtatag ccattttact attgaagatg gaagaaaaga
     3001 agcaaaattt tcagcatatc atgctgtatt atttcaagaa agataacaca accaaaatgc
     3061 gaaaatgtat ttgtgcagtg tatggagaag gtgctgcaac tgatcaagct tgtcaaagta
     3121 gtttgtgaag tattgtgctg gagatttctt actggacaat gctccacagt cgggtatacc
     3181 agttgaagtt gatagtgatc aaattgagat attgagaaca atcaatgtta taccacgtgg
     3241 gagatagctg acatactcaa aatatccaaa tagaaccttg aaaaccattt gcaccatctc
     3301 agttatgtta ataactttga tgtttgagtt ccacataaat taagcaaaaa aaaaacaaaa
     3361 acaaaaacac acaaccttga ccatatttgc atatgcagtt ctctactgaa atgaatgaaa
     3421 acacttttgt ttttaaaaac agattttgat gaacagtgga tactatacaa taacgtagaa
     3481 tggaaaagac tgtggggtga gcaaaatgaa ccagcaccac caaaggccag gcttcatcca
     3541 aagaagatgt gtgtatggtg ggattggaaa gtaatcctct attatgggat tcttctggaa
     3601 aaccaaaaaa tcaattccaa caagtactgc tcctaattag accaactgaa agcagcattc
     3661 aatgaaaagc atccagaatt agtcaataga aagcatataa tcttccatca ggataacaca
     3721 agactacatt tctttgatga cccagcatgg ctgagaggtt ctgattcacc tgctgtattc
     3781 agacattgca tctttggatt tccatttatt tcagtctaca gaattatcat catgaaaaaa
     3841 atttccattc cctggaagat tgtaaagtgc atctggaaaa cttctttgct caaaaagata
     3901 aaaagttttg tgaacacaga attatgaagt tgcctgaaaa acagcagaag atagtgacta
     3961 tgttgttcag taaagttctt ggtgcaaatg tgtcttttat ttttatttaa acactaaagg
     4021 cacgttttgg ccaacccaat actgaatact taaaggaaac tcttccgtgt tgtccttagc
     4081 cttacagcgt gcactgaata gttttgtata agaatccaga gtgatatttg aaatacgcat
     4141 gtgcttatat tttctatatt tgtaactttg catgtacttg ttttgtgtta aaagtttata
     4201 aatatttaat atctgactaa aattaaacag gagctaaaag gagg
  
tagagatg tgagagttga actgtaaaga 1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc 1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca 1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact 1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc 1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg 2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaa
Sequence numbered from beginning of 3' UTR: 3288 bp
PolyA predicted at position 1850
        1 gggcaacctt cctgttttca ttatcttctt aatctttacc aggttggggg agggagtatc
       61 tacctgcagc cccgtagtgg tggtgtctca tttcgtgctt ctctctttgt tacctgtatg
      121 ctaataccct tggcgcttat agcactggga aatgaagagc agacatgaga tgctgtttat
      181 tcaagtcccg ttagctcagt atgctaatgc cccatcttag cagtgatttt gtagcaattt
      241 tctcatttgt ttcaagaaca cgtgactaca tttccctttt ggaatagcat ttctgccaag
      301 tctggaagga ggccacataa tattcattca aaaaaacaaa ccggaaatcc ttagttcata
      361 gacccagggt ccacctggtt gagagcttgt gtcctgtgtc tgcagagaac tataaaggat
      421 attctgcatt ttgcaggtta catttgcagg taacacagcc agctattgca tcaagaatgg
      481 atattcatgc aacctttgac ttatgggtag aggacatttt cacaaggaat gaacataata
      541 cgaaaggctt ctgagactaa aaaattccaa catatgggag aggtgccctt ggtggcagcc
      601 ttccattttg tatgtttaaa gcaccttcaa gtggtattcc tttctttagt aacaaagtat
      661 agataattaa gttaccttaa tttaattaaa ctaccttcta gacactgaga gcaaatctgt
      721 tgtttatctg gaacccagga tgattttgac attgtttaga gatgtgagag ttgaactgta
      781 aagaaagctg agtgctgaag aattgatgct tttgaactct agtgttggag aaaacttgag
      841 agtcccttgg actgcaagga gatcaaatta gtccatccta aaggagatca gtcctgaata
      901 ttcattggaa ggactgatgc tgaacgtgaa actccaatac tttggccacc tgatgggaag
      961 aactgaaggc aggaggagaa ggggatgaca gaggatgaag atggctggat ggcatcatgg
     1021 attcaatgga catgagcttg agtaaactcc aggagttggc aatcgacgga gtcctggcat
     1081 cctgcagtcc atggtgtcgc agagttggac acgactgagt gactgaactg aggtgaaccc
     1141 agattttaac atagagaatg cagatataaa aactccatat tcatttgatt gaatcttttc
     1201 cttaaccagt gctagtgttg gactggtaag attataacaa caaatatagg ttatgtgatg
     1261 aagagaatag tgtacaaaga aaagaaatat gtgcatttct ttattgctat cataattgtc
     1321 aaaaaacaaa attaggtcct tggtttctgt aaaattaact tttgaatcaa cagggaggca
     1381 tttaaagaaa tatcttaaat tagagacagt agaaatctga tacattcaga gtggaaaaag
     1441 aaattctatt acgattattt aagaaggtaa aattatttcc tgggttgttc aatattgtca
     1501 cctagcagat agacactatt gttctgcact gttattactg gcttgcactt tgtggtatcc
     1561 tatgtaaaaa tacatatatt gcatatgaca gacttaagaa tttctgttag agcaattaac
     1621 atctgaacta tctaatgcat tacctgtttt tgtaaggtac tttttgtaag gtactaagga
     1681 gacgtgggtt taatccctag gtcatgtaaa tcccctggag gaggaaatag caacccactc
     1741 cagtattctt gccaggagaa tcccatgggc agaggagcct ggcagggtgc agtccatgca
     1801 tagggttgca aagagtcaga caagacttga gctactaaac aataacaaca ataaatgctg
     1861 ggttggctaa aaggttcatt aggttttttt tctgtaagat ggctgtcttt aacttcattc
     1921 gaaacaattt tgttagattg tatgtgacag ctcttgtatc agcatgcatt tgaaaaagaa
     1981 aacaacttac caaaattggt gaatttttgt atagccattt tactattgaa gatggaagaa
     2041 aagaagcaaa attttcagca tatcatgctg tattatttca agaaagataa cacaaccaaa
     2101 atgcgaaaat gtatttgtgc agtgtatgga gaaggtgctg caactgatca agcttgtcaa
     2161 agtagtttgt gaagtattgt gctggagatt tcttactgga caatgctcca cagtcgggta
     2221 taccagttga agttgatagt gatcaaattg agatattgag aacaatcaat gttataccac
     2281 gtgggagata gctgacatac tcaaaatatc caaatagaac cttgaaaacc atttgcacca
     2341 tctcagttat gttaataact ttgatgtttg agttccacat aaattaagca aaaaaaaaac
     2401 aaaaacaaaa acacacaacc ttgaccatat ttgcatatgc agttctctac tgaaatgaat
     2461 gaaaacactt ttgtttttaa aaacagattt tgatgaacag tggatactat acaataacgt
     2521 agaatggaaa agactgtggg gtgagcaaaa tgaaccagca ccaccaaagg ccaggcttca
     2581 tccaaagaag atgtgtgtat ggtgggattg gaaagtaatc ctctattatg ggattcttct
     2641 ggaaaaccaa aaaatcaatt ccaacaagta ctgctcctaa ttagaccaac tgaaagcagc
     2701 attcaatgaa aagcatccag aattagtcaa tagaaagcat ataatcttcc atcaggataa
     2761 cacaagacta catttctttg atgacccagc atggctgaga ggttctgatt cacctgctgt
     2821 attcagacat tgcatctttg gatttccatt tatttcagtc tacagaatta tcatcatgaa
     2881 aaaaatttcc attccctgga agattgtaaa gtgcatctgg aaaacttctt tgctcaaaaa
     2941 gataaaaagt tttgtgaaca cagaattatg aagttgcctg aaaaacagca gaagatagtg
     3001 actatgttgt tcagtaaagt tcttggtgca aatgtgtctt ttatttttat ttaaacacta
     3061 aaggcacgtt ttggccaacc caatactgaa tacttaaagg aaactcttcc gtgttgtcct
     3121 tagccttaca gcgtgcactg aatagttttg tataagaatc cagagtgata tttgaaatac
     3181 gcatgtgctt atattttcta tatttgtaac tttgcatgta cttgttttgt gttaaaagtt
     3241 tataaatatt taatatctga ctaaaattaa acaggagcta aaaggagg 
Reduced bovine sequence
  957 .............................................................gggc
      961 aaccttcctg ttttcattat cttcttaatc tttaccaggt tgggggaggg agtatctacc
     1021 tgcagccccg tagtggtggt gtctcatttc gtgcttctct ctttgttacc tgtatgctaa
     1081 tacccttggc gcttatagca ctgggaaatg aagagcagac atgagatgct gtttattcaa
     1141 gtcccgttag ctcagtatgc taatgcccca tcttagcagt gattttgtag caattttctc
     1201 atttgtttca agaacacgtg actacatttc ccttttggaa tagcatttct gccaagtctg
     1261 gaaggaggcc acataatatt cattcaaaaa aacaaaccgg aaatccttag ttcatagacc
     1321 cagggtccac ctggttgaga gcttgtgtcc tgtgtctgca gagaactata aaggatattc
     1381 tgcattttgc aggttacatt tgcaggtaac acagccagct attgcatcaa gaatggatat
     1441 tcatgcaacc tttgacttat gggtagagga cattttcaca aggaatgaac ataatacgaa
     1501 aggcttctga gactaaaaaa ttccaacata tgggagaggt gcccttggtg gcagccttcc
     1561 attttgtatg tttaaagcac cttcaagtgg tattcctttc tttagtaaca aagtatagat
     1621 aattaagtta ccttaattta attaaactac cttctagaca ctgagagcaa atctgttgtt
     1681 tatctggaac ccaggatgat tttgacattg tttagacccagat
     2101 tttaacatag agaatgcaga tataaaaact ccatattcat ttgattgaat cttttcctta
     2161 accagtgcta gtgttggact ggtaagatta taacaacaaa tataggttat gtgatgaaga
     2221 gaatagtgta caaagaaaag aaatatgtgc atttctttat tgctatcata attgtcaaaa
     2281 aacaaaatta ggtccttggt ttctgtaaaa ttaacttttg aatcaacagg gaggcattta
     2341 aagaaatatc ttaaattaga gacagtagaa atctgataca ttcagagtgg aaaaagaaat
     2401 tctattacga ttatttaaga aggtaaaatt atttcctggg ttgttcaata ttgtcaccta
     2461 gcagatagac actattgttc tgcactgtta ttactggctt gcactttgtg gtatcctatg
     2521 taaaaataca tatattgcat atgacagact taagaatttc tgttagagca attaacatct
     2581 gaactatcta atgcattacc tgtttttgta aggtactttt tgtaaggtac taaaaacaata acaacaataa atgt actgaatact taaaggaaac tcttccgtgt tgtccttagc
     4081 cttacagcgt gcactgaata gttttgtata agaatccaga gtgatatttg aaatacgcat
     4141 gtgcttatat tttctatatt tgtaactttg catgtacttg ttttgtgtta aaagtttata
     4201 aatatttaat atctgactaa aattaaacag gagctaaaag gagg
Alignment of sheep and cow Bov-B:
Sheep: 1    tagggatgtgagagttggactgtaaagaaagctgagtgctgaagagttgatgcttttgaa 60
            ||| ||||||||||||| ||||||||||||||||||||||||||| ||||||||||||||
Bovin: 1713 tagagatgtgagagttgaactgtaaagaaagctgagtgctgaagaattgatgcttttgaa 1772

                                                                        
Sheep: 61   ctatagtgttggagaaaactcttgagagtcccttggactg----------aaa------- 103
            || |||||||||||||||  ||||||||||||||||||||          |||       
Bovin: 1773 ctctagtgttggagaaaa--cttgagagtcccttggactgcaaggagatcaaattagtcc 1830

                                                                        
Sheep: 104  --------ggagatcagtcctgaatattcattggaaggactgatgctgaagctgaaactc 155
                    ||||||||||||||||||||||||||||||||||||||||||  ||||||||
Bovin: 1831 atcctaaaggagatcagtcctgaatattcattggaaggactgatgctgaacgtgaaactc 1890

                                                                        
Sheep: 156  caatactttggtcacctgatgggaagaactgaaggcaggagggatgctaggaaagactga 215
            ||||||||||| |||||||||||||||||||||||||||| |||              ||
Bovin: 1891 caatactttggccacctgatgggaagaactgaaggcagga-gga--------------ga 1935

                                                                        
Sheep: 216  aggcaggaggagaaggggacgacagaggatgagatggctagatggcatcatggactcaat 275
            |||  ||| ||  || ||| || |||        ||||| |||||||||||||| |||||
Bovin: 1936 agg--ggatga-cagaggatga-aga--------tggctggatggcatcatggattcaat 1983

                                                                        
Sheep: 276  ggacatgagcttaagtaaactccaggagttggcgatggacagggagacctggcgtcctgc 335
            |||||||||||| |||||||||||||||||||| || |||  |||| |||||| ||||||
Bovin: 1984 ggacatgagcttgagtaaactccaggagttggcaatcgac--ggagtcctggcatcctgc 2041

                                                                
Sheep: 336  agtccatggtgtcgcagagtcggacacgattgagtgactaaattgaggtgaa 387
            |||||||||||||||||||| |||||||| ||||||||| || |||||||||
Bovin: 2042 agtccatggtgtcgcagagttggacacgactgagtgactgaactgaggtgaa 2093
Alignment of sheep and cow Bov-tA3:
Sheep: 1    ggagatgtgggtttaatccctaggtcaggtaaatcccctagaggaagaaatggcaaccca 60
            ||||| ||||||||||||||||||||| ||||||||||| ||||| ||||| ||||||||
Bovin: 2634 ggagacgtgggtttaatccctaggtcatgtaaatcccctggaggaggaaatagcaaccca 2693

                                                                        
Sheep: 61   ctccagtattcttgccaggaaaatccagtgggcagaggagcctggcagggtacagtctaa 120
            |||||||||||||||||||| |||||  ||||||||||||||||||||||| ||||| | 
Bovin: 2694 ctccagtattcttgccaggagaatcccatgggcagaggagcctggcagggtgcagtccat 2753

                                                   
Sheep: 121  gcatggggttgcaaagagtgagacaagacttgagctact 159
            |||| |||||||||||||| |||||||||||||||||||
Bovin: 2754 gcatagggttgcaaagagtcagacaagacttgagctact 2792
Alignment of mariner pseudogene of sheep and cow:
SHEEP    1    ctgggtcggctaaaaggttcattaggttttttttctgtaagatggctctagtagtacttg 60
AB001468 2814 ......t........................................------------. 2861

SHEEP    61   tctttatcttcattcgaaacaattttgttagattgtatgtgacagctcttgtatcagcat 120
AB001468 2862 ......a..................................................... 2921

SHEEP    121  gcatttgaaaaa---aaca---t--caaaattggtaaatttttgtatagccatcttacta 172
AB001468 2922 ............gaa....act.ac..........g.................t...... 2981

SHEEP    173  ttgaagatggaagaaaagaagcaaaattttcagcatatcatgctgtattatttcaagaaa 232
AB001468 2982 ............................................................ 3041

SHEEP    233  gataac-----caaaatgcaaaaatgtatttgtgaagtgtatggagaaggggctgcaact 287
AB001468 3042 ......acaac........g..............c...............t......... 3101

SHEEP    288  gatcaagcttgtcaaagtagtttgtgaagt-ttcgtgctggagatttcttattggacgat 346
AB001468 3102 ..............................a..-.................c.....a.. 3160

SHEEP    347  gctccacagttggatataccagttgaagttgatagtgatcaaattgagatattgagaata 406
AB001468 3161 ..........c..g............................................c. 3220

SHEEP    407  atcgatgttataccacgcgggagatagctgacatactcaaaatatccaaatagaaccttg 466
AB001468 3221 ...a.............t.......................................... 3280

SHEEP    467  aaaaccatttgcaccatctcagttatgttaatcactttgatgtttgagttccaca----- 521
AB001468 3281 ................................a......................taaat 3340

SHEEP    522  taagcaaaaaaacaacaacaacaaa-aaaaa-acacaaccttgaccatatttgcgcatgc 579
AB001468 3341 ............-.--.....-...c.....c......................at.... 3396

SHEEP    580  agttctctactgaaatgattgaaaacac-tttgtttttaaaaacagattttgattaacag 638
AB001468 3397 ..................a.........t.........................g..... 3456

SHEEP    639  tgggtacgatacaataacgtagaatggaagaa-attgtagggtgagcaaaatgaaccacc 697
AB001468 3457 ...a...t.....................-..g.c...g...................g. 3515

SHEEP    698  accaccaaaggccagtcttcctctaaagaagatgtgtgtatggtgggattggaaagtaat 757
AB001468 3516 ...............g....a..c.................................... 3575

SHEEP    758  cctctattat-gaattcttctggaaaacactgctcctaattagaccaactgaaagcagca 816
AB001468 3576 ..........g.g...............                                 3603
AB001468 3626                             ................................ 3657

SHEEP    817  ctcaacgaaaagcatccagaattagtcaatagaaaacata--atcttccatcaggataac 874
AB001468 3658 t....t.............................g....ta.................. 3717

SHEEP    875  gcaagactacatatttctttgatgacccagcatggctg-gagtttctgattcatctgttg 933
AB001468 3718 a...........--........................a...g..........c...c.. 3775

SHEEP    934  tattcagacgttgcatctttggattttttccatttatttcagtctacaaaattatcataa 993
AB001468 3776 .........a................---...................g.........c. 3832

SHEEP    994  tggaaaaaatttccattccctggaagattgtaaagtgcatctggaaaatttctttgctca 1053
AB001468 3833 ..a.............................................c........... 3892

SHEEP    1054 aaaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggt 1113
AB001468 3893 ...................................a.............ca.......-- 3950

SHEEP    1114 agtggaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtg 1173
AB001468 3951 .-.--.--------..............ca...........-.-.----..-c....... 3992

SHEEP    1174 tcttttatttttatttaaacaccaaaggcacattttggccaacccaa 1220
AB001468 3993 ......................t........g............... 4039

human position 26,212-27,817 = 1,606 bp with no retrotransposons or poly A annotated. (The 4 other full length human 3' UTR have 6 distal sites with point changes.)

Sequence D00015 from a 1986 Science article is said to have a polyA site at 2422 with 8 A's. ( 2341 tgcatgttct tgttttgtta tataaaaaaa ttgtaaatgt ttaatatctg actgaaatta 2401 aacgagccaa gatgagcacc aa).

X83416, a 1991 Hood sequence, is annotated for polyA signals at 2242..2247 ataaaa and 2277..2282 attaaa:

     2221 tgcatgttct tgttttgtta tataaaaaaa ttgtaaatgt ttaatatctg actgaaatta
     2281 aacgagcgaa gatgagcacc a 
..................................................................ggaaggtct
    26221 tcctgttttc accatctttc taatcttttt ccagcttgag ggaggcggta tccacctgca
    26281 gcccttttag tggtggtgtc tcactctttc ttctctcttt gtcccggata ggctaatcaa
    26341 tacccttggc actgatgggc actggaaaac atagagtaga cctgagatgc tggtcaagcc
    26401 ccctttgatt gagttcatca tgagccgttg ctaatgccag gccagtaaaa gtataacagc
    26461 aaataaccat tggttaatct ggacttattt ttggacttag tgcaacaggt tgaggctaaa
    26521 acaaatctca gaacagtctg aaataccttt gcctggatac ctctggctcc ttcagcagct
    26581 agagctcagt atactaatgc cctatcttag tagagatttc atagctattt agagatattt
    26641 tccattttaa gaaaacccga caacatttct gccaggtttg ttaggaggcc acatgatact
    26701 tattcaaaaa aatcctagag attcttagct cttgggatgc aggctcagcc cgctggagca
    26761 tgagctctgt gtgtaccgag aactggggtg atgttttact tttcacagta tgggctacac
    26821 agcagctgtt caacaagagt aaatattgtc acaacactga acctctggct agaggacata
    26881 ttcacagtga acataactgt aacatatatg aaaggcttct gggacttgaa atcaaatgtt
    26941 tgggaatggt gcccttggag gcaacctccc attttagatg tttaaaggac cctatatgtg
    27001 gcattccttt ctttaaacta taggtAATTA Aggcagctga aaagtaaatt gccttctaga
    27061 cactgaaggc aaatctcctt tgtccattta cctggaaacc agaatgattt tgacatacag
    27121 gagagctgca gttgtgaaag caccatcatc atagaggatg atgtaattaa aaaatggtca
    27181 gtgtgcaaag aaaagaactg cttgcatttc tttatttctg tctcataatt gtcaaaaacc
    27241 agaattaggt caagttcata gtttctgtaa ttggcttttg aatcaaagaa tagggagaca
    27301 atctaaaaaa tatcttaggt tggagatgac agaaatatga ttgatttgaa gtggaaaaag
    27361 aaattctgtt aatgttaatt aaagtaaaat tattccctga attgtttgat attgtcacct
    27421 agcagatatg tattactttt ctgcaatgtt attattggct tgcactttgt gagtattcta
    27481 tgtaaaaata tatatgtata taaaatatat attgcatagg acagacttag gagttttgtt
    27541 tagagcagtt aacatctgaa gtgtctaatg cattaacttt tgtaaggtac tgaatactta
    27601 atatgtggga aacccttttg cgtggtcctt aggcttacaa tgtgcactga atcgtttcat
    27661 gtaagaatcc aaagtggaca ccattaacag gtctttgaaa tatgcatgta ctttatattt
    27721 tctatatttg taactttgca tgttcttgtt ttgttatata aaaaaattgt aaatgtttaa
    27781 tatctgactg aaattaaacg agcgaagatg agcacca

cct cccgtgtctg cagttgtatt
    27841 tcctggtgct tgccctgtgt tggggactgt tttgggggtt aatctgagcc aagtggcgct
    27901 ttctgtcctc ccttctcaag tgatggccga tggttcacgc acttccccct gttcctgccc
    27961 ttgtcctcac ttcccagtca cccactagtt catctctgcg gcttttgcat tttctccaca
    28021 agcatctaag tgggcttagc actggtaaac tgcaaaggca ctattgcagc aggaggaaca
    28081 gtctgggagc ttttttcagt cctggattta gaaatagatt ttcttgatta aaatgaaaat
    28141 taacaagctc taaagaactg ttgacccttg aactacacag ggattagagg cactgacctg
    28201 ccgcacagtc gaaaatctgc agagaagttt tttttgtttt gttttgtttt ttttgagacg
    28261 gagtctcgct ctgtcgccca ggctggagtg cagtggcggg atctcggctc actgcaacct
    28321 ccgcctcccg ggttcaggcg attctcctgc ctcagcctcc tgagtagctg ggactacagg
    28381 catatgccac catgcccggc taatttttgt atttttagta gagatggagt ttcaccatat
    28441 tggccaggct gttctcaaac tcggcctcaa gtgatctgct cgcctcagcc acccaaagtg
    28501 ctaggattac aagcatgagc caccgcgccc ggcctgcata gaacttttaa ctcccccaaa


extra human after break: nothing found in human ESTs that extends significantly begond this, 111 hits

mouse Lee sequence position 20,442-21,675 = 1,234 bp with no retrotransposons or poly A annotated

.......................................................gggaggcct tcctgcttgt
    20461 tccttcgcat tctcgtggtc taggctgggg gaggggttat ccacctgtag ctctttcaat
    20521 tgaggtggtt ctcattcttg cttctctgtg tcccccatag gctaataccc ctggcactga
    20581 tgggccctgg gaaatgtaca gtagaccagt tgctctttgc ttcaggtccc tttgatggag
    20641 tctgtcatca gccagtgcta acaccgggcc aataagaata taacaccaaa taactgctgg
    20701 ctagttgggg ctttgttttg gtctagtgaa taaatactgg tgtatcccct gacttgtacc
    20761 cagagtacaa ggtgacagtg acacatgtaa cttagcatag gcaaagggtt ctacaaccaa
    20821 agaagccact gtttggggat ggcgccctgg aaaacagcct cccacctggg atagctagag
    20881 cgtccacacg tggaattctt tctttactaa caaacgatag ctgattgaag gcaacaggaa
    20941 aaaaaaaaat caaattgtcc tactgacgtt gaaagcaaac ctttgttcat tcccagggca
    21001 ctagaatgat ctttagcctt gcttggattg aactaggaga tcttgactct gaggagagcc
    21061 agccctgtaa aaagcttggt cctcctgtga cgggagggat ggttaaggta caaaggctag
    21121 aaacttgagt ttcttcattt ctgtctcaca attatcaaaa gctagaatta gcttctgccc
    21181 tatgtttctg tacttctatt tgaactggat aacagagaga caatctaaac attctcttag
    21241 gctgcagata agagaagtag gctccattcc aaagtgggaa agaaattctg ctagcattgt
    21301 ttaaatcagg caaaatttgt tcctgaagtt gctttttacc ccagcagaca taaactgcga
    21361 tagcttcagc ttgcactgtg gattttctgt atagaatata taaaacataa cttcaagctt
    21421 atgtcttctt tttaaaacat ctgaagtgtg ggacgccctg gccgttccat ccagtactaa
    21481 atgcttaccg tgtgaccctt gggctttcag cgtgcactca gttccgtagg attccaaagc
    21541 agacccctag ctggtctttg aatctgcatg tacttcacgt tttctatatt tgtaactttg
    21601 catgtatttt gttttgtcat ataaaaagtt tataaatgtt tgctatcaga ctgacattaa
    21661 atagaagcta tgatg

rat D50093 and M20313 Positions 1404-2625 of D50093 shown 1222 bp (seq continues to 3090):

     
     polyA_signal    2570..2575 tataaa
     polyA_signal    2581..2586 tataaa 
     polyA_signal    2606..2611 attaaa
     polyA_site      2625 (or 2627, 2628)      g
     1404..........................ggaggcc ttcctgcttg ttccttctca ttctcgtggt
     1441 ctaggctggg ggaggggtta cccacctgta gctctttcaa ttgaggtggt gtctcattct
     1501 tgcttctctt tgtcccccat aggctaatac ccttggcagt gatgggtctg gggaaatgta
     1561 cagtagacca gatgctattc gcttcagcgt cctttgattg agtccatcat gggccagggt
     1621 taacaccagg ccagtaagaa tataacacca aataactgct ggctagtcag ggctttgttt
     1681 tggtctactg agtaaatact gtgtaacccc tgaattgtac ccagaggaca tggtgacaga
     1741 gacacacata acttagtata ggcaaagggt tctatagcca aagaagccac tgtgtgggca
     1801 tggcaccctg gataacagcc tcccgcctgg gatatctaga gcatccacat gtggaattct
     1861 ttcttttcta acataaacca tagctgattg aaggcaacaa gaaaaagaat caaattatcc
     1921 tactgacatt gaaagcaaac tgtgttcatt ccctaggcgc tggaatgatt tttagccttg
     1981 gattaaacca ggagattttg actctgagga gaaccagcag tacaaaagca tggtctcctg
     2041 tgatgggaga gatggtgaag ggacaaaggc aagacccctg cgtttcttca tttctgtctc
     2101 ataattatca agagctagaa ttaggtcgtg ccctaagttt ctgtactcgt atttgaactg
     2161 gacaacaaag agacaatcta caaattctct tgggctgcag aggagagaaa taggctccat
     2221 tccaaagtgg aaagagaaat tctgctagca ttgtctaagt aaggctaact tttccttaaa
     2281 tcgctttgta tttcccccag cagacatcac aaccctgtga tcggttcagc ctgcaccgcg
     2341 ggtgttctgt gtagaatata taaatataac ttcaagctta ggccttctat tttaaaacat
     2401 ctgaagtgtg gaacgcactg gccgttctgt gcagtactaa gtgtgaccct tgggctttca
     2461 atgtgcactc ggttccgtat gattccaaag tagagcccta gctggtcttc gaatctgcat
     2521 gtacttcacg ttttctatat ttgtaacttc gcatgtattt gttttgtcat ataaaaagtt
     2581 tataaatgtt tgctatctga ctgacattaa atagaagcta tgatg

golden hamster M14054, M37381, K02234. Positions 1249-2463 of M14054 shown 1214 bp (seq continues to 3002). Rat/mouse gives 87% identity, rat/hamster and mouse/hamster 78% identity.

    1249.....................................................ag gaagcctccc
     1261 tgcttgtact tcctcgttct tgtggtctag gctgggggag gggttatcca ccgtagctct
     1321 tttaattgag gtggtgtctc attcctgctt ctctttgtcc cccataggct aatgcccttg
     1381 gcactagtgg gccctgggaa tgtacagtag accagatgct attcgatcca gagcctttga
     1441 attgagtcca tcacgggcca gcactaacac caggcctatc tgaatataac agcaagtaat
     1501 ggctggctag tcagggcttt gttttggtct agtgagtaaa tactgatgtg accctctgac
     1561 ttccacacag agtacgcagt gacagacaca cctaactgtt aaaataggcg aagggttcta
     1621 cagccaaaga agtcactgtt tggcatggtc cctaagaaac agcctcccat ttgggatatt
     1681 taaagcatcc atatgaggca ttcctccttc actaacaaac tctagctgag taaggcaacg
     1741 ggaaaaaaac aaaattaccc tactaacatg gaaagcaaac ctgtgttcat ttcctaggaa
     1801 ctagaatgat gttttagcct tgcttggatt gaaccaggag attttggctc tgaagagcca
     1861 acactgtaaa aatgtggtcc tcctgcaaag ggagagatgg ttaggacaca aagtcacggc
     1921 gcttggcgtt tcttcatttc tgtctcataa ttgtcaaaag tcacaattag gtcatgccct
     1981 tagttaatat acttgtattt gaatcggacg acaagagaca atctaaaaat tctcctaggt
     2041 tgtagatgaa ataggctcca ttcaaggtga aaagacagtt tgttagcgtt gcttatgtaa
     2101 ggcaaacttt gttccttaag ttgctccgtg tttccctgag cagacataac cactctgcaa
     2161 cagcattgcc ctgctgtaga atatataaag tgtaactaca agcttagacc ttctgttctg
     2221 atgcatccga agtacgtaat gcactgacca tttcacccgg tatcagatgt tttctgtgtg
     2281 gcccctagct ttccttcaac atgcattcgg ttccatatat gaatccaaag tggaccccct
     2341 aactggtctc tgaaatctgc atgtacttca cattttctat atttgtaact ttgcatgtcc
     2401 ttgttttgtc atataaaaag tttataaatg tttgctatct gactgacatt aaataggagc
     2461 ta
  
Ancestral rodent 3' UTR: outgroup arbitration on (hamster, (rat,mouse)): gives 50% identity with human sequence(which is 1606 bp vs 1242 bp -- large gaps occur centrally.
>ancest_rod
gggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcactgatgggcccggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggctaacaccaggccaataagaatataacaccaaataactgctggctagtcagggctttgttttggtctagtgagtaaatactggtgtaacccctgacttgtacccagagtacatggtgacagagacacacataacttagtataggcaaagggttctacagccaaagaagccactgtttgggcatggcaccctggataacagcctcccacctgggatatctagagcatccacatgtggaattctttctttactaacaaaccatagctgattgaaggcaacaggaaaaaaaaatcaaattatcctactgacattgaaagcaaacctgtgttcattccctaggcactagaatgatttttagccttgcttggattgaaccaggagattttgactctgaggagagccagcactgtacaaaagcatggtcctcctgtgatgggagagatggttaagggacaaaggcaagacccttgcgtttcttcatttctgtctcataattatcaaaagctagaattaggtcgtgccctaagtttctgtacttgtatttgaactggacaacaaagagacaatctaaaaattctcttaggctgcagatgagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtttaagtaaggcaaactttgttccttaagtcgctttgtatttcccccagcagacataacaaccctgcgatcggttcagcttgcactgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcaacgtgcactcggttccgtatgattccaaagtagacccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatg
mink S46825 1632 bp post gene; 4.6k poly A site is 1593, only exon 3 is known. Mink is an extremely valuable 3' UTR because it forms an outgroup to cow and sheep more recent than the human or rodent sequence. It aligns fairly well with reduced sheep, averaging 75% identity in the blocks of alignment. Mink lacks any sequence similarity to the 3 ruminant insertions Bov-B, Bov-tA3, and OaMAR1, indicating that these appeared subsequently in the artiodactyl lineage.
                                               ggatgg ccttcccatt ctctccatcg
      841 tcttcacctt ttacaggttg ggggaggggg tgtctaccta cagccctgta gtggtggtgt
      901 ctcattcctg cttctcttta tcacccatag gctaatcccc ttggccctga tggccctggg
      961 aaatgtagag cagacccagg atgctattta ttcaagcccc catgtgttgg agtccttcag
     1021 gggccaatgc tagtgcaggg ctgagaataa cagcaaatca tcattggttg acctagggct
     1081 gcttttttgt tgttgttgtc tagtgcagct gaccgaggct aaaacaattc tcaaaacagt
     1141 tttcaaatac ctttgcctgg aaacctctgg ctcctgctgc agctagagct cagtacatta
     1201 atgtcccatc ttagccgtgt cttcatagca acttggggaa gtttttctcc ccactctaaa
     1261 agaacgcgat tgcacttccc tgtgcaaaga acatttctgc caaatttgaa aggaggccac
     1321 atgatattca ttcaaaaagc aaaactagaa accctttgct cttggacgca agcccggcct
     1381 gctaggagca ccaaactggg gcgatggttt gcattctgcg gcgtgggcta tgcggcagcc
     1441 gaggtgtcca gcgtaaatat tgatgcgacg ctagacctag gcagaggatg tttgcacagg
     1501 gaatgaacat aatcaacagt gcgaaaatgc tacaaaaaat cccacactgg ggagcagtgt
     1561 ccttggaggc aagttttttt ccttttggga catttaaagc ccctatatgt ggcattcctt
     1621 tctttcgtaa cctaaactat agatAATTAA ggcagttaaa aattgaactt ccttccaggc  2.1k sheep homologue
     1681 cccaagagca aatctttgtt cacttacctg gaaaccagaa tgattttgac acagaggaag
     1741 gtgcagctgt taaaataacc ctcatcctag aagattgcat catggagaaa acgatccgta
     1801 gacaaaaatg atcgcatttc ttcattgctg tctcgtaatt gacagaaacc agaattatgt
     1861 caagtcctag tttctataat cagcttttga atcaaagaat ggaagtccat ccaaaaaaaa
     1921 aaaagaaata ccttaggtca cccatgacag aaatacccat tcaggttaga aaaaaggaat
     1981 tctgttaact gttatttaag taaggcaaaa ttattgtccg gattgttcga tatcatcagc
     2041 tagcagataa attagcattc tgcaatgttc ccggcttgca ctgtgcgggt atttgatgtt
     2101 aaaaaaaatt attatatata ttgtgtatga caaacttaga agtttttgct agaggagtta
     2161 acatctgata tatctaatgc accaccagtt ttggaaggta ctaaatactt aatatgtaga
     2221 aatccttttg cgtggtcctc aggcttacac gtgcactgaa tagttttgta tgatagagcc
     2281 catgtggtct tcgaaatatg catgtacttt atattttcta tatttgtaac tgggcatgta
     2341 cttgtataaa aaatgtataa acattcgaac tcttgactag aATTAAAcag gaactgagtg  poly A signal
     2401 TGTCCCA.tgt gtttgcagtg acattcacca ccgcaccctg tgttgg                poly A site

Mink aligns well with human 3' UTR with 71% identity, either at Blast or ClustalW
Human: 1    ggaaggtcttcctgttttcaccatc-t-ttctaatctttttccagcttgagggaggcggt 58
            ||| || |||||  || || ||||| | |||  | ||||| | || ||| |||||| |||
Mink : 1    ggatggccttcccattctctccatcgtcttc--accttttac-aggttgggggagggggt 57

                                                                        
Human: 59   atccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtc--ccgg 116
             || |||| ||||||| | ||||||||||||||| || | |||||||  || ||  ||  
Mink : 58   gtctacctacagccctgt-agtggtggtgtctcattcctgcttctct--ttatcaccc-- 112

                                                                        
Human: 117  ataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgag- 175
            |||||||||||    |||||||| ||||||| | |||| |||  ||||| |||||  || 
Mink : 113  ataggctaatc----cccttggccctgatggcc-ctgggaaatgtagagcagaccc-agg 166

                                                                        
Human: 176  atgctggt----caagccccctt-tgattg-agttcatcatgagccgttgctaatgccag 229
            |||||  |    ||||||||| | || ||| ||| | ||| | |||  ||||| ||| ||
Mink : 167  atgctatttattcaagcccccatgtg-ttggagtccttcaggggccaatgctagtgc-ag 224

                                                                        
Human: 230  gccagtaaaagtataacagcaaataaccattggttaatct--gga--cttattt----tt 281
            | | |    || |||||||||||| | |||||||| | ||  ||   ||| |||    ||
Mink : 225  ggctg----agaataacagcaaatcatcattggttgacctagggctgcttttttgttgtt 280

                                                                        
Human: 282  g--gacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctg-aaatacctt 338
            |  | || |||||| | |   ||||||||||||| ||||| |||||| |  |||||||||
Mink : 281  gttgtct-agtgcagctgaccgaggctaaaacaattctcaaaacagttttcaaatacctt 339

                                                                        
Human: 339  tgcctggatacctctggctccttcagcagctagagctcagtatactaatg-ccctatctt 397
            |||||||| ||||||||||||| | ||||||||||||||||| | ||||| ||| |||||
Mink : 340  tgcctggaaacctctggctcctgctgcagctagagctcagtacattaatgtccc-atctt 398

                                                                        
Human: 398  agtagagat-ttcatagctatttagagata-tttt-----ccatttt--aagaa----a- 443
            ||  | | | |||||||| | || | || | ||||     ||| | |  |||||    | 
Mink : 399  agccgtg-tcttcatagcaacttgggga-agtttttctccccactctaaaagaacgcgat 456

                                                                        
Human: 444  ---ac--cc--g-----acaacatttctgccaggtttgttaggaggccacatgatactt- 490
               ||  ||  |     | |||||||||||||  ||||  |||||||||||||||| || 
Mink : 457  tgcacttccctgtgcaaagaacatttctgccaaatttgaaaggaggccacatgata-ttc 515

                                                                        
Human: 491  attcaaaaa--aatcctagagattcttagctcttgggatgcaggctcagcccgctggagc 548
            |||||||||  ||  ||||| |  ||| |||||| ||| ||| || | ||| |||  || 
Mink : 516  attcaaaaagcaaaactagaaaccctttgctctt-ggacgcaagcccggcctgct--ag- 571

                                                                        
Human: 549  atgagctctgtgtgtaccgagaactggggtgatgttttac-ttttcacagtatgggcta- 606
              ||||         ||| | |||||||| |||| ||| | || |  | |  ||||||| 
Mink : 572  --gagc---------acc-a-aactggggcgatggtttgcattct-gcggcgtgggctat 617

                                                                        
Human: 607  -c-acagc--agctgttcaacaagagtaaatattg-tcacaacact-gaacctctggcta 660
             |  ||||  || || ||  | || |||||||||| |  | || || | ||||  ||| |
Mink : 618  gcggcagccgaggtg-tc--c-agcgtaaatattgat-gcgacgctag-acct-aggc-a 669

                                                                        
Human: 661  gaggacatatt--cacag----tgaacataactgtaacatatatg---aaaggcttctgg 711
            ||||  || ||  |||||    ||||||||| |  ||||    ||   ||| |||     
Mink : 670  gagg--atgtttgcacagggaatgaacataa-t-caaca---gtgcgaaaatgct----- 717

                                                                        
Human: 712  gacttgaaat--ca-aatgtttggga--atggtgcccttggaggcaa-----cctcccat 761
             ||   ||||  || | ||   ||||  |  ||| ||||||||||||       |||  |
Mink : 718  -acaaaaaatcccacactg---gggagca--gtgtccttggaggcaagtttttttcc--t 769

                                                                        
Human: 762  ttt-agatgtttaaaggaccctatatgtggcattcctttctt--------taaactatag 812
            |||  ||  |||||| | ||||||||||||||||||||||||        ||||||||||
Mink : 770  tttgggacatttaaa-gcccctatatgtggcattcctttctttcgtaacctaaactatag 828

                                                                        
Human: 813  gtAATTAAggcagctgaaaagt-aaattgccttctagacactgaag-gcaaatctccttt 870  2.1k sheep homologue
             |||||||||||| | |||| | || || ||||| || | |  ||| ||||||   ||||
Mink : 829  atAATTAAggcagttaaaaattgaactt-ccttccaggc-cccaagagcaaat---cttt 883  2.1k sheep homologue

                                                                        
Human: 871  gtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttg-tgaaag 929
            || || |||||||||||||||||||||||||||| | |||| || ||||| || | ||| 
Mink : 884  gttcacttacctggaaaccagaatgattttgacacagagga-aggtgcagctgttaaaat 942

                                                                        
Human: 930  caccatcatcatagaggatgatg--taat-taaaaaatggtcagtgtgcaaaga-aaaga 985
             ||| ||||| |||| |||  ||  | ||  | |||| | || ||      ||| ||| |
Mink : 943  aaccctcatcctagaagat--tgcatcatggagaaaacgatccgt------agacaaa-a 993

                                                                        
Human: 986  actgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagtt 1045
            | || | ||||||||| ||| ||||||| |||||| || |||||||||||| |||||| |
Mink : 994  a-tgatcgcatttcttcattgctgtctcgtaattgacagaaaccagaattatgtcaag-t 1051

                                                                        
Human: 1046 catagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaa--------- 1096
            | |||||||| ||||  |||||||||||||||||| || || | ||| |||         
Mink : 1052 cctagtttctataatcagcttttgaatcaaagaat-ggaagtccatccaaaaaaaaaaaa 1110

                                                                        
Human: 1097 -aaatatcttaggttgga--gatgacagaaata-tgatt--gatttgaagtggaaaaaga 1150
             ||||| |||||||   |   ||||||||||||   |||  | || ||     |||||| 
Mink : 1111 gaaataccttaggt--cacccatgacagaaatacccattcaggttaga-----aaaaagg 1163

                                                                        
Human: 1151 aattctgttaa-tgttaattaa--a--gtaaaattat--tccctgaattgtttgatattg 1203
            ||||||||||| ||||| ||||  |  | ||||||||  |||  | |||||| |||||  
Mink : 1164 aattctgttaactgttatttaagtaaggcaaaattattgtcc--ggattgttcgatatca 1221

                                                                        
Human: 1204 tcacctagcagatatgtatta-cttttctgcaatgttattattggcttgcactttgtgag 1262
            ||| ||||||||||   |||| |  ||||||||||   ||   |||||||||| || | |
Mink : 1222 tcagctagcagata--aattagc-attctgcaatg---ttcccggcttgcactgtgcggg 1275

                                                                        
Human: 1263 tattct-atg-taaaaatatatatgtatataaaatatatattgcataggacagacttagg 1320
            |||| | ||| |||||| | | || ||| |   ||||||||||  || |||| |||||| 
Mink : 1276 tatt-tgatgttaaaaa-a-a-at-tat-t---atatatattgtgtatgacaaacttaga 1326

                                                                        
Human: 1321 ag-ttttgtttagagcagttaacatctga-agtgtctaatgcatta--acttttgtaagg 1376
            || |||||  ||||| ||||||||||||| | | |||||||||  |  | ||||| ||||
Mink : 1327 agtttttg-ctagaggagttaacatctgata-tatctaatgcaccaccagttttggaagg 1384

                                                                        
Human: 1377 tactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcac 1436
            |||| ||||||||||||||  |||| |||||||||||||||| |||||||| | ||||||
Mink : 1385 tactaaatacttaatatgt-agaaatccttttgcgtggtcctcaggcttac-acgtgcac 1442

                                                                        
Human: 1437 tgaatcgtttcatgtaagaatccaaagtggacaccat-taacaggtctttgaaatatgca 1495
            ||||| ||||  |||| | ||  | ||    | |||| |    |||||| ||||||||||
Mink : 1443 tgaatagttt--tgtatg-at--agag----c-ccatgt----ggtcttcgaaatatgca 1488

                                                                        
Human: 1496 tgtactttatattttctatatttgtaactttgcatgttcttgttttgttatataaaaaaa 1555
            |||||||||||||||||||||||||||||  |||||| ||||      |||   ||||||
Mink : 1489 tgtactttatattttctatatttgtaactgggcatgtacttg------tat---aaaaaa 1539

                                                        
Human: 1556 t-tgtaaatgtt-taa-tatctgact-gaaattaaacgagcgaa 1595
            | | ||||  ||  || | | ||||| | |||||||| || |||
Mink : 1540 tgtataaacattcgaactct-tgactag-aattaaac-ag-gaa 1579
CLUSTAL W (1.74) multiple sequence alignment

human           GGAAGGTCTTCCTGTTTTCACCATCTTTCTAATCTTTTTCCAGCTTGAGGGAGGCGGTAT 60
mink            GGATGGCCTTCCCATTCTCTCCATCGTCTTCACCTTTTAC-AGGTTGGGGGAGGGGGTGT 59
                *** ** *****  ** ** ***** *  * * ***** * ** *** ****** *** *

human           CCACCTGCAGCCCTTTTAGTGGTGGTGTCTCACTCTTTCTTCTCTCTTTGTCCCGGATAG 120
mink            CTACCTACAGCCCTGT-AGTGGTGGTGTCTCATTCCTGCTTCTC--TTTATCACCCATAG 116
                * **** ******* * *************** ** * ******  *** ** *  ****

human           GCTAATCAATACCCTTGGCACTGATGGGCACTGGAAAACATAGAGTAGACCTGAGATGCT 180
mink            GCTAATC----CCCTTGGCCCTGATGG-CCCTGGGAAATGTAGAGCAGACCCAGGATGCT 171
                *******    ******** ******* * **** ***  ***** *****   ******

human           GGT----CAAGCCCCCTT-TGATTGAGTTCATCATGAGCCGTTGCTAATGCCAGGCCAGT 235
mink            ATTTATTCAAGCCCCCATGTGTTGGAGTCCTTCAGGGGCCAATGCTAGTGC-AGGGCTG- 229
                  *    ********* * ** * **** * *** * ***  ***** *** *** * * 

human           AAAAGTATAACAGCAAATAACCATTGGTTAATCT-GGACTTATTTTTGGACTT------- 287
mink            ---AGAATAACAGCAAATCATCATTGGTTGACCTAGGGCTGCTTTTTTGTTGTTGTTGTC 286
                   ** ************ * ******** * ** ** **  ***** *   *       

human           -AGTGCAACAGGTTGAGGCTAAAACAAATCTCAGAACAGTCTG-AAATACCTTTGCCTGG 345
mink            TAGTGCAGCTGACCGAGGCTAAAACAATTCTCAAAACAGTTTTCAAATACCTTTGCCTGG 346
                 ****** * *   ************* ***** ****** *  ****************

human           ATACCTCTGGCTCCTTCAGCAGCTAGAGCTCAGTATACTAATGCCCTATCTTAGTAGAGA 405
mink            AAACCTCTGGCTCCTGCTGCAGCTAGAGCTCAGTACATTAATGTCCCATCTTAGCCGTGT 406
                * ************* * ***************** * ***** ** *******  * * 

human           TTTCATAGCTATTTAGAGATATTTT-----CCATTTTAAGAAAACCCGAC---------- 450
mink            CTTCATAGCAACTTGGGGAAGTTTTTCTCCCCACTCTAAAAGAACGCGATTGCACTTCCC 466
                 ******** * ** * **  ****     *** * *** * *** ***           

human           ---------AACATTTCTGCCAGGTTTGTTAGGAGGCCACATGATACTTATTCAAAAA-- 499
mink            TGTGCAAAGAACATTTCTGCCAAATTTGAAAGGAGGCCACATGATATTCATTCAAAAAGC 526
                         *************  ****  **************** * *********  

human           AATCCTAGAGATTCTTAGCTCTTGGGATGCAGGCTCAGCCCGCT-GGAGCATGAGCTCTG 558
mink            AAAACTAGAAACCCTTTGCTCTTGG-ACGCAAGCCCGGCCTGCTAGGAGCACCA------ 579
                **  ***** *  *** ******** * *** ** * *** *** ******  *      

human           TGTGTACCGAGAACTGGGGTGATGTTTTACTTTTCACAGTATGGGCTACACAGCAGC--- 615
mink            -----------AACTGGGGCGATGGTTTGCATTCTGCGGCGTGGGCTATGCGGCAGCCGA 628
                           ******** **** *** * **   * *  *******  * *****   

human           --TGTTCAACAAGAGTAAATATTGTCACAACACTGAACCTCTGGCTAGAGGACATATTCA 673
mink            GGTGTCCAGC----GTAAATATTGATGCGACGCTAGACCTA-GGC-AGAGGATGTTTGCA 682
                  *** ** *    **********   * ** **  ****  *** ******  * * **

human           CAG----TGAACATAACTGTAACATATATGAAAGGCTTCTGGGACTTGAAATCAAATGTT 729
mink            CAGGGAATGAACATAATC--AACAGTGCGAAAATGCT------ACAAAAAATCCCACACT 734
                ***    *********    ****      *** ***      **   *****  *   *

human           TGGGAATGGTGCCCTTGGAGGCAACCTC----CCATTTTAGATGTTTAAAGGACCCTATA 785
mink            GGGGAGCAGTGTCCTTGGAGGCAAGTTTTTTTCCTTTTGGGACATTTAAAGC-CCCTATA 793
                 ****   *** ************  *     ** ***  **  *******  *******

human           TGTGGCATTCCTTTCTTT--------AAACTATAGGTAATTAAGGCAGCTGAAAAGTAAA 837
mink            TGTGGCATTCCTTTCTTTCGTAACCTAAACTATAGATAATTAAGGCAGTTAAAAATTGAA 853
                ******************        ********* ************ * **** * **

human           TTGCCTTCTAGACACTGAAGGCAAATCTCCTTTGTCCATTTACCTGGAAACCAGAATGAT 897
mink            CTTCCTTCCAGGCCCCAAGAGCAAATCT---TTGTTCACTTACCTGGAAACCAGAATGAT 910
                 * ***** ** * *  *  ********   **** ** *********************

human           TTTGACATACAGGAGAGCTGCAGTTGTGAAAGCA-CCATCATCATAGAGGATGATGTAAT 956
mink            TTTGACACAGAGGA-AGGTGCAGCTGTTAAAATAACCCTCATCCTAGAAGATTGCATCAT 969
                ******* * **** ** ***** *** ***  * ** ***** **** ***    * **

human           TAA-AAAATGGTCAGTGTGCAAAGAAAAGAACTGCTTGCATTTCTTTATTTCTGTCTCAT 1015
mink            GGAGAAAACGATCCGT------AGACAA-AAATGATCGCATTTCTTCATTGCTGTCTCGT 1022
                  * **** * ** **      *** ** ** ** * ********* *** ******* *

human           AATTGTCAAAAACCAGAATTAGGTCAAGTTCATAGTTTCTGTAATTGGCTTTTGAATCAA 1075
mink            AATTGACAGAAACCAGAATTATGTCAAGTCC-TAGTTTCTATAATCAGCTTTTGAATCAA 1081
                ***** ** ************ ******* * ******** ****  *************

human           AGAATAGGGAGACAATCTAAAAAA----------TATCTTAGGTTGGAGATGACAGAAAT 1125
mink            AGAAT-GGAAGTCCATCCAAAAAAAAAAAAGAAATACCTTAGGTCACCCATGACAGAAAT 1140
                ***** ** ** * *** ******          ** *******     ***********

human           ATG-ATTGATTTGAAGTGGAAAAAGAAATTCTGTTAA-TGTTAATTAAA----GTAAAAT 1179
mink            ACCCATTCAGGTTA---GAAAAAAGGAATTCTGTTAACTGTTATTTAAGTAAGGCAAAAT 1197
                *   *** *  * *   * ****** *********** ***** ****     * *****

human           TATTCCCTGAATTGTTTGATATTGTCACCTAGCAGATATGTATTACTTTTCTGCAATGTT 1239
mink            TATTGTCCGGATTGTTCGATATCATCAGCTAGCAGATA--AATTAGCATTCTGCAATGTT 1255
                ****  * * ****** *****  *** **********   ****   ************

human           ATTATTGGCTTGCACTTTGTGAGTATTCTATGTAAAAATATATATGTATATAAAATATAT 1299
mink            CC---CGGCTTGCACTGTGCGGGTATTTGATGTTAAAAAAAAT-------TATTATATAT 1305
                      ********** ** * *****  **** **** * **       **  ******

human           ATTGCATAGGACAGACTTAGGAGTTTTGTTTAGAGCAGTTAACATCTGAAGTGTCTAATG 1359
mink            ATTGTGTATGACAAACTTAGAAGTTTTTGCTAGAGGAGTTAACATCTGATATATCTAATG 1365
                ****  ** **** ****** ******   ***** *************  * *******

human           CATTAAC--TTTTGTAAGGTACTGAATACTTAATATGTGGGAAACCCTTTTGCGTGGTCC 1417
mink            CACCACCAGTTTTGGAAGGTACTAAATACTTAATATGTAG-AAATCCTTTTGCGTGGTCC 1424
                **  * *  ***** ******** ************** * *** ***************

human           TTAGGCTTACAATGTGCACTGAATCGTTTCATGTAAGAATCCAAAGTGGACACCATTAAC 1477
mink            TCAGGCTTACAC-GTGCACTGAATAGTTT--TGTATGA--------TAGAGCCCAT---G 1470
                * *********  *********** ****  **** **        * **  ****    

human           AGGTCTTTGAAATATGCATGTACTTTATATTTTCTATATTTGTAACTTTGCATGTTCTTG 1537
mink            TGGTCTTCGAAATATGCATGTACTTTATATTTTCTATATTTGTAACTGGGCATGTACTTG 1530
                 ****** ***************************************  ****** ****

human           TTTTGTTATATAAAAAAATTGTAAATGTTTAATATCT-GACTGAAATTAAAC--GAGCGA 1594
mink            T--------ATAAAAAATGTATAAACATTCGAACTCTTGACTAGAATTAAACAGGAACTG 1582
                *        ********  * ****  **  *  *** ****  ********  ** *  

human           AGATGAGCACCA---------------- 1606
mink            AG-TGTGTCCCAtgtgtttgcagtgacattcaccaccgcaccctgtgttgg 1632
                ** ** *  ***              
Gallus gallus (chicken) M61145

3' UTR [1210 bp has no homology to any known sequence]

 ga tgccgtgccc cggccctgtg gcagtgagat gacatcgtgt
     1021 ccccgtgccc acccatgggg tgttccttgt cctcgctttt gtccatcttt ggtgaagatg
     1081 tccccccgct gcctccccgc aggctctgat ttgggcaaat gggaggggat tttgtcctgt
     1141 cctggtcgtg gcaggacggc tgctggtggt ggagtgggat gcccaaaaaa tggccttcac
     1201 cacttcctcc tcctcttcct ttctggggcg gagatatggg ctcgtccagc ccttattgtc
     1261 cctgcaagag cgtatctgaa aatcctcttt gctaacaagc agggttttac ctaatctgct
     1321 tagccccagt gacagcagag cgcctttccc cagggcacac caaccccaag ctgaggtgct
     1381 tggcagccac acgtcccatg gaggctgatg ggttttgggg cgtcccaagc aacaccctgg
     1441 gctactgagg tgcaattgta gctctttaat ctgccaatcc caaccctacc gtgtagatag
     1501 gaactgcctg ctctgcattt tgcatgctgc aaacacctcc tgccgcagcg cccccaaaat
     1561 agagtgattt gggaatagtg aggctgaagc cacagcagct tgggattggg ctcatcatat
     1621 caatccatga tgctttgctt ccagctgagc ctcactgccc ttttatagcc tgcccagagg
     1681 aagggagcgc tgctaaatgc ccaaaaaggt aacactgagc aaaagcttat ttcaatgtat
     1741 gatagagaac gagtgcatct cgcacagatc agccatggga gcatcgtttg ccatcagccc
     1801 caaaacccaa aggatgctaa aatgcagcca aaggggaatc aagcacgcag ggaaggactt
     1861 gaatcagctc aactggattg aaatggcaaa aggcatgagt agaacgaacg gcaaggggat
     1921 gctggagatc cacctcctgt gagcaaattg ttcgatgcag ccaatggaac tattgcttct
     1981 tgtgcttcag ttgctgctga tgtgtacata ggctgtagca tatgtaaagt tacacgtgtc
     2041 aagctgctcg caccgcgtag agctaatatg tatcatgtat gtgggcactg aatgccaccg
     2101 ttggccatac ccaaccgtcc taaacgattt tcacgtcgct gtaacttaag tggagataca
     2161 ctttcagtat attcagcaaa aggaattc
set of fasta sequences
Sequence 1: sheep_reduced      1481 bp
Sequence 2: bovine_reduced     1526 bp
Sequence 3: human              1606 bp
Sequence 4: mink               1632 bp
Start of Pairwise alignments
Aligning...
Sequences (2:3) Aligned. Score:  58
Sequences (1:2) Aligned. Score:  94
Sequences (3:4) Aligned. Score:  70
Sequences (2:4) Aligned. Score:  54
Sequences (1:3) Aligned. Score:  57
Sequences (1:4) Aligned. Score:  54
>sheep_reduced 1481 bp
gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtctacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtataataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctgtttattcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttcatagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaatggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggcagccttccattttgtatgtttaagcaccttcaagtgatattcctttctttagtaacataaagtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagagaatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagtgttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtacatgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagttaggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaagaAATAAAttagagatgatagaaatctgatccattcagagtagaaaaagaaattccattactgttatttaagaaggtaaaattatttcctgaattgttcaatattgtcacctagcagatagacactattattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaaacgtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacctaatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacAATAAAtgtactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaa

>bovine_reduced 1526 bp
gggcaaccttcctgttttcattatcttcttaatctttaccaggttgggggagggagtatctacctgcagccccgtagtggtggtgtctcatttcgtgcttctctctttgttacctgtatgctaatacccttggcgcttatagcactgggaaatgaagagcagacatgagatgctgtttattcaagtcccgttagctcagtatgctaatgccccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacgtgactacatttcccttttggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaccggaaatccttagttcatagacccagggtccacctggttgagagcttgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccagctattgcatcaagaatggatattcatgcaacctttgacttatgggtagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatgggagaggtgcccttggtggcagccttccattttgtatgtttaaagcaccttcaagtggtattcctttctttagtaacaaagtatagataattaagttaccttaatttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgtttagacccagattttaacatagagaatgcagatataaaaactccatattcatttgattgaatcttttccttaaccagtgctagtgttggactggtaagattataacaacaaatataggttatgtgatgaagagaatagtgtacaaagaaaagaaatatgtgcatttctttattgctatcataattgtcaaaaaacaaaattaggtccttggtttctgtaaaattaacttttgaatcaacagggaggcatttaaagaaatatcttaaattagagacagtagaaatctgatacattcagagtggaaaaagaaattctattacgattatttaagaaggtaaaattatttcctgggttgttcaatattgtcacctagcagatagacactattgttctgcactgttattactggcttgcactttgtggtatcctatgtaaaaatacatatattgcatatgacagacttaagaatttctgttagagcaattaacatctgaactatctaatgcattacctgtttttgtaaggtactttttgtaaggtactaaaaacaataacaacAATAAAtgtactgaatacttaaaggaaactcttccgtgttgtccttagccttacagcgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttctatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaaacaggagctaaaaggagg

>human 1606 bp
ggaaggtcttcctgttttcaccatctttctaatctttttccagcttgagggaggcggtatccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtcccggataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgagatgctggtcaagccccctttgattgagttcatcatgagccgttgctaatgccaggccagtaaaagtataacagcaaataaccattggttaatctggacttatttttggacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctgaaatacctttgcctggatacctctggctccttcagcagctagagctcagtatactaatgccctatcttagtagagatttcatagctatttagagatattttccattttaagaaaacccgacaacatttctgccaggtttgttaggaggccacatgatacttattcaaaaaaatcctagagattcttagctcttgggatgcaggctcagcccgctggagcatgagctctgtgtgtaccgagaactggggtgatgttttacttttcacagtatgggctacacagcagctgttcaacaagagtaaatattgtcacaacactgaacctctggctagaggacatattcacagtgaacataactgtaacatatatgaaaggcttctgggacttgaaatcaaatgtttgggaatggtgcccttggaggcaacctcccattttagatgtttaaaggaccctatatgtggcattcctttctttaaactataggtaattaaggcagctgaaaagtaaattgccttctagacactgaaggcaaatctcctttgtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttgtgaaagcaccatcatcatagaggatgatgtaattaaaaaatggtcagtgtgcaaagaaaagaactgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagttcatagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaaaaatatcttaggttggagatgacagaaatatgattgatttgaagtggaaaaagaaattctgttaatgttaattaaagtaaaattattccctgaattgtttgatattgtcacctagcagatatgtattacttttctgcaatgttattattggcttgcactttgtgagtattctatgtaaaaatatatatgtatataaaatatatattgcataggacagacttaggagttttgtttagagcagttaacatctgaagtgtctaatgcattaacttttgtaaggtactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcatgtaagaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttctatatttgtaactttgcatgttcttgttttgttatataaaaaaattgtaaatgtttaatatctgactgaaattaaacgagcgaagatgagcacca

>mink 1632 bp
ggatggccttcccattctctccatcgtcttcaccttttacaggttgggggagggggtgtctacctacagccctgtagtggtggtgtctcattcctgcttctctttatcacccataggctaatccccttggccctgatggccctgggaaatgtagagcagacccaggatgctatttattcaagcccccatgtgttggagtccttcaggggccaatgctagtgcagggctgagaataacagcaaatcatcattggttgacctagggctgcttttttgttgttgttgtctagtgcagctgaccgaggctaaaacaattctcaaaacagttttcaaatacctttgcctggaaacctctggctcctgctgcagctagagctcagtacattaatgtcccatcttagccgtgtcttcatagcaacttggggaagtttttctccccactctaaaagaacgcgattgcacttccctgtgcaaagaacatttctgccaaatttgaaaggaggccacatgatattcattcaaaaagcaaaactagaaaccctttgctcttggacgcaagcccggcctgctaggagcaccaaactggggcgatggtttgcattctgcggcgtgggctatgcggcagccgaggtgtccagcgtaaatattgatgcgacgctagacctaggcagaggatgtttgcacagggaatgaacataatcaacagtgcgaaaatgctacaaaaaatcccacactggggagcagtgtccttggaggcaagtttttttccttttgggacatttaaagcccctatatgtggcattcctttctttcgtaacctaaactatagataattaaggcagttaaaaattgaacttccttccaggccccaagagcaaatctttgttcacttacctggaaaccagaatgattttgacacagaggaaggtgcagctgttaaaataaccctcatcctagaagattgcatcatggagaaaacgatccgtagacaaaaatgatcgcatttcttcattgctgtctcgtaattgacagaaaccagaattatgtcaagtcctagtttctataatcagcttttgaatcaaagaatggaagtccatccaaaaaaaaaaaagaaataccttaggtcacccatgacagaaatacccattcaggttagaaaaaaggaattctgttaactgttatttaagtaaggcaaaattattgtccggattgttcgatatcatcagctagcagataaattagcattctgcaatgttcccggcttgcactgtgcgggtatttgatgttaaaaaaaattattatatatattgtgtatgacaaacttagaagtttttgctagaggagttaacatctgatatatctaatgcaccaccagttttggaaggtactaaatacttaatatgtagaaatccttttgcgtggtcctcaggcttacacgtgcactgaatagttttgtatgatagagcccatgtggtcttcgaaatatgcatgtactttatattttctatatttgtaactgggcatgtacttgtataaaaaatgtataaacattcgaactcttgactagaattaaacaggaactgagtgtgtcccatgtgtttgcagtgacattcaccaccgcaccctgtgttgg

>ancest_rod 1242 bp
gggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcactgatgggcccggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggctaacaccaggccaataagaatataacaccaaataactgctggctagtcagggctttgttttggtctagtgagtaaatactggtgtaacccctgacttgtacccagagtacatggtgacagagacacacataacttagtataggcaaagggttctacagccaaagaagccactgtttgggcatggcaccctggataacagcctcccacctgggatatctagagcatccacatgtggaattctttctttactaacaaaccatagctgattgaaggcaacaggaaaaaaaaatcaaattatcctactgacattgaaagcaaacctgtgttcattccctaggcactagaatgatttttagccttgcttggattgaaccaggagattttgactctgaggagagccagcactgtacaaaagcatggtcctcctgtgatgggagagatggttaagggacaaaggcaagacccttgcgtttcttcatttctgtctcataattatcaaaagctagaattaggtcgtgccctaagtttctgtacttgtatttgaactggacaacaaagagacaatctaaaaattctcttaggctgcagatgagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtttaagtaaggcaaactttgttccttaagtcgctttgtatttcccccagcagacataacaaccctgcgatcggttcagcttgcactgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcaacgtgcactcggttccgtatgattccaaagtagacccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatg

>hamster 1214 bp
aggaagcctccctgcttgtacttcctcgttcttgtggtctaggctgggggaggggttatccaccgtagctcttttaattgaggtggtgtctcattcctgcttctctttgtcccccataggctaatgcccttggcactagtgggccctgggaatgtacagtagaccagatgctattcgatccagagcctttgaattgagtccatcacgggccagcactaacaccaggcctatctgaatataacagcaagtaatggctggctagtcagggctttgttttggtctagtgagtaaatactgatgtgaccctctgacttccacacagagtacgcagtgacagacacacctaactgttaaaataggcgaagggttctacagccaaagaagtcactgtttggcatggtccctaagaaacagcctcccatttgggatatttaaagcatccatatgaggcattcctccttcactaacaaactctagctgagtaaggcaacgggaaaaaaacaaaattaccctactaacatggaaagcaaacctgtgttcatttcctaggaactagaatgatgttttagccttgcttggattgaaccaggagattttggctctgaagagccaacactgtaaaaatgtggtcctcctgcaaagggagagatggttaggacacaaagtcacggcgcttggcgtttcttcatttctgtctcataattgtcaaaagtcacaattaggtcatgcccttagttaatatacttgtatttgaatcggacgacaagagacaatctaaaaattctcctaggttgtagatgaaataggctccattcaaggtgaaaagacagtttgttagcgttgcttatgtaaggcaaactttgttccttaagttgctccgtgtttccctgagcagacataaccactctgcaacagcattgccctgctgtagaatatataaagtgtaactacaagcttagaccttctgttctgatgcatccgaagtacgtaatgcactgaccatttcacccggtatcagatgttttctgtgtggcccctagctttccttcaacatgcattcggttccatatatgaatccaaagtggaccccctaactggtctctgaaatctgcatgtacttcacattttctatatttgtaactttgcatgtccttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaataggagcta

>rat 1222 bp
ggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttacccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcagtgatgggtctggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggttaacaccaggccagtaagaatataacaccaaataactgctggctagtcagggctttgttttggtctactgagtaaatactgtgtaacccctgaattgtacccagaggacatggtgacagagacacacataacttagtataggcaaagggttctatagccaaagaagccactgtgtgggcatggcaccctggataacagcctcccgcctgggatatctagagcatccacatgtggaattctttcttttctaacataaaccatagctgattgaaggcaacaagaaaaagaatcaaattatcctactgacattgaaagcaaactgtgttcattccctaggcgctggaatgatttttagccttggattaaaccaggagattttgactctgaggagaaccagcagtacaaaagcatggtctcctgtgatgggagagatggtgaagggacaaaggcaagacccctgcgtttcttcatttctgtctcataattatcaagagctagaattaggtcgtgccctaagtttctgtactcgtatttgaactggacaacaaagagacaatctacaaattctcttgggctgcagaggagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtctaagtaaggctaacttttccttaaatcgctttgtatttcccccagcagacatcacaaccctgtgatcggttcagcctgcaccgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttctgtgcagtactaagtgtgacccttgggctttcaatgtgcactcggttccgtatgattccaaagtagagccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaacttcgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatg

>mouse 1234 bp
gggaggccttcctgcttgttccttcgcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggttctcattcttgcttctctgtgtcccccataggctaatacccctggcactgatgggccctgggaaatgtacagtagaccagttgctctttgcttcaggtccctttgatggagtctgtcatcagccagtgctaacaccgggccaataagaatataacaccaaataactgctggctagttggggctttgttttggtctagtgAATAAAtactggtgtatcccctgacttgtacccagagtacaaggtgacagtgacacatgtaacttagcataggcaaagggttctacaaccaaagaagccactgtttggggatggcgccctggaaaacagcctcccacctgggatagctagagcgtccacacgtggaattctttctttactaacaaacgatagctgattgaaggcaacaggaaaaaaaaaaatcaaattgtcctactgacgttgaaagcaaacctttgttcattcccagggcactagaatgatctttagccttgcttggattgaactaggagatcttgactctgaggagagccagccctgtaaaaagcttggtcctcctgtgacgggagggatggttaaggtacaaaggctagaaacttgagtttcttcatttctgtctcacaattatcaaaagctagaattagcttctgccctatgtttctgtacttctatttgaactggataacagagagacaatctaaacattctcttaggctgcagataagagaagtaggctccattccaaagtgggaaagaaattctgctagcattgtttaaatcaggcaaaatttgttcctgaagttgctttttaccccagcagacataaactgcgatagcttcagcttgcactgtggattttctgtatagaatatataaaacataacttcaagcttatgtcttctttttaaaacatctgaagtgtgggacgccctggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgtaggattccaaagcagacccctagctggtctttgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctatcagactgacattaaatagaagctatgatg
>SheepBovB
        1 tagggatgtg agagttggac tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa
       61 ctatagtgtt ggagaaaact cttgagagtc ccttggactg aaaggagatc agtcctgaat
      121 attcattgga aggactgatg ctgaagctga aactccaata ctttggtcac ctgatgggaa
      181 gaactgaagg caggagggat gctaggaaag actgaaggca ggaggagaag gggacgacag
      241 aggatgagat ggctagatgg catcatggac tcaatggaca tgagcttaag taaactccag
      301 gagttggcga tggacaggga gacctggcgt cctgcagtcc atggtgtcgc agagtcggac
      361 acgattgagt gactaaattg aggtgaa 
>CowBovB
tagagatg tgagagttga actgtaaaga
     1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc
     1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca
     1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact
     1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc
     1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg
     2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaa

>SheepBovtA3 
        1 ggagatgtgg gtttaatccc taggtcaggt aaatccccta gaggaagaaa tggcaaccca
       61 ctccagtatt cttgccagga aaatccagtg ggcagaggag cctggcaggg tacagtctaa
      121 gcatggggtt gcaaagagtg agacaagact tgagctact
>cowBovtA3 
ggagacg
     2641 tgggtttaat ccctaggtca tgtaaatccc ctggaggagg aaatagcaac ccactccagt
     2701 attcttgcca ggagaatccc atgggcagag gagcctggca gggtgcagtc catgcatagg
     2761 gttgcaaaga gtcagacaag acttgagcta ct

>SheepOamar1 
        1 ctgggtcggc taaaaggttc attaggtttt ttttctgtaa gatggctcta gtagtacttg
       61 tctttatctt cattcgaaac aattttgtta gattgtatgt gacagctctt gtatcagcat
      121 gcatttgaaa aaaacatcaa aattggtaaa tttttgtata gccatcttac tattgaagat
      181 ggaagaaaag aagcaaaatt ttcagcatat catgctgtat tatttcaaga aagataacca
      241 aaatgcaaaa atgtatttgt gaagtgtatg gagaaggggc tgcaactgat caagcttgtc
      301 aaagtagttt gtgaagtttc gtgctggaga tttcttattg gacgatgctc cacagttgga
      361 tataccagtt gaagttgata gtgatcaaat tgagatattg agaataatcg atgttatacc
      421 acgcgggaga tagctgacat actcaaaata tccaaataga accttgaaaa ccatttgcac
      481 catctcagtt atgttaatca ctttgatgtt tgagttccac ataagcaaaa aaacaacaac
      541 aacaaaaaaa aacacaacct tgaccatatt tgcgcatgca gttctctact gaaatgattg
      601 aaaacacttt gtttttaaaa acagattttg attaacagtg ggtacgatac aataacgtag
      661 aatggaagaa attgtagggt gagcaaaatg aaccaccacc accaaaggcc agtcttcctc
      721 taaagaagat gtgtgtatgg tgggattgga aagtaatcct ctattatgaa ttcttctgga
      781 aaacactgct cctaattaga ccaactgaaa gcagcactca acgaaaagca tccagaatta
      841 gtcaatagaa aacataatct tccatcagga taacgcaaga ctacatattt ctttgatgac
      901 ccagcatggc tggagtttct gattcatctg ttgtattcag acgttgcatc tttggatttt
      961 ttccatttat ttcagtctac aaaattatca taatggaaaa aatttccatt ccctggaaga
     1021 ttgtaaagtg catctggaaa atttctttgc tcaaaaagat aaaaagtttt gtgaacacag
     1081 aattatgacg ttgcctgaaa aatggcagaa ggtagtggaa caaaagagtg actatgttgt
     1141 ttggtaaagt tcttagtgaa aatgaaaaat gtgtctttta tttttattta aacaccaaag
     1201 gcacattttg gccaacccaa
>cowOamar1 
ctgggtt
     2821 ggctaaaagg ttcattaggt tttttttctg taagatggct gtctttaact tcattcgaaa
     2881 caattttgtt agattgtatg tgacagctct tgtatcagca tgcatttgaa aaagaaaaca
     2941 acttaccaaa attggtgaat ttttgtatag ccattttact attgaagatg gaagaaaaga
     3001 agcaaaattt tcagcatatc atgctgtatt atttcaagaa agataacaca accaaaatgc
     3061 gaaaatgtat ttgtgcagtg tatggagaag gtgctgcaac tgatcaagct tgtcaaagta
     3121 gtttgtgaag tattgtgctg gagatttctt actggacaat gctccacagt cgggtatacc
     3181 agttgaagtt gatagtgatc aaattgagat attgagaaca atcaatgtta taccacgtgg
     3241 gagatagctg acatactcaa aatatccaaa tagaaccttg aaaaccattt gcaccatctc
     3301 agttatgtta ataactttga tgtttgagtt ccacataaat taagcaaaaa aaaaacaaaa
     3361 acaaaaacac acaaccttga ccatatttgc atatgcagtt ctctactgaa atgaatgaaa
     3421 acacttttgt ttttaaaaac agattttgat gaacagtgga tactatacaa taacgtagaa
     3481 tggaaaagac tgtggggtga gcaaaatgaa ccagcaccac caaaggccag gcttcatcca
     3541 aagaagatgt gtgtatggtg ggattggaaa gtaatcctct attatgggat tcttctggaa
     3601 aaccaaaaaa tcaattccaa caagtactgc tcctaattag accaactgaa agcagcattc
     3661 aatgaaaagc atccagaatt agtcaataga aagcatataa tcttccatca ggataacaca
     3721 agactacatt tctttgatga cccagcatgg ctgagaggtt ctgattcacc tgctgtattc
     3781 agacattgca tctttggatt tccatttatt tcagtctaca gaattatcat catgaaaaaa
     3841 atttccattc cctggaagat tgtaaagtgc atctggaaaa cttctttgct caaaaagata
     3901 aaaagttttg tgaacacaga attatgaagt tgcctgaaaa acagcagaag atagtgacta
     3961 tgttgttcag taaagttctt ggtgcaaatg tgtcttttat ttttatttaa acactaaagg
     4021 cacgttttgg ccaacccaa
exon 1 is 134 bp
exon 2 is 97bp
exon 3 leader is 11 bp
coding + stop is 738
     mRNA            join(12634..12767,15390..15488,25464..27817) = 2587 bp: 3' UTR starts at 26213
3' UTR is 982-2587= 1606 bp

So human 27026 corresponding to sheep  2.1 homology is at position 813 of 3' UTR or position, 793 bp from end or this signal would correspond to 1796 bp mRNA, approximately the 1765 consensus site mentioned by Goldmann et al. using some unknow human sequence.
QUERY    2059 aatagggag-ac-aatctaaaaaata-t-cttaggttggagatga-c-agaaat-at-ga 2110
QUERY    2111 ttgatttgaagtggaaaaag-aaattctgtt-aatg-ttaattaaagtaaaattattccc 2167
AA258260 378  .........g.....
So 2587 is the longest and most common mRNA using all 1606 of 3' UTR
2098 has 490 fewer bp so stops at 1116 of  3' UTR
1978 has 710 fewer bp so stops at 896.

human mRNA
 1 ccgcccgcga gcgccgccgc ttcccttccc cgccccgcgt ccctccccct cggccccgcg
       61 cgtcgcctgt cctccgagcc agtcgctgac agccgcggcg ccgcgagctt ctcctctcct
      121 cacgaccgag gcaggactcc tgaatatttt tcaaaactga acaatttcag ccatgtctga
      181 gctttccgtc ttcctggagg cacaaatcta gtttagctga accacaacag attagcagtc
      241 attatggcga accttggctg ctggatgctg gttctctttg tggccacatg gagtgacctg
      301 ggcctctgca agaagcgccc gaagcctgga ggatggaaca ctgggggcag ccgatacccg
      361 gggcagggca gccctggagg caaccgctac ccacctcagg gcggtggtgg ctgggggcag
      421 cctcatggtg gtggctgggg gcagcctcat ggtggtggct gggggcagcc ccatggtggt
      481 ggctggggtc aaggaggtgg cacccacagt cagtggaaca agccgagtaa gccaaaaacc
      541 aacatgaagc acatggctgg tgctgcagca gctggggcag tggtgggggg ccttggcggc
      601 tacatgctgg gaagtgccat gagcaggccc atcatacatt tcggcagtga ctatgaggac
      661 cgttactatc gtgaaaacat gcaccgttac cccaaccaag tgtactacag gcccatggat
      721 gagtacagca accagaacaa ctttgtgcac gactgcgtca atatcacaat caagcagcac
      781 acggtcacca caaccaccaa gggggagaac ttcaccgaga ccgacgttaa gatgatggag
      841 cgcgtggttg agcagatgtg tatcacccag tacgagaggg aatctcaggc ctattaccag
      901 agaggatcga gcatggtcct cttctcctct ccacctgtga tcctcctgat ctctttcctc
      961 atcttcctga tagtgggatg aggaaggtct tcctgttttc accatctttc taatcttttt
     1021 ccagcttgag ggaggcggta tccacctgca gcccttttag tggtggtgtc tcactctttc
     1081 ttctctcttt gtcccggata ggctaatcaa tacccttggc actgatgggc actggaaaac
     1141 atagagtaga cctgagatgc tggtcaagcc ccctttgatt gagttcatca tgagccgttg
     1201 ctaatgccag gccagtaaaa gtataacagc aaataaccat tggttaatct ggacttattt
     1261 ttggacttag tgcaacaggt tgaggctaaa acaaatctca gaacagtctg aaataccttt
     1321 gcctggatac ctctggctcc ttcagcagct agagctcagt atactaatgc cctatcttag
     1381 tagagatttc atagctattt agagatattt tccattttaa gaaaacccga caacatttct
     1441 gccaggtttg ttaggaggcc acatgatact tattcaaaaa aatcctagag attcttagct
     1501 cttgggatgc aggctcagcc cgctggagca tgagctctgt gtgtaccgag aactggggtg
     1561 atgttttact tttcacagta tgggctacac agcagctgtt caacaagagt aaatattgtc
     1621 acaacactga acctctggct agaggacata ttcacagtga acataactgt aacatatatg
     1681 aaaggcttct gggacttgaa atcaaatgtt tgggaatggt gcccttggag gcaacctccc
     1741 attttagatg tttaaaggac cctatatgtg gcattccttt ctttaaacta taggtAATTA  2.1k sheep homologue
     1801 Aggcagctga aaagtaaatt gccttctaga cactgaaggc aaatctcctt tgtccattta
     1861 cctggaaacc agaatgattt tgacatacag gagagctgca gttgtgaaag caccatcatc
     1921 atagaggatg atgtaattaa aaaatggtca gtgtgcaaag aaaagaactg cttgcatttc
     1981 tttatttctg tctcataatt gtcaaaaacc agaattaggt caagttcata gtttctgtaa
     2041 ttggcttttg aatcaaagaa tagggagaca atctaaaaaa tatcttaggt tggagatgac
     2101 agaaatatga ttgatttgaa gtggaaaaag aaattctgtt aatgttaatt aaagtaaaat
     2161 tattccctga attgtttgat attgtcacct agcagatatg tattactttt ctgcaatgtt
     2221 attattggct tgcactttgt gagtattcta tgtaaaaata tatatgtata taaaatatat
     2281 attgcatagg acagacttag gagttttgtt tagagcagtt aacatctgaa gtgtctaatg
     2341 cattaacttt tgtaaggtac tgaatactta atatgtggga aacccttttg cgtggtcctt
     2401 aggcttacaa tgtgcactga atcgtttcat gtaagaatcc aaagtggaca ccattaacag
     2461 gtctttgaaa tatgcatgta ctttatattt tctatatttg taactttgca tgttcttgtt
     2521 ttgttatata aaaaaattgt aaatgtttaa tatctgactg aaattaaacg agcgaagatg
     2581 agcacca

 human prion has only TATAAA of the suggested polyA sites, 2269 and 2527.

ccgcccgcgagcgccgccgcttcccttccccgccccgcgtccctccccctcggccccgcgcgtcgcctgtcctccgagccagtcgctgacagccgcggcgccgcgagcttctcctctcctcacgaccgaggcaggactcctgaatatttttcaaaactgaacaatttcagccatgtctgagctttccgtcttcctggaggcacaaatctagtttagctgaaccacaacagattagcagtcattatggcgaaccttggctgctggatgctggttctctttgtggccacatggagtgacctgggcctctgcaagaagcgcccgaagcctggaggatggaacactgggggcagccgatacccggggcagggcagccctggaggcaaccgctacccacctcagggcggtggtggctgggggcagcctcatggtggtggctgggggcagcctcatggtggtggctgggggcagccccatggtggtggctggggtcaaggaggtggcacccacagtcagtggaacaagccgagtaagccaaaaaccaacatgaagcacatggctggtgctgcagcagctggggcagtggtggggggccttggcggctacatgctgggaagtgccatgagcaggcccatcatacatttcggcagtgactatgaggaccgttactatcgtgaaaacatgcaccgttaccccaaccaagtgtactacaggcccatggatgagtacagcaaccagaacaactttgtgcacgactgcgtcaatatcacaatcaagcagcacacggtcaccacaaccaccaagggggagaacttcaccgagaccgacgttaagatgatggagcgcgtggttgagcagatgtgtatcacccagtacgagagggaatctcaggcctattaccagagaggatcgagcatggtcctcttctcctctccacctgtgatcctcctgatctctttcctcatcttcctgatagtgggatgaggaaggtcttcctgttttcaccatctttctaatctttttccagcttgagggaggcggtatccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtcccggataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgagatgctggtcaagccccctttgattgagttcatcatgagccgttgctaatgccaggccagtaaaagtataacagcaaataaccattggttaatctggacttatttttggacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctgaaatacctttgcctggatacctctggctccttcagcagctagagctcagtatactaatgccctatcttagtagagatttcatagctatttagagatattttccattttaagaaaacccgacaacatttctgccaggtttgttaggaggccacatgatacttattcaaaaaaatcctagagattcttagctcttgggatgcaggctcagcccgctggagcatgagctctgtgtgtaccgagaactggggtgatgttttacttttcacagtatgggctacacagcagctgttcaacaagagtaaatattgtcacaacactgaacctctggctagaggacatattcacagtgaacataactgtaacatatatgaaaggcttctgggacttgaaatcaaatgtttgggaatggtgcccttggaggcaacctcccattttagatgtttaaaggaccctatatgtggcattcctttctttaaactataggtaattaaggcagctgaaaagtaaattgccttctagacactgaaggcaaatctcctttgtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttgtgaaagcaccatcatcatagaggatgatgtaattaaaaaatggtcagtgtgcaaagaaaagaactgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagttcatagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaaaaatatcttaggttggagatgacagaaatatgattgatttgaagtggaaaaagaaattctgttaatgttaattaaagtaaaattattccctgaattgtttgatattgtcacctagcagatatgtattacttttctgcaatgttattattggcttgcactttgtgagtattctatgtaaaaatatatatgtaTATAAAatatatattgcataggacagacttaggagttttgtttagagcagttaacatctgaagtgtctaatgcattaacttttgtaaggtactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcatgtaagaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttctatatttgtaactttgcatgttcttgttttgttaTATAAAaaaattgtaaatgtttaatatctgactgaaattaaacgagcgaagatgagcacca
dog AF022714
ggggcaacct tcctgttttc attat 
cat AF003087
gggcaacct
      781 tcctgttttc attat

PrP (prion) gene expression in sheep may be modulated by alternative polyadenylation of its messenger RNA

J Gen Virol 1999 80: 2275
W. Goldmann, G. O'Neill, F. Cheung, F. Charleson, P. Ford and N. Hunter 
This paper looks at two length of mRNA found in various sheep tissues, the 2.1kb (from TTAAGGTACCTTAATTAAA.CTACCTTCTAGACACTG- ending at 1634 of U67922 mRNA) and 4.6kd which differ only in polyadenylation site. The 4.6 is highest in brain; the 2.1 highest in spleen. 2.1 is found in sheep and goat, marginally in cow, not seen in human or mouse. [This statement is inconsistent with homology alignment.]
     1561 gtgatattcc tttctttagt aacataaagt atagataatt aaggtacctt aattaaaCTA
     1621 CCTTCTAGAC ACTGagagca aatctgttgt ttatctggaa cccaggatga ttttgacatt
1634 of U67922 mRNA is 26295 of whole sequence. 4.6 does not seem to have proper signal-site separation
     4081 ctcatatgtc atggggcaga gtcaagtccc cattgtgcct gtccaactct ttggcctaca
     4141 caattcatgg gcatAATAAA atggtggttt ctttagacca ttaagttttg gagtagttgc
682 bp after the 2.1 site comes a signal, possibly that of the 2.6k.
2281aaattgacttgaatcaaaagggaggcatttaaagaAATAAAttagagatgatagaaatct +36 2316

1130  bp after the 2.1 site comes a signal, possibly that of the 3.3, 14 bp after the Line element
2761gacAATAAAtgctgggtcggctaaaaggttcattaggttttttttctgtaagatggctct

Sheep also show a minor 3.3kb band (amounting to 1-5% of mRNA) and a 2.6kb species seen in kidney. No tissues are 2.1 only, some is in brain heart and liver, contrary to earlier findings. 2.1 in brain ils possibly from specific regions, this was not tested. Goldmann et al. also assert unpublished alternative 5' splicing sites in cow but not sheep.

M131313 and AJ223072 said to have 6 allelic differences and 30 differences relative to Lee's heroic sequence U67922 and in several positions to Cheviot (Goldmann, unpublished), showing high mutation rate or (very likely, for reasons given by Lee) sequencing error. This could have been tested by alignment with cow and mink 3' UTR but was not.

Sheep 3'UTR have 3 retrotransposons, poorly described in this paper and claimed to have high GC and high GT said to function as RNA polymerase stop signals. The mRNA species are not breed or allele or scrapie related. Regions ABCDEFG were previously defined by Goldmann [PNAS 87 2476-2480 1990] -- it is unclear what significance these have. Goldmann Brit Med Bulletin 49: 839-8601 1993 says retrotransposons are regions D and F, lacking in 2.1. Sheep 2.1 is found in domains ABC, said to be non-homologous to human.

Protein expression was measured in a heterologous system , mouse neuroblastoma. The best protein expression correlated with shortest 3' UTR in region G; short 3' UTR in region C was distinctly less efficient. Poor translation of full length mRNA intransfected cells, unlike sheep brain, suggests murine neuroblasomat cells were unsuitable for characterizing relative translation efficiencies in vivo in sheep.

Bovine ovary, uterus, and brain also low 2.1 but still 0.5-2% of total mRNA. Horiuchi 1995 compared ovine kidney and brain: 20% less mRNA in kidney, yet 2.5% the protein, factor of 8x less use. Here ovine kidney was 75% 4.6 and 25% 2.1 or 15% and 5% of brain, consistent with but hardly proving that only 2.1 is translated in kidney. Sheep express 4.6 kb mRNA two-thirds through gestation in brain, as does mouse. Various fetal tissues and young lambs and placenta also express the prion gene. Fetal only tonsil had both mRNA forms. Prion mRNA is easily detected at 98 days gestation, is 100 x higher at day 134, and 200x higher in early lamb. 4.6/2.1 ratio was 1 at fetal 98, 3 at fetal 138 and early lamb and held steady. Thymus had a ratio of 4, drops during early lamb by factor of 8.

Calf had 4.6 in brain, kidney, and spleen, liver, ovary, and uterus (which both had some 2.1). Goat spleens had both; brains were 4.6. Humans have 2.5k mRNA; brain 4x as abundantly as liver and heart, 8x that of lung, placenta, muscle, kidney, and pancreas. Humans are said to have 3 additional consensus sites at 1765, 1978, and 2098 that could give shorter mRNAs. This 2.5k must be adjusted for exon 1 and the leader and coding portion of exon 3 to give numbering in terms of the 3' UTR; exon 1 is 134 bp, exon 2 is 97bp, exon 3 leader is 11 bp, coding + stop is 738 totalling 980 bp. Mouse and hamster have 2.5 and a 1.2 in peripheral tissues close to 1152 [Locht PNAS 83: 6372-6376 1986]. No 1.2 seen here.

Sheep said to have 3220 polyA signal versus 3246 given at GenBank annotation of Lee's UTR. Human 3' UTR consists of 1606 bp. 9 Suffolk polyA signals said to be at 1a AATAAA 1523, 1b TATAAA 1523, AATAAA 2222, AATAAA 2285, AATAAA 2667, ATTAAA 4063. Not in table: 1253, 4038, 4678. Cheviot unpublished have 1253, 1523, 4038, and 4063.

The majority of mRNA is polyadenylated about 20 bp downstream of the polyA signal at 4063 in Cheviot. 2.1 is polyadenylated 23 bp downstream of ATTAAA 1523.

References/Abstracts

Medline search 13 Aug 99

Genomic structure of the bovine PrP gene and complete nucleotide sequence of bovine PrP cDNA.

Anim Genet 1998 Feb;29(1):37-40
Horiuchi M, Ishiguro N, Nagasawa H, Toyoda Y, Shinagawa M 
The extent of intron 2 of the bovine PrP gene and the nucleotide sequence of the 3' half of bovine PrP cDNA is given. This newly sequenced 3' half of the bovine PrP cDNA consisted of 2149 bp. The entire 3'-untranslated region (3'-UTR) was found to be encoded by a single exon, exon 3. One nucleotide polymorphism was found in the 3'-UTR.

Complete genomic sequence and analysis of the prion protein gene region from three mammalian species.

Genome Res 1998 Oct;8(10):1022-37 
Lee IY, Westaway D, Smit AF, ... Cooper C, Yao H, Prusiner SB, Hood LE
A major paper that determined and analyzed entire prion genes and flanking regions from sheep, mouse, and human.

Alternative usage of exon 1 of bovine PrP mRNA.

Biochem Biophys Res Commun 1997 Apr 28;233(3):650-4
Horiuchi M, Ishiguro N, Nagasawa H, Toyoda Y, Shinagawa M
Here we report two types of bovine prion protein mRNA that possessed different lengths of the 5'-untranslated region and were expressed in various bovine tissues. The two mRNA species were transcribed from identical positions but differed in the usage of the splice site for exon 1/intron. One mRNA possessed exon 1 consisting of 53 nucleotides and the other possessed exon 1 consisting of 168 nucleotides. Usage of exons 2 and 3 was identical for the two mRNA species. The two mRNA species were detected in all but spleen tissue; the mRNA possessing 168-nt exon 1 was not detected in bovine spleen. This is the first report on the tissue-specific alternative splicing of PrPc mRNA in any other species. Only a low level of PrPc appeared to be present in bovine spleen. These results suggested the possibility that the mRNA possessing 53-nt exon 1 was inefficiently translated into Prp; however, in vitro translation analysis showed no marked difference in translational efficiency between the two mRNA species.

Polymorphisms in the 3' untranslated region of the prp messenger RNA are linked to scrapie incubation period

Unpublished, Genbank X83613 1237 bp and X83612 17 Feb 97
Baybutt,H. and Hope,J.
variation       444 a in VM-S7 mice, g here  
variation       1010 a in VM-S7 mice, g here

Comparison of expression patterns of PrP mRNA in the developing sheep and mouse.

Ann N Y Acad Sci 1994 Jun 6;724:353-4 
Hunter N, Manson JC, Charleson FC, Hope J

A cellular form of prion protein (PrPC) exists in many non-neuronal tissues of sheep.

J Gen Virol 1995 Oct;76 ( Pt 10):2583-7
Horiuchi M, Yamazaki N, Ikeda T, Ishiguro N, Shinagawa M
A cellular form of the prion protein (PrPC) is thought to be a substrate for an abnormal isoform of th eprion protein (PrPSc) in scrapie. PrPC is abundant in tissues of the central nervous system, but little is known about the distribution of PrPC in non-neuronal tissues of sheep, the natural host of scrapie. This study investigated the tissue distribution of PrPC in sheep. Although PrPC was abundant in neuronal tissues, it was detected in non-neuronal tissues such as spleen, lymph node, lung, heart, kidney, skeletal muscle, uterus, adrenal gland, parotid gland, intestine, proventriculus, abomasum and mammary gland. Neither PrPC nor PrP mRNA was detected in the liver. The tissue distribution of PrPC appears to be inconsistent with the tissues which possess scrapie infectivity, suggesting that factor(s) specific to certain cell types may be required to support multiplication of the scrapie agent.

PrP gene and its association with spongiform encephalopathies.

Br Med Bull 1993 Oct;49(4):839-59 [Review, nothing relevent in abstract]
Goldmann W

Two alleles of a neural protein gene linked to scrapie in sheep.

Proc Natl Acad Sci U S A 1990 Apr;87(7):2476-80 [Nothing relevent in abstract]
Goldmann W, Hunter N, Foster JD, Salbaum JM, Beyreuther K, Hope J
Sheep are the natural hosts of the pathogens that cause scrapie, an infectious degenerative disease of the central nervous system. Scrapie-associated fibrils [and their major protein, prion protein (PrP)] accumulate in the brains of all species affected by scrapie and related diseases. PrP is encoded by a single gene that is linked to (and may be) the major gene controlling the incubation period of the various strains of scrapie pathogens. To investigate the role of PrP in natural scrapie, we have determined its gene structure and expression in the natural host. We have isolated two sheep genomic DNA clones that encode proteins of 256 amino acids with high homology to the PrPs of other species. Sheep PrPs have an arginine/glutamine polymorphism at position 171 that may be related to the alleles of the scrapie incubation-control gene in this species.

Characterization of the bovine prion protein gene: the expression requires interaction between the promoter and intron.

Inoue S, Tanaka M, Horiuchi M, Ishiguro N, Shinagawa M
J Vet Med Sci 1997 Mar;59(3):175-83  
We cloned the part of the bovine PrP gene which contains the 5'-flanking region, exon 1, exon 2 and intron 1 to analyze its promoter region. The 5' non-coding region of the bovine PrP gene consisted of three exons and two introns, and its organization was similar to that of the mouse, rat and sheep PrP genes...

Molecular cloning and complete sequence of prion protein cDNA from mouse brain infected with the scrapie agent.

Proc Natl Acad Sci U S A 1986 Sep;83(17):6372-6
Locht C, Chesebro B, Race R, Keith JM
The prion protein (PrP) is a scrapie-associated fibril protein that accumulates in the brains of hamsters and mice infected with the scrapie agent, and also in the brains of persons affected with kuru or Creutzfeldt-Jakob disease. It has been previously proposed that PrP could be either the primary transmissible agent of scrapie or a secondary component involved in the pathogenesis of scrapie. At present, the second possibility seems more likely, for the PrP-specific mRNA is present in both infected and uninfected brains. We have isolated and sequenced the complete PrP-specific cDNA from mRNA isolated from infected mouse brains. Comparison of the mouse PrP with the hamster PrP reveals a high homology in the amino acid sequence and the presence of a conserved octapeptide repeated four times, whose function is unknown at present. Structural features are discussed and compared with other proteins. Except for its homology with the hamster PrP, mouse PrP has no significant homology to any known protein sequence, including neurofilaments, neuropeptides, and amyloid proteins of Alzheimer disease. Some features of the PrP, however, are similar to structures found in aggregating proteins, such as the wheat glutenin, keratin, and collagen.

Molecular cloning of a mink prion protein gene

J. Gen. Virol. 73 (Pt 10), 2757-2761 (1992)
Kretzschmar,H.A., Neumann,M., Riethmuller,G. and Prusiner,S.B.

Comparative sequence analysis and expression of bovine PrP gene in mouse L-929 cells

Virus Genes 6 (4), 343-356 (1992)
Yoshimoto,J., Iinuma,T., Ishiguro,N., Horiuchi,M., Imamura,M. and Shinagawa,M.
They sequenced well into the 3' UTR of the bovine gene.

Medline shows 101 reviews of polyadenylation. Highlights: 3'-Ends of almost all eukaryotic mRNAs are generated by endonucleolytic cleavage and addition of a poly(A) tail. In mammalian cells, the reaction depends on the sequence AAUAAA upstream of the cleavage site, a degenerate GU-rich sequence element downstream of the cleavage site and stimulatory sequences upstream of AAUAAA. Wahle E Bioessays 1992 Feb;14(2):113-8

I came across this goat prion mRNA sequence by accident today and am just passing it along for what it is worth. It ends 12 bp further 3' than sheep U67922's annotation. The sequences extends well into the mariner insert and seems to be of good quality for an EST. Somewhat oddly, there do not seem to be cow or sheep ESTs yet.

LOCUS       Z71825        366 bp    mRNA            EST       13-NOV-1996
  Goat mammary gland Capra hircus cDNA clone EST15-34, mRNA sequence.
 
  AUTHORS   Le Provost,F., Lepingle,A. and Martin,P.
  TITLE     A survey of the goat genome transcribed in the lactating mammary
            gland
  JOURNAL   Mamm. Genome 7 (9), 657-666 (1996)
   
        1 aaagataaaa agttttgtga acacagaatt atgacgttgc ctgaaaaatg gcagaaggta
       61 gtgtaacaaa agagtgacta tgttgtttgg taaagttctt agtgaaaatg aaaaatgtgt
      121 cttttatttt tatttaaaca ccaaaggcac attttggcca acccaatact gaatacttaa
      181 aggaaactct tctgtgttgt ccttagcctt acagtgtgca ctgaatagtt ttgtataaga
      241 ntccagagtg atatttgaaa tacgcatgtn cttatatttn tnatattngt aactttgcat
      301 gtacttgttt tgtgttaaaa gttttataaa tatttaatat ctgactaaaa ttaaacagga
      361 gttaaa


Masked Sequence: DNA/Mariner   positions1055 1220 

>goat
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNTACTGAATACTTAAAGGAAACTCTTCTGTGTTGT
CCTTAGCCTTACAGTGTGCACTGAATAGTTTTGTATAAGANTCCAGAGTG
ATATTTGAAATACGCATGTNCTTATATTTNTNATATTNGTAACTTTGCAT
GTACTTGTTTTGTGTTAAAAGTTTTATAAATATTTAATATCTGACTAAAA
TTAAACAGGAGTTAAA

     repeat_region   24889..26108  sheep Oamar1"



 gb|U67922|OAPRP Ovis aries prion protein gene, alignment

                                                                         
Goat : 1     aaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggta 60
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 25943 aaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggta 26002

                                                                         
Goat : 61    gtgtaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtgt 120
             ||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26003 gtggaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtgt 26062

                                                                         
Goat : 121   cttttatttttatttaaacaccaaaggcacattttggccaacccaatactgaatacttaa 180
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26063 cttttatttttatttaaacaccaaaggcacattttggccaacccaatactgaatacttaa 26122

                                                                         
Goat : 181   aggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataaga 240
             ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26123 aggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataaga 26182

                                                                         
Goat : 241   ntccagagtgatatttgaaatacgcatgtncttatatttntnatattngtaactttgcat 300
              |||||||||||||||||||||||||||| ||||||||| | ||||| ||||||||||||
Sheep: 26183 atccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcat 26242

                                                                         
Goat : 301   gtacttgttttgtgttaaaagttttataaatatttaatatctgactaaaattaaacagga 360
             ||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
Sheep: 26243 gtacttgttttgtgttaaaag-tttataaatatttaatatctgactaaaattaaacagga 26301

                   
Goat : 361   gttaaa 366
             | ||||
Sheep: 26302 gctaaa 26307

Mad Cow Home ... Best Links ... Search this site