What are prion ESTs good for? Introduction
Human and mouse ESTs
Alignment of sheep and cow 3' UTR
Prion gene 3' UTR resources at GenBank
Prion gene expression in sheep modulated by alternative polyadenylation
References/Abstracts
Terminal alignment of the major mRNA
Tissues where mouse and human prion ESTs are found
Anti-prion 4.5 kb mRNA: harmless prion pseudogene?
22 Aug 1999 webmasterPrion research has many bizarre aspects, most notably a culture of denial vis-a-vis related amyloidoses, foregoing of easy and instructive experiments in favor of yet another pathology phenotyping or flock ORF genotyping, widespread unawareness of protein chemistry central to the subject, and missing skills pertaining tobasic online bioinformatic resources used so widely elsewhere in molecular medicine.
For example, there are 443 prion sequences at GenBank. Some 265 of these are expressed sequence tags, ie, various factory labs perform RT PCR with oligo dT on mRNA from a bizillion tissues of mouse and human to see which genes were expressed in which tissues. There is not a single use nor mention of this resource in the 6,000 paper prion literature.
For historical reasons, many people take a dim view of the reliability of EST sequences, so it is important to recognize that sequencing accuracy is actually excellent, often at the level of 0-1 errors per 400 base pairs. (Compare this to the guinea pig or kudu prion sequences.)
Note that EST sequences are not full length for a mRNA as long as the human prion gene, which typically has a 1606 bp 3' UTR, so they are not suitable for finding a tissue where exon 2 is utilized. EST sequences, despite their large numbers, do not represent large numbers of individuals at this time and so do not often detect polymorphisms.
What then are these prion EST sequences good for? Because they start at the 3' end of an in vivo processed transcript and work their way upstream, the ESTs have direct information on alternative polyadenylation, acceptable polyA signals and sites, and tissue use, and thus implications for tissue-specific regulation of mRNAs types, stability, and utilization.
This in turn has possible applications to sporadic CJD and nvCJD or scrapie susceptibility via polymorphisms and mutations (possibly in a secondary gene) resulting in prion protein overproduction. After all these years, we still have no idea how much sporadic CJD is non-ORF familial nor what else beyond met/met distinguishes the nvCJD victims genetically.
There are two motivating precedents for looking at the 3' UTR:
(1) a 1996 paper [Hum Mut 7:280]: the -A21G nucelotide polymorphism upstream of the met initiator codon is found exclusively in A117V. In other words, these two polymorphisms are too tightly linked to have ever been separated by recombination. More bluntly, does A117V cause CJD or does -A21G?
(2) A meeting abstract [#869 Am Soc Hum Gen 46, Mahal SP et al] from 1996 also found two polymorphisms within 600 bp upstream of exon 1 in a screen of sporadic CJD and apparently nvCJD. Nothing further has been released by August 1999 -- a long delay that raises questions.
Some specific mouse and human ESTS are discussed below along with a graphic comparing alternate prion mRNA organization across 8 species and sequence alignments around the various alternative polyA sites, which have held up well over evolutionary time scales.
20 Aug 99 webmasterThe seven species studied so far have predominant mRNA of similar size, after discounting retrotransposons, alternative exon 1 and 2 use, and secondary mRNA polyadenylation sites. Lee et al aligned 4 of 7 species' terminal 3' UTR mRNA in Fig. 6C of their 1998 Genome Research paper, noting a common poly A signal and site, as well as a GT-rich post-adenylation stretch and a conserved 40bp box upstream of the signal.
Here this alignment is extended to all available species (some with multiple independent sequences) and continued in both directions. In effect, the alignability suggests that this site has been the major polyA since these species last had a common ancestor some 100 million years ago. This is not a universal terminus -- no homology exists to any other known human or mouse gene. The low rate at which point mutations and small indels become fixed (relative to intermediate mRNA sequences) suggest considerable selective pressure constrain this region and that polymorphisms here may affect mature prion mRNA formation stability and levels of protein produced, with possible implications for sporadic TSE and susceptibility to infection.
Note that a search for secondary structure (hairpins) in full length mRNA [R Luck, JMB 258 813 1996] found no significant structures either 5' or 3' of the open reading frame (Fig 5, pg 820). The only mRNA features conserved across species was in hairpin C in the repeat region which found support in the use of minor codons at 3rd position. Therefore it is unlikely that conserved 3' UTR stretches are related to hairpins. More likely, they have to do with polyadenylation sites or other unknown sequence features important to mRNA stability and translation or to chromosomal structure.
Using the alignment clamped to the known phylogenetic tree, an ancestral mammal 3' UTR sequence is developed (which serves as a noise-reduced query probe and a baseline for mutational rate), as well as a predicted cervid terminus (the species most likely to be next sequenced). As noted by Lee, rodents have experienced a more rapid rate of mutational fixation than the other lineages.
pre-signal alignnment is good: human 1 gaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttcta tatttgtaactttgcatgt-tcttgttttgttatataaaaaaattgtaaatgtttaatatctgactgaa 128 hamster 2321 ...............c..c.-...t.....c......c...........c.c........ ...................-c..........c.........gt..a.........gc... ....... 2446 mouse 30565 ..........-.c...........c.cg....... ...................a.t.........c.........gt..a.........gc... a...... 30666 rat 2504 ......c...-.c...........c.cg....... ............c......a.-.........c.........gt..a.........gc... 2596 bovine 4127 .........c......g...-.......... ...................-a.........-.g.-.-....gt..a.....a.............. 4218 sheep 26196 .........c......g...-.......t.. ...................-a.........-.g.-.-....gt..a.....a.............. 26287 mink 2286 ......c............................ ...........gg......-a..... ........ 2604>bovinenopost-mRNA. 4021cacgttttggccaacccaatactgaatacttaaaggaaactcttccgtgttgtccttagc 4081cttacagcgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcat 4141gtgcttatattttctatatttgtaactttgcatgtacttgttttgtgttaaaagtttata 4201aatatttaatatctgactaaaattaaacaggagctaaaaggagg >sheep 26101caacccaatactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtg 26161cactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatatt 26221ttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaata 26281tctgactaaaattaaacaggagctaaaaggagtatcttccacggagtgtctggctgtgtt 26341caccagtgtgcacactatgttggcagcttcatttggggggttaatatgagaaaagtgaca >mink ctaaatacttaatatgtaga 2221aatccttttgcgtggtcctcaggcttacacgtgcactgaatagttttgtatgatagagcc 2281catgtggtcttcgaaatatgcatgtactttatattttctatatttgtaactgggcatgta 2341cttgtataaaaaatgtataaacattcgaactcttgactagaattaaacaggaactgagtg 2401tgtcccatgtgtttgcagtgacattcaccaccgcaccctgtgttgg >human 27601atatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcat 27661gtaa gaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattt 27721tctatatttgtaactttgcatgttcttgttttgttatataaaaaaattgtaaatgtttaa 27781tatctgactgaaATTAAAcgagcgaagatgagcaccacctcccgtgtctgcagttgtatt 27841tcctggtgcttgccctgtgttggggactgttttgggggttaatctgagccaagtggcgct 27901 ttctgtcctc ccttctcaag tgatggccga tggttcacgc acttccccct gttcctgccc 27961 ttgtcctcac ttcccagtca cccactagtt catctctgcg gcttttgcat tttctccaca >mouse 30481atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta 30541ggattccaaagcagacccctagctggtctttgaatctgcatgtacttcacgttttctata 30601tttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctatca 30661gactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctccaatgc 30721tccgagtccactgtttatcgccagggtggcttgggctcatttcacatccctgtccctgag >rat 2401ctgaagtgtggaacgcactggccgttctgtgcagtactaagtgtgacccttgggctttca 2461atgtgcactcggttccgtatgattccaaagtagagccctagctggtcttcgaatctgcat 2521gtacttcacgttttctatatttgtaacttcgcatgtatttgttttgtcatataaaaagtt 2581tataaatgtttgctatctgactgacattaaatagaagctatgatgagcacgtgtgggggt 2641ttttctccttcaatgctcctggccctgtgtttgtcacgagggtggcttgggctcatctga >hamster 2221atgcatccgaagtacgtaatgcactgaccatttcacccggtatcagatgttttctgtgtg 2281gcccctagctttccttcaacatgcattcggttccatatatgaatccaaagtggaccccct 2341aactggtctctgaaatctgcatgtacttcacattttctatatttgtaactttgcatgtcc 2401ttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaataggagc 2461tatgatgagcacccctgcagggtttgttctctgttctctgcttctggcccttgtgtttgt 2521 tgccagggta acttgggctc acacaaggta ggtaatggct aatttcacat gccttcccct >rodent atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta ggattccaaagTagacccctagctggtctttgaatctgcatgtacttcacgttttcta tatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctat cTgactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctcc aatgctccgagtccactgtttatcgccagggtggcttgggctcatttcacatccc tgtccctgaTgggcctcgggtcttacctctggtcctgtcttgtttccactggc tttgcatTttcccctaagttGtacttagccctgctgaaacacaaaagcactcctggg gaggaggggtggggagagga
CLUSTAL W (1.74) multiple sequence alignment
bovine CACGTTTTGGCCAACCCAATACTGAATACTTAAAGGAA--ACTCTTCC------------ 46
sheep -----------CAACCCAATACTGAATACTTAAAGGAA--ACTCTTCT------------ 35
mink --------------CTAAATACTTAATATGTA--G-AA--ATCCTTTT------------ 29
human -------------------------ATATGTG--GGAA--ACCCTTTT------------ 19
mouse ------------------ATCC--AGT-----ACTAAATG------CT------TACC-- 21
rat ------------------CTGA--AGTGTGGAACGCACTGGCCGTTCTGTGCAGTACTAA 40
hamster --------------ATGCATCCGAAGTACGTAATGCACTGACCATTTCACCCGGTATCAG 46
* *
bovine ---------GTGTTGTCCTTAGCCTT---ACAGCGTGCACTGAATAGTTTTGTATAAGA- 93
sheep ---------GTGTTGTCCTTAGCCTT---ACAGTGTGCACTGAATAGTTTTGTATAAGA- 82
mink ---------GCGTGGTCCTCAGGCTT---ACA-CGTGCACTGAATAGTTTTGTATGATAG 76
human ---------GCGTGGTCCTTAGGCTT---ACAATGTGCACTGAATCGTTTCATGTAAGA- 66
mouse ---------GTGTGACCCTTGGGCTT---TCAGCGTGCACTCA---GTTCCGTAG--GA- 63
rat ---------GTGTGACCCTTGGGCTT---TCAATGTGCACTCG---GTTCCGTAT--GA- 82
hamster ATGTTTTCTGTGTGGCCCCTAGCTTTCCTTCAACATGCATTCG---GTTCCATATATGA- 102
* ** ** * ** ** **** * *** * *
bovine ATCCAGAGTGA--------------TATTTGAAATACGCATGTGCTT-ATATTTTCTATA 138
sheep ATCCAGAGTGA--------------TATTTGAAATACGCATGTGCTT-ATATTTTTTATA 127
mink AGCCCATGTGG--------------TCTTCGAAATATGCATGTACTTTATATTTTCTATA 122
human ATCCAAAGTGGACACCATTAACAGGTCTTTGAAATATGCATGTACTTTATATTTTCTATA 126
mouse TTCCAAAGCAGACCCC--TAGCTGGTCTTTGAA-TCTGCATGTACTTCACGTTTTCTATA 120
rat TTCCAAAGTAGAGCCC--TAGCTGGTCTTCGAA-TCTGCATGTACTTCACGTTTTCTATA 139
hamster ATCCAAAGTGGACCCCC-TAACTGGTCTCTGAAATCTGCATGTACTTCACATTTTCTATA 161
** * * * *** * ****** *** * **** ****
bovine TTTGTAACTTTGCATGTACTT-GTTTTGTGTT---AAAAGTTTATAAATATTTAATATCT 194
sheep TTTGTAACTTTGCATGTACTT-GTTTTGTGTT---AAAAGTTTATAAATATTTAATATCT 183
mink TTTGTAACTGGGCATGTACTT-GTATA--------AAAAATGTATAAACATTCGAACTCT 173
human TTTGTAACTTTGCATGTTCTT-GTTTTGTTATATAAAAAAATTGTAAATGTTTAATATCT 185
mouse TTTGTAACTTTGCATGTATTTTGTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCA 180
rat TTTGTAACTTCGCATGTATTT-GTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCT 198
hamster TTTGTAACTTTGCATGTCCTT-GTTTTGTCATATAAAAAGTTTATAAATGTTTGCTATCT 220
********* ****** ** ** * **** * **** ** **
bovine -GACTAAAATTAAACAGGAGCTAAAAGGAGG----------------------------- 224
sheep -GACTAAAATTAAACAGGAGCTAAAAGGAGTATCTTC--CACGGAGTGTCTGGCTGTG-- 238
mink TGACTAGAATTAAACAGGAACT----G-AGTGTGTCC--CATG---TGTTTG-CAGTGAC 222
human -GACTGAAATTAAAC--GAGCGAAGATGAGCACCACCT-CCCG---TGTCTG-CAGTTGT 237
mouse -GACTGACATTAAATAGAAGCTATGATGAACACC--TGGCGGG----GTTTG----TTCT 229
rat -GACTGACATTAAATAGAAGCTATGATGAGCACG--TGTGGGG----GTTT-----TTCT 246
hamster -GACTGACATTAAATAGGAGCTATGATGAGCACCCCTG-CAGG----GTTTG----TTCT 270
**** ****** * * *
bovine ------------------------------------------------------------
sheep ---TTCACCAGTGTGCACACT-ATGTTGGCAGCTTC-ATTTGGGGGGTTAATATGAGAAA 293
mink A--TTCACCACCG--CACCCT-GTGTTGG------------------------------- 246
human A--TTTCCTGGTGCTTGCCCT-GTGTTGGGGACT---GTTTTGGGGGTTAATCTGAGCCA 291
mouse CTCTCCAATGCTCCGAGTCCA-CTGTTTATCGCCAGGGTGGCTTGGGCTCATTTCACATC 288
rat C-CTTCAATGCTCCTGGCCCT-GTGTTTGTCACGAGGGTGGCTTGGGCTCAT-------- 296
hamster CTGTTCTCTGCTTCTGGCCCTTGTGTTTGT------------------------------ 300
sheep U67922>rodent atccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgta ggattccaaagTagacccctagctggtctttgaatctgcatgtacttcacgttttcta tatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctat cTgactgacattaaatagaagctatgatgaacacctggcggggtttgttctctctcc aatgctccgagtccactgtttatcgccagggtggcttgggctcatttcacatccc tgtccctgaTgggcctcgggtcttacctctggtcctgtcttgtttccactggc tttgcatTttcccctaagttGtacttagccctgctgaaacacaaaagcactcctggg gaggaggggtggggagagga
20 Aug 99 webmasterOf the 204 Soares ESts, 14 are from brain, 16 from mammary, 18 from embryo, 60 from tumors, utuerus, lung, kidney, skin, Tcell, colon, liver, myotubes, heart, melanocyte, spleen, ovary,
(((((prion[All Fields] AND soares[All Fields]) NOT brain[All Fields]) NOT mammary[All Fields]) NOT embryo[All Fields]) NOT tumor[All Fields]) --
16 Aug 99 webmasterA great many of the 443 prion sequences at GenBank are actually expressed sequence tags, or RT PCR using polyT primers on bulk mRNA in various tissues of mouse and human. These thus give information about the 3' UTR and possibly about tissue use of various alternative polyadenylation sites. The GenBank search term ((((((prion[All Fields] NOT NCI_CGAP[All Fields]) NOT Soares[All Fields]) NOT Sugano[All Fields])NOT Stratagene[All Fields]) NOT patent[All Fields]) NOT primer[All Fields]) leaves 174 sequences of actual prion genes from various species.
Blastn, set to human ests, with the 3'UTR of human prion + 100 bp past the mRNA as query, ie 26212 to 27817 to 27917, pulls up 168 ESTs, with somewhat ragged ends 3' but mostly corresponding to the expected main mRNA. However, there were ones that began farther upstream. Does this reliably mean that in some tissues, mRNA is made and polyadenylated at an earlier site?
The NCI_CGAP series has 49 sequences (all human), the Soares 203 (122 human, 81 mouse), Stratagene has 33 (27 mouse, 4 rat) and the Sugano 13 (all mouse), accounting for 265 of the prion entries. The entries give the primer used. Some of the more recent NCI_CGAP simply use oligo dT as primer and recover mRNA sequences of excellent quality. For example AI801189 has 381 bp in 100% agreement with the main human sequence, U29185. The EST covers 27387 to 27816 of that sequence (plus 1 additional A), ie, perfectly reflects the most common in vivo mRNA, of length 2587 bp, the so-called 2.5 kb mRNA, with probable polyA signal aaattaaacg agcgaagatg agcacc. 123 ESts have similar terminations.
By including flanking sequence past the end of the main human mRNA, it can be seen that 111 EST sequences begin as expected and none continue on downstream (though minor ragged ends are seen). Thus the 2586 bp mRNA (called the 2.5k species) is the longest seen.
The NCI_CGAP series almost all start at the normal mRNA end. However there were 3 unusual ones, one starting at 501, 1239 upstream, and 312 bp downstream of the usual end. The one shown runs in U29185 from 26325-26705, whereas the 3'UTR runs -26212-27817, ie this mRNA is 1113 bases shorter than the full 1606 bp 3' UTR.
26281 gcccttttag tggtggtgtc tcactctttc ttctctcttt gtcccggata ggctaatcaa
26341 tacccttggc actgatgggc actggaaaac atagagtaga cctgagatgc tggtcaagcc
26401 ccctttgatt gagttcatca tgagccgttg ctaatgccag gccagtaaaa gtataacagc
26461 AAATAAccat tggttaatct ggacttattt ttggacttag tgcaacaggt tgaggctaaa
26521 acaaatctca gaacagtctg aaataccttt gcctggatac ctctggctcc ttcagcagct
26581 agagctcagt atactaatgc cctatcttag tagagatttc atagctattt agagatattt
26641 tccattttaa gaaaacccga caacatttct gccaggtttg ttaggaggcc acatgatact
26701 tattc aaaaa aatcctagag
Query: 1598 ATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTATATAACAAA 1539
Query: 1585 TAATTTCAGTCAGATATTAAACATTTACAATTTTTTTATATAACAAAACAAGAACATGCA 1526
Query: 1539 AACAAGAACATGCAAAGTTACAAATATAGAAAATATAAAGTACATGCATATTTCAAAGAC 1480
Query: 501 TTTTTTTGAATAAGTATCATGTGGCCTCCTAACAAACCTGGCAGAAATGTTGTCGGGTTT 442 >gb|AI354282.1|AI354282
Query: 1239 AACATTGCAGAAAAGTAATACATATCTGCTAGGTGACAATATCAAACAATTCAGGGAATA 1180 >gb|AI828378.1|AI828378
>gb|AA906777.1|AA906 Length = 292
Query: 1606 TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT 1547
TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT
Sbjct: 312 TGGTGCTCATCTTCGCTCGTTTAATTTCAGTCAGATATTAAACATTTACAATTTTTTTAT 371
>gi|4094435
TTTTTTTTTTTTTTTTTTTTTTTTTTTGAATAAGTATCATGGGGCCTCCTAACAAACCTGGCAAAAATGT TGTCGGGTTTTCTTAAAAGGGAAAATATCTTTAAATAGCTATGAAATCTCTACTAAAATAGGGCATTAGT ATACTGAGCTCTAGCTGCTGAAGGAGCCAGAGGTATCCAGGCAAAGGTATTTCAAACTGTTCTGAGATTT GTTTTAGCCTCAACCTGTTGCACTAAGTCCAAAAATAAGTCCAGATTAACCAATGGTTATTTGCTGTTAT ACTTTTACTGGCCTGGCATTAGCAACGGCTCATGATGAACTCAATCAAAGGGGGCTTGACCAGCATCTCA GGTCTACTCTATGTTTTCCAGGGCCCATCAGGGCCAAGGGTATTGATTAGCCTATCCG
Probing the mouse EST database with the 3' UTR of mouse mRNA turns up a number of sequences shorter than full length mRNA. For example, gi|4617056 matches U29186 genomic mouse prion from 29425 - 29950 whereas normal exon 3 UTR runs from 29455- 30687, ie, there must be a polyA site at 29950 and a polyA signal preceding this. This gives rise to a 3'UTR of 496 bp and a mRNA of 1415 bp. At least 10 other ESTs begin in the same region, often up to 50 bp shorter, from embryo, T cell, mammary, and myotubes but not mouse brain [AA56217984, AI11755268, AA64593576, AA16379486, AA9606664, AI6078895, AA26019379, AA47607827, W99102, AA72698185, AI89319821]
29701 AAATAActgc tggctagttg gggctttgtt ttggtctagt gAATAAAtac tggtgtatcc
29761 cctgacttgt acccagagta caaggtgaca gtgacacatg taacttagca taggcaaagg
29821 gttctacaac caaagaagcc actgtttggg gatggcgccc tggaaaacag cctcccacct
29881 gggatagcta gagcatccac acgtggaatt ctttctttac taacaaacga tagctgattg
29941 aaggcaacag gaaaaaaaaa atcaaattgt
>AI607889 Soares mouse mammary gland oligoT-ctgttgccttcaatcagctatcgtttgttagtaaagaaagaattccacgtgtggatgctctagctatc ccaggtgggaggctgttttccagggcgccatccccaaacagtggcttctttggttgtagaaccctttgcc tatgctaagttacatgtgtcactgtcaccttgtactctgggtacaagtcaggggatacaccagtatttat tcactagaccaaaacaaagccccaactagccagcagttatttggtgttatattcttattggcccggtgtt agcactggctgatgacagactccatcaaagggacctgaagcaaagagcaactggtctactgtacatttcc cagggcccatcagtgccaggggtattagcctatgggggacacagagaagcaagaatgagaaccacctcaa ttgaaagagctacaggtggataacccctcccccagcctagaccacgagaatgcgaaggaacaagcaggaa agcctccctcatcccacgatcagangatgaggaaaggaA later site corresponding to 29791-30434 (ie, 253bp less than full mRNA) is represented by several sequences as well:
30181 tagcttctgc cctatgtttc tgtacttcta tttgaactgg ataacagaga gacaatctaa
30241 acattctctt aggctgcaga taagagaagt aggctccatt ccaaagtggg aaagaaattc
30301 tgctagcatt gtttaaatca ggcaaaattt gttcctgaag ttgcttttta ccccagcaga
30361 cataaactgc gatagcttca gcttgcactg tggattttct gtatagAATA TATAAAacat
30421 aacttcaagc ttat gtcttc tttttaaaac atctgaagta tgggacgccc tggccgttcc
>AA522175.1 Soares mouse mammary GTGACACATGTAACTTAGCATAGGCAAAGGGTTCTACAACCAAAGAAGCCACTGTTTGGGGATGGCGCCC TGGAAAACAGCCTCCCACCTGGGATAGCTAGAGCATCCACACGTGGAATTCTTTCTTTACTAACAAACGA TAGCTGATTGAAGGCAACAGGAAAAAAAAAATCAAATTGTCCTACTGACGTTGAAAGCAAACCTTTGTTC ATTCCCAGGGCACTAGAATGATCTTTAGCCTTGCTTGGATTGAACTAGGAGATCTTGACTCTGAGGAGAG CCAGCCCTGTAAAAAGCTTGGTCCTCCTGTGACGCGAGGGATGGTTAAGGTACAAAGGCTAGAAACTTGA GTTTCTTCATTTCTGTCTCACAATTATCAAAAGCTAGAATTAGCTTCTGCCCTATGTTTCTGTACTTCTA TTTGAACTGGATAACAGAGAGACAATCTAAACATTCTCTTAGGCTGCAGATAAGAGAAGTAGGCTCCATT CCAAAGTGGGAAAGAAATTCTGCTAGCATTGTTTAAATCAGGCAAAATTTGTTCCTGAAGTTGCTTTTTA CCCCAGCAGACATAAATTGCGATACGTCAGCTTGCACTGTGGATTTTCTGTATAGAATATATAAAACATA ACTTCAAGCTTAT
7 Aug 99 webmasterThe Blast server can be used to provide an excellent alignment. Below Lee's sheep sequence (with retrotransposons masked out) was aligned against other ruminant prion sequences. After small adjustments by hand near indels, outgroup arbitration was used to construct the ancestral ruminant 3' UTR. (This parsimony method, at positions where sheep sequences differ, lets the bovine sequence rule. This can resolve some point mutations and indels, as well as sequencing errors and other noise. Here 22 improvements can be made in the ruminant sequence; these are shown in caps. Use of mink 3' UTR can refine this further.)
In Figure 3 of Goldmann et al 1999, clear homology is established about the 2.1k polyA signal at 680 3' UTR for cow and sheep. Indeed, 84 consecutive nucleotides are identical, mostly distal of the ATTAAA signal. The signal is found 73 bp upstream of the first retrotransposon. This signal region was not tested for longer range homology in, say, mink or human. Blastn of the 84 nucleotides pulls up mink and human prion but not rodent, with the first AATTAA putative signal conserved in mink and human (27052 region homology). [Probe: gtgatattcctttctttagtaacataaagtatagatAATTAAggtaccttAATTAAactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgct]
QUERY 1 gtgatattcctttctttagtaacataaagtatagatAATTAAggtacctt---AATT--- 54 AAacta M31313 1472 ..................................................---....--- 1525 M31313 1526 ...............---....--- 1508 AJ223072 1400 ..................................................---....--- 1453 AJ223072 1454 ...............---....--- 1436 U67922 23678 ..................................................---....--- 23731 U67922 23732 ...............---....--- 23714 AB001468 1591 ....................--.................t......---....taa 1641 D10612 1591 ....................--.................t......---....taa 1641 D38179 2147 ................................................ 2194 S46825 1614 ............c.....c....c...............c.-g..aaa....--- 1664 mink QUERY 6 attcctttctttagtaacataaagtatagatAATTAAggtaccttAATTAAacta--ccttc 65 sheep U29185 27052 taaactataggtAATTAAggcagctgaaaagtaaattgccttc human QUERY 66 tagacactgagag-caaatctg--ttgt----ttatctggaacccaggatgattttgaca 118 U29185 27057 ..........-..g.......cct....ccat...c......a....a............ 27115 ..........27001 gcattccttt ctttaaacta taggtAATTA Aggcagctga aaagtaaatt gccttctaga non-aligning human
Ancestral ruminant 3' UTR sequence (unresolvable alternative bases shown above)
....................................................a...................a.. RUMIN 1 gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc 60 ...........................c.....................g............t...........g RUMIN 61 tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata 119 ...............c..................t...............-g...................... RUMIN 120 ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta 178 1 .........................g............g......c............................. RUMIN 179 ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt 238 .....................................g.................t................... RUMIN 239 ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa 298 ........................................................c.-................ RUMIN 299 gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca 358 ................................tg..........t.............................. RUMIN 359 tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag 417 ...........................................................g............... RUMIN 418 gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa 477 .............................................t............................. RUMIN 478 tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata 537 1 .......................................................g................... RUMIN 538 atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca 597 1 ...................................................g...................--.. RUMIN 598 gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa 656 1 ...............................t....ttaat.................................. RUMIN 657 agtatagataattaaggtacc-----ttaATTAAActaccttctagacactgagagcaaa 711 2.1k site .................................................................-......... RUMIN 712 tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag 771 .........................t...................................t............. RUMIN 772 aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt 831 2 ..............................t.......a..a................................a RUMIN 832 gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac- 890 2 ...............aagaa.a.........................a..--...............c....... RUMIN 891 -----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta 945 1 .....................................a..tt.........c...................tatc RUMIN 946 ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa---- 999 ...........................ca.-............a..........g.............t.....- RUMIN 1000 -taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact 1057 ................a............................gg............................ RUMIN 1058 g-ttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag 1116 2 .......................g.....c......a.....g........t....................at.. RUMIN 1117 acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a 1174 3i ....................a............g....aagaa............................t.t. RUMIN 1175 cAtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc 1228 1 ........................................................................... RUMIN 1229 taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa 1288 ...............................................c.....................c.... RUMIN 1289 taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgc 1347 7d ........................................................................... RUMIN 1348 actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt 1407 2d ................c.......................................................... RUMIN 1408 tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat 1467 ........................................................................... RUMIN 1468 ctgactaaaattaa 1481best corrected reduced sheep sequence
gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa agtatagataattaaggtacc-----ttaattaaactaccttctagacactgagagcaaa tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac- -----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa---- -taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact g-ttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a cAtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgcd actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat ctgactaaaattaa
best corrected reduced sheep sequence (gaps removed) gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc tacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtata ataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctGttta ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca tagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaag gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa tggatattcatgcaacctttgacttatgggcagaggacatTttcacaaggaatgaacata atacGaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca gccttccattttgtatgtttAaagcaccttcaagtgatattcctttctttagtaacataa agtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaa tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag aatgcagatacaaaaActcCatattcatttgattgaatcttttcctgaaccagtgctagt gttggactggtaagAgtataacagcatatataggttatgtgatgaagagaAtagtgtac atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaaAtta ggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaagaaa taaattagagatgatagaaatctgatccattcagagtagaaaaagaaattccattact gttattTaagaaggtaaaattattTcctgaattgttcaatattgtcacctagcagatag acacTATtattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaaa cAtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacc taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa taaatgtactgaatActTaaaggaaactcttctgtgtTgtCCTtAgccttacagtgtgcd actgaAtagtttTgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat ctgactaaaattaa
gb|U67922|OAPRP Ovis aries prion protein gene, complete cds 1491 0.0 emb|AJ223072|OAPRION Ovis aries PrP gene, complete cds 1477 0.0 gb|M31313|SHPPRP Ovis aries prion protein (PrP) gene, compl... 1465 0.0 dbj|D38179|SHPPRPA Sheep gene for prion protein PrP, comple... 1342 0.0 dbj|D10cow|BOVPRP1 Bovine mRNA for prion protein 1271 0.0 QUERY 1 gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtc 60 U67922 23049 ............................................................ 23108 AJ223072 772 ............................................................ 831 M31313 843 ............................................................ 902 D38179 1518 ............................................................ 1577 D10cow 957 .....................................a...................a.. 1016 QUERY 61 tacctgcagccctgtagtggtggtgtctcatttcttgcttctctctt-gttacctgtata 119 U67922 23109 ...............................................-............ 23167 AJ223072 832 ...............................................-............ 890 M31313 903 ...............................................-............ 961 D38179 1578 ...............................................-............ 1636 D10cow 1017 ............c.....................g............t...........g 1076 QUERY 120 ataatacccttggcgcttacagcactgggaaatgaca-agcagacatgagatgctgttta 178 U67922 23168 .....................................-...................... 23226 AJ223072 891 .....................................-...................... 949 M31313 962 .....................................-.................a.... 1020 D38179 1637 .....................................-...................... 1695 D10cow 1077 c..................t...............-.g...................... 1135 QUERY 179 ttcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaatt 238 U67922 23227 ............................................................ 23286 AJ223072 950 ............................................................ 1009 M31313 1021 ............................................................ 1080 D38179 1696 ............................................................ 1755 D10cow 1136 ..........g............g......c............................. 1195 QUERY 239 ttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaa 298 U67922 23287 ............................................................ 23346 AJ223072 1010 ............................................................ 1069 M31313 1081 ............................................................ 1140 D38179 1756 ............................................................ 1815 D10cow 1196 ......................g.................t................... 1255 QUERY 299 gtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttca 358 U67922 23347 ............................................................ 23406 AJ223072 1070 ............................................................ 1129 M31313 1141 ............................................................ 1200 D38179 1816 ............................................................ 1875 D10cow 1256 .........................................c.-................ 1314 QUERY 359 tagacccagggtccaccct-gttgagagcatgtgtcctgtgtctgcagagaactataaag 417 U67922 23407 ...................-........................................ 23465 AJ223072 1130 ...................-........................................ 1188 M31313 1201 ...................-........................................ 1259 D38179 1876 ...................-........................................ 1934 D10cow 1315 ...............-...g.........t.............................. 1373 QUERY 418 gatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaa 477 U67922 23466 ............................................................ 23525 AJ223072 1189 ............................................................ 1248 M31313 1260 ............................................................ 1319 D38179 1935 ............................................................ 1994 D10cow 1374 ............................................g............... 1433 QUERY 478 tggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacata 537 U67922 23526 ............................................................ 23585 AJ223072 1249 ........................................c................... 1308 M31313 1320 ........................................c................... 1379 D38179 1995 ............................................................ 2054 D10cow 1434 ..............................t............................. 1493 QUERY 538 atacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggca 597 U67922 23586 ............................................................ 23645 AJ223072 1309 ....-....................................................... 1367 M31313 1380 ....-....................................................... 1438 D38179 2055 ............................................................ 2114 D10cow 1494 ........................................g................... 1553 QUERY 598 gccttccattttgtatgttt-aagcaccttcaagtgatattcctttctttagtaacataa 656 U67922 23646 ....................-....................................... 23704 AJ223072 1368 ....................-....................................... 1426 M31313 1439 ....................a....................................... 1498 D38179 2115 ....................-....................................... 2173 D10cow 1554 ....................a...............g...................--.. 1611 QUERY 657 agtatagataattaaggtacc-----ttaattaaactaccttctagacactgagagcaaa 711 U67922 23705 .....................-----.................................. 23759 AJ223072 1427 .....................-----.................................. 1481 M31313 1499 .....................-----.................................. 1553 D38179 2174 ..................... 2194 D10cow 1cow ................t....ttaat.................................. 1671 QUERY 712 tctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagag 771 U67922 23760 ......................................... 23800 U67922 24188 ................... 24206 AJ223072 1482 ......................................... 1522 AJ223072 1909 .........-......... 1926 M31313 1554 ......................................... 1594 M31313 1981 .........-......... 1998 D10cow 1672 ....................................... 1710 QUERY 772 aatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagt 831 U67922 24207 ............................................................ 24266 AJ223072 1927 ...............-...-........................................ 1984 M31313 1999 ...............-...-........................................ 2056 AB001cow 2113 ..........t...................................t............. 2172 QUERY 832 gttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtac- 890 U67922 24267 ...........................................................- 24325 AJ223072 1985 ..............g...................................-........- 2042 M31313 2057 ..............g...................................-........- 2114 AB001cow 2173 ...............t.......a..a................................a 2232 QUERY 891 -----atgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagtta 945 U67922 24326 -----....................................................... 24380 AJ223072 2043 -----...................................................a... 2097 M31313 2115 -----...................................................a... 2169 AB001cow 2233 aagaa.a.........................a..--...............c...a... 2290 QUERY 946 ggtccttggtttctgtaaaattgac--ttgaatcaaaagggaggcatttaaagaaa---- 999 U67922 24381 .........................--.............................---- 24434 AJ223072 2098 .........................--.............................---- 2151 M31313 2170 .........................--.............................---- 2223 AB001cow 2291 ......................a..tt.........c...................tatc 2350 QUERY 1000 -taaattagaga-tgatagaaatctgatccattcagagtagaaaaagaaattccattact 1057 U67922 24435 -...........-............................................... 24492 AJ223072 2152 -...........-............................................... 2209 M31313 2224 -...........-............................................... 2281 AB001cow 2351 t...........ca.-............a..........g.............t.....- 2408 QUERY 1058 g-ttatttaagaaggtaaaattatttcctgaattgttcaatattgtcacctagcagatag 1116 U67922 24493 .-.......................................................... 24551 AJ223072 2210 .-.....a.................c.................................. 2268 M31313 2282 .-.....a.................c.................................. 2340 AB001cow 2409 .a............................gg............................ 2cow QUERY 1117 acactattattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaa--a 1174 U67922 24552 .........................................................--. 24609 AJ223072 2269 ....---..................................................--. 2323 M31313 2341 ....---..................................................--. 2395 AB001cow 2469 ........g.....c......a.....g........t....................at. 2528 QUERY 1175 cgtat-ttgcatatgacaaactt-----tttctgttagagcaattaacatctgaaccacc 1228 U67922 24610 .....-.................-----................................ 24663 AJ223072 2324 .a...-.................-----................................ 2377 M31313 2396 .a...-.................-----................................ 2449 AB001cow 2529 .a...a............g....aagaa............................t.t. 2588 QUERY 1229 taatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacaa 1288 U67922 24664 .............................................. 24709 AJ223072 2378 .............................................. 2423 M31313 2450 .............................................. 2495 AB001cow 2589 .............................................. 2634 QUERY 1289 taaatgtactgaatacttaaaggaaactcttctgtgttgtccttag-ccttacagtgtgc 1347 U67922 26109 ........................................-............. 26161 AJ223072 3819 ........-..-...................-..---.-.t............. 3865 M31313 3889 ........-..-...................-..---.-.t............. 3935 AB001cow 4040 ..........................c.............-........c.... 4092 QUERY 1348 actgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattt 1407 U67922 26162 ............................................................ 26221 AJ223072 3866 .....-......-............................................... 3923 M31313 3936 .....-......-............................................... 3993 AB001cow 4093 ............................................................ 4152 QUERY 1408 tttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatat 1467 U67922 26222 ............................................................ 26281 AJ223072 3924 ............................................................ 3983 M31313 3994 ............................................................ 4053 AB001cow 4153 .c.......................................................... 4212 QUERY 1468 ctgactaaaattaa 1481 U67922 26282 .............. 26295 AJ223072 3984 .............. 3997 M31313 4054 .............. 4067
sheep U67922
... AJ223072
... M31313
... D38179
goat Z71825 mRNA EST
cattle AB001468
... D10612
mink S46825
human U29185
mouse U29187
rat D50093
... M20313
golden hamster M14054
... M37381
... K02234
Other ruminant sequences for this region extracted by Blastn include 2 full length sheep sequences and 1 full length bovine sequence; all of which contain a slight mismatch about 220 bp before the terminus, suggesting the Lee sequence has a slight glitch (or a breed differance). The overall alignment shows that the retrotransposon events occurred prior to the divergence of sheep and cattle.
None of these sequences have internal repeats of any signficant length. Blastn with sheep against human pulls up only the prion gene but oddly only a poor alignment from 1400 on, terminally 27,694 to 27,797. Using reduced sheep improves this slightly, aligning poorly over 550 terminal base pairs. Lowering gap penalties on advanced Blastn options to -G2 -E1 is appropriate given that small deletions and insertions are about as common as point mutations elsewhere in non-coding regions of this gene. Another good Blast trick is to use 'tlat master slave with identity' and adjust the expect value (e equal to any convenient number in custom settings) to eliminate undesirable sequences from the alignment.
This very much improves the alignment, which now extends best from 506-1481 of reduced sheep to 26867-27797 of human with the statistical significance threshold for reporting matches set at 0.001 but full-length alignment (with 4 areas of mismatch) when set at 0.01;

Sheep sequence, either full or reduced, advance gapping or not, recovers 9 rodent prion sequences with a very poor but extensive alignment, except for about 60 bp terminally. Thus it will be possible to align sheep and cattle along their entire length, with some help distally from human outgroup arbitration, to determine variable positions and an ancestral sequence. Again, a deer or elk sequence would be a great help.
A dozen rodent 3'UTR sequences have been determined, 9 of which are full length. Mouse, rat, and golden hamster are available in multiple entries, allowing good reconstruction of an ancestral rodent 3' UTR. These align fairly well with human over the first 250 bp and last 108 bp, poorly except for length otherwise.. The 7 mouse sequences differ from each other only at 5 scattered point mutations. However mouse and rat differ at 130 sites plus 37 small deletions, out of 1234 bp considered, so about 13.5 %, so a rather rapid rate of evolution over the 12 million years since divergence. Mouse and hamster differ at 143 point changes and 44 deletions.
Exon 3 (post stop codon) is given as:
sheep Lee sequence 23,049-26,295 = 3,247 bp of which 1766 bp is found in 3 retrotransposons, leaving residual length 1,481 comparable to other mammals. See also Goldmann sequence AJ223072, M31313 [with 82 differences probably mostly sequence error], D38179.
bovine AB001468 and D10612 (from 957 to 2096 = 1140 bp of 3' UTR), D26150
Sequence D00015 from a 1986 Science article is said to have a polyA site at 2422 with 8 A's. ( 2341 tgcatgttct tgttttgtta tataaaaaaa ttgtaaatgt ttaatatctg actgaaatta 2401 aacgagccaa gatgagcacc aa).
X83416, a 1991 Hood sequence, is annotated for polyA signals at 2242..2247 ataaaa and 2277..2282 attaaa:
mouse Lee sequence position 20,442-21,675 = 1,234 bp with no retrotransposons or poly A annotated
repeat_region 23,801..24,187 Bov-B = 387 bp [alt: BOV2; LINE-like element]
repeat_region 24,709..24,867 Bov-tA3 = 159 bp [alt: 24,708..24,872 BOVTA; art-2 SINE]
repeat_region 24,889..26,108 OaMAR1 = 1,220 bp [alt: none, off Medline, Lee entry only, not Jurka]
poly A sites predicted at 998 and 1287 of type AATAAA
..................gg gcaaccttcc tgttttcatt atcttcttaa tctttgccag gttgggggag
23101 ggagtgtcta cctgcagccc tgtagtggtg gtgtctcatt tcttgcttct ctcttgttac
23161 ctgtataata atacccttgg cgcttacagc actgggaaat gacaagcaga catgagatgc
23221 tgtttattca agtcccatta gctcagtatt ctaatgtccc atcttagcag tgattttgta
23281 gcaattttct catttgtttc aagaacacct gactacattt ccctttggga atagcatttc
23341 tgccaagtct ggaaggaggc cacataatat tcattcaaaa aaacaaaact ggaaatcctt
23401 agttcataga cccagggtcc accctgttga gagcatgtgt cctgtgtctg cagagaacta
23461 taaaggatat tctgcatttt gcaggttaca tttgcaggta acacagccat ctattgcatc
23521 aagaatggat attcatgcaa cctttgactt atgggcagag gacattttca caaggaatga
23581 acataatacg aaaggcttct gagactaaaa aattccaaca tatggaagag gtgcccttgg
23641 tggcagcctt ccattttgta tgtttaagca ccttcaagtg atattccttt ctttagtaac
23701 ataaagtata gataattaag gtaccttAAT TAAActacct tctagacact gagagcaaat signal for 2.1k polyA
23761 ctgttgttta tctggaaccc aggatgattt tgacattgct tagggatgtg agagttggac
23821 tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa ctatagtgtt ggagaaaact
23881 cttgagagtc ccttggactg aaaggagatc agtcctgaat attcattgga aggactgatg
23941 ctgaagctga aactccaata ctttggtcac ctgatgggaa gaactgaagg caggagggat
24001 gctaggaaag actgaaggca ggaggagaag gggacgacag aggatgagat ggctagatgg
24061 catcatggac tcaatggaca tgagcttaag taaactccag gagttggcga tggacaggga
24121 gacctggcgt cctgcagtcc atggtgtcgc agagtcggac acgattgagt gactaaattg
24181 aggtgaaccc agattttaac atagagaatg cagatacaaa aactccatat tcatttgatt
24241 gaatcttttc ctgaaccagt gctagtgttg gactggtaag agtataacag catatatagg
24301 ttatgtgatg aagagaatag tgtacatgaa atatgtgcat ttctttattg ctgtcttata
24361 attgtcaaaa aagaaagtta ggtccttggt ttctgtaaaa ttgacttgaa tcaaaaggga
24421 ggcatttaaa gaAATAAAtt agagatgata gaaatctgat ccattcagag tagaaaaaga
24481 aattccatta ctgttattta agaaggtaaa attatttcct gaattgttca atattgtcac
24541 ctagcagata gacactatta ttctgtactg tttttactag cttgcacctt gtggtatcct
24601 atgtaaaaac gtatttgcat atgacaaact ttttctgtta gagcaattaa catctgaacc
24661 acctaatgca ttacctgttt ttgtaaggta ctttttgtaa ggtactaagg agatgtgggt
24721 ttaatcccta ggtcaggtaa atcccctaga ggaagaaatg gcaacccact ccagtattct
24781 tgccaggaaa atccagtggg cagaggagcc tggcagggta cagtctaagc atggggttgc
24841 aaagagtgag acaagacttg agctactgaa caataaggac AATAAAtgct gggtcggcta
24901 aaaggttcat taggtttttt ttctgtaaga tggctctagt agtacttgtc tttatcttca
24961 ttcgaaacaa ttttgttaga ttgtatgtga cagctcttgt atcagcatgc atttgaaaaa
25021 aacatcaaaa ttggtaaatt tttgtatagc catcttacta ttgaagatgg aagaaaagaa
25081 gcaaaatttt cagcatatca tgctgtatta tttcaagaaa gataaccaaa atgcaaaaat
25141 gtatttgtga agtgtatgga gaaggggctg caactgatca agcttgtcaa agtagtttgt
25201 gaagtttcgt gctggagatt tcttattgga cgatgctcca cagttggata taccagttga
25261 agttgatagt gatcaaattg agatattgag aataatcgat gttataccac gcgggagata
25321 gctgacatac tcaaaatatc caaatagaac cttgaaaacc atttgcacca tctcagttat
25381 gttaatcact ttgatgtttg agttccacat aagcaaaaaa acaacaacaa caaaaaaaaa
25441 cacaaccttg accatatttg cgcatgcagt tctctactga aatgattgaa aacactttgt
25501 ttttaaaaac agattttgat taacagtggg tacgatacaa taacgtagaa tggaagaaat
25561 tgtagggtga gcaaaatgaa ccaccaccac caaaggccag tcttcctcta aagaagatgt
25621 gtgtatggtg ggattggaaa gtaatcctct attatgaatt cttctggaaa acactgctcc
25681 taattagacc aactgaaagc agcactcaac gaaaagcatc cagaattagt caatagaaaa
25741 cataatcttc catcaggata acgcaagact acatatttct ttgatgaccc agcatggctg
25801 gagtttctga ttcatctgtt gtattcagac gttgcatctt tggatttttt ccatttattt
25861 cagtctacaa aattatcata atggaaaaaa tttccattcc ctggaagatt gtaaagtgca
25921 tctggaaaat ttctttgctc aaaaagataa aaagttttgt gaacacagaa ttatgacgtt
25981 gcctgaaaaa tggcagaagg tagtggaaca aaagagtgac tatgttgttt ggtaaagttc
26041 ttagtgaaaa tgaaaaatgt gtcttttatt tttatttaaa caccaaaggc acattttggc
26101 caacccaata ctgaatactt aaaggaaact cttctgtgtt gtccttagcc ttacagtgtg
26161 cactgaatag ttttgtataa gaatccagag tgatatttga aatacgcatg tgcttatatt
26221 ttttatattt gtaactttgc atgtacttgt tttgtgttaa aagtttataa atatttaata
26281 tctgactaaa attaa
Reduced sheep 3' UTR (deleted retrotransposons, suitable for comparison with human and mouse): net length 1,481
..................gg gcaaccttcc tgttttcatt atcttcttaa tctttgccag gttgggggag
23101 ggagtgtcta cctgcagccc tgtagtggtg gtgtctcatt tcttgcttct ctcttgttac
23161 ctgtataata atacccttgg cgcttacagc actgggaaat gacaagcaga catgagatgc
23221 tgtttattca agtcccatta gctcagtatt ctaatgtccc atcttagcag tgattttgta
23281 gcaattttct catttgtttc aagaacacct gactacattt ccctttggga atagcatttc
23341 tgccaagtct ggaaggaggc cacataatat tcattcaaaa aaacaaaact ggaaatcctt
23401 agttcataga cccagggtcc accctgttga gagcatgtgt cctgtgtctg cagagaacta
23461 taaaggatat tctgcatttt gcaggttaca tttgcaggta acacagccat ctattgcatc
23521 aagaatggat attcatgcaa cctttgactt atgggcagag gacattttca caaggaatga
23581 acataatacg aaaggcttct gagactaaaa aattccaaca tatggaagag gtgcccttgg
23641 tggcagcctt ccattttgta tgtttaagca ccttcaagtg atattccttt ctttagtaac
23701 ataaagtata gataattaag gtaccttaat taaactacct tctagacact gagagcaaat
23761 ctgttgttta tctggaaccc aggatgattt tgacattgct ccc........agattttaac
atagagaatg cagatacaaa aactccatat tcatttgatt
24241 gaatcttttc ctgaaccagt gctagtgttg gactggtaag agtataacag catatatagg
24301 ttatgtgatg aagagaatag tgtacatgaa atatgtgcat ttctttattg ctgtcttata
24361 attgtcaaaa aagaaagtta ggtccttggt ttctgtaaaa ttgacttgaa tcaaaaggga
24421 ggcatttaaa gaAATAAAtt agagatgata gaaatctgat ccattcagag tagaaaaaga
24481 aattccatta ctgttattta agaaggtaaa attatttcct gaattgttca atattgtcac
24541 ctagcagata gacactatta ttctgtactg tttttactag cttgcacctt gtggtatcct
24601 atgtaaaaac gtatttgcat atgacaaact ttttctgtta gagcaattaa catctgaacc
24661 acctaatgca ttacctgttt ttgtaaggta ctttttgtaa ggtactaagaa caataagga
cAATAAAtgtactgaatactt aaaggaaact cttctgtgtt gtccttagcc ttacagtgtg
26161 cactgaatag ttttgtataa gaatccagag tgatatttga aatacgcatg tgcttatatt
26221 ttttatattt gtaactttgc atgtacttgt tttgtgttaa aagtttataa atatttaata
26281 tctgactaaa attaa
reduced sheep with poly A sites identified: 998 LDF- 4.97 1287 LDF- 2.52
gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtctacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtataataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctgtttattcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttcatagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaatggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggcagccttccattttgtatgtttaagcaccttcaagtgatattcctttctttagtaacataaagtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagagaatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagtgttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtacatgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagttaggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaaga.AATAAA.ggtatcctatgtaaaaacgtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacctaatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggac.AATAAA.tgtactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaa
Bov-B 23801..24187 (378 Blast hits, including muntjak, goat, deer, and viper)
1 tagggatgtg agagttggac tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa
61 ctatagtgtt ggagaaaact cttgagagtc ccttggactg aaaggagatc agtcctgaat
121 attcattgga aggactgatg ctgaagctga aactccaata ctttggtcac ctgatgggaa
181 gaactgaagg caggagggat gctaggaaag actgaaggca ggaggagaag gggacgacag
241 aggatgagat ggctagatgg catcatggac tcaatggaca tgagcttaag taaactccag
301 gagttggcga tggacaggga gacctggcgt cctgcagtcc atggtgtcgc agagtcggac
361 acgattgagt gactaaattg aggtgaa
Bov-tA3 24709..24867 (527 Blast hits, including deer, bison, and goat)
1 ggagatgtgg gtttaatccc taggtcaggt aaatccccta gaggaagaaa tggcaaccca
61 ctccagtatt cttgccagga aaatccagtg ggcagaggag cctggcaggg tacagtctaa
121 gcatggggtt gcaaagagtg agacaagact tgagctact
Oamar1 24889 to 26108 (few Blast hits, retrotransposon basis unknown)
1 ctgggtcggc taaaaggttc attaggtttt ttttctgtaa gatggctcta gtagtacttg
61 tctttatctt cattcgaaac aattttgtta gattgtatgt gacagctctt gtatcagcat
121 gcatttgaaa aaaacatcaa aattggtaaa tttttgtata gccatcttac tattgaagat
181 ggaagaaaag aagcaaaatt ttcagcatat catgctgtat tatttcaaga aagataacca
241 aaatgcaaaa atgtatttgt gaagtgtatg gagaaggggc tgcaactgat caagcttgtc
301 aaagtagttt gtgaagtttc gtgctggaga tttcttattg gacgatgctc cacagttgga
361 tataccagtt gaagttgata gtgatcaaat tgagatattg agaataatcg atgttatacc
421 acgcgggaga tagctgacat actcaaaata tccaaataga accttgaaaa ccatttgcac
481 catctcagtt atgttaatca ctttgatgtt tgagttccac ataagcaaaa aaacaacaac
541 aacaaaaaaa aacacaacct tgaccatatt tgcgcatgca gttctctact gaaatgattg
601 aaaacacttt gtttttaaaa acagattttg attaacagtg ggtacgatac aataacgtag
661 aatggaagaa attgtagggt gagcaaaatg aaccaccacc accaaaggcc agtcttcctc
721 taaagaagat gtgtgtatgg tgggattgga aagtaatcct ctattatgaa ttcttctgga
781 aaacactgct cctaattaga ccaactgaaa gcagcactca acgaaaagca tccagaatta
841 gtcaatagaa aacataatct tccatcagga taacgcaaga ctacatattt ctttgatgac
901 ccagcatggc tggagtttct gattcatctg ttgtattcag acgttgcatc tttggatttt
961 ttccatttat ttcagtctac aaaattatca taatggaaaa aatttccatt ccctggaaga
1021 ttgtaaagtg catctggaaa atttctttgc tcaaaaagat aaaaagtttt gtgaacacag
1081 aattatgacg ttgcctgaaa aatggcagaa ggtagtggaa caaaagagtg actatgttgt
1141 ttggtaaagt tcttagtgaa aatgaaaaat gtgtctttta tttttattta aacaccaaag
1201 gcacattttg gccaacccaa
3'UTR 957..4244 = 3,288 bp
repeat_region 1713..2093 Bov-B = 381 bp [alt: BOV2; LINE-like element 1717-2093]
repeat_region 2634..2792 Bov-tA3 = 159 bp [alt: 2633-2797 24,708..24,872 BOVTA; art-2 SINE]
repeat_region 2814..4039 OaMAR1 = 1,226 bp [mariner element]
total bovine retrotransposons = 1,766 bp
net = 1,522 bp
old_sequence 2096 D10612 replace g by c
variation 2182 replace a by g
misc_signal 4003..4011 ttatttaaa AU-rich element that is thought to mediate mRNA degradation
polyA_signal 4207..4212 taatat
polyA_signal 4222..4227 attaaa
PolyA predicted at position 1850+957
957 .............................................................gggc
961 aaccttcctg ttttcattat cttcttaatc tttaccaggt tgggggaggg agtatctacc
1021 tgcagccccg tagtggtggt gtctcatttc gtgcttctct ctttgttacc tgtatgctaa
1081 tacccttggc gcttatagca ctgggaaatg aagagcagac atgagatgct gtttattcaa
1141 gtcccgttag ctcagtatgc taatgcccca tcttagcagt gattttgtag caattttctc
1201 atttgtttca agaacacgtg actacatttc ccttttggaa tagcatttct gccaagtctg
1261 gaaggaggcc acataatatt cattcaaaaa aacaaaccgg aaatccttag ttcatagacc
1321 cagggtccac ctggttgaga gcttgtgtcc tgtgtctgca gagaactata aaggatattc
1381 tgcattttgc aggttacatt tgcaggtaac acagccagct attgcatcaa gaatggatat
1441 tcatgcaacc tttgacttat gggtagagga cattttcaca aggaatgaac ataatacgaa
1501 aggcttctga gactaaaaaa ttccaacata tgggagaggt gcccttggtg gcagccttcc
1561 attttgtatg tttaaagcac cttcaagtgg tattcctttc tttagtaaca aagtatagat
1621 aattaagtta ccttaattta attaaactac cttctagaca ctgagagcaa atctgttgtt
1681 tatctggaac ccaggatgat tttgacattg tttagagatg tgagagttga actgtaaaga
1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc
1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca
1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact
1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc
1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg
2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaacccagat
2101 tttaacatag agaatgcaga tataaaaact ccatattcat ttgattgaat cttttcctta
2161 accagtgcta gtgttggact ggtaagatta taacaacaaa tataggttat gtgatgaaga
2221 gaatagtgta caaagaaaag aaatatgtgc atttctttat tgctatcata attgtcaaaa
2281 aacaaaatta ggtccttggt ttctgtaaaa ttaacttttg aatcaacagg gaggcattta
2341 aagaaatatc ttaaattaga gacagtagaa atctgataca ttcagagtgg aaaaagaaat
2401 tctattacga ttatttaaga aggtaaaatt atttcctggg ttgttcaata ttgtcaccta
2461 gcagatagac actattgttc tgcactgtta ttactggctt gcactttgtg gtatcctatg
2521 taaaaataca tatattgcat atgacagact taagaatttc tgttagagca attaacatct
2581 gaactatcta atgcattacc tgtttttgta aggtactttt tgtaaggtac taaggagacg
2641 tgggtttaat ccctaggtca tgtaaatccc ctggaggagg aaatagcaac ccactccagt
2701 attcttgcca ggagaatccc atgggcagag gagcctggca gggtgcagtc catgcatagg
2761 gttgcaaaga gtcagacaag acttgagcta ctaaacaata acaacaataa atgctgggtt
2821 ggctaaaagg ttcattaggt tttttttctg taagatggct gtctttaact tcattcgaaa
2881 caattttgtt agattgtatg tgacagctct tgtatcagca tgcatttgaa aaagaaaaca
2941 acttaccaaa attggtgaat ttttgtatag ccattttact attgaagatg gaagaaaaga
3001 agcaaaattt tcagcatatc atgctgtatt atttcaagaa agataacaca accaaaatgc
3061 gaaaatgtat ttgtgcagtg tatggagaag gtgctgcaac tgatcaagct tgtcaaagta
3121 gtttgtgaag tattgtgctg gagatttctt actggacaat gctccacagt cgggtatacc
3181 agttgaagtt gatagtgatc aaattgagat attgagaaca atcaatgtta taccacgtgg
3241 gagatagctg acatactcaa aatatccaaa tagaaccttg aaaaccattt gcaccatctc
3301 agttatgtta ataactttga tgtttgagtt ccacataaat taagcaaaaa aaaaacaaaa
3361 acaaaaacac acaaccttga ccatatttgc atatgcagtt ctctactgaa atgaatgaaa
3421 acacttttgt ttttaaaaac agattttgat gaacagtgga tactatacaa taacgtagaa
3481 tggaaaagac tgtggggtga gcaaaatgaa ccagcaccac caaaggccag gcttcatcca
3541 aagaagatgt gtgtatggtg ggattggaaa gtaatcctct attatgggat tcttctggaa
3601 aaccaaaaaa tcaattccaa caagtactgc tcctaattag accaactgaa agcagcattc
3661 aatgaaaagc atccagaatt agtcaataga aagcatataa tcttccatca ggataacaca
3721 agactacatt tctttgatga cccagcatgg ctgagaggtt ctgattcacc tgctgtattc
3781 agacattgca tctttggatt tccatttatt tcagtctaca gaattatcat catgaaaaaa
3841 atttccattc cctggaagat tgtaaagtgc atctggaaaa cttctttgct caaaaagata
3901 aaaagttttg tgaacacaga attatgaagt tgcctgaaaa acagcagaag atagtgacta
3961 tgttgttcag taaagttctt ggtgcaaatg tgtcttttat ttttatttaa acactaaagg
4021 cacgttttgg ccaacccaat actgaatact taaaggaaac tcttccgtgt tgtccttagc
4081 cttacagcgt gcactgaata gttttgtata agaatccaga gtgatatttg aaatacgcat
4141 gtgcttatat tttctatatt tgtaactttg catgtacttg ttttgtgtta aaagtttata
4201 aatatttaat atctgactaa aattaaacag gagctaaaag gagg
tagagatg tgagagttga actgtaaaga
1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc
1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca
1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact
1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc
1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg
2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaa
Sequence numbered from beginning of 3' UTR: 3288 bp
PolyA predicted at position 1850
1 gggcaacctt cctgttttca ttatcttctt aatctttacc aggttggggg agggagtatc
61 tacctgcagc cccgtagtgg tggtgtctca tttcgtgctt ctctctttgt tacctgtatg
121 ctaataccct tggcgcttat agcactggga aatgaagagc agacatgaga tgctgtttat
181 tcaagtcccg ttagctcagt atgctaatgc cccatcttag cagtgatttt gtagcaattt
241 tctcatttgt ttcaagaaca cgtgactaca tttccctttt ggaatagcat ttctgccaag
301 tctggaagga ggccacataa tattcattca aaaaaacaaa ccggaaatcc ttagttcata
361 gacccagggt ccacctggtt gagagcttgt gtcctgtgtc tgcagagaac tataaaggat
421 attctgcatt ttgcaggtta catttgcagg taacacagcc agctattgca tcaagaatgg
481 atattcatgc aacctttgac ttatgggtag aggacatttt cacaaggaat gaacataata
541 cgaaaggctt ctgagactaa aaaattccaa catatgggag aggtgccctt ggtggcagcc
601 ttccattttg tatgtttaaa gcaccttcaa gtggtattcc tttctttagt aacaaagtat
661 agataattaa gttaccttaa tttaattaaa ctaccttcta gacactgaga gcaaatctgt
721 tgtttatctg gaacccagga tgattttgac attgtttaga gatgtgagag ttgaactgta
781 aagaaagctg agtgctgaag aattgatgct tttgaactct agtgttggag aaaacttgag
841 agtcccttgg actgcaagga gatcaaatta gtccatccta aaggagatca gtcctgaata
901 ttcattggaa ggactgatgc tgaacgtgaa actccaatac tttggccacc tgatgggaag
961 aactgaaggc aggaggagaa ggggatgaca gaggatgaag atggctggat ggcatcatgg
1021 attcaatgga catgagcttg agtaaactcc aggagttggc aatcgacgga gtcctggcat
1081 cctgcagtcc atggtgtcgc agagttggac acgactgagt gactgaactg aggtgaaccc
1141 agattttaac atagagaatg cagatataaa aactccatat tcatttgatt gaatcttttc
1201 cttaaccagt gctagtgttg gactggtaag attataacaa caaatatagg ttatgtgatg
1261 aagagaatag tgtacaaaga aaagaaatat gtgcatttct ttattgctat cataattgtc
1321 aaaaaacaaa attaggtcct tggtttctgt aaaattaact tttgaatcaa cagggaggca
1381 tttaaagaaa tatcttaaat tagagacagt agaaatctga tacattcaga gtggaaaaag
1441 aaattctatt acgattattt aagaaggtaa aattatttcc tgggttgttc aatattgtca
1501 cctagcagat agacactatt gttctgcact gttattactg gcttgcactt tgtggtatcc
1561 tatgtaaaaa tacatatatt gcatatgaca gacttaagaa tttctgttag agcaattaac
1621 atctgaacta tctaatgcat tacctgtttt tgtaaggtac tttttgtaag gtactaagga
1681 gacgtgggtt taatccctag gtcatgtaaa tcccctggag gaggaaatag caacccactc
1741 cagtattctt gccaggagaa tcccatgggc agaggagcct ggcagggtgc agtccatgca
1801 tagggttgca aagagtcaga caagacttga gctactaaac aataacaaca ataaatgctg
1861 ggttggctaa aaggttcatt aggttttttt tctgtaagat ggctgtcttt aacttcattc
1921 gaaacaattt tgttagattg tatgtgacag ctcttgtatc agcatgcatt tgaaaaagaa
1981 aacaacttac caaaattggt gaatttttgt atagccattt tactattgaa gatggaagaa
2041 aagaagcaaa attttcagca tatcatgctg tattatttca agaaagataa cacaaccaaa
2101 atgcgaaaat gtatttgtgc agtgtatgga gaaggtgctg caactgatca agcttgtcaa
2161 agtagtttgt gaagtattgt gctggagatt tcttactgga caatgctcca cagtcgggta
2221 taccagttga agttgatagt gatcaaattg agatattgag aacaatcaat gttataccac
2281 gtgggagata gctgacatac tcaaaatatc caaatagaac cttgaaaacc atttgcacca
2341 tctcagttat gttaataact ttgatgtttg agttccacat aaattaagca aaaaaaaaac
2401 aaaaacaaaa acacacaacc ttgaccatat ttgcatatgc agttctctac tgaaatgaat
2461 gaaaacactt ttgtttttaa aaacagattt tgatgaacag tggatactat acaataacgt
2521 agaatggaaa agactgtggg gtgagcaaaa tgaaccagca ccaccaaagg ccaggcttca
2581 tccaaagaag atgtgtgtat ggtgggattg gaaagtaatc ctctattatg ggattcttct
2641 ggaaaaccaa aaaatcaatt ccaacaagta ctgctcctaa ttagaccaac tgaaagcagc
2701 attcaatgaa aagcatccag aattagtcaa tagaaagcat ataatcttcc atcaggataa
2761 cacaagacta catttctttg atgacccagc atggctgaga ggttctgatt cacctgctgt
2821 attcagacat tgcatctttg gatttccatt tatttcagtc tacagaatta tcatcatgaa
2881 aaaaatttcc attccctgga agattgtaaa gtgcatctgg aaaacttctt tgctcaaaaa
2941 gataaaaagt tttgtgaaca cagaattatg aagttgcctg aaaaacagca gaagatagtg
3001 actatgttgt tcagtaaagt tcttggtgca aatgtgtctt ttatttttat ttaaacacta
3061 aaggcacgtt ttggccaacc caatactgaa tacttaaagg aaactcttcc gtgttgtcct
3121 tagccttaca gcgtgcactg aatagttttg tataagaatc cagagtgata tttgaaatac
3181 gcatgtgctt atattttcta tatttgtaac tttgcatgta cttgttttgt gttaaaagtt
3241 tataaatatt taatatctga ctaaaattaa acaggagcta aaaggagg
Reduced bovine sequence
957 .............................................................gggc
961 aaccttcctg ttttcattat cttcttaatc tttaccaggt tgggggaggg agtatctacc
1021 tgcagccccg tagtggtggt gtctcatttc gtgcttctct ctttgttacc tgtatgctaa
1081 tacccttggc gcttatagca ctgggaaatg aagagcagac atgagatgct gtttattcaa
1141 gtcccgttag ctcagtatgc taatgcccca tcttagcagt gattttgtag caattttctc
1201 atttgtttca agaacacgtg actacatttc ccttttggaa tagcatttct gccaagtctg
1261 gaaggaggcc acataatatt cattcaaaaa aacaaaccgg aaatccttag ttcatagacc
1321 cagggtccac ctggttgaga gcttgtgtcc tgtgtctgca gagaactata aaggatattc
1381 tgcattttgc aggttacatt tgcaggtaac acagccagct attgcatcaa gaatggatat
1441 tcatgcaacc tttgacttat gggtagagga cattttcaca aggaatgaac ataatacgaa
1501 aggcttctga gactaaaaaa ttccaacata tgggagaggt gcccttggtg gcagccttcc
1561 attttgtatg tttaaagcac cttcaagtgg tattcctttc tttagtaaca aagtatagat
1621 aattaagtta ccttaattta attaaactac cttctagaca ctgagagcaa atctgttgtt
1681 tatctggaac ccaggatgat tttgacattg tttagacccagat
2101 tttaacatag agaatgcaga tataaaaact ccatattcat ttgattgaat cttttcctta
2161 accagtgcta gtgttggact ggtaagatta taacaacaaa tataggttat gtgatgaaga
2221 gaatagtgta caaagaaaag aaatatgtgc atttctttat tgctatcata attgtcaaaa
2281 aacaaaatta ggtccttggt ttctgtaaaa ttaacttttg aatcaacagg gaggcattta
2341 aagaaatatc ttaaattaga gacagtagaa atctgataca ttcagagtgg aaaaagaaat
2401 tctattacga ttatttaaga aggtaaaatt atttcctggg ttgttcaata ttgtcaccta
2461 gcagatagac actattgttc tgcactgtta ttactggctt gcactttgtg gtatcctatg
2521 taaaaataca tatattgcat atgacagact taagaatttc tgttagagca attaacatct
2581 gaactatcta atgcattacc tgtttttgta aggtactttt tgtaaggtac taaaaacaata acaacaataa atgt actgaatact taaaggaaac tcttccgtgt tgtccttagc
4081 cttacagcgt gcactgaata gttttgtata agaatccaga gtgatatttg aaatacgcat
4141 gtgcttatat tttctatatt tgtaactttg catgtacttg ttttgtgtta aaagtttata
4201 aatatttaat atctgactaa aattaaacag gagctaaaag gagg
Alignment of sheep and cow Bov-B:
Sheep: 1 tagggatgtgagagttggactgtaaagaaagctgagtgctgaagagttgatgcttttgaa 60
||| ||||||||||||| ||||||||||||||||||||||||||| ||||||||||||||
Bovin: 1713 tagagatgtgagagttgaactgtaaagaaagctgagtgctgaagaattgatgcttttgaa 1772
Sheep: 61 ctatagtgttggagaaaactcttgagagtcccttggactg----------aaa------- 103
|| ||||||||||||||| |||||||||||||||||||| |||
Bovin: 1773 ctctagtgttggagaaaa--cttgagagtcccttggactgcaaggagatcaaattagtcc 1830
Sheep: 104 --------ggagatcagtcctgaatattcattggaaggactgatgctgaagctgaaactc 155
|||||||||||||||||||||||||||||||||||||||||| ||||||||
Bovin: 1831 atcctaaaggagatcagtcctgaatattcattggaaggactgatgctgaacgtgaaactc 1890
Sheep: 156 caatactttggtcacctgatgggaagaactgaaggcaggagggatgctaggaaagactga 215
||||||||||| |||||||||||||||||||||||||||| ||| ||
Bovin: 1891 caatactttggccacctgatgggaagaactgaaggcagga-gga--------------ga 1935
Sheep: 216 aggcaggaggagaaggggacgacagaggatgagatggctagatggcatcatggactcaat 275
||| ||| || || ||| || ||| ||||| |||||||||||||| |||||
Bovin: 1936 agg--ggatga-cagaggatga-aga--------tggctggatggcatcatggattcaat 1983
Sheep: 276 ggacatgagcttaagtaaactccaggagttggcgatggacagggagacctggcgtcctgc 335
|||||||||||| |||||||||||||||||||| || ||| |||| |||||| ||||||
Bovin: 1984 ggacatgagcttgagtaaactccaggagttggcaatcgac--ggagtcctggcatcctgc 2041
Sheep: 336 agtccatggtgtcgcagagtcggacacgattgagtgactaaattgaggtgaa 387
|||||||||||||||||||| |||||||| ||||||||| || |||||||||
Bovin: 2042 agtccatggtgtcgcagagttggacacgactgagtgactgaactgaggtgaa 2093
Alignment of sheep and cow Bov-tA3:
Sheep: 1 ggagatgtgggtttaatccctaggtcaggtaaatcccctagaggaagaaatggcaaccca 60
||||| ||||||||||||||||||||| ||||||||||| ||||| ||||| ||||||||
Bovin: 2634 ggagacgtgggtttaatccctaggtcatgtaaatcccctggaggaggaaatagcaaccca 2693
Sheep: 61 ctccagtattcttgccaggaaaatccagtgggcagaggagcctggcagggtacagtctaa 120
|||||||||||||||||||| ||||| ||||||||||||||||||||||| ||||| |
Bovin: 2694 ctccagtattcttgccaggagaatcccatgggcagaggagcctggcagggtgcagtccat 2753
Sheep: 121 gcatggggttgcaaagagtgagacaagacttgagctact 159
|||| |||||||||||||| |||||||||||||||||||
Bovin: 2754 gcatagggttgcaaagagtcagacaagacttgagctact 2792
Alignment of mariner pseudogene of sheep and cow:
SHEEP 1 ctgggtcggctaaaaggttcattaggttttttttctgtaagatggctctagtagtacttg 60
AB001468 2814 ......t........................................------------. 2861
SHEEP 61 tctttatcttcattcgaaacaattttgttagattgtatgtgacagctcttgtatcagcat 120
AB001468 2862 ......a..................................................... 2921
SHEEP 121 gcatttgaaaaa---aaca---t--caaaattggtaaatttttgtatagccatcttacta 172
AB001468 2922 ............gaa....act.ac..........g.................t...... 2981
SHEEP 173 ttgaagatggaagaaaagaagcaaaattttcagcatatcatgctgtattatttcaagaaa 232
AB001468 2982 ............................................................ 3041
SHEEP 233 gataac-----caaaatgcaaaaatgtatttgtgaagtgtatggagaaggggctgcaact 287
AB001468 3042 ......acaac........g..............c...............t......... 3101
SHEEP 288 gatcaagcttgtcaaagtagtttgtgaagt-ttcgtgctggagatttcttattggacgat 346
AB001468 3102 ..............................a..-.................c.....a.. 3160
SHEEP 347 gctccacagttggatataccagttgaagttgatagtgatcaaattgagatattgagaata 406
AB001468 3161 ..........c..g............................................c. 3220
SHEEP 407 atcgatgttataccacgcgggagatagctgacatactcaaaatatccaaatagaaccttg 466
AB001468 3221 ...a.............t.......................................... 3280
SHEEP 467 aaaaccatttgcaccatctcagttatgttaatcactttgatgtttgagttccaca----- 521
AB001468 3281 ................................a......................taaat 3340
SHEEP 522 taagcaaaaaaacaacaacaacaaa-aaaaa-acacaaccttgaccatatttgcgcatgc 579
AB001468 3341 ............-.--.....-...c.....c......................at.... 3396
SHEEP 580 agttctctactgaaatgattgaaaacac-tttgtttttaaaaacagattttgattaacag 638
AB001468 3397 ..................a.........t.........................g..... 3456
SHEEP 639 tgggtacgatacaataacgtagaatggaagaa-attgtagggtgagcaaaatgaaccacc 697
AB001468 3457 ...a...t.....................-..g.c...g...................g. 3515
SHEEP 698 accaccaaaggccagtcttcctctaaagaagatgtgtgtatggtgggattggaaagtaat 757
AB001468 3516 ...............g....a..c.................................... 3575
SHEEP 758 cctctattat-gaattcttctggaaaacactgctcctaattagaccaactgaaagcagca 816
AB001468 3576 ..........g.g............... 3603
AB001468 3626 ................................ 3657
SHEEP 817 ctcaacgaaaagcatccagaattagtcaatagaaaacata--atcttccatcaggataac 874
AB001468 3658 t....t.............................g....ta.................. 3717
SHEEP 875 gcaagactacatatttctttgatgacccagcatggctg-gagtttctgattcatctgttg 933
AB001468 3718 a...........--........................a...g..........c...c.. 3775
SHEEP 934 tattcagacgttgcatctttggattttttccatttatttcagtctacaaaattatcataa 993
AB001468 3776 .........a................---...................g.........c. 3832
SHEEP 994 tggaaaaaatttccattccctggaagattgtaaagtgcatctggaaaatttctttgctca 1053
AB001468 3833 ..a.............................................c........... 3892
SHEEP 1054 aaaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggt 1113
AB001468 3893 ...................................a.............ca.......-- 3950
SHEEP 1114 agtggaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtg 1173
AB001468 3951 .-.--.--------..............ca...........-.-.----..-c....... 3992
SHEEP 1174 tcttttatttttatttaaacaccaaaggcacattttggccaacccaa 1220
AB001468 3993 ......................t........g............... 4039
human position 26,212-27,817 = 1,606 bp with no retrotransposons or poly A annotated. (The 4 other full length human 3' UTR have 6 distal sites with point changes.)
2221 tgcatgttct tgttttgtta tataaaaaaa ttgtaaatgt ttaatatctg actgaaatta
2281 aacgagcgaa gatgagcacc a
..................................................................ggaaggtct
26221 tcctgttttc accatctttc taatcttttt ccagcttgag ggaggcggta tccacctgca
26281 gcccttttag tggtggtgtc tcactctttc ttctctcttt gtcccggata ggctaatcaa
26341 tacccttggc actgatgggc actggaaaac atagagtaga cctgagatgc tggtcaagcc
26401 ccctttgatt gagttcatca tgagccgttg ctaatgccag gccagtaaaa gtataacagc
26461 aaataaccat tggttaatct ggacttattt ttggacttag tgcaacaggt tgaggctaaa
26521 acaaatctca gaacagtctg aaataccttt gcctggatac ctctggctcc ttcagcagct
26581 agagctcagt atactaatgc cctatcttag tagagatttc atagctattt agagatattt
26641 tccattttaa gaaaacccga caacatttct gccaggtttg ttaggaggcc acatgatact
26701 tattcaaaaa aatcctagag attcttagct cttgggatgc aggctcagcc cgctggagca
26761 tgagctctgt gtgtaccgag aactggggtg atgttttact tttcacagta tgggctacac
26821 agcagctgtt caacaagagt aaatattgtc acaacactga acctctggct agaggacata
26881 ttcacagtga acataactgt aacatatatg aaaggcttct gggacttgaa atcaaatgtt
26941 tgggaatggt gcccttggag gcaacctccc attttagatg tttaaaggac cctatatgtg
27001 gcattccttt ctttaaacta taggtAATTA Aggcagctga aaagtaaatt gccttctaga
27061 cactgaaggc aaatctcctt tgtccattta cctggaaacc agaatgattt tgacatacag
27121 gagagctgca gttgtgaaag caccatcatc atagaggatg atgtaattaa aaaatggtca
27181 gtgtgcaaag aaaagaactg cttgcatttc tttatttctg tctcataatt gtcaaaaacc
27241 agaattaggt caagttcata gtttctgtaa ttggcttttg aatcaaagaa tagggagaca
27301 atctaaaaaa tatcttaggt tggagatgac agaaatatga ttgatttgaa gtggaaaaag
27361 aaattctgtt aatgttaatt aaagtaaaat tattccctga attgtttgat attgtcacct
27421 agcagatatg tattactttt ctgcaatgtt attattggct tgcactttgt gagtattcta
27481 tgtaaaaata tatatgtata taaaatatat attgcatagg acagacttag gagttttgtt
27541 tagagcagtt aacatctgaa gtgtctaatg cattaacttt tgtaaggtac tgaatactta
27601 atatgtggga aacccttttg cgtggtcctt aggcttacaa tgtgcactga atcgtttcat
27661 gtaagaatcc aaagtggaca ccattaacag gtctttgaaa tatgcatgta ctttatattt
27721 tctatatttg taactttgca tgttcttgtt ttgttatata aaaaaattgt aaatgtttaa
27781 tatctgactg aaattaaacg agcgaagatg agcacca
cct cccgtgtctg cagttgtatt
27841 tcctggtgct tgccctgtgt tggggactgt tttgggggtt aatctgagcc aagtggcgct
27901 ttctgtcctc ccttctcaag tgatggccga tggttcacgc acttccccct gttcctgccc
27961 ttgtcctcac ttcccagtca cccactagtt catctctgcg gcttttgcat tttctccaca
28021 agcatctaag tgggcttagc actggtaaac tgcaaaggca ctattgcagc aggaggaaca
28081 gtctgggagc ttttttcagt cctggattta gaaatagatt ttcttgatta aaatgaaaat
28141 taacaagctc taaagaactg ttgacccttg aactacacag ggattagagg cactgacctg
28201 ccgcacagtc gaaaatctgc agagaagttt tttttgtttt gttttgtttt ttttgagacg
28261 gagtctcgct ctgtcgccca ggctggagtg cagtggcggg atctcggctc actgcaacct
28321 ccgcctcccg ggttcaggcg attctcctgc ctcagcctcc tgagtagctg ggactacagg
28381 catatgccac catgcccggc taatttttgt atttttagta gagatggagt ttcaccatat
28441 tggccaggct gttctcaaac tcggcctcaa gtgatctgct cgcctcagcc acccaaagtg
28501 ctaggattac aagcatgagc caccgcgccc ggcctgcata gaacttttaa ctcccccaaa
extra human after break: nothing found in human ESTs that extends significantly begond this, 111 hits
.......................................................gggaggcct tcctgcttgt
20461 tccttcgcat tctcgtggtc taggctgggg gaggggttat ccacctgtag ctctttcaat
20521 tgaggtggtt ctcattcttg cttctctgtg tcccccatag gctaataccc ctggcactga
20581 tgggccctgg gaaatgtaca gtagaccagt tgctctttgc ttcaggtccc tttgatggag
20641 tctgtcatca gccagtgcta acaccgggcc aataagaata taacaccaaa taactgctgg
20701 ctagttgggg ctttgttttg gtctagtgaa taaatactgg tgtatcccct gacttgtacc
20761 cagagtacaa ggtgacagtg acacatgtaa cttagcatag gcaaagggtt ctacaaccaa
20821 agaagccact gtttggggat ggcgccctgg aaaacagcct cccacctggg atagctagag
20881 cgtccacacg tggaattctt tctttactaa caaacgatag ctgattgaag gcaacaggaa
20941 aaaaaaaaat caaattgtcc tactgacgtt gaaagcaaac ctttgttcat tcccagggca
21001 ctagaatgat ctttagcctt gcttggattg aactaggaga tcttgactct gaggagagcc
21061 agccctgtaa aaagcttggt cctcctgtga cgggagggat ggttaaggta caaaggctag
21121 aaacttgagt ttcttcattt ctgtctcaca attatcaaaa gctagaatta gcttctgccc
21181 tatgtttctg tacttctatt tgaactggat aacagagaga caatctaaac attctcttag
21241 gctgcagata agagaagtag gctccattcc aaagtgggaa agaaattctg ctagcattgt
21301 ttaaatcagg caaaatttgt tcctgaagtt gctttttacc ccagcagaca taaactgcga
21361 tagcttcagc ttgcactgtg gattttctgt atagaatata taaaacataa cttcaagctt
21421 atgtcttctt tttaaaacat ctgaagtgtg ggacgccctg gccgttccat ccagtactaa
21481 atgcttaccg tgtgaccctt gggctttcag cgtgcactca gttccgtagg attccaaagc
21541 agacccctag ctggtctttg aatctgcatg tacttcacgt tttctatatt tgtaactttg
21601 catgtatttt gttttgtcat ataaaaagtt tataaatgtt tgctatcaga ctgacattaa
21661 atagaagcta tgatg
rat D50093 and M20313 Positions 1404-2625 of D50093 shown 1222 bp (seq continues to 3090):
polyA_signal 2570..2575 tataaa
polyA_signal 2581..2586 tataaa
polyA_signal 2606..2611 attaaa
polyA_site 2625 (or 2627, 2628) g
1404..........................ggaggcc ttcctgcttg ttccttctca ttctcgtggt
1441 ctaggctggg ggaggggtta cccacctgta gctctttcaa ttgaggtggt gtctcattct
1501 tgcttctctt tgtcccccat aggctaatac ccttggcagt gatgggtctg gggaaatgta
1561 cagtagacca gatgctattc gcttcagcgt cctttgattg agtccatcat gggccagggt
1621 taacaccagg ccagtaagaa tataacacca aataactgct ggctagtcag ggctttgttt
1681 tggtctactg agtaaatact gtgtaacccc tgaattgtac ccagaggaca tggtgacaga
1741 gacacacata acttagtata ggcaaagggt tctatagcca aagaagccac tgtgtgggca
1801 tggcaccctg gataacagcc tcccgcctgg gatatctaga gcatccacat gtggaattct
1861 ttcttttcta acataaacca tagctgattg aaggcaacaa gaaaaagaat caaattatcc
1921 tactgacatt gaaagcaaac tgtgttcatt ccctaggcgc tggaatgatt tttagccttg
1981 gattaaacca ggagattttg actctgagga gaaccagcag tacaaaagca tggtctcctg
2041 tgatgggaga gatggtgaag ggacaaaggc aagacccctg cgtttcttca tttctgtctc
2101 ataattatca agagctagaa ttaggtcgtg ccctaagttt ctgtactcgt atttgaactg
2161 gacaacaaag agacaatcta caaattctct tgggctgcag aggagagaaa taggctccat
2221 tccaaagtgg aaagagaaat tctgctagca ttgtctaagt aaggctaact tttccttaaa
2281 tcgctttgta tttcccccag cagacatcac aaccctgtga tcggttcagc ctgcaccgcg
2341 ggtgttctgt gtagaatata taaatataac ttcaagctta ggccttctat tttaaaacat
2401 ctgaagtgtg gaacgcactg gccgttctgt gcagtactaa gtgtgaccct tgggctttca
2461 atgtgcactc ggttccgtat gattccaaag tagagcccta gctggtcttc gaatctgcat
2521 gtacttcacg ttttctatat ttgtaacttc gcatgtattt gttttgtcat ataaaaagtt
2581 tataaatgtt tgctatctga ctgacattaa atagaagcta tgatg
golden hamster M14054, M37381, K02234. Positions 1249-2463 of M14054 shown 1214 bp (seq continues to 3002). Rat/mouse gives 87% identity, rat/hamster and mouse/hamster 78% identity.
1249.....................................................ag gaagcctccc
1261 tgcttgtact tcctcgttct tgtggtctag gctgggggag gggttatcca ccgtagctct
1321 tttaattgag gtggtgtctc attcctgctt ctctttgtcc cccataggct aatgcccttg
1381 gcactagtgg gccctgggaa tgtacagtag accagatgct attcgatcca gagcctttga
1441 attgagtcca tcacgggcca gcactaacac caggcctatc tgaatataac agcaagtaat
1501 ggctggctag tcagggcttt gttttggtct agtgagtaaa tactgatgtg accctctgac
1561 ttccacacag agtacgcagt gacagacaca cctaactgtt aaaataggcg aagggttcta
1621 cagccaaaga agtcactgtt tggcatggtc cctaagaaac agcctcccat ttgggatatt
1681 taaagcatcc atatgaggca ttcctccttc actaacaaac tctagctgag taaggcaacg
1741 ggaaaaaaac aaaattaccc tactaacatg gaaagcaaac ctgtgttcat ttcctaggaa
1801 ctagaatgat gttttagcct tgcttggatt gaaccaggag attttggctc tgaagagcca
1861 acactgtaaa aatgtggtcc tcctgcaaag ggagagatgg ttaggacaca aagtcacggc
1921 gcttggcgtt tcttcatttc tgtctcataa ttgtcaaaag tcacaattag gtcatgccct
1981 tagttaatat acttgtattt gaatcggacg acaagagaca atctaaaaat tctcctaggt
2041 tgtagatgaa ataggctcca ttcaaggtga aaagacagtt tgttagcgtt gcttatgtaa
2101 ggcaaacttt gttccttaag ttgctccgtg tttccctgag cagacataac cactctgcaa
2161 cagcattgcc ctgctgtaga atatataaag tgtaactaca agcttagacc ttctgttctg
2221 atgcatccga agtacgtaat gcactgacca tttcacccgg tatcagatgt tttctgtgtg
2281 gcccctagct ttccttcaac atgcattcgg ttccatatat gaatccaaag tggaccccct
2341 aactggtctc tgaaatctgc atgtacttca cattttctat atttgtaact ttgcatgtcc
2401 ttgttttgtc atataaaaag tttataaatg tttgctatct gactgacatt aaataggagc
2461 ta
Ancestral rodent 3' UTR: outgroup arbitration on (hamster, (rat,mouse)): gives 50% identity with human sequence(which is 1606 bp vs 1242 bp -- large gaps occur centrally.
>ancest_rod gggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcactgatgggcccggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggctaacaccaggccaataagaatataacaccaaataactgctggctagtcagggctttgttttggtctagtgagtaaatactggtgtaacccctgacttgtacccagagtacatggtgacagagacacacataacttagtataggcaaagggttctacagccaaagaagccactgtttgggcatggcaccctggataacagcctcccacctgggatatctagagcatccacatgtggaattctttctttactaacaaaccatagctgattgaaggcaacaggaaaaaaaaatcaaattatcctactgacattgaaagcaaacctgtgttcattccctaggcactagaatgatttttagccttgcttggattgaaccaggagattttgactctgaggagagccagcactgtacaaaagcatggtcctcctgtgatgggagagatggttaagggacaaaggcaagacccttgcgtttcttcatttctgtctcataattatcaaaagctagaattaggtcgtgccctaagtttctgtacttgtatttgaactggacaacaaagagacaatctaaaaattctcttaggctgcagatgagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtttaagtaaggcaaactttgttccttaagtcgctttgtatttcccccagcagacataacaaccctgcgatcggttcagcttgcactgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcaacgtgcactcggttccgtatgattccaaagtagacccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatgmink S46825 1632 bp post gene; 4.6k poly A site is 1593, only exon 3 is known. Mink is an extremely valuable 3' UTR because it forms an outgroup to cow and sheep more recent than the human or rodent sequence. It aligns fairly well with reduced sheep, averaging 75% identity in the blocks of alignment. Mink lacks any sequence similarity to the 3 ruminant insertions Bov-B, Bov-tA3, and OaMAR1, indicating that these appeared subsequently in the artiodactyl lineage.
ggatgg ccttcccatt ctctccatcg
841 tcttcacctt ttacaggttg ggggaggggg tgtctaccta cagccctgta gtggtggtgt
901 ctcattcctg cttctcttta tcacccatag gctaatcccc ttggccctga tggccctggg
961 aaatgtagag cagacccagg atgctattta ttcaagcccc catgtgttgg agtccttcag
1021 gggccaatgc tagtgcaggg ctgagaataa cagcaaatca tcattggttg acctagggct
1081 gcttttttgt tgttgttgtc tagtgcagct gaccgaggct aaaacaattc tcaaaacagt
1141 tttcaaatac ctttgcctgg aaacctctgg ctcctgctgc agctagagct cagtacatta
1201 atgtcccatc ttagccgtgt cttcatagca acttggggaa gtttttctcc ccactctaaa
1261 agaacgcgat tgcacttccc tgtgcaaaga acatttctgc caaatttgaa aggaggccac
1321 atgatattca ttcaaaaagc aaaactagaa accctttgct cttggacgca agcccggcct
1381 gctaggagca ccaaactggg gcgatggttt gcattctgcg gcgtgggcta tgcggcagcc
1441 gaggtgtcca gcgtaaatat tgatgcgacg ctagacctag gcagaggatg tttgcacagg
1501 gaatgaacat aatcaacagt gcgaaaatgc tacaaaaaat cccacactgg ggagcagtgt
1561 ccttggaggc aagttttttt ccttttggga catttaaagc ccctatatgt ggcattcctt
1621 tctttcgtaa cctaaactat agatAATTAA ggcagttaaa aattgaactt ccttccaggc 2.1k sheep homologue
1681 cccaagagca aatctttgtt cacttacctg gaaaccagaa tgattttgac acagaggaag
1741 gtgcagctgt taaaataacc ctcatcctag aagattgcat catggagaaa acgatccgta
1801 gacaaaaatg atcgcatttc ttcattgctg tctcgtaatt gacagaaacc agaattatgt
1861 caagtcctag tttctataat cagcttttga atcaaagaat ggaagtccat ccaaaaaaaa
1921 aaaagaaata ccttaggtca cccatgacag aaatacccat tcaggttaga aaaaaggaat
1981 tctgttaact gttatttaag taaggcaaaa ttattgtccg gattgttcga tatcatcagc
2041 tagcagataa attagcattc tgcaatgttc ccggcttgca ctgtgcgggt atttgatgtt
2101 aaaaaaaatt attatatata ttgtgtatga caaacttaga agtttttgct agaggagtta
2161 acatctgata tatctaatgc accaccagtt ttggaaggta ctaaatactt aatatgtaga
2221 aatccttttg cgtggtcctc aggcttacac gtgcactgaa tagttttgta tgatagagcc
2281 catgtggtct tcgaaatatg catgtacttt atattttcta tatttgtaac tgggcatgta
2341 cttgtataaa aaatgtataa acattcgaac tcttgactag aATTAAAcag gaactgagtg poly A signal
2401 TGTCCCA.tgt gtttgcagtg acattcacca ccgcaccctg tgttgg poly A site
Mink aligns well with human 3' UTR with 71% identity, either at Blast or ClustalW
Human: 1 ggaaggtcttcctgttttcaccatc-t-ttctaatctttttccagcttgagggaggcggt 58
||| || ||||| || || ||||| | ||| | ||||| | || ||| |||||| |||
Mink : 1 ggatggccttcccattctctccatcgtcttc--accttttac-aggttgggggagggggt 57
Human: 59 atccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtc--ccgg 116
|| |||| ||||||| | ||||||||||||||| || | ||||||| || || ||
Mink : 58 gtctacctacagccctgt-agtggtggtgtctcattcctgcttctct--ttatcaccc-- 112
Human: 117 ataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgag- 175
||||||||||| |||||||| ||||||| | |||| ||| ||||| ||||| ||
Mink : 113 ataggctaatc----cccttggccctgatggcc-ctgggaaatgtagagcagaccc-agg 166
Human: 176 atgctggt----caagccccctt-tgattg-agttcatcatgagccgttgctaatgccag 229
||||| | ||||||||| | || ||| ||| | ||| | ||| ||||| ||| ||
Mink : 167 atgctatttattcaagcccccatgtg-ttggagtccttcaggggccaatgctagtgc-ag 224
Human: 230 gccagtaaaagtataacagcaaataaccattggttaatct--gga--cttattt----tt 281
| | | || |||||||||||| | |||||||| | || || ||| ||| ||
Mink : 225 ggctg----agaataacagcaaatcatcattggttgacctagggctgcttttttgttgtt 280
Human: 282 g--gacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctg-aaatacctt 338
| | || |||||| | | ||||||||||||| ||||| |||||| | |||||||||
Mink : 281 gttgtct-agtgcagctgaccgaggctaaaacaattctcaaaacagttttcaaatacctt 339
Human: 339 tgcctggatacctctggctccttcagcagctagagctcagtatactaatg-ccctatctt 397
|||||||| ||||||||||||| | ||||||||||||||||| | ||||| ||| |||||
Mink : 340 tgcctggaaacctctggctcctgctgcagctagagctcagtacattaatgtccc-atctt 398
Human: 398 agtagagat-ttcatagctatttagagata-tttt-----ccatttt--aagaa----a- 443
|| | | | |||||||| | || | || | |||| ||| | | ||||| |
Mink : 399 agccgtg-tcttcatagcaacttgggga-agtttttctccccactctaaaagaacgcgat 456
Human: 444 ---ac--cc--g-----acaacatttctgccaggtttgttaggaggccacatgatactt- 490
|| || | | ||||||||||||| |||| |||||||||||||||| ||
Mink : 457 tgcacttccctgtgcaaagaacatttctgccaaatttgaaaggaggccacatgata-ttc 515
Human: 491 attcaaaaa--aatcctagagattcttagctcttgggatgcaggctcagcccgctggagc 548
||||||||| || ||||| | ||| |||||| ||| ||| || | ||| ||| ||
Mink : 516 attcaaaaagcaaaactagaaaccctttgctctt-ggacgcaagcccggcctgct--ag- 571
Human: 549 atgagctctgtgtgtaccgagaactggggtgatgttttac-ttttcacagtatgggcta- 606
|||| ||| | |||||||| |||| ||| | || | | | |||||||
Mink : 572 --gagc---------acc-a-aactggggcgatggtttgcattct-gcggcgtgggctat 617
Human: 607 -c-acagc--agctgttcaacaagagtaaatattg-tcacaacact-gaacctctggcta 660
| |||| || || || | || |||||||||| | | || || | |||| ||| |
Mink : 618 gcggcagccgaggtg-tc--c-agcgtaaatattgat-gcgacgctag-acct-aggc-a 669
Human: 661 gaggacatatt--cacag----tgaacataactgtaacatatatg---aaaggcttctgg 711
|||| || || ||||| ||||||||| | |||| || ||| |||
Mink : 670 gagg--atgtttgcacagggaatgaacataa-t-caaca---gtgcgaaaatgct----- 717
Human: 712 gacttgaaat--ca-aatgtttggga--atggtgcccttggaggcaa-----cctcccat 761
|| |||| || | || |||| | ||| |||||||||||| ||| |
Mink : 718 -acaaaaaatcccacactg---gggagca--gtgtccttggaggcaagtttttttcc--t 769
Human: 762 ttt-agatgtttaaaggaccctatatgtggcattcctttctt--------taaactatag 812
||| || |||||| | |||||||||||||||||||||||| ||||||||||
Mink : 770 tttgggacatttaaa-gcccctatatgtggcattcctttctttcgtaacctaaactatag 828
Human: 813 gtAATTAAggcagctgaaaagt-aaattgccttctagacactgaag-gcaaatctccttt 870 2.1k sheep homologue
|||||||||||| | |||| | || || ||||| || | | ||| |||||| ||||
Mink : 829 atAATTAAggcagttaaaaattgaactt-ccttccaggc-cccaagagcaaat---cttt 883 2.1k sheep homologue
Human: 871 gtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttg-tgaaag 929
|| || |||||||||||||||||||||||||||| | |||| || ||||| || | |||
Mink : 884 gttcacttacctggaaaccagaatgattttgacacagagga-aggtgcagctgttaaaat 942
Human: 930 caccatcatcatagaggatgatg--taat-taaaaaatggtcagtgtgcaaaga-aaaga 985
||| ||||| |||| ||| || | || | |||| | || || ||| ||| |
Mink : 943 aaccctcatcctagaagat--tgcatcatggagaaaacgatccgt------agacaaa-a 993
Human: 986 actgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagtt 1045
| || | ||||||||| ||| ||||||| |||||| || |||||||||||| |||||| |
Mink : 994 a-tgatcgcatttcttcattgctgtctcgtaattgacagaaaccagaattatgtcaag-t 1051
Human: 1046 catagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaa--------- 1096
| |||||||| |||| |||||||||||||||||| || || | ||| |||
Mink : 1052 cctagtttctataatcagcttttgaatcaaagaat-ggaagtccatccaaaaaaaaaaaa 1110
Human: 1097 -aaatatcttaggttgga--gatgacagaaata-tgatt--gatttgaagtggaaaaaga 1150
||||| ||||||| | |||||||||||| ||| | || || ||||||
Mink : 1111 gaaataccttaggt--cacccatgacagaaatacccattcaggttaga-----aaaaagg 1163
Human: 1151 aattctgttaa-tgttaattaa--a--gtaaaattat--tccctgaattgtttgatattg 1203
||||||||||| ||||| |||| | | |||||||| ||| | |||||| |||||
Mink : 1164 aattctgttaactgttatttaagtaaggcaaaattattgtcc--ggattgttcgatatca 1221
Human: 1204 tcacctagcagatatgtatta-cttttctgcaatgttattattggcttgcactttgtgag 1262
||| |||||||||| |||| | |||||||||| || |||||||||| || | |
Mink : 1222 tcagctagcagata--aattagc-attctgcaatg---ttcccggcttgcactgtgcggg 1275
Human: 1263 tattct-atg-taaaaatatatatgtatataaaatatatattgcataggacagacttagg 1320
|||| | ||| |||||| | | || ||| | |||||||||| || |||| ||||||
Mink : 1276 tatt-tgatgttaaaaa-a-a-at-tat-t---atatatattgtgtatgacaaacttaga 1326
Human: 1321 ag-ttttgtttagagcagttaacatctga-agtgtctaatgcatta--acttttgtaagg 1376
|| ||||| ||||| ||||||||||||| | | ||||||||| | | ||||| ||||
Mink : 1327 agtttttg-ctagaggagttaacatctgata-tatctaatgcaccaccagttttggaagg 1384
Human: 1377 tactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcac 1436
|||| |||||||||||||| |||| |||||||||||||||| |||||||| | ||||||
Mink : 1385 tactaaatacttaatatgt-agaaatccttttgcgtggtcctcaggcttac-acgtgcac 1442
Human: 1437 tgaatcgtttcatgtaagaatccaaagtggacaccat-taacaggtctttgaaatatgca 1495
||||| |||| |||| | || | || | |||| | |||||| ||||||||||
Mink : 1443 tgaatagttt--tgtatg-at--agag----c-ccatgt----ggtcttcgaaatatgca 1488
Human: 1496 tgtactttatattttctatatttgtaactttgcatgttcttgttttgttatataaaaaaa 1555
||||||||||||||||||||||||||||| |||||| |||| ||| ||||||
Mink : 1489 tgtactttatattttctatatttgtaactgggcatgtacttg------tat---aaaaaa 1539
Human: 1556 t-tgtaaatgtt-taa-tatctgact-gaaattaaacgagcgaa 1595
| | |||| || || | | ||||| | |||||||| || |||
Mink : 1540 tgtataaacattcgaactct-tgactag-aattaaac-ag-gaa 1579
CLUSTAL W (1.74) multiple sequence alignment
human GGAAGGTCTTCCTGTTTTCACCATCTTTCTAATCTTTTTCCAGCTTGAGGGAGGCGGTAT 60
mink GGATGGCCTTCCCATTCTCTCCATCGTCTTCACCTTTTAC-AGGTTGGGGGAGGGGGTGT 59
*** ** ***** ** ** ***** * * * ***** * ** *** ****** *** *
human CCACCTGCAGCCCTTTTAGTGGTGGTGTCTCACTCTTTCTTCTCTCTTTGTCCCGGATAG 120
mink CTACCTACAGCCCTGT-AGTGGTGGTGTCTCATTCCTGCTTCTC--TTTATCACCCATAG 116
* **** ******* * *************** ** * ****** *** ** * ****
human GCTAATCAATACCCTTGGCACTGATGGGCACTGGAAAACATAGAGTAGACCTGAGATGCT 180
mink GCTAATC----CCCTTGGCCCTGATGG-CCCTGGGAAATGTAGAGCAGACCCAGGATGCT 171
******* ******** ******* * **** *** ***** ***** ******
human GGT----CAAGCCCCCTT-TGATTGAGTTCATCATGAGCCGTTGCTAATGCCAGGCCAGT 235
mink ATTTATTCAAGCCCCCATGTGTTGGAGTCCTTCAGGGGCCAATGCTAGTGC-AGGGCTG- 229
* ********* * ** * **** * *** * *** ***** *** *** * *
human AAAAGTATAACAGCAAATAACCATTGGTTAATCT-GGACTTATTTTTGGACTT------- 287
mink ---AGAATAACAGCAAATCATCATTGGTTGACCTAGGGCTGCTTTTTTGTTGTTGTTGTC 286
** ************ * ******** * ** ** ** ***** * *
human -AGTGCAACAGGTTGAGGCTAAAACAAATCTCAGAACAGTCTG-AAATACCTTTGCCTGG 345
mink TAGTGCAGCTGACCGAGGCTAAAACAATTCTCAAAACAGTTTTCAAATACCTTTGCCTGG 346
****** * * ************* ***** ****** * ****************
human ATACCTCTGGCTCCTTCAGCAGCTAGAGCTCAGTATACTAATGCCCTATCTTAGTAGAGA 405
mink AAACCTCTGGCTCCTGCTGCAGCTAGAGCTCAGTACATTAATGTCCCATCTTAGCCGTGT 406
* ************* * ***************** * ***** ** ******* * *
human TTTCATAGCTATTTAGAGATATTTT-----CCATTTTAAGAAAACCCGAC---------- 450
mink CTTCATAGCAACTTGGGGAAGTTTTTCTCCCCACTCTAAAAGAACGCGATTGCACTTCCC 466
******** * ** * ** **** *** * *** * *** ***
human ---------AACATTTCTGCCAGGTTTGTTAGGAGGCCACATGATACTTATTCAAAAA-- 499
mink TGTGCAAAGAACATTTCTGCCAAATTTGAAAGGAGGCCACATGATATTCATTCAAAAAGC 526
************* **** **************** * *********
human AATCCTAGAGATTCTTAGCTCTTGGGATGCAGGCTCAGCCCGCT-GGAGCATGAGCTCTG 558
mink AAAACTAGAAACCCTTTGCTCTTGG-ACGCAAGCCCGGCCTGCTAGGAGCACCA------ 579
** ***** * *** ******** * *** ** * *** *** ****** *
human TGTGTACCGAGAACTGGGGTGATGTTTTACTTTTCACAGTATGGGCTACACAGCAGC--- 615
mink -----------AACTGGGGCGATGGTTTGCATTCTGCGGCGTGGGCTATGCGGCAGCCGA 628
******** **** *** * ** * * ******* * *****
human --TGTTCAACAAGAGTAAATATTGTCACAACACTGAACCTCTGGCTAGAGGACATATTCA 673
mink GGTGTCCAGC----GTAAATATTGATGCGACGCTAGACCTA-GGC-AGAGGATGTTTGCA 682
*** ** * ********** * ** ** **** *** ****** * * **
human CAG----TGAACATAACTGTAACATATATGAAAGGCTTCTGGGACTTGAAATCAAATGTT 729
mink CAGGGAATGAACATAATC--AACAGTGCGAAAATGCT------ACAAAAAATCCCACACT 734
*** ********* **** *** *** ** ***** * *
human TGGGAATGGTGCCCTTGGAGGCAACCTC----CCATTTTAGATGTTTAAAGGACCCTATA 785
mink GGGGAGCAGTGTCCTTGGAGGCAAGTTTTTTTCCTTTTGGGACATTTAAAGC-CCCTATA 793
**** *** ************ * ** *** ** ******* *******
human TGTGGCATTCCTTTCTTT--------AAACTATAGGTAATTAAGGCAGCTGAAAAGTAAA 837
mink TGTGGCATTCCTTTCTTTCGTAACCTAAACTATAGATAATTAAGGCAGTTAAAAATTGAA 853
****************** ********* ************ * **** * **
human TTGCCTTCTAGACACTGAAGGCAAATCTCCTTTGTCCATTTACCTGGAAACCAGAATGAT 897
mink CTTCCTTCCAGGCCCCAAGAGCAAATCT---TTGTTCACTTACCTGGAAACCAGAATGAT 910
* ***** ** * * * ******** **** ** *********************
human TTTGACATACAGGAGAGCTGCAGTTGTGAAAGCA-CCATCATCATAGAGGATGATGTAAT 956
mink TTTGACACAGAGGA-AGGTGCAGCTGTTAAAATAACCCTCATCCTAGAAGATTGCATCAT 969
******* * **** ** ***** *** *** * ** ***** **** *** * **
human TAA-AAAATGGTCAGTGTGCAAAGAAAAGAACTGCTTGCATTTCTTTATTTCTGTCTCAT 1015
mink GGAGAAAACGATCCGT------AGACAA-AAATGATCGCATTTCTTCATTGCTGTCTCGT 1022
* **** * ** ** *** ** ** ** * ********* *** ******* *
human AATTGTCAAAAACCAGAATTAGGTCAAGTTCATAGTTTCTGTAATTGGCTTTTGAATCAA 1075
mink AATTGACAGAAACCAGAATTATGTCAAGTCC-TAGTTTCTATAATCAGCTTTTGAATCAA 1081
***** ** ************ ******* * ******** **** *************
human AGAATAGGGAGACAATCTAAAAAA----------TATCTTAGGTTGGAGATGACAGAAAT 1125
mink AGAAT-GGAAGTCCATCCAAAAAAAAAAAAGAAATACCTTAGGTCACCCATGACAGAAAT 1140
***** ** ** * *** ****** ** ******* ***********
human ATG-ATTGATTTGAAGTGGAAAAAGAAATTCTGTTAA-TGTTAATTAAA----GTAAAAT 1179
mink ACCCATTCAGGTTA---GAAAAAAGGAATTCTGTTAACTGTTATTTAAGTAAGGCAAAAT 1197
* *** * * * * ****** *********** ***** **** * *****
human TATTCCCTGAATTGTTTGATATTGTCACCTAGCAGATATGTATTACTTTTCTGCAATGTT 1239
mink TATTGTCCGGATTGTTCGATATCATCAGCTAGCAGATA--AATTAGCATTCTGCAATGTT 1255
**** * * ****** ***** *** ********** **** ************
human ATTATTGGCTTGCACTTTGTGAGTATTCTATGTAAAAATATATATGTATATAAAATATAT 1299
mink CC---CGGCTTGCACTGTGCGGGTATTTGATGTTAAAAAAAAT-------TATTATATAT 1305
********** ** * ***** **** **** * ** ** ******
human ATTGCATAGGACAGACTTAGGAGTTTTGTTTAGAGCAGTTAACATCTGAAGTGTCTAATG 1359
mink ATTGTGTATGACAAACTTAGAAGTTTTTGCTAGAGGAGTTAACATCTGATATATCTAATG 1365
**** ** **** ****** ****** ***** ************* * *******
human CATTAAC--TTTTGTAAGGTACTGAATACTTAATATGTGGGAAACCCTTTTGCGTGGTCC 1417
mink CACCACCAGTTTTGGAAGGTACTAAATACTTAATATGTAG-AAATCCTTTTGCGTGGTCC 1424
** * * ***** ******** ************** * *** ***************
human TTAGGCTTACAATGTGCACTGAATCGTTTCATGTAAGAATCCAAAGTGGACACCATTAAC 1477
mink TCAGGCTTACAC-GTGCACTGAATAGTTT--TGTATGA--------TAGAGCCCAT---G 1470
* ********* *********** **** **** ** * ** ****
human AGGTCTTTGAAATATGCATGTACTTTATATTTTCTATATTTGTAACTTTGCATGTTCTTG 1537
mink TGGTCTTCGAAATATGCATGTACTTTATATTTTCTATATTTGTAACTGGGCATGTACTTG 1530
****** *************************************** ****** ****
human TTTTGTTATATAAAAAAATTGTAAATGTTTAATATCT-GACTGAAATTAAAC--GAGCGA 1594
mink T--------ATAAAAAATGTATAAACATTCGAACTCTTGACTAGAATTAAACAGGAACTG 1582
* ******** * **** ** * *** **** ******** ** *
human AGATGAGCACCA---------------- 1606
mink AG-TGTGTCCCAtgtgtttgcagtgacattcaccaccgcaccctgtgttgg 1632
** ** * ***
Gallus gallus (chicken) M61145
3' UTR [1210 bp has no homology to any known sequence]
ga tgccgtgccc cggccctgtg gcagtgagat gacatcgtgt
1021 ccccgtgccc acccatgggg tgttccttgt cctcgctttt gtccatcttt ggtgaagatg
1081 tccccccgct gcctccccgc aggctctgat ttgggcaaat gggaggggat tttgtcctgt
1141 cctggtcgtg gcaggacggc tgctggtggt ggagtgggat gcccaaaaaa tggccttcac
1201 cacttcctcc tcctcttcct ttctggggcg gagatatggg ctcgtccagc ccttattgtc
1261 cctgcaagag cgtatctgaa aatcctcttt gctaacaagc agggttttac ctaatctgct
1321 tagccccagt gacagcagag cgcctttccc cagggcacac caaccccaag ctgaggtgct
1381 tggcagccac acgtcccatg gaggctgatg ggttttgggg cgtcccaagc aacaccctgg
1441 gctactgagg tgcaattgta gctctttaat ctgccaatcc caaccctacc gtgtagatag
1501 gaactgcctg ctctgcattt tgcatgctgc aaacacctcc tgccgcagcg cccccaaaat
1561 agagtgattt gggaatagtg aggctgaagc cacagcagct tgggattggg ctcatcatat
1621 caatccatga tgctttgctt ccagctgagc ctcactgccc ttttatagcc tgcccagagg
1681 aagggagcgc tgctaaatgc ccaaaaaggt aacactgagc aaaagcttat ttcaatgtat
1741 gatagagaac gagtgcatct cgcacagatc agccatggga gcatcgtttg ccatcagccc
1801 caaaacccaa aggatgctaa aatgcagcca aaggggaatc aagcacgcag ggaaggactt
1861 gaatcagctc aactggattg aaatggcaaa aggcatgagt agaacgaacg gcaaggggat
1921 gctggagatc cacctcctgt gagcaaattg ttcgatgcag ccaatggaac tattgcttct
1981 tgtgcttcag ttgctgctga tgtgtacata ggctgtagca tatgtaaagt tacacgtgtc
2041 aagctgctcg caccgcgtag agctaatatg tatcatgtat gtgggcactg aatgccaccg
2101 ttggccatac ccaaccgtcc taaacgattt tcacgtcgct gtaacttaag tggagataca
2161 ctttcagtat attcagcaaa aggaattc
set of fasta sequences Sequence 1: sheep_reduced 1481 bp Sequence 2: bovine_reduced 1526 bp Sequence 3: human 1606 bp Sequence 4: mink 1632 bp Start of Pairwise alignments Aligning... Sequences (2:3) Aligned. Score: 58 Sequences (1:2) Aligned. Score: 94 Sequences (3:4) Aligned. Score: 70 Sequences (2:4) Aligned. Score: 54 Sequences (1:3) Aligned. Score: 57 Sequences (1:4) Aligned. Score: 54
>sheep_reduced 1481 bp gggcaaccttcctgttttcattatcttcttaatctttgccaggttgggggagggagtgtctacctgcagccctgtagtggtggtgtctcatttcttgcttctctcttgttacctgtataataatacccttggcgcttacagcactgggaaatgacaagcagacatgagatgctgtttattcaagtcccattagctcagtattctaatgtcccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacctgactacatttccctttgggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaactggaaatccttagttcatagacccagggtccaccctgttgagagcatgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccatctattgcatcaagaatggatattcatgcaacctttgacttatgggcagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatggaagaggtgcccttggtggcagccttccattttgtatgtttaagcaccttcaagtgatattcctttctttagtaacataaagtatagataattaaggtaccttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgctcccagattttaacatagagaatgcagatacaaaaactccatattcatttgattgaatcttttcctgaaccagtgctagtgttggactggtaagagtataacagcatatataggttatgtgatgaagagaatagtgtacatgaaatatgtgcatttctttattgctgtcttataattgtcaaaaaagaaagttaggtccttggtttctgtaaaattgacttgaatcaaaagggaggcatttaaagaAATAAAttagagatgatagaaatctgatccattcagagtagaaaaagaaattccattactgttatttaagaaggtaaaattatttcctgaattgttcaatattgtcacctagcagatagacactattattctgtactgtttttactagcttgcaccttgtggtatcctatgtaaaaacgtatttgcatatgacaaactttttctgttagagcaattaacatctgaaccacctaatgcattacctgtttttgtaaggtactttttgtaaggtactaagaacaataaggacAATAAAtgtactgaatacttaaaggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaa >bovine_reduced 1526 bp gggcaaccttcctgttttcattatcttcttaatctttaccaggttgggggagggagtatctacctgcagccccgtagtggtggtgtctcatttcgtgcttctctctttgttacctgtatgctaatacccttggcgcttatagcactgggaaatgaagagcagacatgagatgctgtttattcaagtcccgttagctcagtatgctaatgccccatcttagcagtgattttgtagcaattttctcatttgtttcaagaacacgtgactacatttcccttttggaatagcatttctgccaagtctggaaggaggccacataatattcattcaaaaaaacaaaccggaaatccttagttcatagacccagggtccacctggttgagagcttgtgtcctgtgtctgcagagaactataaaggatattctgcattttgcaggttacatttgcaggtaacacagccagctattgcatcaagaatggatattcatgcaacctttgacttatgggtagaggacattttcacaaggaatgaacataatacgaaaggcttctgagactaaaaaattccaacatatgggagaggtgcccttggtggcagccttccattttgtatgtttaaagcaccttcaagtggtattcctttctttagtaacaaagtatagataattaagttaccttaatttaattaaactaccttctagacactgagagcaaatctgttgtttatctggaacccaggatgattttgacattgtttagacccagattttaacatagagaatgcagatataaaaactccatattcatttgattgaatcttttccttaaccagtgctagtgttggactggtaagattataacaacaaatataggttatgtgatgaagagaatagtgtacaaagaaaagaaatatgtgcatttctttattgctatcataattgtcaaaaaacaaaattaggtccttggtttctgtaaaattaacttttgaatcaacagggaggcatttaaagaaatatcttaaattagagacagtagaaatctgatacattcagagtggaaaaagaaattctattacgattatttaagaaggtaaaattatttcctgggttgttcaatattgtcacctagcagatagacactattgttctgcactgttattactggcttgcactttgtggtatcctatgtaaaaatacatatattgcatatgacagacttaagaatttctgttagagcaattaacatctgaactatctaatgcattacctgtttttgtaaggtactttttgtaaggtactaaaaacaataacaacAATAAAtgtactgaatacttaaaggaaactcttccgtgttgtccttagccttacagcgtgcactgaatagttttgtataagaatccagagtgatatttgaaatacgcatgtgcttatattttctatatttgtaactttgcatgtacttgttttgtgttaaaagtttataaatatttaatatctgactaaaattaaacaggagctaaaaggagg >human 1606 bp ggaaggtcttcctgttttcaccatctttctaatctttttccagcttgagggaggcggtatccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtcccggataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgagatgctggtcaagccccctttgattgagttcatcatgagccgttgctaatgccaggccagtaaaagtataacagcaaataaccattggttaatctggacttatttttggacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctgaaatacctttgcctggatacctctggctccttcagcagctagagctcagtatactaatgccctatcttagtagagatttcatagctatttagagatattttccattttaagaaaacccgacaacatttctgccaggtttgttaggaggccacatgatacttattcaaaaaaatcctagagattcttagctcttgggatgcaggctcagcccgctggagcatgagctctgtgtgtaccgagaactggggtgatgttttacttttcacagtatgggctacacagcagctgttcaacaagagtaaatattgtcacaacactgaacctctggctagaggacatattcacagtgaacataactgtaacatatatgaaaggcttctgggacttgaaatcaaatgtttgggaatggtgcccttggaggcaacctcccattttagatgtttaaaggaccctatatgtggcattcctttctttaaactataggtaattaaggcagctgaaaagtaaattgccttctagacactgaaggcaaatctcctttgtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttgtgaaagcaccatcatcatagaggatgatgtaattaaaaaatggtcagtgtgcaaagaaaagaactgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagttcatagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaaaaatatcttaggttggagatgacagaaatatgattgatttgaagtggaaaaagaaattctgttaatgttaattaaagtaaaattattccctgaattgtttgatattgtcacctagcagatatgtattacttttctgcaatgttattattggcttgcactttgtgagtattctatgtaaaaatatatatgtatataaaatatatattgcataggacagacttaggagttttgtttagagcagttaacatctgaagtgtctaatgcattaacttttgtaaggtactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcatgtaagaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttctatatttgtaactttgcatgttcttgttttgttatataaaaaaattgtaaatgtttaatatctgactgaaattaaacgagcgaagatgagcacca >mink 1632 bp ggatggccttcccattctctccatcgtcttcaccttttacaggttgggggagggggtgtctacctacagccctgtagtggtggtgtctcattcctgcttctctttatcacccataggctaatccccttggccctgatggccctgggaaatgtagagcagacccaggatgctatttattcaagcccccatgtgttggagtccttcaggggccaatgctagtgcagggctgagaataacagcaaatcatcattggttgacctagggctgcttttttgttgttgttgtctagtgcagctgaccgaggctaaaacaattctcaaaacagttttcaaatacctttgcctggaaacctctggctcctgctgcagctagagctcagtacattaatgtcccatcttagccgtgtcttcatagcaacttggggaagtttttctccccactctaaaagaacgcgattgcacttccctgtgcaaagaacatttctgccaaatttgaaaggaggccacatgatattcattcaaaaagcaaaactagaaaccctttgctcttggacgcaagcccggcctgctaggagcaccaaactggggcgatggtttgcattctgcggcgtgggctatgcggcagccgaggtgtccagcgtaaatattgatgcgacgctagacctaggcagaggatgtttgcacagggaatgaacataatcaacagtgcgaaaatgctacaaaaaatcccacactggggagcagtgtccttggaggcaagtttttttccttttgggacatttaaagcccctatatgtggcattcctttctttcgtaacctaaactatagataattaaggcagttaaaaattgaacttccttccaggccccaagagcaaatctttgttcacttacctggaaaccagaatgattttgacacagaggaaggtgcagctgttaaaataaccctcatcctagaagattgcatcatggagaaaacgatccgtagacaaaaatgatcgcatttcttcattgctgtctcgtaattgacagaaaccagaattatgtcaagtcctagtttctataatcagcttttgaatcaaagaatggaagtccatccaaaaaaaaaaaagaaataccttaggtcacccatgacagaaatacccattcaggttagaaaaaaggaattctgttaactgttatttaagtaaggcaaaattattgtccggattgttcgatatcatcagctagcagataaattagcattctgcaatgttcccggcttgcactgtgcgggtatttgatgttaaaaaaaattattatatatattgtgtatgacaaacttagaagtttttgctagaggagttaacatctgatatatctaatgcaccaccagttttggaaggtactaaatacttaatatgtagaaatccttttgcgtggtcctcaggcttacacgtgcactgaatagttttgtatgatagagcccatgtggtcttcgaaatatgcatgtactttatattttctatatttgtaactgggcatgtacttgtataaaaaatgtataaacattcgaactcttgactagaattaaacaggaactgagtgtgtcccatgtgtttgcagtgacattcaccaccgcaccctgtgttgg >ancest_rod 1242 bp gggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcactgatgggcccggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggctaacaccaggccaataagaatataacaccaaataactgctggctagtcagggctttgttttggtctagtgagtaaatactggtgtaacccctgacttgtacccagagtacatggtgacagagacacacataacttagtataggcaaagggttctacagccaaagaagccactgtttgggcatggcaccctggataacagcctcccacctgggatatctagagcatccacatgtggaattctttctttactaacaaaccatagctgattgaaggcaacaggaaaaaaaaatcaaattatcctactgacattgaaagcaaacctgtgttcattccctaggcactagaatgatttttagccttgcttggattgaaccaggagattttgactctgaggagagccagcactgtacaaaagcatggtcctcctgtgatgggagagatggttaagggacaaaggcaagacccttgcgtttcttcatttctgtctcataattatcaaaagctagaattaggtcgtgccctaagtttctgtacttgtatttgaactggacaacaaagagacaatctaaaaattctcttaggctgcagatgagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtttaagtaaggcaaactttgttccttaagtcgctttgtatttcccccagcagacataacaaccctgcgatcggttcagcttgcactgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcaacgtgcactcggttccgtatgattccaaagtagacccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatg >hamster 1214 bp aggaagcctccctgcttgtacttcctcgttcttgtggtctaggctgggggaggggttatccaccgtagctcttttaattgaggtggtgtctcattcctgcttctctttgtcccccataggctaatgcccttggcactagtgggccctgggaatgtacagtagaccagatgctattcgatccagagcctttgaattgagtccatcacgggccagcactaacaccaggcctatctgaatataacagcaagtaatggctggctagtcagggctttgttttggtctagtgagtaaatactgatgtgaccctctgacttccacacagagtacgcagtgacagacacacctaactgttaaaataggcgaagggttctacagccaaagaagtcactgtttggcatggtccctaagaaacagcctcccatttgggatatttaaagcatccatatgaggcattcctccttcactaacaaactctagctgagtaaggcaacgggaaaaaaacaaaattaccctactaacatggaaagcaaacctgtgttcatttcctaggaactagaatgatgttttagccttgcttggattgaaccaggagattttggctctgaagagccaacactgtaaaaatgtggtcctcctgcaaagggagagatggttaggacacaaagtcacggcgcttggcgtttcttcatttctgtctcataattgtcaaaagtcacaattaggtcatgcccttagttaatatacttgtatttgaatcggacgacaagagacaatctaaaaattctcctaggttgtagatgaaataggctccattcaaggtgaaaagacagtttgttagcgttgcttatgtaaggcaaactttgttccttaagttgctccgtgtttccctgagcagacataaccactctgcaacagcattgccctgctgtagaatatataaagtgtaactacaagcttagaccttctgttctgatgcatccgaagtacgtaatgcactgaccatttcacccggtatcagatgttttctgtgtggcccctagctttccttcaacatgcattcggttccatatatgaatccaaagtggaccccctaactggtctctgaaatctgcatgtacttcacattttctatatttgtaactttgcatgtccttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaataggagcta >rat 1222 bp ggaggccttcctgcttgttccttctcattctcgtggtctaggctgggggaggggttacccacctgtagctctttcaattgaggtggtgtctcattcttgcttctctttgtcccccataggctaatacccttggcagtgatgggtctggggaaatgtacagtagaccagatgctattcgcttcagcgtcctttgattgagtccatcatgggccagggttaacaccaggccagtaagaatataacaccaaataactgctggctagtcagggctttgttttggtctactgagtaaatactgtgtaacccctgaattgtacccagaggacatggtgacagagacacacataacttagtataggcaaagggttctatagccaaagaagccactgtgtgggcatggcaccctggataacagcctcccgcctgggatatctagagcatccacatgtggaattctttcttttctaacataaaccatagctgattgaaggcaacaagaaaaagaatcaaattatcctactgacattgaaagcaaactgtgttcattccctaggcgctggaatgatttttagccttggattaaaccaggagattttgactctgaggagaaccagcagtacaaaagcatggtctcctgtgatgggagagatggtgaagggacaaaggcaagacccctgcgtttcttcatttctgtctcataattatcaagagctagaattaggtcgtgccctaagtttctgtactcgtatttgaactggacaacaaagagacaatctacaaattctcttgggctgcagaggagagaaataggctccattccaaagtggaaagagaaattctgctagcattgtctaagtaaggctaacttttccttaaatcgctttgtatttcccccagcagacatcacaaccctgtgatcggttcagcctgcaccgcgggtgttctgtgtagaatatataaatataacttcaagcttaggccttctattttaaaacatctgaagtgtggaacgcactggccgttctgtgcagtactaagtgtgacccttgggctttcaatgtgcactcggttccgtatgattccaaagtagagccctagctggtcttcgaatctgcatgtacttcacgttttctatatttgtaacttcgcatgtatttgttttgtcatataaaaagtttataaatgtttgctatctgactgacattaaatagaagctatgatg >mouse 1234 bp gggaggccttcctgcttgttccttcgcattctcgtggtctaggctgggggaggggttatccacctgtagctctttcaattgaggtggttctcattcttgcttctctgtgtcccccataggctaatacccctggcactgatgggccctgggaaatgtacagtagaccagttgctctttgcttcaggtccctttgatggagtctgtcatcagccagtgctaacaccgggccaataagaatataacaccaaataactgctggctagttggggctttgttttggtctagtgAATAAAtactggtgtatcccctgacttgtacccagagtacaaggtgacagtgacacatgtaacttagcataggcaaagggttctacaaccaaagaagccactgtttggggatggcgccctggaaaacagcctcccacctgggatagctagagcgtccacacgtggaattctttctttactaacaaacgatagctgattgaaggcaacaggaaaaaaaaaaatcaaattgtcctactgacgttgaaagcaaacctttgttcattcccagggcactagaatgatctttagccttgcttggattgaactaggagatcttgactctgaggagagccagccctgtaaaaagcttggtcctcctgtgacgggagggatggttaaggtacaaaggctagaaacttgagtttcttcatttctgtctcacaattatcaaaagctagaattagcttctgccctatgtttctgtacttctatttgaactggataacagagagacaatctaaacattctcttaggctgcagataagagaagtaggctccattccaaagtgggaaagaaattctgctagcattgtttaaatcaggcaaaatttgttcctgaagttgctttttaccccagcagacataaactgcgatagcttcagcttgcactgtggattttctgtatagaatatataaaacataacttcaagcttatgtcttctttttaaaacatctgaagtgtgggacgccctggccgttccatccagtactaaatgcttaccgtgtgacccttgggctttcagcgtgcactcagttccgtaggattccaaagcagacccctagctggtctttgaatctgcatgtacttcacgttttctatatttgtaactttgcatgtattttgttttgtcatataaaaagtttataaatgtttgctatcagactgacattaaatagaagctatgatg
>SheepBovB
1 tagggatgtg agagttggac tgtaaagaaa gctgagtgct gaagagttga tgcttttgaa
61 ctatagtgtt ggagaaaact cttgagagtc ccttggactg aaaggagatc agtcctgaat
121 attcattgga aggactgatg ctgaagctga aactccaata ctttggtcac ctgatgggaa
181 gaactgaagg caggagggat gctaggaaag actgaaggca ggaggagaag gggacgacag
241 aggatgagat ggctagatgg catcatggac tcaatggaca tgagcttaag taaactccag
301 gagttggcga tggacaggga gacctggcgt cctgcagtcc atggtgtcgc agagtcggac
361 acgattgagt gactaaattg aggtgaa
>CowBovB
tagagatg tgagagttga actgtaaaga
1741 aagctgagtg ctgaagaatt gatgcttttg aactctagtg ttggagaaaa cttgagagtc
1801 ccttggactg caaggagatc aaattagtcc atcctaaagg agatcagtcc tgaatattca
1861 ttggaaggac tgatgctgaa cgtgaaactc caatactttg gccacctgat gggaagaact
1921 gaaggcagga ggagaagggg atgacagagg atgaagatgg ctggatggca tcatggattc
1981 aatggacatg agcttgagta aactccagga gttggcaatc gacggagtcc tggcatcctg
2041 cagtccatgg tgtcgcagag ttggacacga ctgagtgact gaactgaggt gaa
>SheepBovtA3
1 ggagatgtgg gtttaatccc taggtcaggt aaatccccta gaggaagaaa tggcaaccca
61 ctccagtatt cttgccagga aaatccagtg ggcagaggag cctggcaggg tacagtctaa
121 gcatggggtt gcaaagagtg agacaagact tgagctact
>cowBovtA3
ggagacg
2641 tgggtttaat ccctaggtca tgtaaatccc ctggaggagg aaatagcaac ccactccagt
2701 attcttgcca ggagaatccc atgggcagag gagcctggca gggtgcagtc catgcatagg
2761 gttgcaaaga gtcagacaag acttgagcta ct
>SheepOamar1
1 ctgggtcggc taaaaggttc attaggtttt ttttctgtaa gatggctcta gtagtacttg
61 tctttatctt cattcgaaac aattttgtta gattgtatgt gacagctctt gtatcagcat
121 gcatttgaaa aaaacatcaa aattggtaaa tttttgtata gccatcttac tattgaagat
181 ggaagaaaag aagcaaaatt ttcagcatat catgctgtat tatttcaaga aagataacca
241 aaatgcaaaa atgtatttgt gaagtgtatg gagaaggggc tgcaactgat caagcttgtc
301 aaagtagttt gtgaagtttc gtgctggaga tttcttattg gacgatgctc cacagttgga
361 tataccagtt gaagttgata gtgatcaaat tgagatattg agaataatcg atgttatacc
421 acgcgggaga tagctgacat actcaaaata tccaaataga accttgaaaa ccatttgcac
481 catctcagtt atgttaatca ctttgatgtt tgagttccac ataagcaaaa aaacaacaac
541 aacaaaaaaa aacacaacct tgaccatatt tgcgcatgca gttctctact gaaatgattg
601 aaaacacttt gtttttaaaa acagattttg attaacagtg ggtacgatac aataacgtag
661 aatggaagaa attgtagggt gagcaaaatg aaccaccacc accaaaggcc agtcttcctc
721 taaagaagat gtgtgtatgg tgggattgga aagtaatcct ctattatgaa ttcttctgga
781 aaacactgct cctaattaga ccaactgaaa gcagcactca acgaaaagca tccagaatta
841 gtcaatagaa aacataatct tccatcagga taacgcaaga ctacatattt ctttgatgac
901 ccagcatggc tggagtttct gattcatctg ttgtattcag acgttgcatc tttggatttt
961 ttccatttat ttcagtctac aaaattatca taatggaaaa aatttccatt ccctggaaga
1021 ttgtaaagtg catctggaaa atttctttgc tcaaaaagat aaaaagtttt gtgaacacag
1081 aattatgacg ttgcctgaaa aatggcagaa ggtagtggaa caaaagagtg actatgttgt
1141 ttggtaaagt tcttagtgaa aatgaaaaat gtgtctttta tttttattta aacaccaaag
1201 gcacattttg gccaacccaa
>cowOamar1
ctgggtt
2821 ggctaaaagg ttcattaggt tttttttctg taagatggct gtctttaact tcattcgaaa
2881 caattttgtt agattgtatg tgacagctct tgtatcagca tgcatttgaa aaagaaaaca
2941 acttaccaaa attggtgaat ttttgtatag ccattttact attgaagatg gaagaaaaga
3001 agcaaaattt tcagcatatc atgctgtatt atttcaagaa agataacaca accaaaatgc
3061 gaaaatgtat ttgtgcagtg tatggagaag gtgctgcaac tgatcaagct tgtcaaagta
3121 gtttgtgaag tattgtgctg gagatttctt actggacaat gctccacagt cgggtatacc
3181 agttgaagtt gatagtgatc aaattgagat attgagaaca atcaatgtta taccacgtgg
3241 gagatagctg acatactcaa aatatccaaa tagaaccttg aaaaccattt gcaccatctc
3301 agttatgtta ataactttga tgtttgagtt ccacataaat taagcaaaaa aaaaacaaaa
3361 acaaaaacac acaaccttga ccatatttgc atatgcagtt ctctactgaa atgaatgaaa
3421 acacttttgt ttttaaaaac agattttgat gaacagtgga tactatacaa taacgtagaa
3481 tggaaaagac tgtggggtga gcaaaatgaa ccagcaccac caaaggccag gcttcatcca
3541 aagaagatgt gtgtatggtg ggattggaaa gtaatcctct attatgggat tcttctggaa
3601 aaccaaaaaa tcaattccaa caagtactgc tcctaattag accaactgaa agcagcattc
3661 aatgaaaagc atccagaatt agtcaataga aagcatataa tcttccatca ggataacaca
3721 agactacatt tctttgatga cccagcatgg ctgagaggtt ctgattcacc tgctgtattc
3781 agacattgca tctttggatt tccatttatt tcagtctaca gaattatcat catgaaaaaa
3841 atttccattc cctggaagat tgtaaagtgc atctggaaaa cttctttgct caaaaagata
3901 aaaagttttg tgaacacaga attatgaagt tgcctgaaaa acagcagaag atagtgacta
3961 tgttgttcag taaagttctt ggtgcaaatg tgtcttttat ttttatttaa acactaaagg
4021 cacgttttgg ccaacccaa
exon 1 is 134 bp
exon 2 is 97bp
exon 3 leader is 11 bp
coding + stop is 738
mRNA join(12634..12767,15390..15488,25464..27817) = 2587 bp: 3' UTR starts at 26213
3' UTR is 982-2587= 1606 bp
So human 27026 corresponding to sheep 2.1 homology is at position 813 of 3' UTR or position, 793 bp from end or this signal would correspond to 1796 bp mRNA, approximately the 1765 consensus site mentioned by Goldmann et al. using some unknow human sequence.
QUERY 2059 aatagggag-ac-aatctaaaaaata-t-cttaggttggagatga-c-agaaat-at-ga 2110 QUERY 2111 ttgatttgaagtggaaaaag-aaattctgtt-aatg-ttaattaaagtaaaattattccc 2167 AA258260 378 .........g.....
So 2587 is the longest and most common mRNA using all 1606 of 3' UTR
2098 has 490 fewer bp so stops at 1116 of 3' UTR
1978 has 710 fewer bp so stops at 896.
human mRNA
1 ccgcccgcga gcgccgccgc ttcccttccc cgccccgcgt ccctccccct cggccccgcg
61 cgtcgcctgt cctccgagcc agtcgctgac agccgcggcg ccgcgagctt ctcctctcct
121 cacgaccgag gcaggactcc tgaatatttt tcaaaactga acaatttcag ccatgtctga
181 gctttccgtc ttcctggagg cacaaatcta gtttagctga accacaacag attagcagtc
241 attatggcga accttggctg ctggatgctg gttctctttg tggccacatg gagtgacctg
301 ggcctctgca agaagcgccc gaagcctgga ggatggaaca ctgggggcag ccgatacccg
361 gggcagggca gccctggagg caaccgctac ccacctcagg gcggtggtgg ctgggggcag
421 cctcatggtg gtggctgggg gcagcctcat ggtggtggct gggggcagcc ccatggtggt
481 ggctggggtc aaggaggtgg cacccacagt cagtggaaca agccgagtaa gccaaaaacc
541 aacatgaagc acatggctgg tgctgcagca gctggggcag tggtgggggg ccttggcggc
601 tacatgctgg gaagtgccat gagcaggccc atcatacatt tcggcagtga ctatgaggac
661 cgttactatc gtgaaaacat gcaccgttac cccaaccaag tgtactacag gcccatggat
721 gagtacagca accagaacaa ctttgtgcac gactgcgtca atatcacaat caagcagcac
781 acggtcacca caaccaccaa gggggagaac ttcaccgaga ccgacgttaa gatgatggag
841 cgcgtggttg agcagatgtg tatcacccag tacgagaggg aatctcaggc ctattaccag
901 agaggatcga gcatggtcct cttctcctct ccacctgtga tcctcctgat ctctttcctc
961 atcttcctga tagtgggatg aggaaggtct tcctgttttc accatctttc taatcttttt
1021 ccagcttgag ggaggcggta tccacctgca gcccttttag tggtggtgtc tcactctttc
1081 ttctctcttt gtcccggata ggctaatcaa tacccttggc actgatgggc actggaaaac
1141 atagagtaga cctgagatgc tggtcaagcc ccctttgatt gagttcatca tgagccgttg
1201 ctaatgccag gccagtaaaa gtataacagc aaataaccat tggttaatct ggacttattt
1261 ttggacttag tgcaacaggt tgaggctaaa acaaatctca gaacagtctg aaataccttt
1321 gcctggatac ctctggctcc ttcagcagct agagctcagt atactaatgc cctatcttag
1381 tagagatttc atagctattt agagatattt tccattttaa gaaaacccga caacatttct
1441 gccaggtttg ttaggaggcc acatgatact tattcaaaaa aatcctagag attcttagct
1501 cttgggatgc aggctcagcc cgctggagca tgagctctgt gtgtaccgag aactggggtg
1561 atgttttact tttcacagta tgggctacac agcagctgtt caacaagagt aaatattgtc
1621 acaacactga acctctggct agaggacata ttcacagtga acataactgt aacatatatg
1681 aaaggcttct gggacttgaa atcaaatgtt tgggaatggt gcccttggag gcaacctccc
1741 attttagatg tttaaaggac cctatatgtg gcattccttt ctttaaacta taggtAATTA 2.1k sheep homologue
1801 Aggcagctga aaagtaaatt gccttctaga cactgaaggc aaatctcctt tgtccattta
1861 cctggaaacc agaatgattt tgacatacag gagagctgca gttgtgaaag caccatcatc
1921 atagaggatg atgtaattaa aaaatggtca gtgtgcaaag aaaagaactg cttgcatttc
1981 tttatttctg tctcataatt gtcaaaaacc agaattaggt caagttcata gtttctgtaa
2041 ttggcttttg aatcaaagaa tagggagaca atctaaaaaa tatcttaggt tggagatgac
2101 agaaatatga ttgatttgaa gtggaaaaag aaattctgtt aatgttaatt aaagtaaaat
2161 tattccctga attgtttgat attgtcacct agcagatatg tattactttt ctgcaatgtt
2221 attattggct tgcactttgt gagtattcta tgtaaaaata tatatgtata taaaatatat
2281 attgcatagg acagacttag gagttttgtt tagagcagtt aacatctgaa gtgtctaatg
2341 cattaacttt tgtaaggtac tgaatactta atatgtggga aacccttttg cgtggtcctt
2401 aggcttacaa tgtgcactga atcgtttcat gtaagaatcc aaagtggaca ccattaacag
2461 gtctttgaaa tatgcatgta ctttatattt tctatatttg taactttgca tgttcttgtt
2521 ttgttatata aaaaaattgt aaatgtttaa tatctgactg aaattaaacg agcgaagatg
2581 agcacca
human prion has only TATAAA of the suggested polyA sites, 2269 and 2527.
ccgcccgcgagcgccgccgcttcccttccccgccccgcgtccctccccctcggccccgcgcgtcgcctgtcctccgagccagtcgctgacagccgcggcgccgcgagcttctcctctcctcacgaccgaggcaggactcctgaatatttttcaaaactgaacaatttcagccatgtctgagctttccgtcttcctggaggcacaaatctagtttagctgaaccacaacagattagcagtcattatggcgaaccttggctgctggatgctggttctctttgtggccacatggagtgacctgggcctctgcaagaagcgcccgaagcctggaggatggaacactgggggcagccgatacccggggcagggcagccctggaggcaaccgctacccacctcagggcggtggtggctgggggcagcctcatggtggtggctgggggcagcctcatggtggtggctgggggcagccccatggtggtggctggggtcaaggaggtggcacccacagtcagtggaacaagccgagtaagccaaaaaccaacatgaagcacatggctggtgctgcagcagctggggcagtggtggggggccttggcggctacatgctgggaagtgccatgagcaggcccatcatacatttcggcagtgactatgaggaccgttactatcgtgaaaacatgcaccgttaccccaaccaagtgtactacaggcccatggatgagtacagcaaccagaacaactttgtgcacgactgcgtcaatatcacaatcaagcagcacacggtcaccacaaccaccaagggggagaacttcaccgagaccgacgttaagatgatggagcgcgtggttgagcagatgtgtatcacccagtacgagagggaatctcaggcctattaccagagaggatcgagcatggtcctcttctcctctccacctgtgatcctcctgatctctttcctcatcttcctgatagtgggatgaggaaggtcttcctgttttcaccatctttctaatctttttccagcttgagggaggcggtatccacctgcagcccttttagtggtggtgtctcactctttcttctctctttgtcccggataggctaatcaatacccttggcactgatgggcactggaaaacatagagtagacctgagatgctggtcaagccccctttgattgagttcatcatgagccgttgctaatgccaggccagtaaaagtataacagcaaataaccattggttaatctggacttatttttggacttagtgcaacaggttgaggctaaaacaaatctcagaacagtctgaaatacctttgcctggatacctctggctccttcagcagctagagctcagtatactaatgccctatcttagtagagatttcatagctatttagagatattttccattttaagaaaacccgacaacatttctgccaggtttgttaggaggccacatgatacttattcaaaaaaatcctagagattcttagctcttgggatgcaggctcagcccgctggagcatgagctctgtgtgtaccgagaactggggtgatgttttacttttcacagtatgggctacacagcagctgttcaacaagagtaaatattgtcacaacactgaacctctggctagaggacatattcacagtgaacataactgtaacatatatgaaaggcttctgggacttgaaatcaaatgtttgggaatggtgcccttggaggcaacctcccattttagatgtttaaaggaccctatatgtggcattcctttctttaaactataggtaattaaggcagctgaaaagtaaattgccttctagacactgaaggcaaatctcctttgtccatttacctggaaaccagaatgattttgacatacaggagagctgcagttgtgaaagcaccatcatcatagaggatgatgtaattaaaaaatggtcagtgtgcaaagaaaagaactgcttgcatttctttatttctgtctcataattgtcaaaaaccagaattaggtcaagttcatagtttctgtaattggcttttgaatcaaagaatagggagacaatctaaaaaatatcttaggttggagatgacagaaatatgattgatttgaagtggaaaaagaaattctgttaatgttaattaaagtaaaattattccctgaattgtttgatattgtcacctagcagatatgtattacttttctgcaatgttattattggcttgcactttgtgagtattctatgtaaaaatatatatgtaTATAAAatatatattgcataggacagacttaggagttttgtttagagcagttaacatctgaagtgtctaatgcattaacttttgtaaggtactgaatacttaatatgtgggaaacccttttgcgtggtccttaggcttacaatgtgcactgaatcgtttcatgtaagaatccaaagtggacaccattaacaggtctttgaaatatgcatgtactttatattttctatatttgtaactttgcatgttcttgttttgttaTATAAAaaaattgtaaatgtttaatatctgactgaaattaaacgagcgaagatgagcacca
dog AF022714
ggggcaacct tcctgttttc attatcat AF003087
gggcaacct
781 tcctgttttc attat
J Gen Virol 1999 80: 2275 W. Goldmann, G. O'Neill, F. Cheung, F. Charleson, P. Ford and N. HunterThis paper looks at two length of mRNA found in various sheep tissues, the 2.1kb (from TTAAGGTACCTTAATTAAA.CTACCTTCTAGACACTG- ending at 1634 of U67922 mRNA) and 4.6kd which differ only in polyadenylation site. The 4.6 is highest in brain; the 2.1 highest in spleen. 2.1 is found in sheep and goat, marginally in cow, not seen in human or mouse. [This statement is inconsistent with homology alignment.]
1561 gtgatattcc tttctttagt aacataaagt atagataatt aaggtacctt aattaaaCTA
1621 CCTTCTAGAC ACTGagagca aatctgttgt ttatctggaa cccaggatga ttttgacatt
1634 of U67922 mRNA is 26295 of whole sequence.
4.6 does not seem to have proper signal-site separation
4081 ctcatatgtc atggggcaga gtcaagtccc cattgtgcct gtccaactct ttggcctaca
4141 caattcatgg gcatAATAAA atggtggttt ctttagacca ttaagttttg gagtagttgc
682 bp after the 2.1 site comes a signal, possibly that of the 2.6k. 2281aaattgacttgaatcaaaagggaggcatttaaagaAATAAAttagagatgatagaaatct +36 2316 1130 bp after the 2.1 site comes a signal, possibly that of the 3.3, 14 bp after the Line element 2761gacAATAAAtgctgggtcggctaaaaggttcattaggttttttttctgtaagatggctct
Sheep also show a minor 3.3kb band (amounting to 1-5% of mRNA) and a 2.6kb species seen in kidney. No tissues are 2.1 only, some is in brain heart and liver, contrary to earlier findings. 2.1 in brain ils possibly from specific regions, this was not tested. Goldmann et al. also assert unpublished alternative 5' splicing sites in cow but not sheep.
M131313 and AJ223072 said to have 6 allelic differences and 30 differences relative to Lee's heroic sequence U67922 and in several positions to Cheviot (Goldmann, unpublished), showing high mutation rate or (very likely, for reasons given by Lee) sequencing error. This could have been tested by alignment with cow and mink 3' UTR but was not.
Sheep 3'UTR have 3 retrotransposons, poorly described in this paper and claimed to have high GC and high GT said to function as RNA polymerase stop signals. The mRNA species are not breed or allele or scrapie related. Regions ABCDEFG were previously defined by Goldmann [PNAS 87 2476-2480 1990] -- it is unclear what significance these have. Goldmann Brit Med Bulletin 49: 839-8601 1993 says retrotransposons are regions D and F, lacking in 2.1. Sheep 2.1 is found in domains ABC, said to be non-homologous to human.
Protein expression was measured in a heterologous system , mouse neuroblastoma. The best protein expression correlated with shortest 3' UTR in region G; short 3' UTR in region C was distinctly less efficient. Poor translation of full length mRNA intransfected cells, unlike sheep brain, suggests murine neuroblasomat cells were unsuitable for characterizing relative translation efficiencies in vivo in sheep.
Bovine ovary, uterus, and brain also low 2.1 but still 0.5-2% of total mRNA. Horiuchi 1995 compared ovine kidney and brain: 20% less mRNA in kidney, yet 2.5% the protein, factor of 8x less use. Here ovine kidney was 75% 4.6 and 25% 2.1 or 15% and 5% of brain, consistent with but hardly proving that only 2.1 is translated in kidney. Sheep express 4.6 kb mRNA two-thirds through gestation in brain, as does mouse. Various fetal tissues and young lambs and placenta also express the prion gene. Fetal only tonsil had both mRNA forms. Prion mRNA is easily detected at 98 days gestation, is 100 x higher at day 134, and 200x higher in early lamb. 4.6/2.1 ratio was 1 at fetal 98, 3 at fetal 138 and early lamb and held steady. Thymus had a ratio of 4, drops during early lamb by factor of 8.
Calf had 4.6 in brain, kidney, and spleen, liver, ovary, and uterus (which both had some 2.1). Goat spleens had both; brains were 4.6. Humans have 2.5k mRNA; brain 4x as abundantly as liver and heart, 8x that of lung, placenta, muscle, kidney, and pancreas. Humans are said to have 3 additional consensus sites at 1765, 1978, and 2098 that could give shorter mRNAs. This 2.5k must be adjusted for exon 1 and the leader and coding portion of exon 3 to give numbering in terms of the 3' UTR; exon 1 is 134 bp, exon 2 is 97bp, exon 3 leader is 11 bp, coding + stop is 738 totalling 980 bp. Mouse and hamster have 2.5 and a 1.2 in peripheral tissues close to 1152 [Locht PNAS 83: 6372-6376 1986]. No 1.2 seen here.
Sheep said to have 3220 polyA signal versus 3246 given at GenBank annotation of Lee's UTR. Human 3' UTR consists of 1606 bp. 9 Suffolk polyA signals said to be at 1a AATAAA 1523, 1b TATAAA 1523, AATAAA 2222, AATAAA 2285, AATAAA 2667, ATTAAA 4063. Not in table: 1253, 4038, 4678. Cheviot unpublished have 1253, 1523, 4038, and 4063.
The majority of mRNA is polyadenylated about 20 bp downstream of the polyA signal at 4063 in Cheviot. 2.1 is polyadenylated 23 bp downstream of ATTAAA 1523.
Medline search 13 Aug 99
Anim Genet 1998 Feb;29(1):37-40 Horiuchi M, Ishiguro N, Nagasawa H, Toyoda Y, Shinagawa MThe extent of intron 2 of the bovine PrP gene and the nucleotide sequence of the 3' half of bovine PrP cDNA is given. This newly sequenced 3' half of the bovine PrP cDNA consisted of 2149 bp. The entire 3'-untranslated region (3'-UTR) was found to be encoded by a single exon, exon 3. One nucleotide polymorphism was found in the 3'-UTR.
Genome Res 1998 Oct;8(10):1022-37 Lee IY, Westaway D, Smit AF, ... Cooper C, Yao H, Prusiner SB, Hood LEA major paper that determined and analyzed entire prion genes and flanking regions from sheep, mouse, and human.
Biochem Biophys Res Commun 1997 Apr 28;233(3):650-4 Horiuchi M, Ishiguro N, Nagasawa H, Toyoda Y, Shinagawa MHere we report two types of bovine prion protein mRNA that possessed different lengths of the 5'-untranslated region and were expressed in various bovine tissues. The two mRNA species were transcribed from identical positions but differed in the usage of the splice site for exon 1/intron. One mRNA possessed exon 1 consisting of 53 nucleotides and the other possessed exon 1 consisting of 168 nucleotides. Usage of exons 2 and 3 was identical for the two mRNA species. The two mRNA species were detected in all but spleen tissue; the mRNA possessing 168-nt exon 1 was not detected in bovine spleen. This is the first report on the tissue-specific alternative splicing of PrPc mRNA in any other species. Only a low level of PrPc appeared to be present in bovine spleen. These results suggested the possibility that the mRNA possessing 53-nt exon 1 was inefficiently translated into Prp; however, in vitro translation analysis showed no marked difference in translational efficiency between the two mRNA species.
Unpublished, Genbank X83613 1237 bp and X83612 17 Feb 97 Baybutt,H. and Hope,J.
variation 444 a in VM-S7 mice, g here variation 1010 a in VM-S7 mice, g here
Ann N Y Acad Sci 1994 Jun 6;724:353-4 Hunter N, Manson JC, Charleson FC, Hope J
J Gen Virol 1995 Oct;76 ( Pt 10):2583-7 Horiuchi M, Yamazaki N, Ikeda T, Ishiguro N, Shinagawa MA cellular form of the prion protein (PrPC) is thought to be a substrate for an abnormal isoform of th eprion protein (PrPSc) in scrapie. PrPC is abundant in tissues of the central nervous system, but little is known about the distribution of PrPC in non-neuronal tissues of sheep, the natural host of scrapie. This study investigated the tissue distribution of PrPC in sheep. Although PrPC was abundant in neuronal tissues, it was detected in non-neuronal tissues such as spleen, lymph node, lung, heart, kidney, skeletal muscle, uterus, adrenal gland, parotid gland, intestine, proventriculus, abomasum and mammary gland. Neither PrPC nor PrP mRNA was detected in the liver. The tissue distribution of PrPC appears to be inconsistent with the tissues which possess scrapie infectivity, suggesting that factor(s) specific to certain cell types may be required to support multiplication of the scrapie agent.
Br Med Bull 1993 Oct;49(4):839-59 [Review, nothing relevent in abstract] Goldmann W
Proc Natl Acad Sci U S A 1990 Apr;87(7):2476-80 [Nothing relevent in abstract] Goldmann W, Hunter N, Foster JD, Salbaum JM, Beyreuther K, Hope JSheep are the natural hosts of the pathogens that cause scrapie, an infectious degenerative disease of the central nervous system. Scrapie-associated fibrils [and their major protein, prion protein (PrP)] accumulate in the brains of all species affected by scrapie and related diseases. PrP is encoded by a single gene that is linked to (and may be) the major gene controlling the incubation period of the various strains of scrapie pathogens. To investigate the role of PrP in natural scrapie, we have determined its gene structure and expression in the natural host. We have isolated two sheep genomic DNA clones that encode proteins of 256 amino acids with high homology to the PrPs of other species. Sheep PrPs have an arginine/glutamine polymorphism at position 171 that may be related to the alleles of the scrapie incubation-control gene in this species.
Inoue S, Tanaka M, Horiuchi M, Ishiguro N, Shinagawa M J Vet Med Sci 1997 Mar;59(3):175-83We cloned the part of the bovine PrP gene which contains the 5'-flanking region, exon 1, exon 2 and intron 1 to analyze its promoter region. The 5' non-coding region of the bovine PrP gene consisted of three exons and two introns, and its organization was similar to that of the mouse, rat and sheep PrP genes...
Proc Natl Acad Sci U S A 1986 Sep;83(17):6372-6 Locht C, Chesebro B, Race R, Keith JMThe prion protein (PrP) is a scrapie-associated fibril protein that accumulates in the brains of hamsters and mice infected with the scrapie agent, and also in the brains of persons affected with kuru or Creutzfeldt-Jakob disease. It has been previously proposed that PrP could be either the primary transmissible agent of scrapie or a secondary component involved in the pathogenesis of scrapie. At present, the second possibility seems more likely, for the PrP-specific mRNA is present in both infected and uninfected brains. We have isolated and sequenced the complete PrP-specific cDNA from mRNA isolated from infected mouse brains. Comparison of the mouse PrP with the hamster PrP reveals a high homology in the amino acid sequence and the presence of a conserved octapeptide repeated four times, whose function is unknown at present. Structural features are discussed and compared with other proteins. Except for its homology with the hamster PrP, mouse PrP has no significant homology to any known protein sequence, including neurofilaments, neuropeptides, and amyloid proteins of Alzheimer disease. Some features of the PrP, however, are similar to structures found in aggregating proteins, such as the wheat glutenin, keratin, and collagen.
J. Gen. Virol. 73 (Pt 10), 2757-2761 (1992) Kretzschmar,H.A., Neumann,M., Riethmuller,G. and Prusiner,S.B.
Virus Genes 6 (4), 343-356 (1992) Yoshimoto,J., Iinuma,T., Ishiguro,N., Horiuchi,M., Imamura,M. and Shinagawa,M.They sequenced well into the 3' UTR of the bovine gene.
Medline shows 101 reviews of polyadenylation. Highlights: 3'-Ends of almost all eukaryotic mRNAs are generated by endonucleolytic cleavage and addition of a poly(A) tail. In mammalian cells, the reaction depends on the sequence AAUAAA upstream of the cleavage site, a degenerate GU-rich sequence element downstream of the cleavage site and stimulatory sequences upstream of AAUAAA. Wahle E Bioessays 1992 Feb;14(2):113-8
I came across this goat prion mRNA sequence by accident today and am just passing it along for what it is worth. It ends 12 bp further 3' than sheep U67922's annotation. The sequences extends well into the mariner insert and seems to be of good quality for an EST. Somewhat oddly, there do not seem to be cow or sheep ESTs yet.
LOCUS Z71825 366 bp mRNA EST 13-NOV-1996
Goat mammary gland Capra hircus cDNA clone EST15-34, mRNA sequence.
AUTHORS Le Provost,F., Lepingle,A. and Martin,P.
TITLE A survey of the goat genome transcribed in the lactating mammary
gland
JOURNAL Mamm. Genome 7 (9), 657-666 (1996)
1 aaagataaaa agttttgtga acacagaatt atgacgttgc ctgaaaaatg gcagaaggta
61 gtgtaacaaa agagtgacta tgttgtttgg taaagttctt agtgaaaatg aaaaatgtgt
121 cttttatttt tatttaaaca ccaaaggcac attttggcca acccaatact gaatacttaa
181 aggaaactct tctgtgttgt ccttagcctt acagtgtgca ctgaatagtt ttgtataaga
241 ntccagagtg atatttgaaa tacgcatgtn cttatatttn tnatattngt aactttgcat
301 gtacttgttt tgtgttaaaa gttttataaa tatttaatat ctgactaaaa ttaaacagga
361 gttaaa
Masked Sequence: DNA/Mariner positions1055 1220
>goat
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNTACTGAATACTTAAAGGAAACTCTTCTGTGTTGT
CCTTAGCCTTACAGTGTGCACTGAATAGTTTTGTATAAGANTCCAGAGTG
ATATTTGAAATACGCATGTNCTTATATTTNTNATATTNGTAACTTTGCAT
GTACTTGTTTTGTGTTAAAAGTTTTATAAATATTTAATATCTGACTAAAA
TTAAACAGGAGTTAAA
repeat_region 24889..26108 sheep Oamar1"
gb|U67922|OAPRP Ovis aries prion protein gene, alignment
Goat : 1 aaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggta 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 25943 aaagataaaaagttttgtgaacacagaattatgacgttgcctgaaaaatggcagaaggta 26002
Goat : 61 gtgtaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtgt 120
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26003 gtggaacaaaagagtgactatgttgtttggtaaagttcttagtgaaaatgaaaaatgtgt 26062
Goat : 121 cttttatttttatttaaacaccaaaggcacattttggccaacccaatactgaatacttaa 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26063 cttttatttttatttaaacaccaaaggcacattttggccaacccaatactgaatacttaa 26122
Goat : 181 aggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataaga 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sheep: 26123 aggaaactcttctgtgttgtccttagccttacagtgtgcactgaatagttttgtataaga 26182
Goat : 241 ntccagagtgatatttgaaatacgcatgtncttatatttntnatattngtaactttgcat 300
|||||||||||||||||||||||||||| ||||||||| | ||||| ||||||||||||
Sheep: 26183 atccagagtgatatttgaaatacgcatgtgcttatattttttatatttgtaactttgcat 26242
Goat : 301 gtacttgttttgtgttaaaagttttataaatatttaatatctgactaaaattaaacagga 360
||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
Sheep: 26243 gtacttgttttgtgttaaaag-tttataaatatttaatatctgactaaaattaaacagga 26301
Goat : 361 gttaaa 366
| ||||
Sheep: 26302 gctaaa 26307