GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNFEPFFMMIATPAPH
GKYLNEYGAPDAGGLEHIPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARRHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAEDAGGVSHVPPGWSFWYALEKNSKYYNYTLSVNGKARRHGENYSVDYLTDVLANMSLDFLEYKSWNLFFIDGSQTPAPh
GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHGQNYSQDYLTDVLSNVSLDFLNYKSNHEPFFMMIATPAPH
GKYLNQYGSKDAGGVAHVPPGWDQWHALVGNSKYYNYTLSVNGKEEKHGDSYEKDYLTDLVLNRSLHFLEERSPSHPFFMMLCPPAPH
GKYLNQYGHAQAGGVEHIPPGWSFWVGLEKNSKYYNYTLSVNGKAQKHGSDYSKDYLTDVLANMSLEFLQYKSSYQPFFMMVSTPAPH
GKYLNQYGQKDAGHVGHIPPGWDHWHALVGNSQYYNYSLSVNGKEEKHGDNYGDDYLTDLITNRSLTFLDNRSPQLPFFLLLSPPAPH
GKYLNQYGGKSVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSVDFIHNHKMTQPFFMMISTPAPH
GKYLNQYWGADVPKGWNHFYGWNQWFGLHGNSRYYNYTLRENSGNVQHGAHYESTYLTDLLRDRAADFLRNATQSEPFFAMVAPPAAH
........................e.hhh.......eeeeee.............hhhhhhhhhh.h.h.........eeeeee....
........................eehhhh......eeeeee.............hhhhhhhhhh.hhhh........eeeee.....
........................e.hhh.......eeeeee.............hhhhhhhhhh.hhhh........eeeee.....
........................eehhh.......eeeeee.............hhhhhhhhhh.h.h.........eeeee.....
.....hh.................eeee........eeeeee.............hhhhhhhhhhhhhhhhh...eeeee........
........................eeee.......eeeee................hhhhhhh...hhhh........eeeeee....
........................hhhhhe.....eeeeeee...........hhhhhhhhhhhhhhhh.........eeeee.....
........................eeee.......eeeeeee..............hhhhhhhhhhhhhhhh......eeeee.....
.....hh...................hee......eeeeeee..............hhhhhh....eeee........eeeee.....
....e....................eeee......eeeee..............hhhhhhhhhh...hh.h.......eeeee.....
.......e..................e........eeeeee...............hhhhhhhhhhhhhhhh.......eeee.....
>GNS_hsa
GKYLNEYGAP...DAGGLEHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANVSL.......................................DFLDYKSN...FEPFFMMIATPAPH
>GNS_mmu
GKYLNEYGAP...DAGGLEHIPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANLSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS_rno
GKYLNEYGAP...DAGGLEHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARRHGENYSVDYLTDVLANLSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS_chi
GKYLNEYGAP...DAGGLGHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANVSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS__gga
GKYLNEYGAE...DAGGVSHVPPGWSFWYALEKNSK..YYNYTLSV.NGKARRHGENYSVDYLTDVLANMSL.......................................DFLEYKSW...NLFFIDGSQTPAPh
>GNS_xla
GKYLNQYGSE...EAGGINHVPPGWSYWFALEKNSK..YYNYTLSE.NGRPKTHGQNYSQDYLTDVLSNVSL.......................................DFLNYKSN...HEPFFMMIATPAPH
>GNS_dre
GKYLNQYGSK...DAGGVAHVPPGWDQWHALVGNSK..YYNYTLSV.NGKEEKHGDSYEKDYLTDLVLNRSL.......................................HFLEERSP...SHPFFMMLCPPAPH
>GNS1_fru
GKYLNQYGHA...QAGGVEHIPPGWSFWVGLEKNSK..YYNYTLSV.NGKAQKHGSDYSKDYLTDVLANMSL.......................................EFLQYKSS...YQPFFMMVSTPAPH
>GNS2_fru
GKYLNQYGQK...DAGHVGHIPPGWDHWHALVGNSQ..YYNYSLSV.NGKEEKHGDNYGDDYLTDLITNRSL.......................................TFLDNRSP...QLPFFLLLSPPAPH
>GNS_cii
GKYLNQYGGK...SVGGPQHVPVGWNQWFGLVGNSK..YYNYTISD.NGVPVQHGANYHEDYLTDLLANRSV.......................................DFIHNHKM...TQPFFMMISTPAPH
>GNS_dme
GKYLNQYWGA...DVPKGWNHFYGWNQWFGLHGNSR..YYNYTLRE.NSGNVQHGAHYESTYLTDLLRDRAA.......................................DFLRNATQ...SEPFFAMVAPPAAH
GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAEKAYGFLDEAAKHNRPFFLGIAPIAPH
GKLFNAHTVENYNSPYPAGWNGSDFLLDPYTYNYLNSSFQRNQDPPKSYEGFHSVDVLAEKSLGFVDEAVRADGPFFLGIAPVAPH
GKLFNAQTVDNYDSPHAAGWTGSDFLlDPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAGKALGFLDDVVAEDKPFFLGIAPIAPH
GKFLVDYSVSNYQNVPaAGWTDIDALVTPYTFDYlNNPFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPH
GKFLVDYSVSNYQQVPRAGwTISMPlVTPYTFDYlNnTLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPH
..ee...........eee....................................hhhhhhhhhhhhhhhhh....eeeee......
......................................................hhhhhhh....hhhhh......eeee......
....................................hhh..................hhhhhh..hhhh......eeeee......
..eeeeee................h..............................hhhhhhhhhhhhhhhhh.....eee......
..eeeee.............eee................................hhhhh..hhhhhhhhh......ee.......
>Sulf_ncr
GKLFNAHT.....VDNYDSPYIAGWNGSDFLLDPYTYSYLNAT.FQRNRDPPISYEGQYSV..DVLAEKAY........................................GFLDEAAK..HNRPFFLGIAPIAPH
>Sulf_pan
GKLFNAHT.....VENYNSPYPAGWNGSDFLLDPYTYNYLNSS.FQRNQDPPKSYEGFHSV..DVLAEKSL........................................GFVDEAVR..ADGPFFLGIAPVAPH
>Sulf_cgl
GKLFNAQT.....VDNYDSPHAAGWTGSDFLlDPYTYSYLNAT.FQRNKDAPVSHEGEYST..GVLAGKAL........................................GFLDDVVA..EDKPFFLGIAPIAPH
>Sulf_vca
GKFLVDYS.....VSNYQNVPaAGWTDIDALVTPYTFDYlNNP.FSRNGATPNIYPGFYST..DVIADKAV........................................AQIKTAVA..AGKPFYAQISPIAPH
>Sulf_cre
GKFLVDYS.....VSNYQQVPRAG.TISMPlVTPYTFDYlNnT.LQRNGATPNIYPGEYST..DVIRDKGV........................................AQIKSAVA..AGKPFYAQISPIAPH
GKYLNEYNGSYIPQGWQYWMGLVRNSRYYNYSLRHNDVKESHRDNYRDDYFTDLIVNRSMTYFRRKKHEEPDSPILSVLSFPAPH
GKYLNEYNGSYIPAGWKYWMGLIKNSKYYNYAVNHNSQKELHGDDYAKDYLTDLVTNRSMEFFRDSKTERPEDPVLVAlsfpaph
GKYLNKYNGSYIPPGWREWGGLIMNSKYYNYSINLNGQKIKHGFDYAKDYYPDLIANDSIAFLRSSKQQNQRKPVLLTMSFPAPH
GKYLNEYDGSYIPPGWDEWHAIVKNSKFYNYTMNSNGEREKFGSEYEKDYFTDLVTNRSLSKFIDKIKIRAWQPFALIISYPAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPH
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKSSKKMYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKLSKKLYPHRPIMMVISHAaph
GKYLNEYNGSYIPPGWREWVGLIKNSRFYNYTVCRNGYKEKHGGEYAKDYFTDLITNDSINFFRISKRMFPHRPVMMVISHAAPH
................hhhhhhh...........................hhhhhhhhhhhhhhhh........eeeee......
................hhhhhhh......eeee............hhhhhhhhhh...hhhhh...........eeeee......
.....................eee......eeee........................hhhhhhh.........eeeee......
.................hhhhhh....eeeeee..............h..hhh......hhhhhhhhhh......eeee......
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhhh......eeeeee.....
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhh.......eeeeee.....
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhhh......eeeeee.....
.................h.eee.....eeeeeee...........hhhhhhhhhh......hhhh.h.......eeeeee.....
.................h.eee.....eeeeeee...........hhhhhhhhhh......hhhh.h.......eeeeee.....
................hhhhhhh....eeeeee..............h..hhhh.....hh..h..........eeeeee.....
................hhhhhhh....eeeeee..............h..hhhh.....hhh.hhhh.......eeeeee.....
................hhh.h......eeeeeee............hhhhhhhh.......hhhhhhh......eeeeee.....
>KIAA47/77_cii
GKYLNEYNGS...YIPQGWQYWMGL......VRNSR..YYNYSLRH.NDVKESHRDNYRDDYFTDLIVNRSM......................................TYFRRK.KHEEPDSPILSVLSFPAPH
>KIAA1077__hro
GKYLNEYNGS...YIPAGWKYWMGL......IKNSK..YYNYAVNH.NSQKELHGDDYAKDYLTDLVTNRSM......................................EFFRDS.KTERPEDPVLVAlsfpaph
>KIAA47/77_dme
GKYLNKYNGS...YIPPGWREWGGL......IMNSK..YYNYSINL.NGQKIKHGFDYAKDYYPDLIANDSI......................................AFLRSS.KQQNQRKPVLLTMSFPAPH
>KIAA47/77_cel
GKYLNEYDGS...YIPPGWDEWHAI......VKNSK..FYNYTMNS.NGEREKFGSEYEKDYFTDLVTNRSL......................................SKFIDK.IKIRAWQPFALIISYPAPH
>KIAA1077_hsa
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPVMMVISHAAPH
>KIAA1077_mmu
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPIMMVISHAAPH
>KIAA1077_rno
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPVMMVISHAAPH
>KIAA1077_cco
GKYLNEYNGS...YIPPGWREWVGL......VKNSR..FYNYTISR.NGNKEKHGFDYAKDYFTDLITNESI......................................NYFRMS.KRIYPHRPIMMVISHAAPH
>KIAA1077__gga
GKYLNEYNGS...YIPPGWREWVGL......VKNSR..FYNYTISR.NGNKEKHGFDYAKDYFTDLITNESI......................................NYFRMS.KRIYPHRPIMMVISHAAPH
>KIAA1077_xla
GKYLNEYNGS...YIPPGWREWLGL......VKNSR..FYNYTMCR.NGFKEKHGFEYEKDYFTDLITNDSI......................................SYFKSS.KKMYPHRPIMMVISHAAPH
>KIAA1077__str
GKYLNEYNGS...YIPPGWREWLGL......VKNSR..FYNYTMCR.NGFKEKHGFEYEKDYFTDLITNDSI......................................SYFKLS.KKLYPHRPIMMVISHAaph
>KIAA1077_fru
GKYLNEYNGS...YIPPGWREWVGL......IKNSR..FYNYTVCR.NGYKEKHGGEYAKDYFTDLITNDSI......................................NFFRIS.KRMFPHRPVMMVISHAAPH
GKFLNNYDGSWVPPGWTKWAALVRNSRYYNYSLNKNGRNEWHGNRYENDYLTNLVANLSLQFIDESLLNPHGQPFLVVLSFPAPH
GKYLNEYNGSYVPPGWREWVALVKNSRFYNYTLCRNGInGwHGTQYPKDYLTnRITNDSINFLRMSKRMYPHRPVMMGLSHAAPH
GKYLNEYNGSYVPPGWKEWLGLVKNSRFYNYTLSRNGFREKHGAEYPQDYLTDLITAESMRYFRYSKRVYPHRPVLMVLSHAAPH
GKYLNEYNGSYVPPGWKEWVALVKNSRFYNYTLCRNGVREKHSSDYPKDYLTDIITNESINYFRTSKRTYPNRPVMMVLSHVAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGFDYSRDYLTDLITNDSITFFRISKKMYPHRPVLMVISHAAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSKDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSTDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPH
...e.............hhhhhhh.....eee................hhhhhhhhhhhhhhhhhh........eeeeee.....
................hhhhhhh....e..eeee.........................hhhhhhhh.......eeee.......
................hhhhhhh....eeeeee...............hhhhhhhhhhhhhhhhh.........eeeeee.....
................hhhhhhh....e.eeee.................hhhee......ee...........eeeeeee....
................hhhhhhh......eeee................hhhhh......ee...hh.......eeeeee.....
................hhhhhhh......eeee................hhhhh......eeee..........eeeeee.....
................hhhhhhh......eeee................hhhhh......eeee..........eeeeee.....
>KIAA1247__hgl
GKFLNNYDGS...WVPPGWTKWAAL......VRNSR..YYNYSLNK.NGRNEWHGNRYENDYLTNLVANLSL......................................QFIDES.LLNPHGQPFLVVLSFPAPH
>KIAA1247__dre
GKYLNEYNGS...YVPPGWREWVAL......VKNSR..FYNYTLCR.NGInGwHGTQYPKDYLTnRITNDSI......................................NFLRMS.KRMYPHRPVMMGLSHAAPH
>KIAA1247b_fru
GKYLNEYNGS...YVPPGWKEWLGL......VKNSR..FYNYTLSR.NGFREKHGAEYPQDYLTDLITAESM......................................RYFRYS.KRVYPHRPVLMVLSHAAPH
>KIAA1247a_fru
GKYLNEYNGS...YVPPGWKEWVAL......VKNSR..FYNYTLCR.NGVREKHSSDYPKDYLTDIITNESI................................. ....NYFRTS.KRTYPNRPVMMVLSHVAPH
>KIAA1247__gga
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGFDYSRDYLTDLITNDSI................................ .....TFFRIS.KKMYPHRPVLMVISHAAPH
>KIAA1247_hsa
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGSDYSKDYLTDLITNDSV................................. ....SFFRTS.KKMYPHRPVLMVISHAAPH
>KIAA1247_mmu
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGSDYSTDYLTDLITNDSV................................. ....SFFRTS.KKMYPHRPVLMVISHAAPH
>ARSB_spo
GKWHLGLTPDRY.PSKRGFKESFALLPGGGNHFA......YEPGTRE.....................................................................NPAVPFLPPLYTHNHDPVDH
>ARSB_dm1
GKWHLGHWKLKYTPLYRGFSSHWGLDMRNGTQVA......YDLHGH................................YTT........................DVITDHSVKVIANHNATKGPLFLYVAHAACH
>ARSB_cal
GKWHLGLKKPYW.PNKRGFNKSFTLLPGAGNH........YKYITRDSQGNQIPFLPAIYVEDDKELLQPEIELPDDFYST.........................NYFTDKAIEFIKETPQGKPFFGMITYTAPH
>SulfY_ptr
GKWHVGHSRWTQTPTFRGFQSFFGFYLGAQD.........YNTHIKQGERGNAYEMHWDARGKC.GRDCSRLVDERGNYST.......................HVFTREAIRVIENHPQRPHEPLFLYLAHQAVH
>SulfZYB_cii
GKWHLGFSSSKYAPWNRGFHGFYGFLAGSEN.........YWSKWLPMARHSNIG.....GVDFTDSTTGPTNETWGQYSA......................HVYAS..RARYVIQHH.DQSKPLFLYLPLQTPH
>ARSB_cii
GKWHVGYCDEAYTPTRRGFDSHYGFYNSGIS.........YSNYSSTEGTDV........GYDYR.DDLALNLAAEGKYTT......................TDFTD..QAKTLIDNH.DQTNPMFLYMAYNAPH
>ARSB_dm3
GKWHLGFSRPEYTPTRRGFDYHFGYWGAYID.........YFQRRSKMPVANYSL.....GYDFRR.NMELECRDRGVYVT......................DLLT..AEAERLIKDHADKEQPLFLMLSHLAAH
>ARSB_dm4
GKWHLGLSQRNFTPTERGFDRHLGYLGAYVD.........YYTQSYEQQNKGYN......GHDFR.DSLKSTHDHVGHYVT......................DLLTDAAVKEIEDHGSKNSSQPLFLLLNHLAPH
>ARSB_dm2
GKWHLGFWRKDLTPTMRGFDHHFGYYNGYID.........YYDHQVRMLDRNYSA.....GLDFRRD.LEPCPEANGTYAT......................EAFTS...EAKRIIEQHDKSKPLFMVLSHLAVH
>ARSB2_cii
GKWHLGFYKKECLPTSRGFDTFYGYYCGAED.........YYTKQVHANFHFGNKTRRVSGFDFHDN.SRTEWEANGTYSS......................YLYRD...RAVRIIKSHNSSIPLFMYLPFQSVH
>ARSB_hro
GKWHLGFYRKECLPTIRGFDTHYGYYCGNQD.........YYTK
>SulfY_ame
GKWHLGYPP.AFGPLRSGYEEFFGPMSGGVD.........YFT..HCSSNGTHD............LYLGEEEKQQDGYLT.....................DLITDHALDYVQRMAEGAKDGKPFFLSLHYTAPH
>SulfY_ava
GKWHLGYSSLNYTPTHRGFDSFYGFYNGPID.........YYRGIMEQEGH.........KGLDFWNGTHTVPLEERIYST......................TRFRD..QAESIIANRNSS.KPLFLYLAHQGVH
>sulfZ/Y_hpo
GKWHLGFYKQEYLPWNRGFDTYFGYLNAAED.........YFNHNVPWRQV.........RYLDLRDNNGPVRNETGQYSA........................HLFTGKAIDVVQSHNTS.KPLFLYLAYQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCALDFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIESLNGTRCALDLRDGEEPAKEYTNIYSTNIFTKRATPVIATHPPEKPLFLYLAFQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIECLNGTRCALDLRDGEEPAKEYTDIYSTNIFTKRATTLIANHPPEKPLFLYLAFQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCALIDSLNVTRCALDFRDGEQVATGYKNMYSTNIFTERATALITSHPPEKPLFLYLALQSVH
.....................eeeee............ehhhhh..hhhh.......hhh...........hhhhhhhh.........hhhhhhhhhh.
...ee................eeeee......................eeee.............ee...ee......eee.......hhee.hhhh..
...ee................eeeee......................eeee..............e......hh.eeee........hhee.hhhh..
...ee................eeeee.........hhhhe......hhh......................hhhhhhhh.........hh..hhhh...
>ARSB_hsaXR
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYSHERCTLIDALNVTRC..ALDFRDG.EEVATGYKNMYST......................NIFTK..RAIALITNHPP.EKPLFLYLALQSVH
>ARSB_mmu
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYTHEACAPIESLNGTRC..ALDLRDG.EEPAKEYTNIYST......................NIFTK..RATPVIATHPP.EKPLFLYLAFQSVH
>ARSB_rno
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYTHEACAPIECLNGTRC..ALDLRDG.EEPAKEYTDIYST......................NIFTK..RATTLIANHPP.EKPLFLYLAFQSVH
>ARSB_fca
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYSHERCALIDSLNVTRC..ALDFRDG.EQVATGYKNMYST......................NIFTE..RATALITSHPP.EKPLFLYLALQSVH
GKWHLGFYRKECMPTRRGFDTFFGSLLGSGDYYTHYKCDSPGMCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILASHNPTKPIFLYIAYQAVH
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGDYYTHYKCDSPGVCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILATHDPTKPLFLYiAYQAVH
GKWHLGFYRRECMPTQRGFDTFFGSLLGSGDYYTHFKCDSPGICGYDLYENDNAAWDHDNGIYSTQMYTQKVQQILASHNPRKPIFLYXAYQAVH
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGDHYSHYKCEAPGMCGYDLYEGEEAAWEQDRGLYSTVMFTQKAISILAKHDPRKPLFLYLAYQAVH
GKWHLGLFTSNFLPHNRGFDHWVGTVGAGDHRYHRQCFNSMaCAYDLREGTNKDGVYEDKTRYDQKTEINEFQKIVDKHNTTNPLFAYLSFHAVH
GKWHLGLYKKEYTPLYRGFDSYYGYLEGGEDYYTYYNCDTFHWCGYDLRDMNEPVTDMNGTYSTHLYTKKAIDIINGASTGKaPFLLYLAYQAVH
..eeeeee.............ee.........eeee............................hhhhhhhhhhhhh.......eeeee.hhhh.
...ee................ee.........eeee........e.e................ehhhhhhhhhhhhh.......eeee.hhhh..
....eeee.............ee............................hhhhhhhh....eeeeehhhhhhhhh.......hhh.hhhhh..
...eeeee.............ee.ee........hh.h.hhhhh.....................hhhhhhhhhhhh.......hh...e.hh..
.....ee................e.......eeeee............................eee...eeeee.........hhhhhhhhh..
>SulfY_hsa
GKWHLGFYRKECMPTRRGFDTFFGSLLGSGD.........YYTHYKCDSPG.....MC..GYDLYENDNAAWDYDNGIYST......................QMYTQ..RVQQILASHNP.TKPIFLYIAYQAVH
>SulfY_mmu
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGD.........YYTHYKCDSPG.....VC..GYDLYENDNAAWDYDNGIYST......................QMYTQ..RVQQILATHDP.TKPLFLYVAYQAVH
>SulfY_gga
GKWHLGFYRRECMPTQRGFDTFFGSLLGSGD.........YYTHFKCDSPG.....IC..GYDLYENDNAAWDHDNGIYST......................QMYTQ..KVQQILASHNP.RKPIFLYXAYQAVH
>SulfY_fru
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGD.........HYSHYKCEAPG.....MC..GYDLYEGEEAAWEQDRGLYST......................VMFTQ..KAISILAKHDPHRKPLFLYLAYQAVH
>SulfY_odi
GKWHLGLFTSNFLPHNRGFDHWVGTVQGAGD.........HRYHRQCFNSPIKG*.MC..AYDLREGTNKDGVYEDKTRYD..................LNGTQKTEILTNEFQKIVDKHNTTNPLFAYLSFHAVH
>sulfY/Z_hpo
GKWHLGLYKKEYTPLYRGFDSYYGYLEGGED.........YYTYYNCDTFHNR*..WC..GYDLRDMNE.PVTDMNGTYST......................HLYTK..KAIDIINGASTGGKPFLLYLAYQAVH
GKWHLGFYKKECLPTRRGFDTYFGSLTGSVNYYTYDSCDGPGMCGFDLHEGESVAWSQKGKYSTHLYTQRVRKILATHDPSQPLFIFLSFQAVH
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGENVAWGLSGQYSTMLYAQRASHILASHSPQRPLFLYVAFQAVH
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGESVACGLSGQYSTMLYAQRASHILARHNPQNPLFLYVAFQAVH
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGEDYYTKWRCDGKgLCGYDMhTSEKGPTNATGQYSANLFANKANEAIDKHDKTKPLFLYVAFQSVH
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVNHFTRYHCEPKgFCGYDMIDSRYGPTNATGEYSTNLFIRKSKEMIDKHNKQKPMFLYLSLQAVH
...eee..............eee......eeeeee........ee.......eeeee........ehhhhhhhhhh.......eeeeeee....
...eeee.............hhe......eeeeee.......eee........eee......hhhhhhhhhhh.hh........eeeehhhhh.
..eeeee.............hhe......eeeeee.......eeee......eeee......hhhhhhhhhhhhhh........eeee..hh..
...eeee..............eeeee......eeeee......ee...............hhhhhhhhhhhhhhhhh......eeeeeee....
.....................hh.........eee............ee...............hh.hhhhhhhhhh......eeeeeeh....
>sulfZ_fru
GKWHLGFYKKECLPTRRGFDTYFGSLTGSVN.........YYTYDSCDGPG......MC..GFDLHEGESVAWSQK.GKYST......................HLYTQ..RVRKILATHDPSQPLFIFLSFQAVH
>SulfZ_hsa
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVD.........YYTYDNCDGPG......VC..GFDLHEGENVAWGLS.GQYST......................MLYAQ..RASHILASHSPQRPLFLYVAFQAVH
>SulfZ_mmu
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVD.........YYTYDNCDGPG......VC..GFDLHEGESVACGLS.GQYST......................MLYAQ..RASHILARHNPQNPLFLYVAFQAVH
>sulfZ/Y_cii
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGED.........YYTKWRCDGKg......LC..GYDM.TSEKGPTNAT.GQYSA......................NLFAN..KANEAIDKHDKTKPLFLYVAFQSVH
>sulfY/Z_cii
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVN.........HFTRYHCEPKg......FC..GYDMIDSRYGPTNAT.GEYST......................NLFIR..KSKEMIDKHNKQKPMFLYLSLQAVH
1AUK
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTH
e sb gggtttggggt seeeeess ttssb ttsbsbtttee ss bs ss eeetteesees hhhhthhhhhhhhhhhhhhhhtt eeeeee tts
e....................eeeee..................ee.............eee..ee.ee...hhhh.hhhhhhhhhhhhhhhh....eeeeee.....
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPDIPCKGGCDQGLVPIPLLANLTVEAQPPWLPGLEARYVSFSRDLMADAQRQGRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHHGFHRFLGIPYSHDQGPCQNLTCFPPATPCEGICDQGLVPIPLLANLSVEAQPPWLPGLEARYVAFARDLMTDAQHQGRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPSTPCDGSCDQGLVPVPLLANLSVEAQPPWLPGLEARYVAFARDLMADAQRQGRPFFLYYASHHTH
GKWHLGLGARGSFLPIHQGFDHFLGVPYSHDQGPCQNLTCFPPDIKCFGTCDQGLVPVPLFWNQSIVQQPVSFLIWCRLQQICTGLHLPTAPGEArpfLLYYASHHTH
GKWHLGIGANGTFLPTRQGFDQYLGIPYSHEMGPCQNLTCFPPDVKCFGLCDVGTVTVPLMYNEVIKQQPVNFLDLENAYRDFASDFISTSAKKRQPFFLYFPSHHTH
..eee................ee....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee................ee.......................................................h..hhhhhhhhhhhh....eeeeee.....
..eee.................e....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee................ee....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee.................e....................................e............hhhhhhhhhh...............eeeee......
..eeee........................................ee.......eeeee..hhhh......h.hhhhhhhhhhhhhhhh.......eeeee......
>ARSA_hsa
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPATPCDGGCDQGLVP.............IPLLANLSVEAQPPWLPGLEARY...MAFAHDLMADAQRQDRPFFLYYASHHTH
>ARSA_mmu
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPDIPCKGGCDQGLVP.............IPLLANLTVEAQPPWLPGLEARY...VSFSRDLMADAQRQGRPFFLYYASHHTH
>ARSA_bta
GKWHLGVGP.....EGAFLPPHHGFHR.FLGIPYSHDQGPCQN......LTCFPPATPCEGICDQGLVP.............IPLLANLSVEAQPPWLPGLEARY...VAFARDLMTDAQHQGRPFFLYYASHHTH
>ARSA_ssc
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPSTPCDGSCDQGLVP.............VPLLANLSVEAQPPWLPGLEARY...VAFARDLMADAQRQGRPFFLYYASHHTH
>ARSA_gga
GKWHLGLGA.....RGSFLPIHQGFDH.FLGVPYSHDQGPCQN......LTCFPPDIKCFGTCDQGLVP.............VPLFWNQSIVQQPVSFLIWCRLQ...QICTGLHLPTAPGEArpfLLYYASHHTH
>ARSA_fru
GKWHLGIGA.....NGTFLPTRQGFDQ.YLGIPYSHEMGPCQN......LTCFPPDVKCFGLCDVGTVT.............VPLMYNEVIKQQPVNFLDLENAY...RDFASDFISTSAKKRQPFFLYFPSHHTH
GKWHLGHHGSYHPNFRGFDYYFGIPYSHDMGCTDTPGYNHPPCPACPQGDGPSRNLQRDCYTDVALPLYENLNIVEQPVNLSSLAQKYAEKATQFIQRASTSGRPFLLYVALAHMH
GKWHLGHHGSYHPNFRGFDYYFGIPYSNDMGCTDAPGYNYPPCPACPQRDGLWRNPGRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGQAHMH
GKWHLGHHGSYHPSFRFDYYYFGIPYSNDMGCTDNPGYNYPPCPACPQSDGRWRNPDRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGLAHMH
GKWHLGHHGSYHPSFRFDYYYFGIPYSHDMGCTDTPGYNYPPCPACPRRHQPSRNLERDCYSDVALPLYENLNIVEQPVNLSGLARKYAEKATQFIQQARASGRPFLLYVGLAHMH
GKWHLGHNGPYRPNRRGFDYYYGVPYSNDMGCTDVPGYNLPQCPPCDPPSGPsrSRHDGCYSKVALPLIENTTIVQQPLNLWRLTEQYKSAATRIIQNARAQGQPYFLYIALAHMH
GKWHLGITKAYHPCSRGFNYYYGLPYSNDMGCVDCDAYNHPQCKKCPKQSGITNDQAIECGYDTALPLYENYDIIEQPANLVELGDRYVEKATLFIQQAKNKTQPFFLYVATAHTH
...................eee...................................................ee.....hhhhhhhhhhhhhhhhhhh.......eeehhhhhh.
...................eee...................................................ee......hhhhhhhhhhhhhhhhhh......eeeee......
..e.............eeeeee...................................................ee......hhhhhhhhhhhhhhhhhh......eeeeee.....
..e.............eeeeee...................................................ee.....hhhhhhhhhhhhhhhhhhhhh....eeeeee.....
..ee................ee.........................................e.........ee...hhhhhhhhhhhhhhhhhhhhhh.....h.eeehhhhh.
...ee.e.............e....................................eh...................hhhhhh.hhhhhhhhhhhhhh......eeeeee.....
>KIAA1001_hsa
GKWHLGHHG.......SYHPNFRGFDY.YFGIPYSHDMG.CTD......TPGYNHPPCPACPQGDGPSRNLQRDCYTDVA..LPLYENLNIVEQPVNLSSLAQKY...AEKATQFIQRASTSGRPFLLYVALAHMH
>KIAA1001_mmu
GKWHLGHHG.......SYHPNFRGFDY.YFGIPYSNDMG.CTD......APGYNYPPCPACPQRDGLWRNPGRDCYTDVA..LPLYENLNIVEQPVNLSGLAQKY...AERAVEFIEQASTSGRPFLLYVGQAHMH
>KIAA1001_rno
GKWHLGHHG.......SYHPSFRFDYY.YFGIPYSNDMG.CTD......NPGYNYPPCPACPQSDGRWRNPDRDCYTDVA..LPLYENLNIVEQPVNLSGLAQKY...AERAVEFIEQASTSGRPFLLYVGLAHMH
>KIAA1001_ssc
GKWHLGHHG.......SYHPSFRFDYY.YFGIPYSHDMG.CTD......TPGYNYPPCPACPRRHQPSRNLERDCYSDVA..LPLYENLNIVEQPVNLSGLARKY...AEKATQFIQQARASGRPFLLYVGLAHMH
>KIAA1001_fru
GKWHLGHNG.......PYRPNRRGFDY.YYGVPYSNDMG.CTD......VPGYNLPQCPPCDPPSGPsrSRHDGCYSKVA..LPLIENTTIVQQPLNLWRLTEQY...KSAATRIIQNARAQGQPYFLYIALAHMH
>KIAA1001_cii
GKWHLGITK.......AYHPCSRGFNY.YYGLPYSNDMG.CVD......CDAYNHPQCKKCPKQSGITNDQAIECGYDTA..LPLYENYDIIEQPANLVELGDRY...VEKATLFIQQAKNKTQPFFLYVATAHTH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQARHHgPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKAKPNIPVYRDWEMVGRFYEEFPINRKTGEANLTQLYLQEALDFIRTQHARQGPFFLYWAIDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKVKPNIPVYRDWEMVGRFYEEFPINLKTGEANLTQLYLQEALDFIRTQHARQSPFFLYWAIDATH
gkwHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQhARHHPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDQEMVGRFYEEFPINLKTGEANLTQIYLQEALEFIQRQQAAHRPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNRARPNIPVYRDWEMVGRFYEEFPINLKTGESNLTQIYLQEALDFIKRQQATHHPFFLYWAIDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNRALPNIPVYRDWEMIGRYYEDFKIDLRTGEANLTQIYLQEMYGPAQAALVFHHPFFLYWAIDATH
GKWHLGHRPSYHPLRHGFDEWFGSXNCHFGPYDNKQIPNIPLDRDWNMGGRYYEEFNIIIKTGESNLTRSSLQeGLDFIQSQAEAQKPFFLYWAPDATH
GKWHLGHRPQYLPLEHGFDEWFGAPNCHFGPYNNSVRPNIPVYRNSWMLGRYYEEFKIDKKTGESNLTQMYLLEGLDFIQSQAEAQKPFFLYWAPDATH
GKWHLGHRAHHLPLEHGFDEWFGAPNCHFGPYNSSDRPNVPVYNNSQMVGRYYEEFGIDKKTGESNLTQIYLEEGLDFIFHQNMAQRPFFLYWAADATH
GKWHLGHRAHHLPLEHGFDEWFGAPNCHFGPYNDSSRPNIPVYNNSEMKGRYYEEFEINVKTGESNLTQLYLKEGLDFISQQAMAQRPFFLYWAPDATH
GKWHLGQQEQYLPLKHGFHEWFGSPNCHFGPYDDKTTPNIPVYNNTEMVGRYYEEFAIESHKYLSNMTQYYIQEALDFIERMERNEKPFFLYWAPDATH
.............................................hhhhhhhhh...........hhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhh..........hhhhhhhhhhhhhhhhhhhh.....eeeeee....
.............................................hhhhhhhhhh.........hhhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhh...........hhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhhh..........hhhhhhhhhhhhhhhhhhh....eeeeeee....
.............................................hhhhhhhhh............hhhhhhhhhhhhhhhhhh.....eeeeee....
.............................................hhhhhh..hhhee.......hhhhhhhhhh...hhhhhhh....eeeeee....
.......................................................eeeeee.......hhhhhhhhhhhhhhhhh...eeeee......
....................ee........................hhhhhhhhhhe.........hhhhhhhhhhhhhhhhhhh...eeeee......
....................e..........................hhhhhhhh...........hhhhhhhh..hhe..hhh.....eeeeeh....
....................ee...................e.........hhhheeeee......hhhhhhhh..hhhhhhhhhh...eeee......
........h..........e..........................hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh....eeeee......
>GALNS_hsa
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDWEMVG........................RYYEEFPINLKTGEANLTQIYLQE...ALDFIKRQARHHgPFFLYWAVDATH
>GALNS_mmu
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKAKPNIPVYRDWEMVG........................RFYEEFPINRKTGEANLTQLYLQE...ALDFIRTQHARQGPFFLYWAIDATH
>GALNS_rno
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKVKPNIPVYRDWEMVG........................RFYEEFPINLKTGEANLTQLYLQE...ALDFIRTQHARQSPFFLYWAIDATH
>GALNS_mac
...HLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDWEMVG........................RYYEEFPINLKTGEANLTQIYLQE...ALDFIKRQ.ARHHPFFLYWAVDATH
>GALNS_bta
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDQEMVG........................RFYEEFPINLKTGEANLTQIYLQE...ALEFIQRQQAAHRPFFLYWAVDATH
>GALNS_ssc
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNRARPNIPVYRDWEMVG........................RFYEEFPINLKTGESNLTQIYLQE...ALDFIKRQQATHHPFFLYWAIDATH
>GALNS_gga
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNRALPNIPVYRDWEMIG........................RYYEDFKIDLRTGEANLTQIYLQE...MYGPAQAALVFHHPFFLYWAIDATH
>GALNS_xla
GKWHLGHRP.......SYHPLRHGFDEWFGSXN...CHFGPYDNKQIPNIPLDRDWNMGG........................RYYEEFNIIIKTGESNLTRSSLQ....GLDFIQSQAEAQKPFFLYWAPDATH
>GALNS_fru
GKWHLGHRP.......QYLPLEHGFDEWFGAPN...CHFGPYNNSVRPNIPVYRNSWMLG........................RYYEEFKIDKKTGESNLTQMYLLE...GLDFIQSQAEAQKPFFLYWAPDATH
>GALNS_omy
GKWHLGHRA.......HHLPLEHGFDEWFGAPN...CHFGPYNSSDRPNVPVYNNSQMVG........................RYYEEFGIDKKTGESNLTQIYLEE...GLDFIFHQNMAQRPFFLYWAADATH
>GALNS_dre
GKWHLGHRA.......HHLPLEHGFDEWFGAPN...CHFGPYNDSSRPNIPVYNNSEMKG........................RYYEEFEINVKTGESNLTQLYLKE...GLDFISQQAMAQRPFFLYWAPDATH
>GALNS_cii
GKWHLGQQE.......QYLPLKHGFHEWFGSPN...CHFGPYDDKTTPNIPVYNNTEMVG........................RYYEEFAIESHKYLSNMTQYYIQE...ALDFIERMERNEKPFFLYWAPDATH
GKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPH
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSGSPFFLAVGYHKPH
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAITLLEKMKTSVSPFFLAVGFHKPH
GKVFHPGISSNYSDDYPYSWSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAIRLLNVMKTKKQKFFLAVGYHKPH
GKIFHPGISSNHSDDYPYSWSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAIRLLKTVKQQNASFFLAVGYHKPH
GKVFHPGIASNHTDDYPYSIWSPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAIGLLKGRVQNTQPFFLAVGFHKPH
GKVFHPGIASNHSDDYPYSWSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAIRLLRSMKGSQKPfFLSVGFYKPH
GKVFHPGIASNHSDDYPYSWSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAIRLLKSTRNSGKNFFLAVGFHKPH
GKVFHPGICSNYNDDFPLSWSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAIMIRKFSNNKSQPFFLAVGFHKPH
GKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEAIFVGSRSRHSQEPFFLAMGFHKPH
GKVFHPGASSNFTDDFPLSWSEPAFHPLTDEYSNAAVCIDPDGRLKRNLLCPVRLETQPLHTLPDIESTEEAiKRFLSTVGLSQPYFLAVGYRKPH
GKVFHPGKSSNFTDDYPYSWSEYPYHPPTEMYKDAKVCRNKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAiIDFLNTVGLSQPYFLAVGYRKPH
......................................................e.............hhhhhhhhhhhh.....eeeee......
....................................................................hhhhhhhhhhhh.....eeeee......
....................................................................hhhhhhhhhhhhh....eeeee......
...e........................................eeeee.................hhhhhhhhhhhhhhhhh..eeeee......
..ee...............eee......................ee..ee................h.hhhhhhhhhhhhhh..eeeeee......
..............................hhhhhh.........hhh.eeeee............hhhhhhhhhhh........eeeee......
...............................hhhhhhh............................hhhhhhhhhhhhh......eeeeee.....
..................................eeee........hheeeee.............hhhhhhhhhhhhh......eeeeee.....
..ee.............................eeee........e..e..................hhhhheeehh........eeeeee.....
...........................................hhhh..................hhhhhhhhee..........hhh.h......
..ee.........................hhhhhhhhhhhhhhhhhh..................hhhhhhhhhhhhh.......eeeee......
>IDS_hsa
GKVFHPGISSNHTDDSPYS.................WSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAI.......................QLLEKMKTSASPFFLAVGYHKPH
>IDS_mmu
GKVFHPGISSNHSDDYPYS.................WSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAI.......................RLLEKMKTSGSPFFLAVGYHKPH
>IDS_rno
GKVFHPGISSNHSDDYPYS.................WSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAI.......................TLLEKMKTSVSPFFLAVGFHKPH
>IDS_gga
GKVFHPGISSNYSDDYPYS.................WSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAI.......................RLLNVMKTKKQKFFLAVGYHKPH
>IDS_str
GKIFHPGISSNHSDDYPYS.................WSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAI.......................RLLKTVKQQNASFFLAVGYHKPH
>IDS_fru
GKVFHPGIASNHTDDYPYS.................IWSPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAI.......................GLLKGRVQNTQPFFLAVGFHKPH
>IDS_dre
GKVFHPGIASNHSDDYPYS.................WSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAI.......................RLLRSMKGSQKPfFLSVGFYKPH
>IDS_omy
GKVFHPGIASNHSDDYPYS.................WSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAI.......................RLLKSTRNSGKNFFLAVGFHKPH
>IDS_cii
GKVFHPGICSNYNDDFPLS.................WSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAI.......................MIRKFSNNKSQPFFLAVGFHKPH
>IDS_dme
GKVFHPGLSSNNTDDYPLS.................WSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEAI.......................FVGSRSRHSQEPFFLAMGFHKPH
>IDS_aga
GKVFHPGASSNFTDDFPLS.................WSEPAFHPLTDEYSNAAVCIDPDGRLKRNLLCPVRLETQPLHTLPDIESTEEAi.......................KRFLSTVGLSQPYFLAVGYRKPH
>IDS_bmo
GKVFHPGKSSNFTDDYPYS.................WSEYPYHPPTEMYKDAKVCRNKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAi.......................IDFLNTVGLSQPYFLAVGYRKPH
WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQeaVNYTKPFVLYLGLNLPH
WMDVMEKHGYQTQKFGKLDYSSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQeaVNSTKPFVLYLGLNLPH
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPH
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRKEASNSTQPFVLYLGLNLPH
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRKEAMNYTQPFVLYLGLNLPH
WMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDVEFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKKEAVNLTQPFALYLGLNLPH
WMDLMQKHGYYTQKYGKLDYTSGSHSLSNRVEAWTRDVPFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQKAAQNHSQPFFLYLGLNLPH
WMDLLEVNGYLTKMMGKLDYTSGSHSVSNRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPH
WMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDVPFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRHKAAALSQPFALYLGLNLPH
WMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDVPFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRHRAAFSNQPFALYLGLNLPH
..hhhhh.........eee........hhhhhhhhhhhhhhhhh....eeee.......eeee......hhhhhhhhhhhh.....eeeee......
hhhhhhh..........e..........hhhhhhhhhhhhhhhh....eeee.......e.........hhhhhhhhhhhh.....eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh.....eee......eeeeh......hhhhhhhhhhh......eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh....eeee.....eeeeeee.....hhhhhhhhh........eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh.....eeee....eeeeee......hhhhhhhhhhh......eeeee.....h
hhhhhhh...ee.....ee.........hhhhhhhhhhhhhhhh.....eee.....eeeee.......hhhhhhhhhh.......hheee.....h
hhhhhhh...ee.....e..........hhhhhhh.....hh................eeeeee......hhhhhhhhhhh.....eeeee......
hhhhhhh..hhhhhh.............hhhhhhhhhhhhhhhh.....eeee....eeeee....hhhhhhhhhhhhhh......h.eee.....h
hhhhhh.....eeee..e..........hhhhhhh....hhhh......eeee......eeee.......hhhhhhhhhhhhh...hheee......
hhhhhh....hhhh..............hhhhhh.....eeee......eeee....eeeeee.......hhhhhhhhhhh......eeee......
>SulfX_mmu
WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQ.......................eaVNYTKPFVLYLGLNLPH
>SulfX_rno
WMDVMEKHGYQTQKFGKLDYSSGHHSISNRVEAWTRDV................AFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQ.......................eaVNSTKPFVLYLGLNLPH
>SulfX_hsa
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRK.......................EAINYTEPFVIYLGLNLPH
>SulfX_bta
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRK.......................EASNSTQPFVLYLGLNLPH
>SulfX_ssc
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRK.......................EAMNYTQPFVLYLGLNLPH
>SulfX_gga
WMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDV................EFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKK.......................EAVNLTQPFALYLGLNLPH
>SulfX_xla
WMDLMQKHGYYTQKYGKLDYTSGSHSLSNRVEAWTRDV................PFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQK.......................AAQNHSQPFFLYLGLNLPH
>SulfX_fru
WMDLLEVNGYLTKMMGKLDYTSGSHSVSNRVEAWTRDV................QFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQ.......................RAESSQQPFALYLGLNLPH
>SulfX_omy
WMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDV................PFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRH.......................KAAALSQPFALYLGLNLPH
>SulfX_ola
WMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDV................PFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRH.......................RAAFSNQPFALYLGLNLPH
GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPEMVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTRGDRPFFLYVAFHDPH
GKKHVGPEAVYPFDFAHTEENDSILQVGRNITRMKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPETVYPFDFAhTEENSSVMQVGRNITRIKQLVQKFLQTQDDRPFFLYVAFHDPH
GKKHVGPESVYPFEFAHTEENSSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPGSVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFFQSQEERPFFLYVAFHDTH
GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKESFLLYIGFHDPH
GKKHVGAANNFRFDFEQTEEQHSINQIGRNITRMKEYARQFLQAKDEKPFFLMVGFHDPH
..............eee......eeee..hhhhhhhhhhhhhh......eeeeeee....
.........e....eee......eeee...hhhhhhhhhhhhh......eeeeeee....
......................hhhhhhhhhhhhhhhhhhhhh......eeeeeee....
.......................ee.h.hhhhhhhhhhhhhhh......eeeeeee....
............eee........eeee..hhhhhhhhhhhhhh......eeeeeee....
..............eee......eeee...hhhhhhhhhhhh.......eeeeeee....
.........................hh..hhhhhhhhhhhhhhhhhhhh.eeeee.....
...........ee....hhhhhhhhhh.hhhhhhhhhhhhhhhh.....eeeeee.....
>SGSH_hsa
GKKHVGPETVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_bta
GKKHVGPEMVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFLQ...........TRGD.RPFFLYVAFHDPH
>SGSH_ssc
GKKHVGPEAVYPFDFA..HTEENDSILQVGRNITRMKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_mmu
GKKHVGPETVYPFDFA..hTEENSSVMQVGRNITRIKQLV..............................................................QKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_clu
GKKHVGPESVYPFEFA..HTEENSSVLQVGRNITRIKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_fru
GKKHVGPGSVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFFQ...........SQEE.RPFFLYVAFHDTH
>SGSH_cii
GKKHVAPEAVYPFDFA..ETEENNSILQVGRNITRMKELA..............................................................KQFFS...........MQLK.ESFLLYIGFHDPH
>SGSH_dme
GKKHVGAANNFRFDFE..QTEEQHSINQIGRNITRMKEYA..............................................................RQFLKQ..........AKDE.KPFFLMVGFHDPH
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVH
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRRNRARPFLLFLSFLHVH
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRRNRDTPFLLFLSFMHVH
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGAYVAFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIERNSARPFLLFFSFLQVH
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKRNVDRPFLLFFSMAHIH
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIERHKHGPFLLFLSLLHVH
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFVSFLHVH
GKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVH
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLERHSKETFLLFFSFLHVH
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIERYKREPFLLFFSFLHVH
.....e...................eeeeeee............eeee...eeee.......hhhhhhhhhhhh........hhhhhhhhhhhhhhhhhhhhhh...hhhhhhhhhh.h.......hhhhhhhhhhhhhhhh.....eeeeeeee...
..eee.e.........................................hhhhhh...hhhhhhhhhhhhhhhhhhh.....hhhhhhhhhhhh................h.ehhhhhhh........hhhhhhhhhhhhhhhh....hhhhhhh.e..
...ee......................................eee....eeeeee..hhhhhhhhhhhhhhhh........eeee.hhhhhhh...eeeeee.......eee..eee........hhhhhhhhhhhhhhhh.....eeeeeehh...
..eee....................eee...............eeeee..........hhhhhhh.ee............hhhhhhhhh...eehhhhhhh.ee..hh..eee.....e.....hhhhhhhhhhhhhhhhhh.....hhh..eeee..
...ee.e..................ee....eee..........hhhhhhhhhhhhhheee...hhhhhhhh..hh..hhhhhhhhhhhhhhhhhhhh............ee...h.hh...eeehh.hhhhhhhhhhhhhh.....ee.ehhhhhh.
.........................ee....ee............hhhhhhhhhh.hhhhhhhhhhhhh......eeeeehhhh.......eeeeeee.....ee.hheeeee.........hhhhhhhhhhhhhhhhhhhh.....ee..hhhhh..
..eee.....................e..........hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh....eee......hhhhhhhhhhhhhhhhh..eeeee..hhhh................hhhhhhhhhhhhh.....eeeeeeeeee.
...ee...........................hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh..h...eee.....eeeeehhhhh....h.....hh.hhhhhhhhhh................hhhhhhhhhhhhh....eeeeeeeeee.
.........................ee....eeee..........hhhhhhhhhhhhhhhhhhhhhhh........e...hhhhhhhhhhhhhhhh..........ehhhhhh......hhhhhhhhh.hhhhhhhhhhhhhh.hhhhhhhhhh.e..
...ee.e.................eeee................hhhhhhhhhhhhhhhhhhh.hhhhh......ee....h...hhhhhhhhhh...........hhhhhhhhhhhhhh..hhhhhhhhhhhhhhhhhhhhhh...e.eeehh....
>STS_hsaXR
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGI..SLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLG.FLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNT.....ETPFLLVLSYLHVH
>STS_mmu
GKWHLGLSCRGATDFCHHPLRHGFDRFLGV..PTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSA.SCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRRNR.....ARPFLLFLSFLHVH
>STS_rno
GKWHLGLSCQAASDFCHHPGRHGFDRFLGT..PTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVA.FLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRRNR.....DTPFLLFLSFMHVH
>STS_fru
GKWHLGLNCESRDDHCHHPNAHGFNYFFGI..PLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGAYVA.FIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIERNS.....ARPFLLFFSFLQVH
>ARSD_fru
GKWHLGVNCERRGDHCHHPNQHGFSYFYGL..PFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWL.PFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKRNV.....DRPFLLFFSMAHIH
>ARSD_hsa
GKWHQGVNCASRGDHCHHPLNHGFDYFYGM..PFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYS.SFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIERHK.....HGPFLLFLSLLHVH
>ARSE_hsa
GKWHLGLNCESASDHCHHPLHHGFEHFYGM..PFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYF.VGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNK.....HGPFLLFVSFLHVH
>ARSE_bta
GKWHLGLSCASPDDHCHHPLNHGFDHFYGM..PFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSC.VGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHK.....QGPFLLFVSFLHVH
>ARSF_hsa
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGM..PFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFS.SHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLERHS.....KETFLLFFSFLHVH
>ARSG_hsa
GKWHLGLSCASRNDHCYHPLNHGFHYFYGV..PFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYS.SYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIERYK.....REPFLLFFSFLHVH
GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQKNPFFLYLSLPQTH
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNRLNSFFLYFSPPQAH
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQCYLYVNATLVSQPYQHKGLTQLFTDDALGFIEDNHADPFFLYVAFAHMH
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTECYLYKNATLVSQPYQHRNLTKLFTDDAIEFIDNNADNPFFLYVAFAHMH
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNCFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIEDNVNKPFFMYVSFAHMH
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLMTRLnRPFFFYFSFPQVH
........h................e......eeeeee..........eeeeeee.............hhhhhhhhhhhhhhhhh.....eeeee......
..eeeee......................................hhhh..eeeee....eee...........h......h........eeeee......
..eeeee.............................................eeeeeeeeee.........eeee....hhhh........eeeeeh....
..eeeee.................eeee...............eee......eeee...eee.......hhhhhh...hhhhh.......eeeee.hhh..
..eeeee.............................................eeeee.............hhhhhh....eeee.......eeeeee....
..eeeee.................eeeee..............hh........ee.....ee.....hhhhhhhhhhhhhhhhhhh.....eeeee.....
>STS_cii
GKWHLGINELKQNDGRHLPKHHGFD.FVGTNLPFTFHLFCSPSEYPVDKMKIK...........................................................CFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQ.....KNPFFLYLSLPQTH
>STS2_cii
GKWHLGINRNTSTDGYHLPHNHGFD.FVGTNLPLSHSEMCNPAEFTVEELSTM...........................................................CFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNR.....LNSFFLYFSPPQAH
>ARSE_hpu
GKWHLGINENSSTDGAHLPFNHGFD.FVGHNLPFTNSWSCDDTGLHKDFPDSQR..........................................................CYLYVNATLVSQPYQHKGLTQLFTDDALGFIEDNH.....ADPFFLYVAFAHMH
>ARSE_her
GKWHLGINEQTSTDGAHLPFNHGFE.YVGYNLPFTNSWNCDDTGLHVDFPNTEK..........................................................CYLYKNATLVSQPYQHRNLTKLFTDDAIEFIDNNA.....DNPFFLYVAFAHMH
>ARSE_spu
GKWHLGINENSSSDGAHLPANRGFD.FVGHNLPFGNSWRCDDTGLHQDFPDTNA..........................................................CFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIEDNV.....NKPFFMYVSFAHMH
>ARSE_cel
GKWHLGINENNATDGAHLPSKRGFE.YVGVNLPFTNVWQCDTTREFYDKGPDPSL.........................................................CFLYDGDDIVQQPMKFEHMTENLVGDWKRFLMTRL.....nRPFFFYFSFPQVH
>Sulf_pmari cys CAPTR Prochlorococcus marinus
GKWHETP................................................GRETTAAGPQTRWPT.RQGFEKFYGFIGAEENMYEPSLHD........GVTIIDYPDREDYHFLEDMTDQAIAWMRQQQGLRPDKPFFIYYASAGSH
>Sulf_mtube4 cys CSPTR Mycobacterium tuberculosis
GKCHEVP................................................VWQTSPVGPFDAWPSGGGGFEYFYGFIGGEANQWYPSLYE........GTTPVEVNRTPEYHFMADMTDKALGWIGQQKALAPDRPFFVYFAPGATH
>Sulf_npunc cys CSPTR Nostoc punctiforme
GKWHNTP.................................................DYETSAVGPFDRWPTGLGFEYFYGFMGGDTNQWSPALVE.....NTKRVAPPIEGNNPDYHLTPDLVDHAIAWIRSQQSIAPEKPFFTYLAIGATH
>Sulf_mbark2 cys CSPSR Methanosarcina barkeri
GKWHLTP.................................................ADQISAAGPYDRWPLGRGFECFYGFLGGETHQYYPELTY......DNHSVNPPKTPEEGYTLNEDLADRAIQFIADAKQVAPNKPFFMYFCTGAMH
>Sulf_mtube1 cys CSPTR Mycobacterium tuberculosis
GKWHLTP.................................................DNVQGAAGPFDNWPLGWGFDHFWGFPSGAAGQYDPIISQDNSVIGIPEGSGEDGRPYYFPDDLTDKAIEWLHTVRAQNATKPWMLYYATGATHAPH
>Sulf_sarom1 cys CSPTR Sphingomonas aromaticivorans
GKWHQVP.................................................DWEASPSGPFDRWPTGEGFERFYGFIGGETDQFDPSLFE........GTTPVMRPDVPNYHLTEDLADKSIAWLRTQHSVTPDKPFFLYFAPGATH
>Sulf_npunc cys CSPTR Nostoc punctiforme
GKWHNTP.................................................DYETSAVGPFDRWPTGLGFEYFYGFMGGDTNQWSPALVE.......NTKRVAPPINNPDYHLTPDLVDHAIAWIRSQQSIAPEKPFFTYLAIGATH
>Sulf_sarom3 cys CSPTR Sphingomonas aromaticivorans
GKSHLTP.................................................EWQTSAAGPFDQWPTGLGFEYFYGFLSADTSMWQPSIVE.......NTLPVEPPHDDPNYFFEKDMADHAIKWMRTQQAAAPDKPFFMYYAPGIAH
>Sulf_sarom2 cys CSPTR Sphingomonas aromaticivorans.
GKHHNTP.................................................EPFVSPAGPFDLWPTGLGFEYFYGFMAASTNQFSPALYR..........NTSPIPTLRDGVLDKALADDAIGWIHAQKAAAPDKPFFLYYATGSAH
>Sulf_narom cys CSPTR Novosphingobium aromaticivorans
GWNTAMI..........................................GKHHNTPEPFVSPAGPFDLWPTGLGFEYFYGFMAASTNQFSPALYR..........NTSPIPTLRDGVLDKALADDAIGWIHAQKAAAPDKPFFLYYATGSAH
>Sulf_mtube5 cys CSPTR Mycobacterium tuberculosis
GKWHLTP.................................................LEESNMASTKRHWPTSRGFERFYGFLGGETDQWYPDLVYDNH..PVSPPGTPEGGYHLSKDIADKTIEFIRDAKVIAPDKPWFSYVCPGAGHAPHH
>Sulf_prevoS ser STPAR +SulMod_pre Prevotella sp.......
GKWHMQC...............................................................RPKGFDFFRIFEGQGDYYNPLVLSHD.............SNGKYEREQGYATDIVTEHAVEFLNQRDEQKPFFLLVEHKAPH
>Sulf_bfrag4S ser SGPSR Bacteroides fragilis
GKWHLES...............................................................LPSGFNYWEIVPGQGDYYNPDFITQ...............NDTVQKHGYITNLITDDAIDWMENKRDESKPFCLLIHHKAIH
>Sulf_ecoli2S ser SGPSR +SulfMod_ecoli2 Escherichia coli
GKWHLSK..................................ISNVPVPEDKQTRDYHDNFTTFSAEEWQPQNRGFDYFMGFHAAGTAYYNSPSLFKN.............RERVPAKGYISDQLTDEAIGVVDRAKTLDQPFMLYLAYNAPH
>Sulf_bbron cys CTAGR CTAGR Bordetella bronchiseptica
GKNHLGD...........................RDEYLPTKHGFDEFFGNLYHLNAEEEPERPYWPKDPKD..............PVVKAYKPRGVIKASADGKIEDTGALTSKRMETIDDETVGAALDYIDKHGKGDKPFFVWMNTTRMH
>Sulf_mbark1 cys CTAGR Methanosarcina barkeri
GKNHFGD...........................LDEYLPTNHGFDEFYGNLYHLNAEEEPENPDYPAEKDF.............PNFRKNYGPRGVIHSYANGRVEDTGPLTRKRMETVDLEFLDAAIDFIKRHHAAEKPFFVWLNTTWMH
>Sulf_mmagn cys CTAGR Magnetospirillum magnetotacticum
GKNHLGD...........................RDEYLPTNHGFDEFFGNLYHLNAEEEPEQRTYPRDPEF................RKRFGPRGVIKSSADGKIEDTGPLTKKRMETIDDETSAAAMDFIERQTRADKPFFCWFNSTRMH
>Sulf_bbron3 cys CVPTR Bordetella bronchiseptica
GKWHLGD...........................VPGRYPSDRGFDEWYGIPRTTDESQFTASIGFDP....................AVADLPYIMQGRAGEPSENVKLYDLESRRRIDE.ELVERSLGFMRGNAAAGRPFFLYLPLVHLH
>Sulf_bbron5 cys CRAVR Bordetella bronchiseptica
GKWHLGD...........................KEGRYPKDRGFDEWYGIPRTTNESMFMEAVGFDP....................DVVEVPYVMEGRKGSPAERRERYDLEMRRRIDE..VLTQRSCEFIGRHAGKAPFFLYVPLTQLH
>Sulf_mloti1 cys CTAGR Mesorhizobium loti
GKNHVGD...........................RNEFLPTVHGFDEFFGNLYHLNAEEEPENVDYPKNPEFHAKFGPRGVLKCTATETDDPTEDPRFGRVGKQKIEDTGPLTKKRMETVDEEFLGAAKDFIDRQHKASKPFFCWFNSTRMH
>Sulf_ecoli3S ser SSPTR +SulfMod_ecoli3 Escherichia coli
GKWHMGE...........................NKESQPQNVGFDDFRGFNSVSDMYTEWRDVHVNPEVALSPDRS........EYIKQLPFSKDDVHAVRGGEQQAIADITPKYMEDLDQRWMDYGVKFLDKMAKSDKPFFLYYGTRGCH
>Sulf_bfrag5S ser SAPAR Bacteroides fragilis
GKWGLGA..........................PGTEGTPNKQGFDSFYGYNCQRQAHSYYPAFLYKNEDRVYLANK...............VLDPHTTKLDAGADPRDEAAYAKFSQKEYANDLIFDELISFVGQNRKKPFFLMWTTPLPH
>Sulf_bfrag3S ser SAPSR Bacteroides fragilis
GKWGLGF..........................IGSTGDPKKQGIDEFYGYNCQLLAHSYYPDHLWDNDKRVELKDNTLD...............................VQYGKGTYSQDLIHSKALDFLDRMGKSGESFCMWYPTIIPH
>Sulf_bfrag10S ser SAPSR Bacteroides fragilis
GKWAGGY..........................EGSASTPDKRGIDEYYGYICQFQAHLYYPNFLNRYSPSLGDTGVVRVVMEENIKYPMYGP..................DYHKRTQYSADLIHQKAMEWIEKQD.GEQPFFGIFTYTLPH
>Sulf_bcepa1 cys CTAGR Burkholderia cepacia
GKNHLGD...........................KNEYLPTNHGFDEFYGNLYHLNAEEEPERPYWPKDKNDPYVKNFSPRGVIHSTSDGKVQDTGPLTAKRME..............TIDDETGAHAEEFIRKQVKNGTPFFVWMNFTRMH
>Sulf_bfrag6S ser STPSR Bacteroides fragilis
GKWHLGL....GDKSGEQDWNAPLPAALGDLGFDYSYIMAATADRVPCVFIENGKVANYDPSAPIEVSYRKPFEGEPLGKDHPELLYNQKHSHGHDMAIVNGIGRIGYMKGGGKALWKDENIADSITTHAINFIREHKDEPFFMYFATNDVH
>Sulf_bfrag7S ser STPSR Bacteroides fragilis
GKWHIGL......GDGHVDWNKEVHPGAAEIGYDYSFIQAATNDRVPCVFLENGRVVGLDPNDPLYVDYRKNFPGEPTGKENPELLRM.HPSVGHAGSIVNGVPRIGFQKGGKAAQWKDEEMAGLFLDKARQFVDDNKDKPFFLYYGLHQPH
>Sulf_styph2 cys CSPSR +SulfMod_sty2 Salmonella typhimurium
GKLHLNA...................................................GGDRTDQPQAKDMGFDYTLVNPAGFVTDATLDNAKERPRYGVVHPTGWIRNGQHIGRADKMSGEFVSSEVVNWLDNKK.DDNPFFLYVAFTEVH
>Sulf_dvulgT thr TIPVR Desulfovibrio vulgaris
GYSRGFD........................................YVRFCNGHELDHETFCNVPLDEEFKAEDYLSPNWLKKDENGE..YDSSSKSLIRETECYLRQRQFWASDADNYASVVISEADNWLKMKRNPQRPFFLWLDSFDPH
>Sulf_ypest2S ser SGPSR +Sulf_ypest2 Yersinia pestis
GKWHLSK........................................ISNVPVPEAEQTRDYHDNFTTYSADEWQPQNRGFQYFMGYHAAGTAYYNSPSLFHNKERVKAKGYISDQLTDEA.......IGVANRAKSLDEPFMMYLAYSAPH
>Sulf_ypest1a cys CGPSR +SulfMod_ypest1 Yersinia pestis
GKWHNAR.......................................IEKKAFVADEVKSRDYHDNMISVSAPGYAPEKRGFDYSYSYYASGAALWHSPAIWQNSKNIAAPGYLTHNLTDE.........TLKFIDDSGKKPFFISLAYSVPH
>Sulf_bfrag11S ser SSPSR Bacteroides fragilis
GKAHFGC..................LKSEGENPTNLGFDVNIAGSAIGHPGSYHGENGYGWIKGQRARAVPDLEQYHK............THTFLSDALTLEA.........................GKEIEKAVAEKKPFYLNMAHYAVH
>Sulf_bfrag2S ser SSPTR Bacteroides fragilis.......
GKAHFGA..................VNTPGESPYHMGFEVNIAGHAGGGLASYLGENNYGNRTDGKPNPWFAVPGLEKYW.........GTDTFVSEALTLEA.........................IKALDHAKEYNQPFFLYMAHYAIH
>Sulf_bfrag9S ser SSPSR Bacteroides fragilis
GKWHLAE....................SAEYYPEQNGFDINIGGNNTGHPSKGYFSPYGNPQLKDGPEG.......................EYLTDRLTDEV...........................IRYISEPKEKPFFVYLSYYTVH
>Sulf_ccres cys CAPSR Caulobacter crescentus.
GKWHLGG....................VKGSRPEDQGFDESLGFMAGAALFAPVGDPGVVESRQDWDPIDKFLWGAAPFAVQFNGGKLFNPSHYMTDYLTDEA...........................VKAIDANKNRPFFMYLAYNAVH
>Sulf_tfusc cys CSSTR Thermobifida fusca
GKWHCGW....................LPWYSPLRIGFETFFGNFDGALDYFEHVDTLGKADLYEGETPVEEVGYYTEIISERA..............................................AEYITAHRNRPFYVQLNYTAPH
>Sulf_sarom4 cys CSPTR Sphingomonas aromaticivorans
GKWHLGE....................PPAHGPLKHGYDHFLGIVEGGADYFVHRMVMSGKPAGVGLAEDDAQTDRTGY...................LTDIFGDEA.....................VRVIEEG..GNQPFFLSLHFTAPH
>Sulf_sarom5 cys CTATR Sphingomonas aromaticivorans
GKWHLGS....................LPDFDPLKSGYQTFWGIRSGGVDYYTHATSNGQPDLWDGPTP....VERAGY...................LTDLLADRA.....................VSEIREASSGEAPWFMSLHFTAPH
>Sulf_bfrag1S ser SSPAR Bacteroides fragilis
GKWHLDA..............................................PYKPYVDTYNNRGKVAWNEWCPPERRHGFDHWIAYGTYDYHLKPMYWNTTAPRDSFYYVNQWGPEYE..ASKAIEYINGQK..DQKQPFALVVSMNPPH
>Sulf_ypest1b cys CTPFR + adjSulfMod_ypest1 Yersinia pestis
GKWHLDA............................................PEAPFVPSYNNPMEGRY.WNDWTPPEKRHGFDFWYSYGTYDLHLNPMYWTNDTPRDKPLKINQWSPEHEADIAIKYLRNEGGKYRDNDQPFALVVSMNPPH
>Sulf_bbron2 cys CAPSR Bordetella bronchiseptica
GKMHFVG...................................................PEQHHGFQERLTTDIYPSDFGWTPDWREEIPIAPTGMNMRSVIEAGECRRSMQIDYDDDVVYRGVQKIYDLGRLHR....DRPFFLAVSMTHPH
>Sulf_paer2 cys CAPSR Pseudomonas aeruginosa
GKMHFCG...................................................PDQLHGYEERLTSDIYPADYGWAVNWDEPEVRPSWYHNMSSVLQAGPCVRTNQLDFDEEVVFKARQYLYDHVRQ.HAG...QPFCLTVSMTHPH
>Sulf_bcepa2 cys CAPSR Burkholderia cepacia
GKMHFCG...................................................ADQLHGFEERLTTDIYPADFGWTPDWEHFETRPTWYHNMSSVIDAGPCVRTNQLDFDDEVTFTTRQKLFDIARERHAGKDARPFCLVASLTHPH
>Sulf_dhafn cys CAVTR Desulfitobacterium hafniense
GKWHLGS..............................................GLPNATNQGAAVGTSAPGQTPVSWGFEKSYALLGGGGDHFGRNGATAYVEDDHYVTPNTTSFFSSDFYTSTIIKYIDSSTGKNTDGKPFFAYLTYQAPH
>Sulf_rmeta1 cys CAVTR Ralstonia metallidurans
GKWHLGS..............................................GLPNATNQGAAVGTSAPGQTPVSWGFEKSYALLGGGGDHFGRNGATAYVEDDHYVTPNTTSFFSSDFYTSTIIKYIDSSTGKNTDGKPFFAYLTYQAPH
>Sulf_atume cys CAPAR Agrobacterium tumefaciens
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYTKPGERIDWWYHNLGSVTGAGVAEITNQMEYDDEVAYHATRKLYDLSR...RLDDRPWCLTVSFTHPH
>Sulf_rspha cys CAPAR Rhodobacter sphaeroides
GKMHFVG...................................................PDQLHGFEARLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVTGAGVAEITNQLEYDDDVAHQAIQKLYDLSR...GADPRPWCLTVSFTHPH
>Sulf_mloti3 cys CAPGR Mesorhizobium loti
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVSGAGVAEISNQMEYDDEVAFHAVQKLYDFARVSDDAAHRPWCLTVSFTHPH
>Sulf_smeli cys CAPAR Sinorhizobium meliloti
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVTGAGVAEITNQMEYDDEVAFLANQKLYQLSRENDDESRRPWCLTVSFTHPH
>Sulf_bbron6 cys CMPNR Bordetella bronchiseptica
.................................................................................................................TRVPESCSTTAYLGRRTMQALDGYAQRGQPFFIQCSFPDPH
>Sulf_rpalu2 cys CGPSR Rhodopseudomonas palustris
GFEPYER...............................................DDGLHPDGPYDPAPDYDAYLRSQGFDADNPWEVWANSAEGGDGELLSGWLLSHADKPARVPDEHSETPYITRRAIEFIGEAEADGRPWCLHLSYIKPH
>Sulf_paer1 cys CGPSR Pseudomonas aeruginosa
GFEPYDR..........................................NDGVYPDDPAFADKRERAPYTH.YLRRLGFTGDNPWHDWANAAAGADGEILSGWRMRHAGLPTRLPEAHSETAYTTRRAMDFI..DEQGERPWCLHLSYIKPH
>Sulf_bbron1 cys CGPSR Bordetella bronchiseptica
GLLLSRG...........................................GFRELDRYDGHHEPGAESGYPAFLRRHGYDSPDPWSDYVISAIDAGGQVVSGW..HMRNTYLPSRVREAHSETAYMTGQALDFMRQRGGQPWVLHLSYVKPH
>Sulf_rmeta2 cys CGPSR Ralstonia metallidurans
GHFVEVD..................................................RHDGHHAEPRSPYADWLRAQGYDSADPWTDYVISAQTPDGEVVSGW..HMRNAGLPARVAEPHSETAYTVDRAMDYIGARGDDPWVLHLSLVKPH
>Sulf_scoel cys CTPAR Streptomyces coelicolor
GKWHAGN....................................RRTAADYGFDGPELPGWHNPV.DHPDYLAYLD...ERGLPPYEISDRVRGTLPNGGPGNLLAARLHQPVEATFEHYLATRAIERLEHYAADAHDRDRPFFLALHFFGPH
>Sulf_pmult2 cys CGPAR +SulfMod_pmult2 Pasteurella multocida
GKWHVGT...............................KSVPEDYDIKGHNFDGYGYPGSGVYKNLVFNQPPTHSNRYKEWLEEKGFEFPEVSKAYFGDNPHLRVQELCGFLSGTKEQTIPYFIIDEAKRYIQESLEENKPFFTWINFWGPH
>Sulf_rpalu1 cys CTSSR Rhodopseudomonas palustris
GKWHLNR......................................................................KFDTQETDRLFTKEMDDYGFSDYFSPGDIIGHTLGGYQFDPLIASSAITWLRRNGRPLTDDDKPWALFVSLVNPH
>Sulf_kpneu2 cys CTPSR +Sulf_kpneu2 Klebsiella pneumoniae
GKWHLTR.........................................................EIDQPVAGKSVEEMDLGEIPTPRLHEIMEKYGFSDYHGIGDVIGKSKGGYFFDSVTTGQTISWLRNTGRPLNDENKPWFAAVNLVNPH
>Sulf_styph4 cys CTPSR +isolated Salmonella typhimurium
GKWHLTE.........................................................KLEKPLPDEKDEDIDVGDIPEPELHKIMEKYGFADYHGIGDIIGHSKGGYFYDSTTTAQTINWLRCKGQPLNDQHKPWFLAVNLVNPH
>Sulf_styph1S ser SAPAR +SulfMod_sty1 Salmonella typhimurium
GKWHLGF..........................TPGSTPKDRGFRHSFALMGGGASHFDDAVPLGTVEIFHTYYTR................................DNQRISLPSSFYSSEAYASQINRWISETPREQPIFAWLAFTAPH
>Sulf_kpneu1S ser SAPAR +SulfMod_kpneu1 Klebsiella pneumoniae
GKWHLGF..........................VPGATPKDRGFNHAFAFMGGGTSHFNDAIPLGTVEAFHTYYTR................................DGERVSLPDDFYSSEAYARQMNSWIKATPKEQPVFAWLAFTAPH
>Sulf_paer3XR 1HDH cys CSPTR Pseudomonas aeruginosa
GKWHLGL..........................KPEQTPHARGFERSFSLLPGAANHYGFEPPYDESTP..........................RILKGTPALYVEDERYLDTLPEGFYSSDAFGDKLLQYLKERDQSRPFFAYLPFSAPH
>Sulf_styph3 cys CMPAR +isolated Salmonella typhimurium
NYHNRYS........................................................SWDVVRGQEGDHWKASVGEPPIPEVLRVPQKQTGGGVSGLWRHDWANREYIQQEADFPQTKVFDAGCDFIHKNHAEDNWLLQVETFDPH
>Sulf_ecoliH cys CMPAR Escherichia coli O157:H7
GxNYHNR......................................................YSSWEIVRGQEGDHWHASVAQPPIPEVLRVPQKQTGGGVSGLWRHDWANREYIQQEADFPQTKVFDAGCAFIHKNHAEDNWLLQIETFDPH
>Sulf_mtube3 cys CVPSR Mycobacterium tuberculosis
GMQHETS..........................................................................................YPKRLGFDEFDVSNSYCEYVVAKAQDWLHNRVPALDGQRFLLTAGFFETHRPYPH
>Sulf_Bmeli cys CHPSR Brucella melitensis
GWMKALR........................................DLDYYTTTISSFGERHGCWHWYAGFNEVMNCGKGGMENADEIVPMAIDWIARNKSRKWFLHVNLWDPHTPYRVPEEWGDPFAGEPLPAWMTEEVLARSIAGYGPH
>Sulf_bfrag8S ser SCPSR Bacteroides fragilis
GKWHVTV..............................................................EGAFTQPNGSYPVERGFEKYYGCLSGGGSYYTPKPVFSGLQRITEFPKDYYYTTAITDSAVSFIRQHPVDEPMFMYLAHYAPH
>Sulf_mloti2 cys CSPAR Mesorhizobium loti
GYTDVAL....................................DPRLLTSGDPRLKTYEGVLPGFTVRQLLPEHQKQWLSWLKQQGVDASAGSPGIHRPVGDEDDDSVTEAPPIYSKDHTPAAFLAGEFIRWLGEQEQAAPWFAHLSFISPH
>Sulf_bbron4 cys CIPAR Bordetella bronchiseptica
GKLHFRD.........................................................YGGDHGFSEEIIPMHIVGGKGDLMGLVRSDLPVRKGAYKMAQMAGPGESQYTFYDREIVSRAQIWLREQAPRHADKPWVLFVSFVSPH
>Sulf_bcary cys CGPAR Burkholderia caryophylli
GYDPALI..............................GYTTTTPDPRTTSARDPRFTVLGDIMDGFRSVGAFEPNMEGYFGWVAQNGFELPENREDIWLPEGEHSVPGATDKPSRIPKEFSDSTFFTERALTYLKGRDGKPFFLHLGYYRPH
>Sulf_mtube2 cys Mycobacterium tuberculosis
GKWHISH........................................ADLEDPATGAPLATNDNEGVVDSAAVRRYLDADPLGPYGFSGWVGPEPHGAGLANSGFRRDPLVADRVVAWLTERY.....ARRRAGDTAAMRPFLLVASFVNPH
>Sulf_pmult1 cys CGPCR +SulfMod_pmult1 Pasteurella multocida
GKWHLAS........................................DGELEEEPTIDYTTSAIPPERRGGYKGFWRASDVLEFTSHGYDGYVFDENMN.............KCEFKGYRVDCITDFALEYLDQYQG.DKPFFMTISHIEPH
>Sulf_efaec cys CVPAR Enterococcus faecium
GKMHVYP...............................SRKRLGFDHVLLHDGYLHVDRKYDKSYGEQFEYSSDYLMFLKESLGSDADLIDDGLNCNS.........WEARPWMYPEKFHPTNWVVSEGINFLRRKDPTVPFFLKLSFEKPH
>Sulf_ecoli1 cys CTPAR Escherichia coli
GKWHLDG...................HDYFGTGECPPEWDADYWFDGA....NYLSELTEKEISLWRNGLNSVEDLQANHIDETFTWAHRISN.................................RAVDFLQQPARADEPFLMVVSYDEPH
Sulf_paer3XR 1HDH cys CSPTR Pseudomonas aeruginosa 549 2.8e-56 1
ARSB_cal 239 2.0e-23 1
Sulf_styph1S ser SAPAR +SulfMod_sty1 Salmonella typhimurium 192 1.9e-18 1
Sulf_kpneu1S ser SAPAR +SulfMod_kpneu1 Klebsiella pneumo... 169 5.1e-16 1
ARSB_spo 144 2.3e-13 1
Sulf_dhafn cys CAVTR Desulfitobacterium hafniense 142 3.7e-13 1
Sulf_rmeta1 cys CAVTR Ralstonia metallidurans 142 3.7e-13 1
Sulf_bfrag9S ser SSPSR Bacteroides fragilis 78 1.3e-11 2
ARSB_cii 124 3.0e-11 1
ARSB2_cii 124 3.0e-11 1
ARSB_dm2 121 6.3e-11 1
SulfY_mmu 121 6.3e-11 1
SulfY_hsa 119 1.0e-10 1
Sulf_sarom4 cys CSPTR Sphingomonas aromaticivorans 119 1.0e-10 1
sulfZ_fru 118 1.3e-10 1
sulfY/Z_hpo 128 2.0e-10 1
SulfY_fru 116 2.1e-10 1
sulfY/Z_cii 128 3.3e-10 1
SulfY_gga 114 3.5e-10 1
SulfZYB_cii 113 4.4e-10 1
sulfZ/Y_cii 110 9.2e-10 1
ARSB_fca 122 9.8e-10 1
ARSB_rno 107 1.9e-09 1
Sulf_prevoS ser STPAR +SulMod_pre Prevotella sp 73 1.9e-09 2
ARSB_mmu 106 2.4e-09 1
ARSB_hsa 104 4.0e-09 1
SulfY_ame 102 6.5e-09 1
SulfY_odi 100 1.1e-08 1
GALNS_bta 100 1.1e-08 1
sulfZ/Y_hpo 108 2.1e-08 1
Sulf_tfusc cys CSSTR Thermobifida fusca 97 2.2e-08 1
GALNS_hsa 95 3.6e-08 1
Sulf_bbron5 cys CRAVR Bordetella bronchiseptica 95 3.6e-08 1
ARSB_dm4 94 4.6e-08 1
SulfY_ava 94 4.6e-08 1
SulfZ_hsa 94 4.6e-08 1
Sulf_ccres cys CAPSR Caulobacter crescentus 91 9.5e-08 1
SulfZ_mmu 90 1.2e-07 1
GALNS_rno 90 1.2e-07 1
GALNS_ssc 90 1.2e-07 1
Sulf_bbron3 cys CVPTR Bordetella bronchiseptica 90 1.2e-07 1
Sulf_sarom5 cys CTATR Sphingomonas aromaticivorans 90 1.2e-07 1
Sulf_bfrag8S ser SCPSR Bacteroides fragilis 89 1.5e-07 1
ARSB_dm3 87 2.5e-07 1
GALNS_mmu 86 3.2e-07 1
KIAA1001_cii 100 5.0e-07 1
GALNS_fru 84 5.2e-07 1
GALNS_gga 83 6.7e-07 1
GALNS_dre 83 6.7e-07 1
GALNS_mac 82 8.5e-07 1
GALNS_xla 82 8.5e-07 1
GALNS_cii 96 1.2e-06 1
Sulf_bcary cys CGPAR Burkholderia caryophylli 79 1.8e-06 1
Sulf_bfrag10S ser SAPSR Bacteroides fragilis 78 2.3e-06 1
GALNS_omy 77 2.9e-06 1
Sulf_mbark2 cys CSPSR Methanosarcina barkeri 75 4.7e-06 1
STS_hsaXR 315 1.0e-61 2
SGSH_dme 315 3.6e-58 2
STS_rno 195 2.5e-37 2
STS_mmu 204 3.9e-35 2
STS_fru 212 8.1e-35 2
ARSD_fru 176 4.8e-31 2
ARSD_hsa 164 1.3e-27 2
ARSE_bta 156 1.2e-26 2
ARSG_hsa 195 5.5e-26 2
ARSF_hsa 139 1.6e-23 2
ARSE_her 190 3.1e-18 1
ARSE_hpu 169 5.1e-16 1
ARSE_hsa 161 3.6e-15 1
STS_cii 158 7.5e-15 1
STS2_cii 156 1.2e-14 1
ARSE_spu 150 5.3e-14 1
ARSE_cel 150 5.3e-14 1
KIAA1001_hsa 123 3.8e-11 1
KIAA1001_mmu 116 2.1e-10 1
ARSA_fru 124 7.5e-10 1
KIAA1001_cii 124 1.4e-09 1
KIAA1001_ssc 100 1.1e-08 1
Sulf_bfrag9S ser SSPSR Bacteroides fragilis 70 1.3e-08 2
KIAA1001_rno 99 1.3e-08 1
GALNS_fru 85 4.1e-07 1
ARSA_gga 82 8.5e-07 1
GALNS_bta 81 1.1e-06 1
GALNS_ssc 81 1.1e-06 1
GALNS_mmu 76 3.7e-06 1
GALNS_rno 76 3.7e-06 1
GALNS_hsa 75 4.7e-06 1
GALNS_dre 74 6.0e-06 1
GALNS_xla 71 1.2e-05 1
GALNS_omy 71 1.2e-05 1
ARSB_hsa 532 1.8e-54 1
ARSB_fca 507 3.3e-51 1
ARSB_rno 437 2.0e-44 1
ARSB_mmu 433 5.4e-44 1
SulfY_fru 281 7.0e-28 1
SulfZ_hsa 281 7.0e-28 1
SulfZ_mmu 278 1.4e-27 1
SulfY_hsa 276 2.4e-27 1
SulfY_mmu 274 3.8e-27 1
sulfZ_fru 274 3.8e-27 1
sulfY/Z_hpo 281 7.8e-27 1
SulfY_gga 260 1.2e-25 1
ARSB2_cii 231 1.4e-22 1
sulfZ/Y_cii 225 6.0e-22 1
ARSB_dm2 191 2.4e-18 1
SulfY_ptr 181 2.7e-17 1
sulfY/Z_cii 191 6.4e-17 1
ARSB_dm3 171 3.2e-16 1
SulfZYB_cii 160 4.6e-15 1
ARSB_hro 158 7.5e-15 1
SulfY_ava 158 7.5e-15 1
ARSB_dm1 96 2.8e-14 2
ARSB_cii 145 1.8e-13 1
SulfY_odi 141 4.8e-13 1
ARSB_dm4 135 2.1e-12 1
ARSE_her 120 8.0e-11 1
ARSB_cal 103 5.1e-09 1
Sulf_tfusc cys CSSTR Thermobifida fusca 103 5.1e-09 1
GALNS_dre 102 6.5e-09 1
ARSG_hsa 115 6.5e-09 1
Sulf_bfrag9S ser SSPSR Bacteroides fragilis 61 7.9e-09 2
Sulf_bfrag8S ser SCPSR Bacteroides fragilis 101 8.2e-09 1
KIAA1001_mmu 77 1.0e-08 2
GALNS_omy 100 1.1e-08 1
GALNS_bta 98 1.7e-08 1
Sulf_prevoS ser STPAR +SulMod_pre Prevotella sp 59 2.0e-08 2
Sulf_bfrag4S ser SGPSR Bacteroides fragilis 57 3.2e-08 2
Sulf_paer3XR 1HDH cys CSPTR Pseudomonas aeruginosa 95 3.6e-08 1
ARSB_hsa 1FSU
GKWHLG.MYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCAL........DFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVHEP
E...S...SSGGGTGGGTT.SEEEE.SSS...TTT.EEEEEEGGGTEEEEEE..........EETTEE..S.TTTTHHHHHHHHHHHHHHTTTTTS.EEEEEE..TTSSS
Sulf_paer3XR
GKWHLG..LKPEQTPHARGFERSFSLLPGAANHYGFEPPYDESTPRILKGTPA......LYVEDERYLDTLPEGFYSSDAFGDKLLQYLKERDQSRPFFAYLPFSAPHWP
E..S....SSGGGTGGGTT.SEEEEE.SS...TT......STTTTHHHHT..........EEETTEE.S...TTTTHHHHHHHHHHHHHHTTTTTS.EEEEEE..TTSSS
ArsA_hsa
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYP
E...SB..GGGTTTGGGGT.SEEEEESS.TTSSB.TTSBSBTTTEE.SS.BS.SS....EEETTEESEES..HHHHTHHHHHHHHHHHHHHHHTT..EEEEEE..TTSSS
STS_hsa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVF.............CFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVH
TTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLN
H helix
B residue in isolated beta bridge;
E extended beta strand;
G 310 helix;
I pi helix;
T hydrogen bonded turn;
S bend.
>SGSH_hsa
10 20 30 40 50 60
| | | | | |
UNK_9160 GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
DPM ccctetccceccechhhhhhtcceeeeeeeeeeeeeeeehehhchcccceeeeehecccc
DSC cccccccccccccceeecccccccchhhhhhhheehhhhhhhcccccccceeeeeccccc
GOR4 ccccccccccccccceecccccceeeeccchhhhhhhhhhhhccccccceeeeeeeceec
HNNC cccccccccccccceeecccccceeeechhhhhhhhhhhhhhhcccccceeeeeeecccc
PHD cccecccceeccccceeccccceeeeeccceeehhhhhhhhhhcccccceeeeeeeeccc
Predator cccccccccccccceeeeecccceeeeccchhhhhhhhhhhhhhccccccceeeeecccc
SIMPA96 cccccccccccccceecccccceeeeecchhhhhhhhhhhhhhcccccceeeeeeecccc
SOPM ttttcctttecceeeeeccttcceeeeccchhhhhhhhhhhhhcttccceeeeeeetttt
Sec.Cons. cccccccccccccceeecccccceeeeccchhhhhhhhhhhhhcccccceeeeeeecccc
10 20 30 40 50 60
| | | | | |
UNK_37250 GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKNESFLLYIGFHDPH
DPM ccchehchhecchhhhhhhhtcceeeeeeeeehhhhhhhhhhhhhhhthheeeeeeecccc
DSC ccccccccccccccccchhccchhhhhhhhhhhhhhhhhhhhhhhccccccceeecccccc
GOR4 cccccccccccccccccccccchhhhhcchhhhhhhhhhhhhhhhhhcceeeeeeecccee
HNNC ccccccccccccccccccccccceehhchhhhhhhhhhhhhhhhhcccceeeeeeeccccc
PHD cccccccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhcccccceeeeeeeeccc
Predator ccccccccccccccccccccccceeeeccchhhhhhhhhhhhhhhhcccccceeeeccccc
SIMPA96 ccccccccccccccccccccccchhhhchhhhhhhhhhhhhhhhhhcccceeeeecccccc
SOPM tccccctteecceeeccchttcheeeeccchhhhhhhhhhhhhhhctttceeeeeecccct
Sec.Cons. cccccccccccccccccccccc???hhchhhhhhhhhhhhhhhhh?cccceeeeeeccccc
>SulfX_
MAAVAAATRWHLLLVLSAAGLGVTGAPQPPNILLLLMDDMGWGD
LGVYGEPSRETPNLDRMAAEGMLFPSFYAANPLCSPSRAALLTGRLPIRTGFYTTNGH
ARNAYTPQEIVGGIPDPEHLLPELLKGAGYASKIVGKWHLGHRPQFHPLKHGFDEWFG
SPNCHFGPYDNRARPNIPVYRDWEMVGRFYEEFPINLKTGESNLTQIYLQEALDFIKR
QQATHHPFFLYWAIDATHAPVYASRAFLGTSQRGRYGDAVREIDDSVGRIVGLLRDLK
IAGNTFVFFTSDNGAALVSAPKQGGSNGPFLCGKQTTFEGGMREPAIAWWPGHIPAGQ
VSHQLGSVMDLFTTSLSLAGLEPPSDRAIDGLDLLPAMLQGRLTERPIFYYRGNTLMA
ATLGQYKAHFWTWTNSWEEFRQGVDFCPGQNVSGVTTHSQEEHTKLPLIFHLGRDPGE
RFPLSFASTEYLDALRKITLVVQQHQESLVPGQPQLNVCNPAVMNWAPPGCEKLGKCL
TPPESVPEKCSWPH
ARSD_gga
FLSGMASSNRYRALQWNAGSGGLPANETTFARLLQQQGYTTGLIGK...KGGKAMGGWEGGIRVPGIFRWPGVLPAGKVISEPTSLMDIYPTVVHLAGGVVPQDR
STS_gga
RTPNIDRLAREGVKLTQHIAAAPLCTPSRAAFLTGRYPIRSGWA
GALNS_spu sea urchin trace exon
RAALLTGRLPIRNGFYTTNGHAHNAWSQQIVKGGIPDSEILLPKLLKLSGYKSKIVGKW
>GNS_chi
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPH
>GNS_dre
GKYLNQYGSKDAGGVAHVPPGWDQWHALVGNSKYYNYTLSVNGKEEKHGDSYEKDYLTDLVLNRSLHFLEERSPSHPFFMMLCPPAPH
>GNS__gga
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSN
MSRSALAALARGLALAALLVLSPAQAARQRPNVVLILTDDQDVFLGGMTPMKKTNALIAQMGVTFSNAYVPSALCCPSRASILTGKYPHNHHVVNNTLEGNCSSKLWQKIQEPNTFPALLKSMCGYQTFFAGKYLNEYGAEDAGGVSHVPPGWSFWYALEKNSKYYNYTLSVNGKARRHGENYSVDYLTDVLANMSLGLLGVQINFWNLFFIDGSQTPAP
>GNS_ssc
GKYLNEYGAPDAGGLAHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYFFMMISTPAPH
>GNS_cfa
VPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANISLGFLDYKSNSEP
>GNS_xla
GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHGQNYSQDYLTDVLSNVSLDFLNYKSNHEPFFMMIATPAPH
GKYLNQYGGK...SVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSV.......................................DFIHNHKMRYTQPFFMMISTPAPH
GKYLNQYGGK...SVGGPQHIPVGWDQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDL
GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHG
>KIAA__gga
GSRSRSSILTGKYVHNHNTYTNNENCSSPSWQAQHEIRTFAVYLNNTGYRTAFFGKYLNE
YNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGFDYSRDYLTDLITNDSITFFRIS
KKMYPHRPVLMVISHAAPHGPEDSAPQYSHLFPNASQHITPSYNYAPNPDKHWIMRYTGP
MKPIHMEFTNMLQRKRLQTLMSVDDSMEMIYNTLVETGELDNTYIYTQQIMVIILVSSGW
>KIAA1077__gga
QGHSSTLKSLRFRGRVQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRRIMENGGASFINAFVTTPMCCPSRSSMLTGKYVHNHNIYTNNENCSSPSWQATHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPHGPEDSAPQFSELYPQRFAAYHS
>KIAA1247__dre
MAVGWRPATLLLVFILTFICLSDGSTYLSGQRQRSRLQRDRRNVRPNMILILTDDQDIELGSMQAMNKTKRIMMQGGTHFSNAFATTPMCCPSRSTILTGKYVHNHHTYTNNENCSSPSWQAHHEPHTFAVHLNNSGYRTAFFGKYLNEYNGSYVPPGWREWVALVKNSRFYNYTLCRNGIXGXHGTQYPKDYLTXRITNDSINFLRMSKRMYPHRPVMMGLSHAAPHGP
>KIAA1077__str
NSPGCCPSRSSMLTGKYVHNHNIYTNNENCSSPSWQAIHEPRTFAVYLNNTGYRTVFFGK
YLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISY
FKLSKKLYPHRPIMMVISHA
>KIAA1077__str Halocynthia roretzi
MLSDFRILFLIMKRSLSAFDVTFLLLLTLACSVWSKKPNIVIMITDDQDVLLNSMEVMHHTHNELIQQGVEFANAFTTTPMCCPSRSTMLTGLYTHNHHVYTNNDNCSSTLWRKQFESRSYATYLNNSGYNTGYFGKYLNEYNGSYIPAGWKYWMGLIKNSKYYNYAVNHNSQKELHGDDYAKDYLTDLVTNRSMEFFRDSKTERPEDPVLVA
>KIAA1247__hgl Heterodera glycines
MIRPNIVLLITDDQDIELGSMAFMPKTLRLLQQRGTEFRNAFVSTPICCPSRSTILTGLYAHNHKVMTNNGNCAGEEWRSDFEKDTFAVYLERTGFLTGFFGKFLNNYDGSWVPPGWTKWAALVRNSRYYNYSLNKNGRNEWHGNRYENDYLTNLVANLSLQFIDESVLLNPHGQPFLVVLSFPAPHGPEDPAPQFGDLFE
>gi|27764275|emb|AL627362.2|CNS07TIX DNA centromeric region sequence from BAC DP15B03, DP38F06 of
chromosome 5 of Podospora anserina
Length = 156244
Score = 145 bits (365), Expect = 2e-35
Identities = 67/88 (76%), Positives = 79/88 (89%)
Frame = +1
>Sulf_pan
GKLFNAHTVENYNSPYPAGWNGSDFLLDPYTYNYLNSSFQRNQDPPKSYEGFHSVDVLAEKSLGFVDEAVRADGPFFLGIAPVAPH
>Sulf_cgl
GKLFNAQTVDNYDSPHAAGWTGSDFLDPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAGKALGFLDDVVAEDKPFFLGIAPIAPH
>gi|6822088|emb|AJ271152.1|CGL271152 Colletotrichum gloeosporioides f. sp. malvae partial ars gene for
arylsulfatase, exons 1-3
Length = 1679
Score = 135 bits (341), Expect = 1e-32
Identities = 65/88 (73%), Positives = 73/88 (82%)
Frame = +2
Query: 1 GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAE 60
GKLFNA TVDNYDSP+ AGW GSDFL DPYTYSYLNATFQRN+D P+S+EG+YS VLA
Sbjct: 221 GKLFNAQTVDNYDSPHAAGWTGSDFL-DPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAG 397
Query: 61 KAYGFLDEAAKNVHNRPFFLGIAPIAPH 88
KA GFLD+ ++PFFLGIAPIAPH
Sbjct: 398 KALGFLDDVV--AEDKPFFLGIAPIAPH 475
>Sulf_vca 649aa Volvox carteri Arylsulfatase sulphohydrolase 69% Chlamy Q10723 ARS_VOLCA ::123::45
RPNFVVIFTDDQDGIQNSTHPRYQPKLHEHIRYPGIELKNYFVTTPVCCPSRTNLWRGQFSHNTNFTDVLGPHGGYAKWKSLGIDKSYLPVWLQNLGYNTYYVGKFLVDYSVSNYQNVPAGWTDIDALVTPYTFDYNNPGFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPHTSTQIYFDPVANATKTFFYPPIPAPRHWELFSDATLPEGTSHKNLYEADVSDKPAWIRALPLAQQNNRTYLEEVYRLRLRSLASVDELIDRVVATLQEAGVLDNTYLIYSADNGYHVGTHRFGAGKVTAYDEDLRVPFLIRGPGIRASHSDKPANSKVGLHVDFAPTILTLAGAGDQVGDKALDGTPLGLYANDDG ::12
NLLADYPRPANHRNQFQGEFWGGWSDEVLHHIPRYTNNSWKAVRVYDEDNQQAWKLIVSCTNERELYDLKTDPGELCNIYNKTRAAVRTRLEALLAVLVVCKGESCTNPWKILHPEGSVNSWNQSLDRKYDKYYANVAPFQYRTCLPYQDHNNEVSAFRSTVAAAAAAAAAAAAQQPGRRRMYTWTSAGRQLSATASAIATSPQPRSEPFVAEVERHSVPVPAEVLQSDVAKWFDNPLALA ::45
>Sulf_cre 647aa Chlamydomonas reinhardtii P14217 ARS_CHLRE Arylsulfatase GNS 30% ::123::45
KPNFVVIFTDDQDAIQNSTHPHYMPSLHKYIRYPGVELSQYFVTTPVCCPSRTNLXRGQFAHNTNFTSVLPPYGGWAKWKGLGIDQSYLPLWLKDQGYNTYYVGKFLVDYSVSNYQQVPRAGTISMPXVTPYTFDYNTRLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPHTSTQISTNPATGVTRSYFFPPIPAPPHWQLFSDANLPGGSXNKNLYEVDVSDKPAWIRALPLAQQNNRTYQEEIYRLRLRSLGPDELIEQVVKTLDEAGVLDNTYIIYSADNGYHVGAHRFGAGKTTGYEEDLRVPFLIRGPGIKASKSDKPQNSKVGLHVDFAPTILSLAGASHLLGDKGLDGTPLGLYANDDG ::12
TLPSDYPRPEQHRQQFQGEFWGGWSDELLQNLRSQPNNTWKVVRTYDESSKQGWKLIAQCTNERELYDLRKDPGELYNIYDKAKPAVRSRLEGLLAVLAVCKGESCSNPWKILHPDGTVKNFTQALNSKYDRIYNAIRPFTYKRCLPYLDWDNEDSQFKTQIRGANPAAGVGHHRLLTAASERAIATRRRAQAAVSAELADGPAVFQAKVEEKSVPVPQDILKADVEKWFAFNNAEYYLA ::45
>ARSA_afu 59aa Aspergillus fumigatus fragment 72% Sulf_psa3 55% ARSA_hsa
RPNFLVIVADDLGFSDCGCFGSEISTPNIDALAYSRGGLRFTSFHVAAACAPTRSMLMTGTDHHLTGLGQLPEYIA-LSRAHQGAPGHEGYLNERVVALPELLRDGGYYTLMSGKWHLGLKREYSPHARGFAKSYAMLSGAANHY
>ARSA_cal 59aa Candida albicans fragment 76% ARSB_spo 52% ARSA_mmu
QPNFLIIVADDLGFTDLSPFGGEINTPNLNKLATGANGVRLTDFHTASACSPTRSMLLSGTDNHIAGLGQMAEFAQRHPEKFNNQPGYEGYLNDKVVALPEILQDNGYHTFISGKWHLGLKKPYWPNKRGFNKSFTLLPGAGNHYKYITRDSQGNQIPFLPAIYVEDDKELLQPEIELPDDFYSTNYFTDKAIEFIKETPQGKPFFGMITYTAPHWPYQAPQDKIAKYNGVYDNGPEELRQKRLQSAKNLGLIDTNIIPHPIKTIRKSWDELTLLEKLKEIKIMQTYAAMVEILDENIGRLIDHLNSIDELNNTFILFMSDNGAEGMLMEALPLTNQRINKFIDEYYDNSLSNIGNKNSFTYYGDQWAQAATAPHAMYKMWSTEGAIVCPLIIHYPNLFSSSAVSGGGSGDGDGDGDGGKILKEFTTVMDILPTILELANVSHPGETYKGRQVVKPRGKSWVNYLINKTDQVHDENTVTGWELFGQQAIRKGSFKAIYIPKPFGPEKWQLFNIIEDPGEINDLSESSSEYQTILNELLDHWAVYAAETGLIELGSD
=-=-=-= new uro kiaaa long =-=-=-
Boltenia villosa
Eukaryota; Metazoa; Chordata; Urochordata; Ascidiacea;
Stolidobranchia; Pyuridae; Boltenia.
REFERENCE 1 (bases 1 to 507)
AUTHORS Davidson,B. and Swalla,B.J.
TITLE A molecular analysis of ascidian metamorphosis reveals activation
of an innate immune response
JOURNAL Development 129 (20), 4739-4751 (2002)
MEDLINE 22248966
transcripts differentially expressed during metamorphosis in the ascidian Boltenia villosa by
suppressive PCR subtractions of staged larval and juvenile cDNAs. We employed a series of three subtractions to dissect gene
expression during metamorphosis. We have isolated 132 different protein coding sequences, and 65 of these transcripts show
significant matches to GenBank proteins. Some of these genes have putative functions relevant to key metamorphic events
including the differentiation of smooth muscle, blood cells, heart tissue and adult nervous system from larval rudiments.
VPYPVTQTQNPIEWEFLYPLEITKETEETLHSEQLREYQERIET
KRMIKAMRQAKLARIQKREKKGVLRLNRCGSNGVTCFKLSNATWKTEPLWNGGDQCYC
TNXNNNTYWCVRIINETTNVLYCEFITYFVEYYNLNDDPHQLVNYRDKISDEEHNALY
XEMAXLPSC
=-=-=-=new ids -=-=-=-=-=-
>IDS_dme 512aa AE003478 CG12014 gp Drosophila melanogaster 46% clearly IDS NLmPL is NLvPL genome ::123::45
RPNVVMVIFDDLRPVIGAYGDTLASTPYLDNFARGSHIFTRVYSQQSLCAPSRNSLLTGRRPDTLHLYDFYSYWRTFTGNFTTLPQYFKEHGYYTYSCGKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEALRFVGSRSRHSQEPFFLAMGFHKPHINFRFPRQFLSRFNLSQFYNYTEDSLKPPDMPAVAWNPYTDVRARDDFKHSNISFPYGPISPLQAAQIRQSYYASVSYVDDLFGKLIGGLDLDETVVVALGDHGWSLGEHAEWAKYSNFEVALRVPLIIRSPQFPVAQTKYYHGITELLDVFPTLVDLAGLPKLDKCQSSQELTCG ::12
EGKSLYHQLMGLGRADEHVALSQYPRPGMLPTKHPNSDKPKLRNIKIMGYSLRTDIYRYTMWVRFHAQNFSRDWHDVYGEELYDHRLDSGEELNLmPLPQFDDVRQRLRRRLMEMVGS ::45
>IDS_gga 167aa chicken BI390037, to human 79%
MNVLFIVVDDLRPVLGCYGDNLVKSPNIDQLASQSIVFSNAYAQQAVCAPSRVSFLTGRRPDTTRLYDFYSYWRVHSGNYSTMPQYFKENGYVTMSVGKVFHPGISSNYSDDYPYSWSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAIRLLNVMKTKKQKFFLAVGYHKPHIPLRYPQEFLKLYPLENITLAPDPWVPEKLPPVAYNPWVDIRQRDDVKALNVTFPYGPLPDDFQRLIRQSYYAAVSYLDMQVGLLLNALDYVGLSNSTIVVFTADHGWSLGEHGE
>IDS_str Silurana tropicalis
MNLFGYLRFLMCATTVFAVWQQPFLPKHTATGGKNVLIIIADDLRTSLGCYGDSAVKSPNIDHLASQSIIFTNAYAQQAVCAPSRVSFLTGRRPDTTRLFDFNSYWRTHAGNYTTLPQYFKEHGYVTMSVGKIFHPGISSNHSDDYPYSWSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAIRLLKTVKQQNASFFLAVGYHKPHIPFRFPKEFLKLYPIKNISLAPDP
>IDS_rno Rattus norvegicus
FSLLLGFFCIALVSAAQGNSATDALNILLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSIVFENAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHSGNFSTIPQYFKENGYVTMSVGKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSVSPFFLAVGFHKPHIPFRYPKEF
>IDS_omy Oncorhynchus mykiss
MVLRPDSVYVSHDVVAKTVRRNVLFIMADDLRPTLGCYGDPIVKSPNIDQLASKSNVFLNAYAQQAVCGPSRTSLLTSRRPDTTRLYDFNSYWRVHAGNYTTLPQYFKSKGYTTMSVGKVFHPGIASNHSDDYPYSWSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAIRLLKSTRNSGKNFFLAVGFHKPH
>IDS_dre
MNVMFVFTCWWFVFIFHLLGRDVFAAKSKNFNVLYLIADDLRPTLGCYSDPVVKSPNIDQLASLSVVFHNAYAQQAVCGPSRVSFLTSRRPDTTKLYDFNSYWRVHAGNYTTLPQYFKSNGYTTLSVGKVFHPGIASNHSDDYPYSWSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAIRLLRSMKGSQKPfFLSVGFYKPH
>IDS_aga Anopheles gambiae
ATDQPNVLLIILDDFRPVINYGYGDGNAITVNIDRLVQQGFFFQNAFAQQALCAPSRNSMLTGRRPDTVRLYDFYSYWRHTSGNYTTLPQYFKQHGYRTHSVGKVFHPGASSNFTDDFPLSWSEPAFHPLTDEYSNAAVCIDPADGRLKRNLLCPVRLETQPLHTLPDIESTEEAKRFLSTVGLSQPYFLAVGYRKPHIPFRIPAKYLGLHPVAKFATLDLDYPPYGLPTVAWSSY
>IDS_bmo Bombyx mori
GKVFHPGKSSNFTDDYPYSWSEYPYHPPTEMYKDAKVCRNKKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAIDFLN
=-=-=-=-=new arsa ==--=-=-=-=-
>ARSA_gga
MAVWCGFPPWAVLLLWALRGAAGGPPSFVLLLADDLGFGDLGSYGHPSSATPNLD
RMAARGLRFTDFYSSSAVCSPSRAALLTGRFQMRSGIYPGVFYPGSRGGLPLSEVTIAEV
LKAKGYATAIVGKWHLGLGARGSFLPIHQGFDHFLGVPYSHDQGPCQNLTCFPPDIKCFG
TCDQGLVPVPLFWNQSIVQQPVSFLIWCRLQQICTGLHLPTAPGEALLXYYASHHTH
>SulfX_hsa 544 aa 8 exons
0 MLLLWVSVVAALALAVLAPGAGEQRRRAAKAPNVVLVVSDSF 0
0 DGRLTFHPGSQVVKLPFINFMKTRGTSFLNAYTNSPICCPSRA 1
2 AMWSGLFTHLTESWNNFKGLDPNYTTWMDVMERHGYRTQKFGKLDYTSGHHSIS 2
1 NRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK 0
0 VSHDAIKIPKWSPLSEMHPVDYYSSYTKNCTGRFTKKEIKNIRAFYYAMCAETDAML1
2 GEIILALHQLDLLQKTIVIYSSDHGELAMEHRQFYKMSMYEASAHVPLLMMGPGIKAGLQVSNVVSLVDIYPTML 1
2 DIAGIPLPQNLSGYSLLPLSSETFKNEHKVKNLHPPWILSEFHGCNVNASTYMLRTNHWKYIAYSDGASILPQLF 1
2 DLSSDPDELTNVAVKFPEITYSLDQKLHSIINYPKVSASVHQYNKEQFIKWKQSIGQNYSNVIANLRWHQDWQKEPRKYENAIDQWLKTHMNPRAV*
>SulfX_mmu 553 aa 8 exons
0 MLLLLVSVVAALALAAPAPRTQKKRMQVNQAPNVVLVASDSF 0
0 DGRLTFQPGSQVVKLPFINFMRAHGTTFLNAYTNSPICCPSRA 1
2 AMWSGLFTHLTESWNNFKGLDPNYTTWMDIMEKHGYQTQKFGKVDYTSGHHSIS 2
1 NRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK 0
0 VAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAML 1
2 GEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVPLLMMGPGIKANLQVPSVVSLVDIYPTML 1
2 DIAGIALPPNLSGYSLLTLLSNASANEQAFKFHRPPWILSEFHGCNANASTYMLRTGQWKYIAYADGASVQPQLF 1
2 DLSLDPDELTNIATEFPEITYSLDQKLRSIVNYPKVSASVHQYNKEQFIMWKQSVGQNYSNVIAHLRWHQDWQRDPRKYENAIQHWLTAHSSPLASSPTQSTSGSQPTLPQSTSG* 0
>SulfX_rno
atgctgctgctgttggtttcagtgatcgtggcgttggcgctcgtggcaccggctcccgaaacacaggagaaaaggctgcaagtggcccaggcgcccaacgtggtgctggtcgccagtgactccttcgatggaagactaacattccaaccaggaagtcaggtagtaaaacttcccttcattaacttcatgagagcacgtggcaccaccttcctaaatgcctacactaactctcccatctgctgtccatcacgtgcagcaatgtggagtggccttttcactcacttaacagaatcttggaataattttaagggtctggatccaaattacacaacatggatggatgtcatggagaagcatggctatcagacacagaaatttggaaaactggactattcttcagggcatcattccatt
>SulfX_rno all but one exon can be found using ESTs, HGRP, trace, assembly, and nrn.93%
MLLLLVSVIVALALVAPAPETQEKRLQVAQAPNVVLVASDSF
DGRLTFQPGSQVVKLPFINFMRARGTTFLNAYTNSPICCPSR
AAMWSGLFTHLTESWNNFKGLDPNYTTWMDVMEKHGYQTQKFGKLDYSSGHHSIS
NRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQVNSTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK
vaydaikipkwltlsqmhpvdfyssytknctgkfteneiknirafyyamcaetdaml
GEIILALHKLNLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASAHVPILMMGPGIKANLQVPSLVSLVDIYPTML
DIAGIPLPLNLSGYSLLPLSSNTSANDQAFRVHHPPWILSEFHGCNANASTYMLRTGQWKYIAYSDGTLVQPQLF
nLSLDPDELTNIATEFPEITYSLDQQLRSVINYPKVSASIHRYNKEQFIMWKQSVAQNYSNYIAHLRWHQDWQKDPRKYENAIQRWLAIHSSP
>SulfX_bta 420aa cow 88% to human positions 3-187 242 note DS not DD
MLLLWVSVVAASALAAPAPGADGQRRGAIQAWPD
APNVLLVVSDSFDGRLTFYPGSQVVKLPFINFMKAHGTSFLNAYTNSPICCPSRAAMWSGLFTHLTESWNNFKGLDPNYTTWMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRKEASNSTQPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSRYWLKKVSYDAIKIPKWSPLSEMHPVDYYSSYTKTCPGKFPEKEIKNIRAFYYAMCAEXDAMLGEIILALRQLGLLQKXIVIYTSDHGELAMEHRQFYKMSMYEASSHVPLLIMGPGIQANLQVSSVVSLVDIYPTMLDIAGIPLPQNLSGYSLLPSSSEMFKNEQKFKNLHPPWILSEFHGCNVNASTYMLRTNQWKYIAYSDGASVLPQLFDLSSDPDELTNIAAKFPEVTSSLDQKL
>SulfX_ssc 219aa pig 89% similarity to human positions 117-263 455-526
GKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRKEAMNYTQPFVLYLGLNLPHPYPSPSSGENFGSSTFQTSLYWLKKVSYDAIKIPKWSPLSEMHPVDYYSSYTKNCTGKFTKKKKK QKLRSIINYPKVSASVHQYNKEQFIKWKQSIGQNYSNVIANLRWHQDWLKEPRKYESAIDQWLKTYSDPKKI
>SulfX_gga
PSRAAMWSGLFTHLTESWNNFKGLDPDYVTWMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDVEFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKKEAVNLTQPFALYLGLNLPHPYPSPYAGENFGSSTFLTSPYWLEKVKYEAIKIPTWTALSEMHPVDYYSSYTKNCTGEFTKQEVRRIRAFYYAMCAETDAML
>SulfX_omy
PSRAAMWSGRFVHLTESWNNYKCLDPNATTWMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDVPFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRHKAAALSQPFALYLGLNLPHPYVTDSLGPNAGGSTFRTSPYWLEKVMPEFISIPKW
>SulfX_xen 85aa frog fragment 95% to human KIAA1247_hsa
MEHWRILLLTLLMALVLPAIEGSVLSKQRMKGRFQRDRRNIRPNIILVLTDDQDVELGSMQVMNKTRRIMEQGGTHFINAFVTTPMCCPSRSSILTGKYVHNHNTYTNNENCSSPSWQAQHETHTFFVYLNNTGYRTAFFGKYLNEYNGTYVPPGW
...GSHSLSNRVEAWTRDVPFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQKAAQNHSQPFFLYLGLNLPHPYPSETMGENFGSSTFLTSPYWLQKVPYKNVTIPKWKPLQSMHPVDYYSSYTKNCT
>KIAA1077_xla kiaa type
HNHNIYTNNENCSSPSWQAIHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKSSKKMYPHRPIMMVISHAAPHGPEDSAPQFSEFFPNASQHITPSYNYAPNMDKHWIMQYTGAMLPIHMEFTNVLHRKRLQTLLSVDDSMEKLYNMLVDTGELENTYIIYTSDHGYHIGQFGLVKGKSMPYDFDIRVPFFVRGPNVEPGSVVPQIVLNLDLAPTILDIAGAGRTP
>SulfX_ola
SRAAMWSGQFVHLTQSWNNYKCLEANATTWMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDVPFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRHRAAFSNQPFALYLGLNLPHPY
>SulfX_fru 8 exons 504 aa Scaffold_2094:15768-18243 SINFRUP00000086837 (62% Sulf_hsa)
0 MSVKLSALILLFLAFHQVLARNRTRPNFLVVMSDAF 0
0 DGRLTFDPGSKVVKLPFINYLRELGVTFINAYTNSPICCPSRA 1
2 AMWSGQFVHLTQSWNNYKCLDANATTWMDLLEVNGYLTKMMGKLDYTSGSHSvs 0
1 NRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPHPYKTESLGPTAGGSTFRTSPHWLEK 0
0 VSSEHVTVPKWLPGAAMHPVDFYSTFTKNCSGFFTEEEIMNIRAFYYAMCAEADAML 1
2 GQLISALRETHLLNNTVVIFTADHGELAMEHRQFYKMSMFEGSSHVPLLFMGPGLMSGVEADQLVSLVDIYPTVL 1
2 DLADVPPVGSLSGYSLLPLLSTCSSCPGRPHPDWVLSEYHGCNANASTYMLRSGRWKYIAYADGLRVPPQLF 1
2 DMILDKEELHNVVFKFSEVSAQLDKLLRSIVHYPEVSAAVHRYNKESFVAWRHTLGRNYSQVISSLRWHVDWQRNPLANERAIDEWLYGSF* 0
>SulfX_tni 8 exons CONTIG_5131_1 + CONTIG_27630_1
...DGRLTFDPGSSVVKLPFITYLQELGVTFLNAYTNSPICCP...
...GQFVHLTQSWNNFKCLDSNATTWLDLLESICYRSMRICkRDYTSGSHS...
...VSSEHVSVPKWLPVAAMHPVDLYSTFTKKCSGCFTQEEITNVRAFYYAMCARSGCHA
GQLISALRETRLLGNTVVVFTADHGELAMEHRQFYKMSMFEGSSHVPLLFTGPGLMSGVQVNQLVSLVDIYPTIL
...GRWKYLAYADGLSVPPQLF
DLSLDKEELHNVVFKFTDVYAHLDKLLRSIVDYPAVSAAVHLYNKKAFVAWSQTLGRNYSQVISN...
>SulfX_hsa 1584 bp 8 exons
ATGCTACTGCTGTGGGTGTCGGTGGTCGCAGCCTTGGCGCTGGCGGTACTGGCCCCCGGAGCAGGGGAGCAGAGGCGGAGAGCAGCCAAAGCGCCCAATgatggaaggttaacatttcatccaggaagtcaggtagtgaaacttccttttatcaactttatgaagacacgtgggacttcctttctgaatgcctacacaaactctccaatttgttgcccatcacgcgcagCAATGTGGAGTGGCCTCTTCACTCACTTAACAGAATCTTGGAATAATTTTAAGGGTCTAGATCCAAATTATACAACATGGATGGATGTCATGGAGAGGCATGGCTACCGAACACAGAAATTTGGGAAACTGGACTATACTTCAGGACATCACTCCATTAGtaatcgtgtggaagcgtggacaagagatgttgctttcttactcagacaagaaggcaggcccatggttaatcttatccgtaacaggactaaagtcagagtgatggaaagggattggcagaatacagacaaagcagtaaactggttaagaaaggaagcaattaattacactgaaccatttgttatttacttgggattaaatttaccacacccttacccttcaccatcttctggagaaaattttggatcttcaacatttcacacatctctttattggcttgaaaaaGTGTCTCATGATGCCATCAAAATCCCAAAGTGGTCACCTTTGTCAGAAATGCACCCTGTAGATTATTACTCTTCTTATACAAAAAACTGCACTGGAAGATTTACAAAAAAAGAAATTAAGAATATTAGAGCATTTTATTATGCTATGTGTGCTGAGACAGATGCCATGCTTGgtgaaattattttggcccttcatcaattagatcttcttcagaaaactattgtcatatactcctcagaccatggagagctggccatggaacatcgacagttttataaaatgagcatgtacgaggctagtgcacatgttccgcttttgatgatgggaccaggaattaaagccggcctacaagtatcaaatgtggtttctcttgtggatatttaccctaccatgcttgATATTGCTGGAATTCCTCTGCCTCAGAACCTGAGTGGATACTCTTTGTTGCCGTTATCATCAGAAACATTTAAGAATGAACATAAAGTCAAAAACCTGCATCCACCCTGGATTCTGAGTGAATTCCATGGATGTAATGTGAATGCCTCCACCTACATGCTTCGAACTAACCACTGGAAATATATAGCCTATTCGGATGGTGCATCAATATTGCCTCAACTCTTTGatctttcctcggatccagatgaattaacaaatgttgctgtaaaatttccagaaattacttattctttggatcagaagcttcattccattataaactaccctaaagtttctgcttctgtccaccagtataataaagagcagtttatcaagtggaaacaaagtataggacagaattattcaaacgttatagcaaatcttaggtggcaccaagactggcagaaggaaccaaggaagtatgaaaatgcaattgatcagtggcttaaaacccatatgaatccaagagcagtttga
>SulfX_mmu 1662 bp 8 exons
ATGCTGTTGCTGTTGGTGTCGGTGGTCGCAGCGTTAGCACTCGCAGCACCGGCCCCCAGAACACAGAAGAAAAGGATGCAAGTGAACCAGGCGCCCAACGTGGTGCTGGTCGCCAGTGACTCCTTCgatggaagactaacatttcaaccaggaagtcaggtagtaaaacttcccttcattaacttcatgagagcacatggcaccaccttcctaaatgcctacactaattcacccatctgctgtccatcacgtgcagCAATGTGGAGTGGCCTCTTCACTCACTTGACAGAATCTTGGAATAATTTTAAGGGTCTGGATCCAAATTATACGACATGGATGGACATCATGGAGAAGCATGGCTATCAGACACAGAAATTTGGAAAAGTGGACTATACTTCAGGACATCATTCCATTAGtaaccgtgtggaagcatggacaagagatgttgcattcttgctccgacaagaaggcagacccataattaatcttatccctgataagaatagaaggagagtgatgaccaaggactggcagaatacagacaaagcaatcgaatggctaagacaggttaactacaccaagccatttgtcctttacttgggattgaatttgccacacccttacccttcaccatcttcaggagaaaactttggctcttctacgtttcacacttccctttactggcttgaaaagGTAGCTTATGATGCAATCAAAATCCCAAAGTGGTTGACTTTGTCACAAATGCACCCTGTGGATTTTTACTCCTCCTATACAAAAAACTGCACTGGGAAATTTACTGAAAATGAAATTAAGAACATTAGAGCATTTTATTATGCTATGTGTGCTGAGACAGATGCCATGCTAGgtgaaattattttggctcttcacaagttagatcttcttcagaaaactattgttatatatacctcagaccatggagagatggctatggaacaccgccagttttataaaatgagtatgtatgaagctagtgtccatgttcctcttctgatgatgggaccaggaattaaggccaacctacaagtaccaagtgttgtttctcttgtggatatctaccctactatgcttgACATTGCTGGGATTGCTCTGCCTCCAAATCTGAGTGGATACTCCTTGTTGACGCTGTTGTCAAATGCATCTGCAAATGAACAGGCATTCAAATTCCACCGTCCACCTTGGATTCTGAGTGAATTCCATGGATGCAATGCAAATGCTTCTACCTACATGCTACGAACTGGCCAGTGGAAGTACATAGCCTACGCTGATGGTGCTTCCGTGCAGCCTCAGCTCTTCGatctttccttggatccggatgagctaacaaacattgctacagaatttccagaaattacttattctttggaccagaagcttcgttctattgtaaactaccctaaagtgtctgcttctgtccatcagtacaataaagaacagtttatcatgtggaagcaaagcgtagggcaaaattactcaaacgttatagcacacctcagatggcatcaagattggcagagagatccaaggaagtatgaaaatgcaatccaacattggctcacagcccactccagtccactggctagcagcccaacccagtccaccagtggctcacagcccactcttccccagtccaccagtggctga
>SulfX_fru 8 exons 1569 bp Scaffold_2094
ATGTCGGTAAAATTATCAGCCCTGATTTTGCTTTTTCTGGCTTTTCATCAAGTTTTGGCCCGTAATAGAACCCGACCAAACTTTTTGGTGGTGATGAGTGATGCTTTTgacggacgattgacctttgaccctggcagcaaagttgtgaagctgccattcataaactacctccgagagcttggtgtcaccttcataaatgcttacacgaactcacccatctgctgcccctcccgcgcagCAATGTGGAGCGGTCAGTTTGTTCACCTCACGCAGTCGTGGAACAACTACAAGTGTCTTGATGCCAATGCGACAACATGGATGGATCTGCTGGAGGTGAATGGTTACCTTACCAAGATGATGGGTAAGCTGGACTACACCTCAGGGAGCCACTCTGTCAGcaatcgagttgaggcgtggacacgagacgttcagtttcttctgcgccaagagggccggcctgttacgcaacttgttgggaacatgtcaacagtcaggatcatgggaaaagactgggaaaacatagacaaggctacgcagtggatccagcagagagccgaatcctcacagcagccattcgctctctacctcggcctgaacctgcctcacccctataaaaccgaatccctggggccgacggcaggaggctccaccttccgtacctcaccacactggctggagaagGTGTCTTCTGAGCATGTCACTGTTCCTAAATGGCTTCCAGGCGCCGCCATGCACCCCGTGGACTTCTACTCCACCTTCACCAAAAACTGCAGTGGCTTTTTTACTGAGGAGGAAATCATGAACATACGAGCCTTCTATTACGCCATGTGTGCTGAAGCTGATGCCATGCTGGgtcagctgatctcagccctgagagaaacccatctgctcaacaacaccgttgttatattcacggctgaccacggcgaactggccatggagcaccggcagttctacaaaatgtccatgtttgaaggcagttcccatgttcccctcctcttcatggggccaggtctgatgtcaggcgttgaggccgaccagctagtttccctggttgatatatatcccaccgtcttggACCTTGCCGACGTTCCACCTGTCGGTAGTCTGAGTGGCTACTCGCTCCTTCCTCTGCTGTCCACGTGCAGCTCTTGTCCAGGCAGACCACATCCAGACTGGGTTCTAAGTGAATATCATGGTTGTAACGCCAATGCGTCTACCTACATGCTCAGAAGTGGTCGTTGGAAGTATATTGCTTATGCAGATGGCCTGAGGGTCCCTCCGCAGCTTTTTGatatgattctggacaaggaggaactgcataatgtagtcttcaaattctcagaggtgtctgcacagttggacaagctgttgcgtagcatagtgcactacccagaagtctcagcagccgtccaccggtacaataaagagtcgtttgttgcctggagacacactttagggagaaactacagccaagtcatttcgtccctcaggtggcacgtggattggcaaaggaatccattagccaatgagagagctatagatgagtggctctatggctctttttaa
>
>GNS_hsa 552aa 12q14
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNFEPFFMMIATPAPHSPWTAAPQYQKAFQNVFAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRKRWQTLLSVDDLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLYEFDIKVPLLVRGPGIKPNQTSKM
>GNS_chi 559aa goat
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQNAFQNVFAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRERWQTLLSVDDLVEKLVKRLEFNGELNNTYIFYTSDNGYHTGQFSLPIDKRQLYEFDIKVPLLVRGPGIKPNQTSKM
>GNS_mmu 544aa mouse
GKYLNEYGAPDAGGLEHIPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQKAFQNVIAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIRFLDDAFRRRWQTLLSVDDLVEKLVKRLDSTGELDNTYIFYTSDNGYHTGQFSLPIDKRQLYESDIKVPLLVRGPGIKPNQTSKM
>GNS_rno 544aa rat
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARRHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQKAFPNVIAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIKFLDDAFRRRWQTLLSVDDLVEKLVKRLDSTGELDNTYIFYTSdngyhtgqfslpidkrqlyesdikvpllvrgpgikpnqtskm
>KIAA1077_hsa 871aa
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVL
>KIAA1077_mmu 870 aa
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPIMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSVERLYNMLVESGELDNTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSIEPGSIVPQIVL
>KIAA1077_rno floor plate
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSVERLYNMLVETGELGNTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSIEPGSIVPQIVL
>KIAA1077_cco 867aa quail
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPHGPEDSAPQFSELYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSMERLYQMLAEMGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSVVPQIVL
>KIAA1247_hsa 870aa chr 20q13.12
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSKDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPHGPEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWIMRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDSMETIYNMLVETGELDNTYIVYTADHGYHIGQFGLVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIVL
>KIAA1247_mmu 875aa chr 2
gkylneyngSyvppgwkewvgllknsrfynytlcrngvkekhgsdystDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPHGPEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWIMRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDSMETIYDMLVETGELDNTYILYTADHGYHIGQFGLVKGKSMPYEFDIRVPFYVRGPNVEAGSLNPHIVL
>ARSA_hsa 507aa 22q13.33 metachromatic leukodystrophy
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAERSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRCGKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLA
>ARSA_bta 505aa lower case 44aa missing, substituted human
GKWHLGVGPEGAFLPPHHGFHRFLGIPYSHDQGPCQNLTCFPPATPCEGICDQGLVPIPLLANLSVEAQPPWLPGLEARYVAFARDLMTDAQHQGRPFFLYYASHHTHYPQFSGQSFPGHSGRGPFGDSLMELDAAVGALMTAVGDLGLLGETLVFFTADNGPETMRMSHGGCSGLLRCGkgttyeggvrepalafwpghiapgvthelassldllptla
>ARSA_ssc 504aa pig lower case missing, substituted cow
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPSTPCDGSCDQGLVPVPLLANLSVEAQPPWLPGLEARYVAFARDLMADAQRQGRPFFLYyAsHHTHYPQFSGQSFSGHSgRGPFGDSLMELDASVGALMTAVGDLGLLGETLVIFTADNGPETMRMSHGGCSGLLrCGKGttFEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLA
>ARSA_mmu 506aa mouse
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPDIPCKGGCDQGLVPIPLLANLTVEAQPPWLPGLEARYVSFSRDLMADAQRQGRPFFLYYASHHTHYPQFSGQSFTKRSGRGPFGDSLMELDGAVGALMTTVGDLGLLEETLVIFTADNGPELMRMSNGGCSGLLRCGKGTTFEGGVREPALVYWPGHITPGVTHELASSLDLLPTLA
>GALNS_hsa 522aa 16q24.3
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQARHHPFFLYWAVDATHAPVYASKPFLGTSQRGRYGDAVREIDDSIGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLCGKQTTFEGGMREPALAWWPGHVTAGQVSHQLGSIMDLFTTSLALAGLTPP
>GALNS_mmu 520aa chr 8 mouse
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKAKPNIPVYRDWEMVGRFYEEFPINRKTGEANLTQLYLQEALDFIRTQHARQGPFFLYWAIDATHAPVYASRQFLGTSLRGRYGDAVREIDDSVGKILNLLQNLGISKNTFVFFTSDNGAALISAPNEGGSNGPFLCGKQTTFEGGMREPAIAWWPGHIAAGQVSHQLGSIMDLFTTSLSLAGLKP
>KIAA1001_hsa 525aa 17q24.2 ten exons
GKWHLGHHGSYHPNFRGFDYYFGIPYSHDMGCTDTPGYNHPPCPACPQGDGPSRNLQRDCYTDVALPLYENLNIVEQPVNLSSLAQKYAEKATQFIQRASTSGRPFLLYVALAHMHVPLPVTQLPAAPRGRSLYGAGLWEMDSLVGQIKDKVDHTVKENTFLWFTGDNGPWAQKCELAGSVGPFTGFWQTRQGGSPAKQTTWEGGHRVPALAYWPGRVPV
>KIAA1001_mmu 526aa
GKWHLGHHGSYHPNFRGFDYYFGIPYSNDMGCTDAPGYNYPPCPACPQRDGLWRNPGRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGQAHMHVPLSVTPPLAHPQRQSLYRASLREMDSLVGQIKDKVDHVARENTLLWFTGDNGPWAQKCELAGSVGPFFGLWQTHQGGSPTKQTTWEGGHRVPALAYWPGRVPA
>STS_hsa 583 aa 10 exons 79 aa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQR
>ARSD_hsa 593 aa 10 exons 79 aa
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIEr
>ARSE_hsa 589 aa 10 exons 79 aa
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKR
>ARSF_hsa 590 aa 10 exons 79 aa
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLER
>ARSG_hsa 699 aa 11 exons 79 aa
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIER
>STS_mmu 624aa mouse bad RefSeq chr? 79 aa
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRR
>STS_rno 578aa rat chrX:87654459-87661379 79 aa
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRR
>ARSE_bta cow Bos taurus 71% complete 80 aa
GKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVH
>SulfX_mmu 556 aa mouse 82% MPA N-trim BAB28703.1| AK01319 ::123::45 105
APNVVLVASDSFDGRLTFQPGSQVVKLPFINFMRAHGTTFLNAYTNSPICCPSRAAMWSGLFTHLTESWNNFKGLDPNYTTWMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAMLGEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVPLLMMGPGIKANLQVPSVVSLVDIYPTMLDIAGIALPPNLSGYSLLTLLSN ::12
ASANEQAFKFHRPPWILSEFHGCNANASTYMLRTGQWKYIAYADGASVQPQLFDLSLDPDELTNIATEFPEITYSLDQKLRSIVNYPKVSASVHQYNKEQFIMWKQSVGQNYSNVIAHLRWHQDWQRDPRKYENAIQHWLTAHSSPLASSPTQSTSGSQPTLPQSTSG ::45
>ARSB_hsa 533aa 5p11q13 MPSVI PDB:1AUK
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCALDFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVHEPLQVPEEYLKPYDFIQDKNRHHYAGMVSLMDEAVGNVTAALKSSGLWNNTVFIFSTDNGGQTLAGGNNWPLRGRKWSLWEGGVRGVGFVASPLLKQKGVKNRELIHISDWLPTLVKLARG
>ARSB_mmu 491aa mouse from Ests
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIESLNGTRCALDLRDGEEPAKEYTNIYSTNIFTKRATPVIATHPPEKPLFLYLAFQSVHDPLQVPEEYMEPYGFIQDKHRRIYAGMVSLMDEAVGNVTKALKSHGLWNNTVFIFSTDNGGQTRSGGNNWPLRGRKGTLWEGGIRGTGFVASPLLKQKGVKSRELMHISDWLPTLVDLAGG
>ARSB_rno 533aa lower case missing, substituted mouse
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIECLNGTRCALDLRDGEEPAKEYTDIYSTNIFTKRATTLIANHPPEKPLFLYLAFQSVHDPLQVPEEYMEPYDFIQDKHRRIYAGMVSLLDEAVGNVTKALKSRGLWNNTVLIFSTDNGGQTRSGGNNWPLRGRKGTLWEGGIRGAGFVASPLLKQKGVKSRELMHITDWLPTLVNLAGG
>ARSB_fca 535aa cat
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCALIDSLNVTRCALDFRDGEQVATGYKNMYSTNIFTERATALITSHPPEKPLFLYLALQSVHEPLQVPEEYLKPYDFIQDKNRHYYAGMVSLMDEAVGNVTAALKSHGLWNNTVFIFSTDNGGQTLAGGNNWPLRGRKWSLWEGGIRGVGFVASPLLKQKGVKNRELIHISDWLPTLVKLARG
>SulfY_hsa 574aa 2 exons chr4:125570802125648441 size 77640 strand polym fitK/R gen/EST
GKWHLGFYRKECMPTRRGFDTFFGSLLGSGDYYTHYKCDSPGMCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILASHNPTKPIFLYIAYQAVHSPLQAPGRYFEHYRSIININRRRYAAMLSCLDEAINNVTLALKTYGFYNNSIIIYSSDNGGQPTAGGSNWPLRGSKGTYWEGGIRAVGFVHSPLLKNKGTVCKELVHITDWYPTLISLAEGQIDE
>SulfY_mmu 573aa from mouse htgs AC091322 94% human 2 or 3 exons
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGDYYTHYKCDSPGVCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILATHDPTKPLFLYVAYQAVHSPLQAPGRYFEHYRSIININRRRYAAMLSCLDEAIHNVTLALKRYGFYNNSIIIYSSDNGGQPTAGGSNWPLRGSKGTYWEGGIRAVGFVHSPLLKNKGTVCKELVHITDWYPTLISLAEGQIDE
>SulfZ_hsa 569aa NT_006951 ARSB type another chr 5 gene 2 exon, 4 glcyo QLLTGR end of exon1
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGENVAWGLSGQYSTMLYAQRASHILASHSPQRPLFLYVAFQAVHTPLQSPREYLYRYRTMGNVARRKYAAMVTCMDEAVRNITWALKRYGFYNNSVIIFSSDNGGQTFSGGSNWPLRGRKGTYWEGGVRGLGFVHSPLLKRKQRTSRALMHITDWYPTLVGLAGGTTSAA
>SulfZ_mmu 572aa 95% human AA123795
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGESVACGLSGQYSTMLYAQRASHILARHNPQNPLFLYVAFQAVHTPLQSPREYLYRYRTMGNVAQRKYAGMVTCMDEAVRNITWALKRYGFYNNSVIIFSSDNGGQTFSGGSNWPLRGRKGTYWEGGVRGLGFVHSPLLKKKRRTSRALVHITDWYPTLVGLAGGTTSAA
>IDS_hsa 550aa Xq27.3q28 i
GKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDV
>IDS_mmu 563aa chr X 27
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSGSPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPHVPDSLPPVAYNPWMDIREREDVQALNISVPYGPIPEDFQRKIRQSYFASVSYLDTQVGHVLSALDDLRLAHNTIIAFTSDHGWALGEHGEWAKYSNFDV
>SGSH_hsa 502aa 17q25.3 MPSIIIA
GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSQPQYGTFCEKFGNGESGMGRIPDWTPQAYDPLDVLVPYFVPNTPAARADLAAQYTTVGRMDQGVGLVLQELRDAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPLLVSSPEHPKRWGQVSEAYVSLLDLTPTILDWFSIPYPSYAIFGSKTI
>SGSH_bta 505aa cow lower case missing, substituted pig needs fix
GKKHVGPEMVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTRGDRPFFLYVAFHDPHRCGHSQPQYGAFCEKFGNGESGMGRIPDWTPQTYNPKDVQVPYFVPDTPAARADLAAQYTTIGRMDQGIGLVLQELRGAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPMLVSSPEHPKRWGQVSEAYVSLLDLTPTILDWFSIPphyYAIFGTKTV
>SGSH_ssc 505aa pig
GKKHVGPEAVYPFDFAHTEENDSILQVGRNITRMKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSHPQYGAFCEKFGNGESGMGWIPDWTPQTYNPQDVQVPYFVPDTPAARADLAAQYTTIGRMDQGIGLVLQELRGAGVLNDTLVIFTSDNGVPFPSGRTNLYWPGAAEPLLVSSPEHPQRWGQVSEAYVSLLDLTPTVLDWFSIPYPHYAIFGSKTV
>SGSH_mmu 502aa chr 11 Mus MPSIIIA
GKKHVGPETVYPFDFAFTEENSSVMQVGRNITRIKQLVQKFLQTQDDRPFFLYVAFHDPHRCGHSQPQYGTFCEKFGNGESGMGYIPDWTPQIYDPQDVMVPYFVPDTPAARADLAAQYTTIGRMDQGVGLVLQELRGAGVLNDTLIIFTSDNGIPFPSGRTNLYWPGTAEPLLVSSPEHPQRWGQVSDAYVSLLDLTPTILDWFSIPYPSYAIFGSKTI
>SGSH_clu 507aa AF217204 6108 bp 8 exons dog
GKKHVGPESVYPFEFAHTEENSSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSQPQFGTFCEKFGNGESGMGRIPDWTPQTYDPLDVLVPYFVPDTPAARADLAAQYTTIGRMDQGVGLVLQELRGAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPLLISSPEHRKRWGQVSEAYVSLLDLTPTILDWFSIPYPSYAIFGSKTV
>KIAA47/77_cel 709aa WP:CE04736 Sulf2_C.elegans K09C4.8 KIAA1077_mmu 51% CE04736 14 exons U43375
GKYLNEYDGSYIPPGWDEWHAIVKNSKFYNYTMNSNGEREKFGSEYEKDYFTDLVTNRSLKFIDKHIKIRAWQPFALIISYPAPHGPEDPAPQFAHMFENEISHRTGSWNFAPNPDKQWLLQRTGKMNDVHISFTDLLHRRRLQTLQSVDEGIERLFNLLRELNQLWNTYAIYTSDHGYHLGQFGLLKGKNMPYEFDIRVPFFMRGPGIPRNVTFNEIVT
>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE
>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID
>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE
>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
>ARSB_dm1 542aa Drosophila melanogaster 542aa 51%
GKWHLGHWKLKYTPLYRGFSSHWGLDMRNGTQVAYDLHGHYTTDVITDHSVKVIANHNATKGPLFLYVAHAACHSSNPYNPLPVPDNDVIKMSHIPNYKRRKFAAMVSKMDNSVGQIVDQLRKSNMLENSIIIFSSDNGGPAQGFNLNFASNYPLKGVKNTLWEGGVRAAGLMWSPLLKKSQRVSNQTMHIIDWLPTLLEAAGGQPALSNLSKQIDGQSI
>ARSB_dm3 996aa Drosophila melanogaster
GKWHLGFSRPEYTPTRRGFDYHFGYWGAYIDYFQRRSKMPVANYSLGYDFRRNMELECRDRGVYVTDLLTAEAERLIKDHADKEQPLFLMLSHLAAHTANEDDPLQAPEEEIQKFSYIKDPNRRKYAAMISKLDQSVGRIITALSSTDQLENSIVIFYSDNGAPSVGMFSNTGSNFPLRGQKNTPWEGGVRVAGAIWSSgLQARGSIFRQPLYVADWLPT
>ARSB_dm2 579aa Drosophila melanogaster 43% ARSB
GKWHLGFWRKDLTPTMRGFDHHFGYYNGYIDYYDHQVRMLDRNYSAGLDFRRDLEPCPEANGTYATEAFTSEAKRIIEQHDKSKPLFMVLSHLAVHTGNEDSPMQAPEEEVAKFPHIRDPKRRTYAGMISSLDKSVAQTIGALKDNGMLNNSIILLYSDNGAPTIGIHSNAGSNYPYRGQKESPWEGGIRSAGALWSPLLKERGYVSNQAIHAVDWLPTL
>ARSB_dm4 585aa Drosophila melanogaster
GKWHLGLSQRNFTPTERGFDRHLGYLGAYVDYYTQSYEQQNKGYNGHDFRDSLKSTHDHVGHYVTDLLTDAAVKEIEDHGSKNSSQPLFLLLNHLAPHAANDDDPMQAPAEEVSRFEYISNKTHRYYAAMVSRLDKSVGSVIDALARQEMLQNSIILFLSDNGGPTQGQHSTTASNYPLRGQKNSPWEGALRSSAAIWSTEFERLGSVWKQQIYIGDLLP
>SGSH_dme 524aa Drosophila melanogaster
GKKHVGAANNFRFDFEQTEEQHSINQIGRNITRMKEYARQFLKQAKDEKKPFFLMVGFHDPHRCGHITPQFGEFCERWGSGEEGMGSIPDWKPIYYDWRNLDVPAWLPDTDVVRQELAAQYMTISRLDQGVGLMLKELEAAGVADQTLVIYTSDNGPPFPGGRTNLYEHGIRSPLIISSPNKEDRHHEATAAMVSLLDIYPSVMDALQIPRPNDTKIVGR
>IDS_dme 512aa Drosophila melanogaster
GKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEALRFVGSRSRHSQEPFFLAMGFHKPHINFRFPRQFLSRFNLSQFYNYTEDSLKPPDMPAVAWNPYTDVRARDDFKHSNISFPYGPISPLQAAQIRQSYYASVSYVDDLFGKLIGGLDLDETVVVALGDHGWSLGEHAEWAKYSNFEVAL
>GNS_dme 492aa Drosophila C
GKYLNQYWGAGDVPKGWNHFYGLHGNSRYYNYTLRENSGNVHYESTYLTDLLRDRAADFLRNATQSSEPFFAMVAPPAAHEPFTPAPRHEGVFSHIEALRTPSFNQVKQDKHWLVRAARRLPNETINTIDTYFQKRWETLLAVDELVVTLMGVLNDTQSLENTY
>KIAA47/77_dme 1114aa Sulf1 Sulfated Drosophila melanogaster
GKYLNKYNGSYIPPGWREWGGLIMNSKYYNYSINLNGQKIKHGFDYAKDYYPDLIANDSIAFLRSSKQQNQRKPVLLTMSFPAPHGPEDSAPQYSHLFFNVTTHHTPSYDHAPNPDKQWILRVTEPMQPVHKRFTNLLMTKRLQTLQSVDVAVERVYNELKELGEL
>Sulf_ncr 639aa Neurospora crassa
GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAEKAYGFLDEAAKNVHNRPFFLGIAPIAPHSNVEPGFPSSSSSSSSSDSATLHRRPTNEHDDIEKSVSFTPPIPAARHAHLFPDVIVPRTPHFSRASGVSWIARLP
>Sulf_vca 649aa Volvox carteri
GKFLVDYSVSNYQNVPAGWTDIDALVTPYTFDYNNPGFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPHTSTQIYFDPVANATKTFFYPPIPAPRHWELFSDATLPEGTSHKNLYEADVSDKPAWIRALPLAQQNNRTYLEEVYRLRLRSLASVDELIDRVVATLQEAGVLDNTYLIYSADNGYHVGTHRFGAGKVTAYDEDLRVPFL
>Sulf_cre 647aa Chlamydomonas reinhardtii P14217 ARS_CHLRE Arylsulfatase GNS 30%
GKFLVDYSVSNYQQVPRAGTISMPXVTPYTFDYNTRLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPHTSTQISTNPATGVTRSYFFPPIPAPPHWQLFSDANLPGGSXNKNLYEVDVSDKPAWIRALPLAQQNNRTYQEEIYRLRLR
>ARSB_spo 554aa S.pombe
GKWHLGLTPDRYPSKRGFKESFALLPGGGNHFAYEPGTRENPAVPFLPPLYTHNHDPVDHKSLKNFYSSNYFAEKLIDQLKNREKSQSFFAYLPFTAPHWPLQSPKEYINKYRGRYSEGPDVLRKNRLQAQKDLGLIPENVIPAPVDGMGTKSWDELTTEEKEFSARTMEVYAAMVELLDLNIGRVIDYLKTIGELDNTFVIFMSDNGAEGSVLEAIPVL
>KIAA1077_fru 19 exons 864 aa
GKYLNEYNGSYIPPGWREWVGLIKNSRFYNYTVCRNGYKEKHGGEYAKDYFTDLITNDSINFFRISKRMFPHRPVMMVISHAAPHGPEDSAPQYADHFPNASQHITPSYNYAPNMDKHWIMQYTGPMRPIHMEFTNFLHRKRLQTLMSVDDSVQK
>KIAA1247a_fru 20 exons 869 aa
GKYLNEYNGSYVPPGWKEWVALVKNSRFYNYTLCRNGVREKHSSDYPKDYLTDIITNESINYFRTSKRTYPNRPVMMVLSHVAPHGPEDSAPQYSSAFPNASQHITPSYNYAPNPDKHWILRYTGAMKPVHMQFTNMLQRRRMQTLLSVDDSVEK
>KIAA1247b_fru 22 exons 858 aa
GKYLNEYNGSYVPPGWKEWLGLVKNSRFYNYTLSRNGFREKHGAEYPQDYLTDLITAESMRYFRYSKRVYPHRPVLMVLSHAAPHGPEDSAPQYSTAFQNASQhiTPSYNYAPNLDKHWIMRYIGPMKPIHMEFTNVLQRKRLQTLLSVDDSVEK
>GNS_fru 543 aa (72% GNS_hsa) Scaffold_157:8104384785 SINFRUP00000081852
GKYLNQYGHAQAGGVEHIPPGWSFWVGLEKNSKYYNYTLSVNGKAQKHGSDYSKDYLTDVLANMSLEFLQYKSSYQPFFMMVSTPAPHSPWTAAPQYQNSFNGTKAPRDPNFNVHGKDKHWLIRQAKTPMSNSSVQFLDDAFRK
>GNSlike_fru 14 exons 568 aa
GKYLNQYGQKDAGHVGHIPPGWDHWHALVGNSQYYNYSLSVNGKEEKHGDNYGDDYLTDLITNRSLTFLDNRSPQLPFFLLLSPPAPHAPWTAAPQHQKDYADIKAPRDGSFDKPGKDKHWLLRQPANPMTASSLTYLDNAYRKR
>GALNS_fru 14 exons 520 aa (69% GALNS_hsa)
GKWHLGHRPQYLPLEHGFDEWFGAPNCHFGPYNNSVRPNIPVYRNSWMLGRYYEEFKIDKKTGESNLTQMYLLEGLDFIQSQAEAQKPFFLYWAPDATHAPVYASKDFLGKSQRGR
>KIAA1001_fru bp 11 exons 527 aa
GKWHLGHNGPYRPNRRGFDYYYGVPYSNDMGCTDVPGYNLPQCPPCDPPSGPsrSRHDGCYSKVALPLIENTTIVQQPLNLWRLTEQYKSAATRIIQNARAQGQPYFLYIALAHMHVPLAPPVGASATNDNKVYAASLQEMDDLVGAIKRISDETDRDNTLIWFT 1
>ARSA_fru 8 exons 501 aa
GKWHLGIGANGTFLPTRQGFDQYLGIPySHEMGPCQNLTCFPPDVKCFGLCDVGTVTVPLMYNEVIKQQPVNFLDLENAYRDFASDFISTSAKKRQPFFLYFPSHHTHYPQFAGPGAAGMSLRGPFGDALLELDNTIGSLLETLEGTGVLNNTLILFTSDNGPELMRMSRGGNAGPLRCGKSTTYEGGMREPAIAYWRGIIQP
>IDS_fru 12 exons 593 aa
GKVFHPGIASNHTDDYPYSWSIPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAIGLLKGRVQNTQPFFLAVGFHKPHIPFRIPQEYLSLYPIEKMTLAPDPIVPELLPPVAYN
>SGSH_fru 8 exons 499 aa
GKKHVGPGSVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFFQAHKEDKVNSQEEERPFFLYVAFHDTHRCGHSQPQYGAFCEKFGNGEKGMGRIPDWKPVYYTPDQVKVPPFAPDTPVTRADLAAQYTTVSRLDQGIGLVLQELREAG
>SulfX_fru 8 exons 504 aa
WMDLLEVNGYLTKMMGKLDYTSGSHSvsNRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPHPYKTESLGPTAGGSTFRTSPHWLEKVSSEHVTVPKWLPGAAMHPVDFYSTFTKNCSGFFTEEEIMNIRAFYYAMCAEADAML
>SulfX_hsa 527a
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVSHDAIKIPKWSPLSEMHPVDYYSSYTKNCTGRFTKKEIKNIRAFYYAMCAETDAMLGEIILALHQLDLLQKTIVIYSSDHGELAMEHRQFYKMSMYEASAH
>SulfX_mmu 556 aa mouse 82%
WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAMLGEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVP
>SulfY_fru 2 exons 560 aa
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGDHYSHYKCEAPGMCGYDLYEGEEAAWEQDRGLYSTVMFTQKAISILAKHDPHRKPLFLYLAYQAVHSPLQVPSRYLERYKGISNVHRRKYAAMVSCLDEAIRNLTLALKRYGYYDNTVLVYSSDNGGQPLLGGSNWPLRGSKASYWEGGIRAVGFVHSPLLRNKGTKCRSLIHITDWFPTLVSLGEGTLE
>sulfZ_fru 2 exons 579 aa
GKWHLGFYKKECLPTRRGFDTYFGSLTGSVNYYTYDSCDGPGMCGFDLHEGESVAWSQKGKYSTHLYTQRVRKILATHDPRSQPLFIFLSFQAVHTPLQCPREYIYPYRGLENIARRKYAAMVSAVDEAVRNITYGLRKYGYYENSIMIFSTDNGGQPLSGGSNWPLRGRKGTYWEGGVRGLGFIHSPLLRKKKRVSKALVHITDWYPTLVGLAGGKESH
>STS_fru revisited 75 aaone 22 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER
>ARSD_fru revisited 80 aa split exon 21
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR
>KIAA47/77_cii 14 exons 854aa
GKYLNEYNGSYIPQGWQYWMGLVRNSRYYNYSLRHNDVKESHRDNYRDDYFTDLIVNRSMTYFRRKKHEEPDSPILSVLSFPAPHGSEDGAPQYQHLYANVTSHITPSFDYGPNPDKHWIISSRKVPMDETQHRFSSILQQKRLQTLRSVDDAVDRFVSMLQDTGELDNTYLLYTSDHGFHIGQFGLAKGKSMPYDFDVRVPLFMRGPGIQAGLHVNEIILNID
>GNS_cii 11 exons 545aa
GKYLNQYGGKSVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSVDFIHNHKMRYTQPFFMMISTPAPHSPWDSAPQYSKMYENNTAPHTPSYNTKAVNKHWLVRQATHPMTKESMDYSDNAFRSRWRALKSVDDLVERVINALSKMKQLDNTYVFFSSDNGY
>KIAA1001_cii 8 exons 500 aa
GKWHLGITKAYHPCSRGFNYYYGLPYSNDMXCVDCDAYNHPQCKKCPKQSGITNDQAIECGNYDTALPLYENYDIIEQPANLVELGDRYVEKATLFIQQAKNKTQPFFLYVATAHTHVPLAYAKRFHNSTSHDTRYSDTLHELDDMIGRIMTSLKDNGLY
>GALNS_cii 14 exons 493aa
GKWHLGQQEQYLPLKHGFHEWFGSPNCHFGPYDDKTTPNIPVYNNTEMVGRYYEEFAIESHKYLSNMTQYYIQEALDFIERMERNEKPFFLYWAPDATHSPVYSSPMFRGASRRGPYGDAVMELDYGVGVIIQKLKQLGLDKNTLVLFSSDNGAAMIGSA
>SGSH_cii 2 exons 507aa ????
GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKNESFLLYIGFHDPHRCGHTHPQYGEFCEKFGNGDYRMGKIPDWKPDYYSPDDVIVPPFVQDTPASRKDISAQYTTISRLDQGVGLIINELKQAGFLESTLILFTSDNGIPFPNGRTNLYNSGTAGPFILALPVQKHKQAVV
>SulfZYB_cii 10 exons 512aa
GKWHLGFSSSKYAPWNRGFHGFYGFLAGSENYWSKWLPMARHSNIGGVDFTDSTTGPTNETWGQYSAHVYASRARYVIQHHDQSKPLFLYLPLQTPHTPLGAPSHYYEPFKDIEDDDRMKYLSMVSVLDETVRNVTNYLKDAGMWEDTLLIFSTDNGGEV
>sulfZ/Y_cii 1 exon 517aa
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGEDYYTKWRCDGKLCGYDMTSEKGPTNATYGQYSANLFANKANEAIDKHDKTKPLFLYVAFQSVHSPMEVPESYAKPFDYIKNHNRKMYGGMVAAMDEAVKNITEHLQAAGLWDNTILVFSADNGGQTLSGGNNWPLRGRKLTLWEGGIKGVGFVHGKILNVPNPNYIVNNEMIHISDWFPTIMEATQCPYV
>sulfY/Z_cii 10 exons 562aa
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVNHFTRYHCEPKKTRRFCGYDMIDSRYGPTNATYGEYSTNLFIRKSKEMIDKHNKQKPMFLYLSLQAVHGPLQVPNQYLKRFKHIRDKNRRIYAGMVYAMDRGIRQLVKHLKRARMWKNTIFIFSTDNGGQTTRGGNNWPLRGKKGTLWEGGIRGVGFVHGKPLQVTTPRVNKELLHVSDWYPTIMSATHC
>ARSB2_cii 1 exons 522aa
GKWHLGFYKKECLPTSRGFDTFYGYYCGAEDYYTKQVHANFHFGNKTRRVSGFDFHDNSRTEWEANGTYSSYLYRDRAVRIIKSHNSSIPLFMYLPFQSVHFPLQVPAKYIKRYRHIKDRKRRTFSAMVTAMDEAIGSVVDALKWKGIWQDTLVVFTTDNGGQTLFGGNNWPLRGRKASLWEGGVRGVGLVRGYGIRDKGRSSNELVHISDWFPTLLYIA
>ARSB_cii 12 exons 492aa
GKWHVGYCDEAYTPTRRGFDSHYGFYNSGISYSNYSSTEGTDVGYDYRDDLALNLAAEGKYTTTDFTDQAKTLIDNHDQTNPMFLYMAYNAPHTPFEVEESYRDIYDGNLRDGNRKTYLGMISALDEQVGQLVDKLKEVGMWSNTVFVFYSDNGGTQPQSGQSGNNFPLRGKKGSLFEGGYRLIARTRAGNLELIASTSSTLFHISDMFATFIALAGGDA
>IDS_cii 11 exons 504aa
GKVFHPGICSNYNDDFPLSWSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAVEMIRKFSNNKSQPFFLAVGFHKPHIPYKFPEQYLDLYPISEIDLAPNPFIPKELPPVAFSPYTQMRIREDVKSLNLSFPFGPIPYDFQRKIRQHYYSSVTYMDSMVGKVLQQLEQSG
>STS_cii 21 aa 47% ARSE/D/STS
GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFIT
>STS2_cii 22 aa 34% STS/E/F/D
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS
new sts sequences:
>ARSD_gga chicken Gallus gallus, no STS_gga available
SLPGRQRSCLSIWLILCLFLRSCVSSPPKPNFLLILADDLGIGDVGCYGNDTIRTPHIDG
LAKEGVRLTQHIAAAAVCTPSRAAFLTGRYPIRSGMASSTERRILFWNGCSGGLPPNETT
FARVLHQQGYSTALVRNKHRPFLLFLSLLHVHTPLITTKEFLGRSRHGLYGDNVEEMDWM
VGRLLDVIDKEGLKNTTFIYFASDHGGSVEAHRGNVRLGGWNGIYKGKILVACTYIKTVY
YKESTE
>STS_hsa 583aa Xp22.31 steroid sulfatase ichthyosis PMID:10607842::123::45
RPNIILVMADDLGIGDPGCYGNKTIRTPNIDRLASGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASWSRTGVFLFTASSGGLPTDEITFAKLLKDQGYSTALIGKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVHTALFSSKDFAGKSQHGVYGDAVEEMDWSVGQILNLLDELRLANDTLIYFTSDQGAHVEEVSSKGEIHGGSNGIYKGGKANNWEGGIRVPGILRWPRVIQAGQKIDEPTSNMDIFPTVAKLAGAPLPEDRIIDGRDLMPLLEG ::12
KSQRSDHEFLFHYCNAYLNAVRWHPQNSTSIWKAFFFTPNFNPVGSNGCFATHVCFCFGSYVTHHDPPLLFDISKDPRERNPLTPASEPRFYEILKVMQEAADRHTQTLPEVPDQFSWNNFLWKPWLQLCCPSTGLSCQCDREKQDKRLSR ::45
>ARSD_hsa_ins
TLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCI
>ARSE_hsa_ins
SLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCF
>ARSF_hsa_ins
TLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCL
>STS_hsa_ins
TNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCF
>STS_mmu_ins
TNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCF
>STS_rra_ins
TNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCF
-=-=-=-= reference set including exonis flanking =-====--
>STS_hsa 583 aa 10 exons 79 aa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQR
>ARSD_hsa 593 aa 10 exons 79 aa
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIEr
>ARSE_hsa 589 aa 10 exons 79 aa
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKR
>ARSF_hsa 590 aa 10 exons 79 aa
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLER
>ARSG_hsa 699 aa 11 exons 79 aa
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIER
>STS_mmu 624aa mouse bad RefSeq chr? 79 aa
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRR
>STS_rno 578aa rat chrX:87654459-87661379 79 aa
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRR
>ARSD_ssc pig Sus scrofa 73% nearly complete 79 aa
PSTLGSECHPGWPPQVGEALGGRLWLSTQMMALGVLTGAAGKTLGLVSVPWKFVWGAASLVLLFFGSWFASLGVLRRWNCILMRNHDVVEQPMALESTARLLSGEALSFIQR
>ARSE_ssc pig Sus scrofa 73% partial 59 aa
GKWHLGLNCESSEDHCHHPLNHGFDLFYGMPFSMMGDCLPSDISEKRVILERQVNVCCHIVALAALTLALGKLTRLTPGSWTPVVCSALAA
>ARSE_bta cow Bos taurus 71% complete 80 aa
gKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKR
>STS_bta cow Bos taurus 82% partial 3 aa
GKWHLGISCHDPGDFCHHPTSHGFDYFHGLPLTNM
>ARSD_gga chicken Gallus gallus 60% nearly complete 79 aa
GVNCKSHRDHCHHPLNHGFEYFYGMSFTILNECQGTDDPELAKSSQDTYWLYTQIIFIAVLTLFvGKLTHLFSVKWKIIVCVTIFGLLYFLSWFSSYGFTKYWNCIMMRNHEITEQPMNLDKTTSNMLKEAVSFIER
>ARSD_xla_frag Xenopus laevis CA793586 59 aa
GKWHLGVNCRSRDDFCHHPLNHGFDYYYGLLYTLINDCQASMPSEIHVAFRAQLLFYAQLFAVTLLTAMVTKPNGILLQVSWKSSWPIFCPPSGE
>STS_xla_frag Xenopus laevis 63 aa
ptrpptrpaGAAKYIKATFQISFLALFTLVLISYSGLLNVPWKLIFYIVSVTSLLLGAVIFFFWNFQYLNCVLMRNDKIIQQPLVFDNLTQRITREALQYIKS
>ARSD_str_frag Silurana tropicalis 59 aa
GKWHLGLNCASRDDFCHHPNSHGFNYFYGMPFSLYSGCKPGSIPESPNSPKQQLSFVTQIIGFGVLTLTALKYSKILAINGKFLVSCAVFD
>STS_fru revisited 75 aa genomic frameshift, one 2-2 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER
>ARSD_fru revisited 80 aa split exon 2-1 Actinopterygii
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR
>STS_dre_frag zebrafish Danio rerio 76 aa BQ258682
GKWHLGLNCEDSSDHCHHPNSHGFHYFYGTIMTHLRDCQPGHGSVILYNVYSHIPFKPLSIGLVSLVLHIRGMLTVSRRVFFSFLILVGLVLSLFRLLVYTFPNLNCFVMRGTEIVEQPYISENLTQRMTSEAIEFLER
>ARSD_dre_frag zebrafish Danio rerio 5 aa CA377054
GKWHLGVNCELRGDHCHHPNTHGFSYFYGLPFTLLND
>ARSD_omy_frag Oncorhynchus mykiss 40 aa CA386455
rvcgllevSVSVIVAVTCLSLLAFSVWFVPFELLMTWNCIIMRNQEVVEQPMDLDTLSQRLLGEAQGFIER
>STS_cii 21 aa 47% ARSE/D/STS
GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFIT
>STS2_cii 22 aa 34% STS/E/F/D
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS
>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE
>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID
>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE
>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
=-=-=-=-=-=-=-=-
Summary: It is likely that the insert arose only once in the ancester to STS/ARSD sulfatases and subsequently propagated through speciation and gene duplication. Since tunicate, sea urchin, and nematode sulfatases in the STS class all have short (21-23 aa) intervening sequences similar to non-insertional sulfatases [need what happens elsewhere beteeen anchors], the most parsimonious scenario has
-=-=-=-= end of reference set including exonis flanking =-=-=-=-=--
>STS_cii 21 aa according to ESTs BW300043 BW047068; exon breaks differently, after GKWHL
MLKFPLWLLLIILVNQVSSRPNFVLIFADDVGYGDFQSYGHPTQERGPIDDLAAEGMRFTQWYSAASLCTPSRAALLT
GRLPIRSGMVGPTRVLHQNDAGGLPKNETTLAEALKDLGYKTGMVGKWHL
GINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSE
YPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQKNPFFLYLSLPQTHVAMFCKEEFCNKSMRG
SYGDNVNEMSWAVGEVVNQLKDLNLDRNTLVMFLSDHGPAVEFCYT
GGSTGGLKGGKASS
WDGGIKVPAVAWWPGTIQPGVKTQVVSTMDIFPTFLQLAGN
SDDQAYYMMLLDYDITQVMKETMETSMECQSPIFF
cNEGNNGNLDGMSISDLLLSNHDNEVHEILFHYCSDRLMAVRYGRYK
IHFHTQHLHVFNSNCIDGKALENIVDYFDCYANTTTHNPPLIFDINTDPEELFPLEAAPRAHIIEEVEKQVAKHQKTIKPVASQLGRHGKDLQPCCNPPSCVCNYPNPDKR
>STS2_cii 8 exons 483aa ci0100146549 best 34% STS_cii 43%
PNFIVIMADDIGYGDFQSFGHPTQEYGGVDRMVKEGMRFTQWTSAATLCSPSRAALLTG
RYAIRSGLRGDVAPVFQPQSV
GGLPRKEITIAESLKALGYRTGLVGKWHL
GINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAE
FTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNRLNSFFLYFSPPQAHRALFCAERFCGRSKRG
PYGDTINEMSSAISDILDHLVQLEIDDNTLVIFLSDHGPNSDKCPDGGVPGLFKGTGKGTT
WEGGLRVPAVAWWPGVI
PAGTVSNAVVSTLDVHPTLLKIAALQNQKPIPSKLFDGIPIPDLICSMKHQRTSSCLSTPSNR
ILFHYCGEDILAVRYG
DLKFHFKSNPPLQRRSNCVRTVTADLIRTFSCGKR
THDPPLVFNLLIDPSEEIPLNISHYSEELSEVQRLIRKHKRSIKEVPAQYSPNVPEVQPCCNPPSCICNY
>STS2_cii 8 exons 483aa 21 aa N
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEvcfmyyciaclinrvtclnkyeiknlfiyfqFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS
atagggatcaaccgaaatacatcaacagatggttatcatttaccacataatcatggcttt
I G I N R N T S T D G Y H L P H N H G F
gactttgttggcaccaaccttcctttatctcattcggagatgtgtaatccagcagaggtc
D F V G T N L P L S H S E M C N P A E V
tgttttatgtattattgtatcgcatgcttaataaatagggtaacctgcttaaataaatat
C F M Y Y C I A C L I N R V T C L N K Y
atatgtaaaagctttttttaaattaagaatttgtttatttattttcagtttactgtggag
I C K S F F - I K N L F I Y F Q F T V E
gaattgtccacaatgtgtttcctttacaacggttctactatagtggaacaacctgtcaat
E L S T M C F L Y N G S T I V E Q P V N
ttaagcacattaactgacagaataacaagtgatgcaaagaattttatatca
L S T L T D R I T S D A K N F I S
>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE
>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID
>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE
>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM
-=-==- working on above ==-=-==-
>STS_dre screwed up with fugu
0 mplr 2
1 kmkIPCTFCLLLYTADAGSGTKPNFVFMMVDDLGIGDLGCYGNTTLR 2
1 TPNIDRVALEGVKLTQHIVXAPLCTPSRAAFLTGRYPVRS 1
2 GMAAHGHMGVFLFSASSGGLPQEEITFAKAVKVQGYSTAVI 1
2 GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLL 2
1 NSARPFLLFFSFLQVHTAMFASAAFRATSQHGIYGDAVHEVDWSV 1
2 GQIMQALDKFNLKDDTLVYLTSDQGGHVEEISATGVVQGGWNGIYK 1
2 AGKATNWEGGIRVPGILRWPGKIPGGRKIDEPTSNMDLFPTVVQLSGASVPLDR 2
>STS_dre_frag zebrafish Danio rerio
MKSFQWIPCTFCLLLYTADAGSGTKPNFVFMMVDDLGIGDLGCYGNTTLRTPNIDRLALEGVKLTQHIAAAPLCTPSRAAFLTGRYPIRSGMAAHGHMGVFLFSASSGGLPQEEITFAKAVKVQGYSTAVIGKWHLGLNCEDSSDHCHHPNSHGFHYFYGTIMTHLRDCQPGHGSV
>ARSD_dre_frag zebrafish Danio rerio 5 aa
CLLALLLLLDTGSDVTASEEDRKPNFVLMMVDDLGIGDIGCYGNDTIRTPNIDRLAAEGVKLTQHIAAAPLCTPSRAAFHTGRYALRSGLGSTGRVQVLLFLGGSGGLPPTETTFAKRLQEQGYTTGLVGKWHLGVNCELRGDHCHHPNTHGFSYFYGLPFTLLND
>ARSD_omy_frag Oncorhynchus mykiss
RVCGLLEVSVSVIVAVTCLSLLAFSVWFVPFELLMTWNCIIMRNQEVVEQPMDLDTLSQRLLGEAQGFIERNADRPFLLFLSLAHVHTPLFSSPGFAGKSRHGLYGDNVEEVDYMIGRMTETVDRLGLANNTMMYFTSDHGGHIEDADERGQKGGWNGIYKGGKAMGGWEGGIRVPGIFRWPGRLPAGRVFDTPTSLMDLYPTLTHLAGATHSDRLLDGYDLMPVLEGRTERSQHEF
>STS_fru revisited 75 aa genomic frameshift, one 2-2 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER
agctttacatcatcgtattttctgttgttcagggaaatggcaccttggactcaactgtgag
G K W H L G L N C E
agcagagatgatcactgccaccaccccaatgctcacggctttaactatttttttgggatc
S R D D H C H H P N A H G F N Y F F G I
ccgttgaccaacctccgggactgccagccaggacatggtacggtctttcagatccataag
P L T N L R D C Q P G H G T V F Q I H K
tacctaccgtacaggacgctaggcaccgtgttggcttctacagtcttactgtacttcagt
Y L P Y R T L G T V L A S T V L L Y F S
ggactcatcagttcagcagaaaaaggaccttttgccttctggctgcagcggttttggtcgt
G L I S S A E K G P F A F W L Q R F W S
gtacctaccgtacaggacgctaggcaccgtgttggcttctacagtcttactgtacttcag
V P T V Q D A R H R V G F Y S L T V L Q
tggactcatcagttcagcagaaaaaggaccttttgccttctggctgcagcggttttggtc
W T H Q F S R K R T F C L L A A A V L V
gtgagtttcatagtaggattcatcatgattatcccgttattcaactgtgttctcatgaag
V S F I V G F I M I I P L F N C V L M K
gaccacagcattgtggagcagccgtttgtatcagaaaatctgacccaaaggatgacgcgt
D H S I V E Q P F V S E N L T Q R M T R
gaggccgtggacttcatagaaaggtagagactgtgaaaggttgcctctgaactgctattg
E A V D F I E R
>STS_fru 9 exons 496 aa Scaffold_4788:7785-13384 gene missing (61% STS_hsa) insert missing
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLL 2
1 NSARPFLLFFSFLQVHTAMFASAAFRATSQHGIYGDAVHEVDWSV 1
2 GQIMQALDKFNLKDDTLVYLTSDQGGHVEEISATGVVQGGWNGIYK 1
2 AGKATNWEGGIRVPGILRWPGKIPGGRKIDEPTSNMDLFPTVVQLSGASVPLDR 2
>ARSD_fru 10 exons 500 aa Scaffold_771:84193-88005 (58% ARSD_hsa) 88500--84180 first exon unproven
2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCV 2
1 NVDRPFLLFFSMAHIHTPLFRNPAFSGKSLHGLYGDNIEEVDWMI 1
2 GKMTETVDSLGLANNTLMYFTSDHGGHLEDSNSRVGQQGGWNSIYR 1
2 GGKAMGGWEGGIRVPAIFRWPGRLAPGRVVHEPTSLMDLYPTLKYLAGDTQPDR 2
87153 2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGS DILADLQKTLRSFTIFLGIGLATLvrlivvfqasfyrlag
2 00 2
hlshfqalvrrcGLLDISLRLLVVLFFISILATVLWLTPF KFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR 86231 gt phase 2
2 1
>ARSD_fru revisited 80 aa split exon 2-1
KWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR
2 gtaagtggcacttgggagtgaactgtgagcgcagaggggaccactgccaccacccaaaccagcacggcttcagctacttctatggcctccccttcaccctgttcaacgactgtgtgcctggggagggcagcgacatcctggcagacctgcagaaaacgctccgaagcttcaccatttttctgggcatcggactggcaacactggtacggctcatcgttgttttccaggcctctttttacag cctgcggttgctggtggtgcttttttttatcagtatcctggcaaccgttctgtggttgacgccttttaaattcatcccgacctggaactgcatcctcatgagaaaccaggaggtggtcgagcagccgatggtggtggagacgctgccccggagactgctgacggaagcccaacagtttattaaaag 2
>STS_tni 496 aa 9 exons 86% fugu CONTIG_8306_1 + CONTIG_22961_1
2 GKWHLGLSCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGIIFASTVLLS 2
1 NSAKPFLLFFSFLQVHTAIFASAAFRGTSQHGIYGDAVHEVDWSV 1
2 GQIMETLDRFNLGDHTLVYLTSDQGGHVEEISAAGVVQGGWNGIYK 1
>ARSD_tni 500 aa CONTIG_24273_2 first exon unproven 2 GC-AG
2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCV 2
1 NVDRPFLLFFSMVHVHTPSLQKPAFAGKSLHGLYGDNIEEVDWMI 1
2 GKMTETVDSLCLANNTLMYFTSDHGGHIEGSTSRASQQGGWNSFYK 1
>STS_dre 117aa zebrafish BG799792, to human 78% 5-116
FLLIMADDLGIGDLGCYGNRTLRTPRTPHIDRLALEGVKLTQHLAAAPLCTPSRAAFLTGRYPVRSGMASHGRLGVFLFSASSGGLPPNEVTFAKLLKGQGYTTGLV
GKWHLGLSCQ
=--=-=-=-=-=-=-=-=-=-==
>ARSd_gga chicken Gallus gallus ???
>ARSD_xla_frag Xenopus laevis
MALIPAVFFLLWASTASQAHGNKPNFVLLMADDLGIGEVGCYGNNTLRTPNIDRLAREGVKLTHHIAASSLCTPSRAAFLTGRYPIRSGMTGHDGGYLVLMWSAVSGGLPTNETTFAKILQEQGYTTGII
GKWHLGVNCRSRDDFCHHPLNHGFDYYYGLLYTLINDCQASMPSEIHVAFRAQLLFYAQLFAVTLLTAMVTKPNGILLQVSWKSSWPIFCPPSGE
>STS_xla_frag Xenopus laevis 63 aa
ptrpptrpaGAAKYIKATFQISFLALFTLVLISYSGLLNVPWKLIFYIVSVTSLLLGAVIFFFWNFQYLNCVLMRNDKIIQQPLVFDNLTQRITREALQYIKS
NKDTPFLLFVSYVQVHTALYASQDFIGKSNHGIYGDATEEVDWSVGELLNELDRSHLQNKTVVYFTSDNGAHLEEISSSGEVHGGC
>ARSD_str_frag Silurana tropicalis 59 aa
LYLRTNSQLLLQFSSHRTPNIDRLAKEGLKLTQHISAAPLCTPSRAAFMTGRYPIRSGMDFSNGFRVIVSAAVSAGLPSNETTFATILQQQGYSTGLIGKWHLGLNCASRDDFCHHPNSHGFNYFYGMPFSLYSGCKPGSIPESPNSPKQQLSFVTQIIGFGVLTLTALKYSKILAINGKFLVSCAVFD
>ARSD_gga chicken Gallus gallus 60% nearly complete
GVNCKSHRDHCHHPLNHGFEYFYGMSFTILNECQGTDDPELAKSSQDTYWLYTQIIFIAVLTLFvGKLTHLFSVKWKIIVCVTIFGLLYFLSWFSSYGFTKYWNCIMMRNHEITEQPMNLDKTTSNMLKEAVSFIERNKHRPFLLFLSLLHVHTPLITTKEFLGRSRHGLYGDNVEEMDWMVGRLLDVIDKEGLKNTTFIYFASDHGGSVEAHRGNVRLGGWNGIYKGKILVACTYIR
>ARSD_ssc pig Sus scrofa 73% nearly complete
...PSTLGSECHPGWPPQVGEALGGRLWLSTQMMALGVLTGAAGKTLGLVSVPWKFVWGAASLVLLFFGSWFASLGVLRRWNCILMRNHDVVEQPMALESTARLLSGEALSFIQRHKPGPFLLFVSLLHVHVPLMTTKEFQGKSQHGLYGDNVEEMDGLVGDILNAIEEHGLKNTTLTYFTSDHGGHLEAIDGHVQLGGWNGIYRGGKGMGGWEGGIRVPGIFRWPGVLPAGRVIQEPTSLMDVFPTVVQLGGGQVPQDRVIDGRSLVPLLQGETEHSAHEFLFHYCGEHLHAARWHDKDSGRLW
>ARSE_ssc pig Sus scrofa 73% partial
MLSFRSGLALTIGVLLGSKPSAYGDLSASRPNILLLMADDLGIGDLGCYGNHTIRTPNIDRLAADGVMLTQHLAAASLCTPSRAAFLTGRYPLRSGMVSSTGSRVLQWVAASGGLPPNETTFAKILKDKGYVTGLVGKWHLGLNCESSEDHCHHPLNHGFDLFYGMPFSMMGDCLPSDISEKRVILERQVNVCCHIVALAALTLALGKLTRLTPGSWTPVVCSALAA
>ARSE_bta cow Bos taurus 71% complete
gKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVHTPLVTT
>STS_bta cow Bos taurus 82% partial
MAWDMMTLLLLLLFLCEAQSRAASKPNFVLLMADDLGIGDPGCYGNKTLRTPNIDRLARGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASQSQVGVFLFSASSGGLPPSEITFAKLLKDQGYSTALIGKWHLGISCHDPGDFCHHPTSHGFDYFHGLPLTNM
>M86934
MDGLLLDTERLYSVVFQEICNRYDKKYSWDVKSLVMGKKALEAAQIIIDV
LQLPMSKEELVEESQTKLKEVFPMAALMPGAEKLIIHLRKHGIPFALATS
SGSASFDMKTSRHKEFFSLFSHIVLGDDPEVQHGKPDPDIFLACAKRFSP
PPAMEKCLVFEDAPNGVEAALAAGMQAVMVPDGNLSRDLTTKATLVLNSL
QDFQPELFGLPSYE
>AF167081
MSPKPRASGPPAKATEAGKRKSSSQPSPSDPKKKTTKVAKKGKA VRRGRRGKKGAATKMAAVTAPEAESAPAAPGPSDQPSQELPQHELPPEEPVSEGTQHD PLSQEAELEEPLSQESEVEEPLSQESQVEEPLSQESEVEEPLSQESQVEEPLSQESEV EEPLSQESQVEEPLSQESEMEEPLSQESQVEEPPSQESEMEELPSV
=-=-=-=-=-=-=-
groomed seqs