GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNFEPFFMMIATPAPH
GKYLNEYGAPDAGGLEHIPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARRHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPH
GKYLNEYGAEDAGGVSHVPPGWSFWYALEKNSKYYNYTLSVNGKARRHGENYSVDYLTDVLANMSLDFLEYKSWNLFFIDGSQTPAPh
GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHGQNYSQDYLTDVLSNVSLDFLNYKSNHEPFFMMIATPAPH
GKYLNQYGSKDAGGVAHVPPGWDQWHALVGNSKYYNYTLSVNGKEEKHGDSYEKDYLTDLVLNRSLHFLEERSPSHPFFMMLCPPAPH
GKYLNQYGHAQAGGVEHIPPGWSFWVGLEKNSKYYNYTLSVNGKAQKHGSDYSKDYLTDVLANMSLEFLQYKSSYQPFFMMVSTPAPH
GKYLNQYGQKDAGHVGHIPPGWDHWHALVGNSQYYNYSLSVNGKEEKHGDNYGDDYLTDLITNRSLTFLDNRSPQLPFFLLLSPPAPH
GKYLNQYGGKSVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSVDFIHNHKMTQPFFMMISTPAPH
GKYLNQYWGADVPKGWNHFYGWNQWFGLHGNSRYYNYTLRENSGNVQHGAHYESTYLTDLLRDRAADFLRNATQSEPFFAMVAPPAAH
........................e.hhh.......eeeeee.............hhhhhhhhhh.h.h.........eeeeee....
........................eehhhh......eeeeee.............hhhhhhhhhh.hhhh........eeeee.....
........................e.hhh.......eeeeee.............hhhhhhhhhh.hhhh........eeeee.....
........................eehhh.......eeeeee.............hhhhhhhhhh.h.h.........eeeee.....
.....hh.................eeee........eeeeee.............hhhhhhhhhhhhhhhhh...eeeee........
........................eeee.......eeeee................hhhhhhh...hhhh........eeeeee....
........................hhhhhe.....eeeeeee...........hhhhhhhhhhhhhhhh.........eeeee.....
........................eeee.......eeeeeee..............hhhhhhhhhhhhhhhh......eeeee.....
.....hh...................hee......eeeeeee..............hhhhhh....eeee........eeeee.....
....e....................eeee......eeeee..............hhhhhhhhhh...hh.h.......eeeee.....
.......e..................e........eeeeee...............hhhhhhhhhhhhhhhh.......eeee.....

>GNS_hsa	
GKYLNEYGAP...DAGGLEHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANVSL.......................................DFLDYKSN...FEPFFMMIATPAPH
>GNS_mmu	
GKYLNEYGAP...DAGGLEHIPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANLSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS_rno	
GKYLNEYGAP...DAGGLEHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARRHGENYSVDYLTDVLANLSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS_chi	
GKYLNEYGAP...DAGGLGHVPLGWSYWYALEKNSK..YYNYTLSI.NGKARKHGENYSVDYLTDVLANVSL.......................................DFLDYKSN...SEPFFMMISTPAPH
>GNS__gga	
GKYLNEYGAE...DAGGVSHVPPGWSFWYALEKNSK..YYNYTLSV.NGKARRHGENYSVDYLTDVLANMSL.......................................DFLEYKSW...NLFFIDGSQTPAPh
>GNS_xla	
GKYLNQYGSE...EAGGINHVPPGWSYWFALEKNSK..YYNYTLSE.NGRPKTHGQNYSQDYLTDVLSNVSL.......................................DFLNYKSN...HEPFFMMIATPAPH
>GNS_dre	
GKYLNQYGSK...DAGGVAHVPPGWDQWHALVGNSK..YYNYTLSV.NGKEEKHGDSYEKDYLTDLVLNRSL.......................................HFLEERSP...SHPFFMMLCPPAPH
>GNS1_fru	
GKYLNQYGHA...QAGGVEHIPPGWSFWVGLEKNSK..YYNYTLSV.NGKAQKHGSDYSKDYLTDVLANMSL.......................................EFLQYKSS...YQPFFMMVSTPAPH
>GNS2_fru	
GKYLNQYGQK...DAGHVGHIPPGWDHWHALVGNSQ..YYNYSLSV.NGKEEKHGDNYGDDYLTDLITNRSL.......................................TFLDNRSP...QLPFFLLLSPPAPH
>GNS_cii	
GKYLNQYGGK...SVGGPQHVPVGWNQWFGLVGNSK..YYNYTISD.NGVPVQHGANYHEDYLTDLLANRSV.......................................DFIHNHKM...TQPFFMMISTPAPH
>GNS_dme	
GKYLNQYWGA...DVPKGWNHFYGWNQWFGLHGNSR..YYNYTLRE.NSGNVQHGAHYESTYLTDLLRDRAA.......................................DFLRNATQ...SEPFFAMVAPPAAH

GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAEKAYGFLDEAAKHNRPFFLGIAPIAPH
GKLFNAHTVENYNSPYPAGWNGSDFLLDPYTYNYLNSSFQRNQDPPKSYEGFHSVDVLAEKSLGFVDEAVRADGPFFLGIAPVAPH
GKLFNAQTVDNYDSPHAAGWTGSDFLlDPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAGKALGFLDDVVAEDKPFFLGIAPIAPH
GKFLVDYSVSNYQNVPaAGWTDIDALVTPYTFDYlNNPFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPH
GKFLVDYSVSNYQQVPRAGwTISMPlVTPYTFDYlNnTLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPH
..ee...........eee....................................hhhhhhhhhhhhhhhhh....eeeee......
......................................................hhhhhhh....hhhhh......eeee......
....................................hhh..................hhhhhh..hhhh......eeeee......
..eeeeee................h..............................hhhhhhhhhhhhhhhhh.....eee......
..eeeee.............eee................................hhhhh..hhhhhhhhh......ee.......

>Sulf_ncr	
GKLFNAHT.....VDNYDSPYIAGWNGSDFLLDPYTYSYLNAT.FQRNRDPPISYEGQYSV..DVLAEKAY........................................GFLDEAAK..HNRPFFLGIAPIAPH
>Sulf_pan	
GKLFNAHT.....VENYNSPYPAGWNGSDFLLDPYTYNYLNSS.FQRNQDPPKSYEGFHSV..DVLAEKSL........................................GFVDEAVR..ADGPFFLGIAPVAPH
>Sulf_cgl	
GKLFNAQT.....VDNYDSPHAAGWTGSDFLlDPYTYSYLNAT.FQRNKDAPVSHEGEYST..GVLAGKAL........................................GFLDDVVA..EDKPFFLGIAPIAPH
>Sulf_vca	
GKFLVDYS.....VSNYQNVPaAGWTDIDALVTPYTFDYlNNP.FSRNGATPNIYPGFYST..DVIADKAV........................................AQIKTAVA..AGKPFYAQISPIAPH
>Sulf_cre	
GKFLVDYS.....VSNYQQVPRAG.TISMPlVTPYTFDYlNnT.LQRNGATPNIYPGEYST..DVIRDKGV........................................AQIKSAVA..AGKPFYAQISPIAPH

GKYLNEYNGSYIPQGWQYWMGLVRNSRYYNYSLRHNDVKESHRDNYRDDYFTDLIVNRSMTYFRRKKHEEPDSPILSVLSFPAPH
GKYLNEYNGSYIPAGWKYWMGLIKNSKYYNYAVNHNSQKELHGDDYAKDYLTDLVTNRSMEFFRDSKTERPEDPVLVAlsfpaph
GKYLNKYNGSYIPPGWREWGGLIMNSKYYNYSINLNGQKIKHGFDYAKDYYPDLIANDSIAFLRSSKQQNQRKPVLLTMSFPAPH
GKYLNEYDGSYIPPGWDEWHAIVKNSKFYNYTMNSNGEREKFGSEYEKDYFTDLVTNRSLSKFIDKIKIRAWQPFALIISYPAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPH
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKSSKKMYPHRPIMMVISHAAPH
GKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKLSKKLYPHRPIMMVISHAaph
GKYLNEYNGSYIPPGWREWVGLIKNSRFYNYTVCRNGYKEKHGGEYAKDYFTDLITNDSINFFRISKRMFPHRPVMMVISHAAPH
................hhhhhhh...........................hhhhhhhhhhhhhhhh........eeeee......
................hhhhhhh......eeee............hhhhhhhhhh...hhhhh...........eeeee......
.....................eee......eeee........................hhhhhhh.........eeeee......
.................hhhhhh....eeeeee..............h..hhh......hhhhhhhhhh......eeee......
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhhh......eeeeee.....
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhh.......eeeeee.....
................hhhhhhh....eeeeeee...........hhhhhhhhhh....hhhhhhhhh......eeeeee.....
.................h.eee.....eeeeeee...........hhhhhhhhhh......hhhh.h.......eeeeee.....
.................h.eee.....eeeeeee...........hhhhhhhhhh......hhhh.h.......eeeeee.....
................hhhhhhh....eeeeee..............h..hhhh.....hh..h..........eeeeee.....
................hhhhhhh....eeeeee..............h..hhhh.....hhh.hhhh.......eeeeee.....
................hhh.h......eeeeeee............hhhhhhhh.......hhhhhhh......eeeeee.....

>KIAA47/77_cii	
GKYLNEYNGS...YIPQGWQYWMGL......VRNSR..YYNYSLRH.NDVKESHRDNYRDDYFTDLIVNRSM......................................TYFRRK.KHEEPDSPILSVLSFPAPH
>KIAA1077__hro	
GKYLNEYNGS...YIPAGWKYWMGL......IKNSK..YYNYAVNH.NSQKELHGDDYAKDYLTDLVTNRSM......................................EFFRDS.KTERPEDPVLVAlsfpaph
>KIAA47/77_dme	
GKYLNKYNGS...YIPPGWREWGGL......IMNSK..YYNYSINL.NGQKIKHGFDYAKDYYPDLIANDSI......................................AFLRSS.KQQNQRKPVLLTMSFPAPH
>KIAA47/77_cel	
GKYLNEYDGS...YIPPGWDEWHAI......VKNSK..FYNYTMNS.NGEREKFGSEYEKDYFTDLVTNRSL......................................SKFIDK.IKIRAWQPFALIISYPAPH
>KIAA1077_hsa	
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPVMMVISHAAPH
>KIAA1077_mmu	
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPIMMVISHAAPH
>KIAA1077_rno	
GKYLNEYNGS...YIPPGWREWLGL......IKNSR..FYNYTVCR.NGIKEKHGFDYAKDYFTDLITNESI......................................NYFKMS.KRMYPHRPVMMVISHAAPH
>KIAA1077_cco	
GKYLNEYNGS...YIPPGWREWVGL......VKNSR..FYNYTISR.NGNKEKHGFDYAKDYFTDLITNESI......................................NYFRMS.KRIYPHRPIMMVISHAAPH
>KIAA1077__gga	
GKYLNEYNGS...YIPPGWREWVGL......VKNSR..FYNYTISR.NGNKEKHGFDYAKDYFTDLITNESI......................................NYFRMS.KRIYPHRPIMMVISHAAPH
>KIAA1077_xla	
GKYLNEYNGS...YIPPGWREWLGL......VKNSR..FYNYTMCR.NGFKEKHGFEYEKDYFTDLITNDSI......................................SYFKSS.KKMYPHRPIMMVISHAAPH
>KIAA1077__str	
GKYLNEYNGS...YIPPGWREWLGL......VKNSR..FYNYTMCR.NGFKEKHGFEYEKDYFTDLITNDSI......................................SYFKLS.KKLYPHRPIMMVISHAaph
>KIAA1077_fru	
GKYLNEYNGS...YIPPGWREWVGL......IKNSR..FYNYTVCR.NGYKEKHGGEYAKDYFTDLITNDSI......................................NFFRIS.KRMFPHRPVMMVISHAAPH

GKFLNNYDGSWVPPGWTKWAALVRNSRYYNYSLNKNGRNEWHGNRYENDYLTNLVANLSLQFIDESLLNPHGQPFLVVLSFPAPH
GKYLNEYNGSYVPPGWREWVALVKNSRFYNYTLCRNGInGwHGTQYPKDYLTnRITNDSINFLRMSKRMYPHRPVMMGLSHAAPH
GKYLNEYNGSYVPPGWKEWLGLVKNSRFYNYTLSRNGFREKHGAEYPQDYLTDLITAESMRYFRYSKRVYPHRPVLMVLSHAAPH
GKYLNEYNGSYVPPGWKEWVALVKNSRFYNYTLCRNGVREKHSSDYPKDYLTDIITNESINYFRTSKRTYPNRPVMMVLSHVAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGFDYSRDYLTDLITNDSITFFRISKKMYPHRPVLMVISHAAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSKDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPH
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSTDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPH
...e.............hhhhhhh.....eee................hhhhhhhhhhhhhhhhhh........eeeeee.....
................hhhhhhh....e..eeee.........................hhhhhhhh.......eeee.......
................hhhhhhh....eeeeee...............hhhhhhhhhhhhhhhhh.........eeeeee.....
................hhhhhhh....e.eeee.................hhhee......ee...........eeeeeee....
................hhhhhhh......eeee................hhhhh......ee...hh.......eeeeee.....
................hhhhhhh......eeee................hhhhh......eeee..........eeeeee.....
................hhhhhhh......eeee................hhhhh......eeee..........eeeeee.....

>KIAA1247__hgl	
GKFLNNYDGS...WVPPGWTKWAAL......VRNSR..YYNYSLNK.NGRNEWHGNRYENDYLTNLVANLSL......................................QFIDES.LLNPHGQPFLVVLSFPAPH
>KIAA1247__dre	
GKYLNEYNGS...YVPPGWREWVAL......VKNSR..FYNYTLCR.NGInGwHGTQYPKDYLTnRITNDSI......................................NFLRMS.KRMYPHRPVMMGLSHAAPH
>KIAA1247b_fru	
GKYLNEYNGS...YVPPGWKEWLGL......VKNSR..FYNYTLSR.NGFREKHGAEYPQDYLTDLITAESM......................................RYFRYS.KRVYPHRPVLMVLSHAAPH
>KIAA1247a_fru	
GKYLNEYNGS...YVPPGWKEWVAL......VKNSR..FYNYTLCR.NGVREKHSSDYPKDYLTDIITNESI................................. ....NYFRTS.KRTYPNRPVMMVLSHVAPH
>KIAA1247__gga	
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGFDYSRDYLTDLITNDSI................................ .....TFFRIS.KKMYPHRPVLMVISHAAPH
>KIAA1247_hsa	
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGSDYSKDYLTDLITNDSV................................. ....SFFRTS.KKMYPHRPVLMVISHAAPH
>KIAA1247_mmu	
GKYLNEYNGS...YVPPGWKEWVGL......LKNSR..FYNYTLCR.NGVKEKHGSDYSTDYLTDLITNDSV................................. ....SFFRTS.KKMYPHRPVLMVISHAAPH

>ARSB_spo	
GKWHLGLTPDRY.PSKRGFKESFALLPGGGNHFA......YEPGTRE.....................................................................NPAVPFLPPLYTHNHDPVDH
>ARSB_dm1	
GKWHLGHWKLKYTPLYRGFSSHWGLDMRNGTQVA......YDLHGH................................YTT........................DVITDHSVKVIANHNATKGPLFLYVAHAACH
>ARSB_cal	
GKWHLGLKKPYW.PNKRGFNKSFTLLPGAGNH........YKYITRDSQGNQIPFLPAIYVEDDKELLQPEIELPDDFYST.........................NYFTDKAIEFIKETPQGKPFFGMITYTAPH
>SulfY_ptr
GKWHVGHSRWTQTPTFRGFQSFFGFYLGAQD.........YNTHIKQGERGNAYEMHWDARGKC.GRDCSRLVDERGNYST.......................HVFTREAIRVIENHPQRPHEPLFLYLAHQAVH
>SulfZYB_cii	
GKWHLGFSSSKYAPWNRGFHGFYGFLAGSEN.........YWSKWLPMARHSNIG.....GVDFTDSTTGPTNETWGQYSA......................HVYAS..RARYVIQHH.DQSKPLFLYLPLQTPH
>ARSB_cii	
GKWHVGYCDEAYTPTRRGFDSHYGFYNSGIS.........YSNYSSTEGTDV........GYDYR.DDLALNLAAEGKYTT......................TDFTD..QAKTLIDNH.DQTNPMFLYMAYNAPH
>ARSB_dm3	
GKWHLGFSRPEYTPTRRGFDYHFGYWGAYID.........YFQRRSKMPVANYSL.....GYDFRR.NMELECRDRGVYVT......................DLLT..AEAERLIKDHADKEQPLFLMLSHLAAH
>ARSB_dm4	
GKWHLGLSQRNFTPTERGFDRHLGYLGAYVD.........YYTQSYEQQNKGYN......GHDFR.DSLKSTHDHVGHYVT......................DLLTDAAVKEIEDHGSKNSSQPLFLLLNHLAPH
>ARSB_dm2	
GKWHLGFWRKDLTPTMRGFDHHFGYYNGYID.........YYDHQVRMLDRNYSA.....GLDFRRD.LEPCPEANGTYAT......................EAFTS...EAKRIIEQHDKSKPLFMVLSHLAVH
>ARSB2_cii	
GKWHLGFYKKECLPTSRGFDTFYGYYCGAED.........YYTKQVHANFHFGNKTRRVSGFDFHDN.SRTEWEANGTYSS......................YLYRD...RAVRIIKSHNSSIPLFMYLPFQSVH
>ARSB_hro
GKWHLGFYRKECLPTIRGFDTHYGYYCGNQD.........YYTK
>SulfY_ame	
GKWHLGYPP.AFGPLRSGYEEFFGPMSGGVD.........YFT..HCSSNGTHD............LYLGEEEKQQDGYLT.....................DLITDHALDYVQRMAEGAKDGKPFFLSLHYTAPH
>SulfY_ava
GKWHLGYSSLNYTPTHRGFDSFYGFYNGPID.........YYRGIMEQEGH.........KGLDFWNGTHTVPLEERIYST......................TRFRD..QAESIIANRNSS.KPLFLYLAHQGVH
>sulfZ/Y_hpo
GKWHLGFYKQEYLPWNRGFDTYFGYLNAAED.........YFNHNVPWRQV.........RYLDLRDNNGPVRNETGQYSA........................HLFTGKAIDVVQSHNTS.KPLFLYLAYQSVH

GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCALDFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIESLNGTRCALDLRDGEEPAKEYTNIYSTNIFTKRATPVIATHPPEKPLFLYLAFQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIECLNGTRCALDLRDGEEPAKEYTDIYSTNIFTKRATTLIANHPPEKPLFLYLAFQSVH
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCALIDSLNVTRCALDFRDGEQVATGYKNMYSTNIFTERATALITSHPPEKPLFLYLALQSVH
.....................eeeee............ehhhhh..hhhh.......hhh...........hhhhhhhh.........hhhhhhhhhh.
...ee................eeeee......................eeee.............ee...ee......eee.......hhee.hhhh..
...ee................eeeee......................eeee..............e......hh.eeee........hhee.hhhh..
...ee................eeeee.........hhhhe......hhh......................hhhhhhhh.........hh..hhhh...

>ARSB_hsaXR	
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYSHERCTLIDALNVTRC..ALDFRDG.EEVATGYKNMYST......................NIFTK..RAIALITNHPP.EKPLFLYLALQSVH
>ARSB_mmu	
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYTHEACAPIESLNGTRC..ALDLRDG.EEPAKEYTNIYST......................NIFTK..RATPVIATHPP.EKPLFLYLAFQSVH
>ARSB_rno	
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYTHEACAPIECLNGTRC..ALDLRDG.EEPAKEYTDIYST......................NIFTK..RATTLIANHPP.EKPLFLYLAFQSVH
>ARSB_fca	
GKWHLGMYRKECLPTRRGFDTYFGYLLGSED.........YYSHERCALIDSLNVTRC..ALDFRDG.EQVATGYKNMYST......................NIFTE..RATALITSHPP.EKPLFLYLALQSVH

GKWHLGFYRKECMPTRRGFDTFFGSLLGSGDYYTHYKCDSPGMCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILASHNPTKPIFLYIAYQAVH
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGDYYTHYKCDSPGVCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILATHDPTKPLFLYiAYQAVH
GKWHLGFYRRECMPTQRGFDTFFGSLLGSGDYYTHFKCDSPGICGYDLYENDNAAWDHDNGIYSTQMYTQKVQQILASHNPRKPIFLYXAYQAVH
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGDHYSHYKCEAPGMCGYDLYEGEEAAWEQDRGLYSTVMFTQKAISILAKHDPRKPLFLYLAYQAVH
GKWHLGLFTSNFLPHNRGFDHWVGTVGAGDHRYHRQCFNSMaCAYDLREGTNKDGVYEDKTRYDQKTEINEFQKIVDKHNTTNPLFAYLSFHAVH
GKWHLGLYKKEYTPLYRGFDSYYGYLEGGEDYYTYYNCDTFHWCGYDLRDMNEPVTDMNGTYSTHLYTKKAIDIINGASTGKaPFLLYLAYQAVH
..eeeeee.............ee.........eeee............................hhhhhhhhhhhhh.......eeeee.hhhh.
...ee................ee.........eeee........e.e................ehhhhhhhhhhhhh.......eeee.hhhh..
....eeee.............ee............................hhhhhhhh....eeeeehhhhhhhhh.......hhh.hhhhh..
...eeeee.............ee.ee........hh.h.hhhhh.....................hhhhhhhhhhhh.......hh...e.hh..
.....ee................e.......eeeee............................eee...eeeee.........hhhhhhhhh..

>SulfY_hsa	
GKWHLGFYRKECMPTRRGFDTFFGSLLGSGD.........YYTHYKCDSPG.....MC..GYDLYENDNAAWDYDNGIYST......................QMYTQ..RVQQILASHNP.TKPIFLYIAYQAVH
>SulfY_mmu	
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGD.........YYTHYKCDSPG.....VC..GYDLYENDNAAWDYDNGIYST......................QMYTQ..RVQQILATHDP.TKPLFLYVAYQAVH
>SulfY_gga	
GKWHLGFYRRECMPTQRGFDTFFGSLLGSGD.........YYTHFKCDSPG.....IC..GYDLYENDNAAWDHDNGIYST......................QMYTQ..KVQQILASHNP.RKPIFLYXAYQAVH
>SulfY_fru	
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGD.........HYSHYKCEAPG.....MC..GYDLYEGEEAAWEQDRGLYST......................VMFTQ..KAISILAKHDPHRKPLFLYLAYQAVH
>SulfY_odi
GKWHLGLFTSNFLPHNRGFDHWVGTVQGAGD.........HRYHRQCFNSPIKG*.MC..AYDLREGTNKDGVYEDKTRYD..................LNGTQKTEILTNEFQKIVDKHNTTNPLFAYLSFHAVH
>sulfY/Z_hpo
GKWHLGLYKKEYTPLYRGFDSYYGYLEGGED.........YYTYYNCDTFHNR*..WC..GYDLRDMNE.PVTDMNGTYST......................HLYTK..KAIDIINGASTGGKPFLLYLAYQAVH

GKWHLGFYKKECLPTRRGFDTYFGSLTGSVNYYTYDSCDGPGMCGFDLHEGESVAWSQKGKYSTHLYTQRVRKILATHDPSQPLFIFLSFQAVH
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGENVAWGLSGQYSTMLYAQRASHILASHSPQRPLFLYVAFQAVH
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGESVACGLSGQYSTMLYAQRASHILARHNPQNPLFLYVAFQAVH
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGEDYYTKWRCDGKgLCGYDMhTSEKGPTNATGQYSANLFANKANEAIDKHDKTKPLFLYVAFQSVH
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVNHFTRYHCEPKgFCGYDMIDSRYGPTNATGEYSTNLFIRKSKEMIDKHNKQKPMFLYLSLQAVH
...eee..............eee......eeeeee........ee.......eeeee........ehhhhhhhhhh.......eeeeeee....
...eeee.............hhe......eeeeee.......eee........eee......hhhhhhhhhhh.hh........eeeehhhhh.
..eeeee.............hhe......eeeeee.......eeee......eeee......hhhhhhhhhhhhhh........eeee..hh..
...eeee..............eeeee......eeeee......ee...............hhhhhhhhhhhhhhhhh......eeeeeee....
.....................hh.........eee............ee...............hh.hhhhhhhhhh......eeeeeeh....
>sulfZ_fru	
GKWHLGFYKKECLPTRRGFDTYFGSLTGSVN.........YYTYDSCDGPG......MC..GFDLHEGESVAWSQK.GKYST......................HLYTQ..RVRKILATHDPSQPLFIFLSFQAVH
>SulfZ_hsa	
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVD.........YYTYDNCDGPG......VC..GFDLHEGENVAWGLS.GQYST......................MLYAQ..RASHILASHSPQRPLFLYVAFQAVH
>SulfZ_mmu	
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVD.........YYTYDNCDGPG......VC..GFDLHEGESVACGLS.GQYST......................MLYAQ..RASHILARHNPQNPLFLYVAFQAVH
>sulfZ/Y_cii	
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGED.........YYTKWRCDGKg......LC..GYDM.TSEKGPTNAT.GQYSA......................NLFAN..KANEAIDKHDKTKPLFLYVAFQSVH
>sulfY/Z_cii	
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVN.........HFTRYHCEPKg......FC..GYDMIDSRYGPTNAT.GEYST......................NLFIR..KSKEMIDKHNKQKPMFLYLSLQAVH

1AUK
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTH
e   sb  gggtttggggt seeeeess ttssb ttsbsbtttee ss bs ss    eeetteesees  hhhhthhhhhhhhhhhhhhhhtt  eeeeee  tts
e....................eeeee..................ee.............eee..ee.ee...hhhh.hhhhhhhhhhhhhhhh....eeeeee.....

GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPDIPCKGGCDQGLVPIPLLANLTVEAQPPWLPGLEARYVSFSRDLMADAQRQGRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHHGFHRFLGIPYSHDQGPCQNLTCFPPATPCEGICDQGLVPIPLLANLSVEAQPPWLPGLEARYVAFARDLMTDAQHQGRPFFLYYASHHTH
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPSTPCDGSCDQGLVPVPLLANLSVEAQPPWLPGLEARYVAFARDLMADAQRQGRPFFLYYASHHTH
GKWHLGLGARGSFLPIHQGFDHFLGVPYSHDQGPCQNLTCFPPDIKCFGTCDQGLVPVPLFWNQSIVQQPVSFLIWCRLQQICTGLHLPTAPGEArpfLLYYASHHTH
GKWHLGIGANGTFLPTRQGFDQYLGIPYSHEMGPCQNLTCFPPDVKCFGLCDVGTVTVPLMYNEVIKQQPVNFLDLENAYRDFASDFISTSAKKRQPFFLYFPSHHTH
..eee................ee....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee................ee.......................................................h..hhhhhhhhhhhh....eeeeee.....
..eee.................e....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee................ee....................................................hhhhhhhhhhhhhhhhhh....eeeeee.....
..eee.................e....................................e............hhhhhhhhhh...............eeeee......
..eeee........................................ee.......eeeee..hhhh......h.hhhhhhhhhhhhhhhh.......eeeee......

>ARSA_hsa	
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPATPCDGGCDQGLVP.............IPLLANLSVEAQPPWLPGLEARY...MAFAHDLMADAQRQDRPFFLYYASHHTH
>ARSA_mmu	
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPDIPCKGGCDQGLVP.............IPLLANLTVEAQPPWLPGLEARY...VSFSRDLMADAQRQGRPFFLYYASHHTH
>ARSA_bta	
GKWHLGVGP.....EGAFLPPHHGFHR.FLGIPYSHDQGPCQN......LTCFPPATPCEGICDQGLVP.............IPLLANLSVEAQPPWLPGLEARY...VAFARDLMTDAQHQGRPFFLYYASHHTH
>ARSA_ssc	
GKWHLGVGP.....EGAFLPPHQGFHR.FLGIPYSHDQGPCQN......LTCFPPSTPCDGSCDQGLVP.............VPLLANLSVEAQPPWLPGLEARY...VAFARDLMADAQRQGRPFFLYYASHHTH
>ARSA_gga
GKWHLGLGA.....RGSFLPIHQGFDH.FLGVPYSHDQGPCQN......LTCFPPDIKCFGTCDQGLVP.............VPLFWNQSIVQQPVSFLIWCRLQ...QICTGLHLPTAPGEArpfLLYYASHHTH
>ARSA_fru	
GKWHLGIGA.....NGTFLPTRQGFDQ.YLGIPYSHEMGPCQN......LTCFPPDVKCFGLCDVGTVT.............VPLMYNEVIKQQPVNFLDLENAY...RDFASDFISTSAKKRQPFFLYFPSHHTH

GKWHLGHHGSYHPNFRGFDYYFGIPYSHDMGCTDTPGYNHPPCPACPQGDGPSRNLQRDCYTDVALPLYENLNIVEQPVNLSSLAQKYAEKATQFIQRASTSGRPFLLYVALAHMH
GKWHLGHHGSYHPNFRGFDYYFGIPYSNDMGCTDAPGYNYPPCPACPQRDGLWRNPGRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGQAHMH
GKWHLGHHGSYHPSFRFDYYYFGIPYSNDMGCTDNPGYNYPPCPACPQSDGRWRNPDRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGLAHMH
GKWHLGHHGSYHPSFRFDYYYFGIPYSHDMGCTDTPGYNYPPCPACPRRHQPSRNLERDCYSDVALPLYENLNIVEQPVNLSGLARKYAEKATQFIQQARASGRPFLLYVGLAHMH
GKWHLGHNGPYRPNRRGFDYYYGVPYSNDMGCTDVPGYNLPQCPPCDPPSGPsrSRHDGCYSKVALPLIENTTIVQQPLNLWRLTEQYKSAATRIIQNARAQGQPYFLYIALAHMH
GKWHLGITKAYHPCSRGFNYYYGLPYSNDMGCVDCDAYNHPQCKKCPKQSGITNDQAIECGYDTALPLYENYDIIEQPANLVELGDRYVEKATLFIQQAKNKTQPFFLYVATAHTH
...................eee...................................................ee.....hhhhhhhhhhhhhhhhhhh.......eeehhhhhh.
...................eee...................................................ee......hhhhhhhhhhhhhhhhhh......eeeee......
..e.............eeeeee...................................................ee......hhhhhhhhhhhhhhhhhh......eeeeee.....
..e.............eeeeee...................................................ee.....hhhhhhhhhhhhhhhhhhhhh....eeeeee.....
..ee................ee.........................................e.........ee...hhhhhhhhhhhhhhhhhhhhhh.....h.eeehhhhh.
...ee.e.............e....................................eh...................hhhhhh.hhhhhhhhhhhhhh......eeeeee.....

>KIAA1001_hsa	
GKWHLGHHG.......SYHPNFRGFDY.YFGIPYSHDMG.CTD......TPGYNHPPCPACPQGDGPSRNLQRDCYTDVA..LPLYENLNIVEQPVNLSSLAQKY...AEKATQFIQRASTSGRPFLLYVALAHMH
>KIAA1001_mmu	
GKWHLGHHG.......SYHPNFRGFDY.YFGIPYSNDMG.CTD......APGYNYPPCPACPQRDGLWRNPGRDCYTDVA..LPLYENLNIVEQPVNLSGLAQKY...AERAVEFIEQASTSGRPFLLYVGQAHMH
>KIAA1001_rno
GKWHLGHHG.......SYHPSFRFDYY.YFGIPYSNDMG.CTD......NPGYNYPPCPACPQSDGRWRNPDRDCYTDVA..LPLYENLNIVEQPVNLSGLAQKY...AERAVEFIEQASTSGRPFLLYVGLAHMH
>KIAA1001_ssc	
GKWHLGHHG.......SYHPSFRFDYY.YFGIPYSHDMG.CTD......TPGYNYPPCPACPRRHQPSRNLERDCYSDVA..LPLYENLNIVEQPVNLSGLARKY...AEKATQFIQQARASGRPFLLYVGLAHMH
>KIAA1001_fru	
GKWHLGHNG.......PYRPNRRGFDY.YYGVPYSNDMG.CTD......VPGYNLPQCPPCDPPSGPsrSRHDGCYSKVA..LPLIENTTIVQQPLNLWRLTEQY...KSAATRIIQNARAQGQPYFLYIALAHMH
>KIAA1001_cii	
GKWHLGITK.......AYHPCSRGFNY.YYGLPYSNDMG.CVD......CDAYNHPQCKKCPKQSGITNDQAIECGYDTA..LPLYENYDIIEQPANLVELGDRY...VEKATLFIQQAKNKTQPFFLYVATAHTH

GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQARHHgPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKAKPNIPVYRDWEMVGRFYEEFPINRKTGEANLTQLYLQEALDFIRTQHARQGPFFLYWAIDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKVKPNIPVYRDWEMVGRFYEEFPINLKTGEANLTQLYLQEALDFIRTQHARQSPFFLYWAIDATH
gkwHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQhARHHPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDQEMVGRFYEEFPINLKTGEANLTQIYLQEALEFIQRQQAAHRPFFLYWAVDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNRARPNIPVYRDWEMVGRFYEEFPINLKTGESNLTQIYLQEALDFIKRQQATHHPFFLYWAIDATH
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNRALPNIPVYRDWEMIGRYYEDFKIDLRTGEANLTQIYLQEMYGPAQAALVFHHPFFLYWAIDATH
GKWHLGHRPSYHPLRHGFDEWFGSXNCHFGPYDNKQIPNIPLDRDWNMGGRYYEEFNIIIKTGESNLTRSSLQeGLDFIQSQAEAQKPFFLYWAPDATH
GKWHLGHRPQYLPLEHGFDEWFGAPNCHFGPYNNSVRPNIPVYRNSWMLGRYYEEFKIDKKTGESNLTQMYLLEGLDFIQSQAEAQKPFFLYWAPDATH
GKWHLGHRAHHLPLEHGFDEWFGAPNCHFGPYNSSDRPNVPVYNNSQMVGRYYEEFGIDKKTGESNLTQIYLEEGLDFIFHQNMAQRPFFLYWAADATH
GKWHLGHRAHHLPLEHGFDEWFGAPNCHFGPYNDSSRPNIPVYNNSEMKGRYYEEFEINVKTGESNLTQLYLKEGLDFISQQAMAQRPFFLYWAPDATH
GKWHLGQQEQYLPLKHGFHEWFGSPNCHFGPYDDKTTPNIPVYNNTEMVGRYYEEFAIESHKYLSNMTQYYIQEALDFIERMERNEKPFFLYWAPDATH
.............................................hhhhhhhhh...........hhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhh..........hhhhhhhhhhhhhhhhhhhh.....eeeeee....
.............................................hhhhhhhhhh.........hhhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhh...........hhhhhhhhhhhhhhhhhh......eeeeee....
.............................................hhhhhhhhhh..........hhhhhhhhhhhhhhhhhhh....eeeeeee....
.............................................hhhhhhhhh............hhhhhhhhhhhhhhhhhh.....eeeeee....
.............................................hhhhhh..hhhee.......hhhhhhhhhh...hhhhhhh....eeeeee....
.......................................................eeeeee.......hhhhhhhhhhhhhhhhh...eeeee......
....................ee........................hhhhhhhhhhe.........hhhhhhhhhhhhhhhhhhh...eeeee......
....................e..........................hhhhhhhh...........hhhhhhhh..hhe..hhh.....eeeeeh....
....................ee...................e.........hhhheeeee......hhhhhhhh..hhhhhhhhhh...eeee......
........h..........e..........................hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh....eeeee......

>GALNS_hsa	
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDWEMVG........................RYYEEFPINLKTGEANLTQIYLQE...ALDFIKRQARHHgPFFLYWAVDATH
>GALNS_mmu	
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKAKPNIPVYRDWEMVG........................RFYEEFPINRKTGEANLTQLYLQE...ALDFIRTQHARQGPFFLYWAIDATH
>GALNS_rno
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKVKPNIPVYRDWEMVG........................RFYEEFPINLKTGEANLTQLYLQE...ALDFIRTQHARQSPFFLYWAIDATH
>GALNS_mac	
...HLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDWEMVG........................RYYEEFPINLKTGEANLTQIYLQE...ALDFIKRQ.ARHHPFFLYWAVDATH
>GALNS_bta	
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNKARPNIPVYRDQEMVG........................RFYEEFPINLKTGEANLTQIYLQE...ALEFIQRQQAAHRPFFLYWAVDATH
>GALNS_ssc
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNRARPNIPVYRDWEMVG........................RFYEEFPINLKTGESNLTQIYLQE...ALDFIKRQQATHHPFFLYWAIDATH
>GALNS_gga	
GKWHLGHRP.......QFHPLKHGFDEWFGSPN...CHFGPYDNRALPNIPVYRDWEMIG........................RYYEDFKIDLRTGEANLTQIYLQE...MYGPAQAALVFHHPFFLYWAIDATH
>GALNS_xla 
GKWHLGHRP.......SYHPLRHGFDEWFGSXN...CHFGPYDNKQIPNIPLDRDWNMGG........................RYYEEFNIIIKTGESNLTRSSLQ....GLDFIQSQAEAQKPFFLYWAPDATH
>GALNS_fru	
GKWHLGHRP.......QYLPLEHGFDEWFGAPN...CHFGPYNNSVRPNIPVYRNSWMLG........................RYYEEFKIDKKTGESNLTQMYLLE...GLDFIQSQAEAQKPFFLYWAPDATH
>GALNS_omy	
GKWHLGHRA.......HHLPLEHGFDEWFGAPN...CHFGPYNSSDRPNVPVYNNSQMVG........................RYYEEFGIDKKTGESNLTQIYLEE...GLDFIFHQNMAQRPFFLYWAADATH
>GALNS_dre	
GKWHLGHRA.......HHLPLEHGFDEWFGAPN...CHFGPYNDSSRPNIPVYNNSEMKG........................RYYEEFEINVKTGESNLTQLYLKE...GLDFISQQAMAQRPFFLYWAPDATH
>GALNS_cii	
GKWHLGQQE.......QYLPLKHGFHEWFGSPN...CHFGPYDDKTTPNIPVYNNTEMVG........................RYYEEFAIESHKYLSNMTQYYIQE...ALDFIERMERNEKPFFLYWAPDATH

GKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPH
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSGSPFFLAVGYHKPH
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAITLLEKMKTSVSPFFLAVGFHKPH
GKVFHPGISSNYSDDYPYSWSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAIRLLNVMKTKKQKFFLAVGYHKPH
GKIFHPGISSNHSDDYPYSWSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAIRLLKTVKQQNASFFLAVGYHKPH
GKVFHPGIASNHTDDYPYSIWSPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAIGLLKGRVQNTQPFFLAVGFHKPH
GKVFHPGIASNHSDDYPYSWSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAIRLLRSMKGSQKPfFLSVGFYKPH
GKVFHPGIASNHSDDYPYSWSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAIRLLKSTRNSGKNFFLAVGFHKPH
GKVFHPGICSNYNDDFPLSWSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAIMIRKFSNNKSQPFFLAVGFHKPH
GKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEAIFVGSRSRHSQEPFFLAMGFHKPH
GKVFHPGASSNFTDDFPLSWSEPAFHPLTDEYSNAAVCIDPDGRLKRNLLCPVRLETQPLHTLPDIESTEEAiKRFLSTVGLSQPYFLAVGYRKPH
GKVFHPGKSSNFTDDYPYSWSEYPYHPPTEMYKDAKVCRNKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAiIDFLNTVGLSQPYFLAVGYRKPH
......................................................e.............hhhhhhhhhhhh.....eeeee......
....................................................................hhhhhhhhhhhh.....eeeee......
....................................................................hhhhhhhhhhhhh....eeeee......
...e........................................eeeee.................hhhhhhhhhhhhhhhhh..eeeee......
..ee...............eee......................ee..ee................h.hhhhhhhhhhhhhh..eeeeee......
..............................hhhhhh.........hhh.eeeee............hhhhhhhhhhh........eeeee......
...............................hhhhhhh............................hhhhhhhhhhhhh......eeeeee.....
..................................eeee........hheeeee.............hhhhhhhhhhhhh......eeeeee.....
..ee.............................eeee........e..e..................hhhhheeehh........eeeeee.....
...........................................hhhh..................hhhhhhhhee..........hhh.h......
..ee.........................hhhhhhhhhhhhhhhhhh..................hhhhhhhhhhhhh.......eeeee......

>IDS_hsa	
GKVFHPGISSNHTDDSPYS.................WSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAI.......................QLLEKMKTSASPFFLAVGYHKPH
>IDS_mmu	
GKVFHPGISSNHSDDYPYS.................WSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAI.......................RLLEKMKTSGSPFFLAVGYHKPH
>IDS_rno	
GKVFHPGISSNHSDDYPYS.................WSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAI.......................TLLEKMKTSVSPFFLAVGFHKPH
>IDS_gga	
GKVFHPGISSNYSDDYPYS.................WSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAI.......................RLLNVMKTKKQKFFLAVGYHKPH
>IDS_str	
GKIFHPGISSNHSDDYPYS.................WSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAI.......................RLLKTVKQQNASFFLAVGYHKPH
>IDS_fru	
GKVFHPGIASNHTDDYPYS.................IWSPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAI.......................GLLKGRVQNTQPFFLAVGFHKPH
>IDS_dre	
GKVFHPGIASNHSDDYPYS.................WSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAI.......................RLLRSMKGSQKPfFLSVGFYKPH
>IDS_omy	
GKVFHPGIASNHSDDYPYS.................WSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAI.......................RLLKSTRNSGKNFFLAVGFHKPH
>IDS_cii	
GKVFHPGICSNYNDDFPLS.................WSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAI.......................MIRKFSNNKSQPFFLAVGFHKPH
>IDS_dme	
GKVFHPGLSSNNTDDYPLS.................WSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEAI.......................FVGSRSRHSQEPFFLAMGFHKPH
>IDS_aga	
GKVFHPGASSNFTDDFPLS.................WSEPAFHPLTDEYSNAAVCIDPDGRLKRNLLCPVRLETQPLHTLPDIESTEEAi.......................KRFLSTVGLSQPYFLAVGYRKPH
>IDS_bmo	
GKVFHPGKSSNFTDDYPYS.................WSEYPYHPPTEMYKDAKVCRNKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAi.......................IDFLNTVGLSQPYFLAVGYRKPH

WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQeaVNYTKPFVLYLGLNLPH
WMDVMEKHGYQTQKFGKLDYSSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQeaVNSTKPFVLYLGLNLPH
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPH
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRKEASNSTQPFVLYLGLNLPH
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRKEAMNYTQPFVLYLGLNLPH
WMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDVEFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKKEAVNLTQPFALYLGLNLPH
WMDLMQKHGYYTQKYGKLDYTSGSHSLSNRVEAWTRDVPFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQKAAQNHSQPFFLYLGLNLPH
WMDLLEVNGYLTKMMGKLDYTSGSHSVSNRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPH
WMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDVPFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRHKAAALSQPFALYLGLNLPH
WMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDVPFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRHRAAFSNQPFALYLGLNLPH
..hhhhh.........eee........hhhhhhhhhhhhhhhhh....eeee.......eeee......hhhhhhhhhhhh.....eeeee......
hhhhhhh..........e..........hhhhhhhhhhhhhhhh....eeee.......e.........hhhhhhhhhhhh.....eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh.....eee......eeeeh......hhhhhhhhhhh......eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh....eeee.....eeeeeee.....hhhhhhhhh........eeeee......
hhhhhhh..........e.........hhhhhhhhhhhhhhhhh.....eeee....eeeeee......hhhhhhhhhhh......eeeee.....h
hhhhhhh...ee.....ee.........hhhhhhhhhhhhhhhh.....eee.....eeeee.......hhhhhhhhhh.......hheee.....h
hhhhhhh...ee.....e..........hhhhhhh.....hh................eeeeee......hhhhhhhhhhh.....eeeee......
hhhhhhh..hhhhhh.............hhhhhhhhhhhhhhhh.....eeee....eeeee....hhhhhhhhhhhhhh......h.eee.....h
hhhhhh.....eeee..e..........hhhhhhh....hhhh......eeee......eeee.......hhhhhhhhhhhhh...hheee......
hhhhhh....hhhh..............hhhhhh.....eeee......eeee....eeeeee.......hhhhhhhhhhh......eeee......

>SulfX_mmu	
WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQ.......................eaVNYTKPFVLYLGLNLPH
>SulfX_rno	
WMDVMEKHGYQTQKFGKLDYSSGHHSISNRVEAWTRDV................AFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQ.......................eaVNSTKPFVLYLGLNLPH
>SulfX_hsa	
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRK.......................EAINYTEPFVIYLGLNLPH
>SulfX_bta	
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRK.......................EASNSTQPFVLYLGLNLPH
>SulfX_ssc	
WMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDV................AFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRK.......................EAMNYTQPFVLYLGLNLPH
>SulfX_gga	
WMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDV................EFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKK.......................EAVNLTQPFALYLGLNLPH
>SulfX_xla
WMDLMQKHGYYTQKYGKLDYTSGSHSLSNRVEAWTRDV................PFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQK.......................AAQNHSQPFFLYLGLNLPH
>SulfX_fru	
WMDLLEVNGYLTKMMGKLDYTSGSHSVSNRVEAWTRDV................QFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQ.......................RAESSQQPFALYLGLNLPH
>SulfX_omy
WMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDV................PFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRH.......................KAAALSQPFALYLGLNLPH
>SulfX_ola
WMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDV................PFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRH.......................RAAFSNQPFALYLGLNLPH

GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPEMVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTRGDRPFFLYVAFHDPH
GKKHVGPEAVYPFDFAHTEENDSILQVGRNITRMKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPETVYPFDFAhTEENSSVMQVGRNITRIKQLVQKFLQTQDDRPFFLYVAFHDPH
GKKHVGPESVYPFEFAHTEENSSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
GKKHVGPGSVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFFQSQEERPFFLYVAFHDTH
GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKESFLLYIGFHDPH
GKKHVGAANNFRFDFEQTEEQHSINQIGRNITRMKEYARQFLQAKDEKPFFLMVGFHDPH
..............eee......eeee..hhhhhhhhhhhhhh......eeeeeee....
.........e....eee......eeee...hhhhhhhhhhhhh......eeeeeee....
......................hhhhhhhhhhhhhhhhhhhhh......eeeeeee....
.......................ee.h.hhhhhhhhhhhhhhh......eeeeeee....
............eee........eeee..hhhhhhhhhhhhhh......eeeeeee....
..............eee......eeee...hhhhhhhhhhhh.......eeeeeee....
.........................hh..hhhhhhhhhhhhhhhhhhhh.eeeee.....
...........ee....hhhhhhhhhh.hhhhhhhhhhhhhhhh.....eeeeee.....

>SGSH_hsa	
GKKHVGPETVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_bta	
GKKHVGPEMVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFLQ...........TRGD.RPFFLYVAFHDPH
>SGSH_ssc	
GKKHVGPEAVYPFDFA..HTEENDSILQVGRNITRMKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_mmu	
GKKHVGPETVYPFDFA..hTEENSSVMQVGRNITRIKQLV..............................................................QKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_clu	
GKKHVGPESVYPFEFA..HTEENSSVLQVGRNITRIKLLV..............................................................RKFLQ...........TQDD.RPFFLYVAFHDPH
>SGSH_fru	
GKKHVGPGSVYPFDFA..YTEENGSVLQVGRNITRIKLLV..............................................................RKFFQ...........SQEE.RPFFLYVAFHDTH
>SGSH_cii	
GKKHVAPEAVYPFDFA..ETEENNSILQVGRNITRMKELA..............................................................KQFFS...........MQLK.ESFLLYIGFHDPH
>SGSH_dme	
GKKHVGAANNFRFDFE..QTEEQHSINQIGRNITRMKEYA..............................................................RQFLKQ..........AKDE.KPFFLMVGFHDPH

GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVH
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRRNRARPFLLFLSFLHVH
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRRNRDTPFLLFLSFMHVH
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGAYVAFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIERNSARPFLLFFSFLQVH
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKRNVDRPFLLFFSMAHIH
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIERHKHGPFLLFLSLLHVH
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFVSFLHVH
GKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVH
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLERHSKETFLLFFSFLHVH
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIERYKREPFLLFFSFLHVH
.....e...................eeeeeee............eeee...eeee.......hhhhhhhhhhhh........hhhhhhhhhhhhhhhhhhhhhh...hhhhhhhhhh.h.......hhhhhhhhhhhhhhhh.....eeeeeeee...
..eee.e.........................................hhhhhh...hhhhhhhhhhhhhhhhhhh.....hhhhhhhhhhhh................h.ehhhhhhh........hhhhhhhhhhhhhhhh....hhhhhhh.e..
...ee......................................eee....eeeeee..hhhhhhhhhhhhhhhh........eeee.hhhhhhh...eeeeee.......eee..eee........hhhhhhhhhhhhhhhh.....eeeeeehh...
..eee....................eee...............eeeee..........hhhhhhh.ee............hhhhhhhhh...eehhhhhhh.ee..hh..eee.....e.....hhhhhhhhhhhhhhhhhh.....hhh..eeee..
...ee.e..................ee....eee..........hhhhhhhhhhhhhheee...hhhhhhhh..hh..hhhhhhhhhhhhhhhhhhhh............ee...h.hh...eeehh.hhhhhhhhhhhhhh.....ee.ehhhhhh.
.........................ee....ee............hhhhhhhhhh.hhhhhhhhhhhhh......eeeeehhhh.......eeeeeee.....ee.hheeeee.........hhhhhhhhhhhhhhhhhhhh.....ee..hhhhh..
..eee.....................e..........hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh....eee......hhhhhhhhhhhhhhhhh..eeeee..hhhh................hhhhhhhhhhhhh.....eeeeeeeeee.
...ee...........................hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh..h...eee.....eeeeehhhhh....h.....hh.hhhhhhhhhh................hhhhhhhhhhhhh....eeeeeeeeee.
.........................ee....eeee..........hhhhhhhhhhhhhhhhhhhhhhh........e...hhhhhhhhhhhhhhhh..........ehhhhhh......hhhhhhhhh.hhhhhhhhhhhhhh.hhhhhhhhhh.e..
...ee.e.................eeee................hhhhhhhhhhhhhhhhhhh.hhhhh......ee....h...hhhhhhhhhh...........hhhhhhhhhhhhhh..hhhhhhhhhhhhhhhhhhhhhh...e.eeehh....

>STS_hsaXR	
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGI..SLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLG.FLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNT.....ETPFLLVLSYLHVH
>STS_mmu	
GKWHLGLSCRGATDFCHHPLRHGFDRFLGV..PTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSA.SCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRRNR.....ARPFLLFLSFLHVH
>STS_rno	
GKWHLGLSCQAASDFCHHPGRHGFDRFLGT..PTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVA.FLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRRNR.....DTPFLLFLSFMHVH
>STS_fru	
GKWHLGLNCESRDDHCHHPNAHGFNYFFGI..PLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGAYVA.FIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIERNS.....ARPFLLFFSFLQVH
>ARSD_fru	
GKWHLGVNCERRGDHCHHPNQHGFSYFYGL..PFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWL.PFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKRNV.....DRPFLLFFSMAHIH
>ARSD_hsa	
GKWHQGVNCASRGDHCHHPLNHGFDYFYGM..PFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYS.SFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIERHK.....HGPFLLFLSLLHVH
>ARSE_hsa	
GKWHLGLNCESASDHCHHPLHHGFEHFYGM..PFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYF.VGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNK.....HGPFLLFVSFLHVH
>ARSE_bta	
GKWHLGLSCASPDDHCHHPLNHGFDHFYGM..PFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSC.VGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHK.....QGPFLLFVSFLHVH
>ARSF_hsa	
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGM..PFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFS.SHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLERHS.....KETFLLFFSFLHVH
>ARSG_hsa	
GKWHLGLSCASRNDHCYHPLNHGFHYFYGV..PFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYS.SYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIERYK.....REPFLLFFSFLHVH

GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQKNPFFLYLSLPQTH
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNRLNSFFLYFSPPQAH
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQCYLYVNATLVSQPYQHKGLTQLFTDDALGFIEDNHADPFFLYVAFAHMH
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTECYLYKNATLVSQPYQHRNLTKLFTDDAIEFIDNNADNPFFLYVAFAHMH
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNCFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIEDNVNKPFFMYVSFAHMH
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLMTRLnRPFFFYFSFPQVH
........h................e......eeeeee..........eeeeeee.............hhhhhhhhhhhhhhhhh.....eeeee......
..eeeee......................................hhhh..eeeee....eee...........h......h........eeeee......
..eeeee.............................................eeeeeeeeee.........eeee....hhhh........eeeeeh....
..eeeee.................eeee...............eee......eeee...eee.......hhhhhh...hhhhh.......eeeee.hhh..
..eeeee.............................................eeeee.............hhhhhh....eeee.......eeeeee....
..eeeee.................eeeee..............hh........ee.....ee.....hhhhhhhhhhhhhhhhhhh.....eeeee.....

>STS_cii	
GKWHLGINELKQNDGRHLPKHHGFD.FVGTNLPFTFHLFCSPSEYPVDKMKIK...........................................................CFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQ.....KNPFFLYLSLPQTH
>STS2_cii	
GKWHLGINRNTSTDGYHLPHNHGFD.FVGTNLPLSHSEMCNPAEFTVEELSTM...........................................................CFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNR.....LNSFFLYFSPPQAH
>ARSE_hpu	
GKWHLGINENSSTDGAHLPFNHGFD.FVGHNLPFTNSWSCDDTGLHKDFPDSQR..........................................................CYLYVNATLVSQPYQHKGLTQLFTDDALGFIEDNH.....ADPFFLYVAFAHMH
>ARSE_her	
GKWHLGINEQTSTDGAHLPFNHGFE.YVGYNLPFTNSWNCDDTGLHVDFPNTEK..........................................................CYLYKNATLVSQPYQHRNLTKLFTDDAIEFIDNNA.....DNPFFLYVAFAHMH
>ARSE_spu	
GKWHLGINENSSSDGAHLPANRGFD.FVGHNLPFGNSWRCDDTGLHQDFPDTNA..........................................................CFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIEDNV.....NKPFFMYVSFAHMH
>ARSE_cel	
GKWHLGINENNATDGAHLPSKRGFE.YVGVNLPFTNVWQCDTTREFYDKGPDPSL.........................................................CFLYDGDDIVQQPMKFEHMTENLVGDWKRFLMTRL.....nRPFFFYFSFPQVH

>Sulf_pmari cys CAPTR Prochlorococcus marinus
GKWHETP................................................GRETTAAGPQTRWPT.RQGFEKFYGFIGAEENMYEPSLHD........GVTIIDYPDREDYHFLEDMTDQAIAWMRQQQGLRPDKPFFIYYASAGSH
>Sulf_mtube4 cys CSPTR Mycobacterium tuberculosis
GKCHEVP................................................VWQTSPVGPFDAWPSGGGGFEYFYGFIGGEANQWYPSLYE........GTTPVEVNRTPEYHFMADMTDKALGWIGQQKALAPDRPFFVYFAPGATH
>Sulf_npunc cys CSPTR Nostoc punctiforme
GKWHNTP.................................................DYETSAVGPFDRWPTGLGFEYFYGFMGGDTNQWSPALVE.....NTKRVAPPIEGNNPDYHLTPDLVDHAIAWIRSQQSIAPEKPFFTYLAIGATH
>Sulf_mbark2 cys CSPSR Methanosarcina barkeri
GKWHLTP.................................................ADQISAAGPYDRWPLGRGFECFYGFLGGETHQYYPELTY......DNHSVNPPKTPEEGYTLNEDLADRAIQFIADAKQVAPNKPFFMYFCTGAMH
>Sulf_mtube1 cys CSPTR Mycobacterium tuberculosis
GKWHLTP.................................................DNVQGAAGPFDNWPLGWGFDHFWGFPSGAAGQYDPIISQDNSVIGIPEGSGEDGRPYYFPDDLTDKAIEWLHTVRAQNATKPWMLYYATGATHAPH
>Sulf_sarom1 cys CSPTR Sphingomonas aromaticivorans
GKWHQVP.................................................DWEASPSGPFDRWPTGEGFERFYGFIGGETDQFDPSLFE........GTTPVMRPDVPNYHLTEDLADKSIAWLRTQHSVTPDKPFFLYFAPGATH
>Sulf_npunc cys CSPTR Nostoc punctiforme
GKWHNTP.................................................DYETSAVGPFDRWPTGLGFEYFYGFMGGDTNQWSPALVE.......NTKRVAPPINNPDYHLTPDLVDHAIAWIRSQQSIAPEKPFFTYLAIGATH
>Sulf_sarom3 cys CSPTR Sphingomonas aromaticivorans
GKSHLTP.................................................EWQTSAAGPFDQWPTGLGFEYFYGFLSADTSMWQPSIVE.......NTLPVEPPHDDPNYFFEKDMADHAIKWMRTQQAAAPDKPFFMYYAPGIAH
>Sulf_sarom2 cys CSPTR Sphingomonas aromaticivorans.
GKHHNTP.................................................EPFVSPAGPFDLWPTGLGFEYFYGFMAASTNQFSPALYR..........NTSPIPTLRDGVLDKALADDAIGWIHAQKAAAPDKPFFLYYATGSAH
>Sulf_narom cys CSPTR Novosphingobium aromaticivorans
GWNTAMI..........................................GKHHNTPEPFVSPAGPFDLWPTGLGFEYFYGFMAASTNQFSPALYR..........NTSPIPTLRDGVLDKALADDAIGWIHAQKAAAPDKPFFLYYATGSAH
>Sulf_mtube5 cys CSPTR Mycobacterium tuberculosis
GKWHLTP.................................................LEESNMASTKRHWPTSRGFERFYGFLGGETDQWYPDLVYDNH..PVSPPGTPEGGYHLSKDIADKTIEFIRDAKVIAPDKPWFSYVCPGAGHAPHH
>Sulf_prevoS ser STPAR +SulMod_pre Prevotella sp.......
GKWHMQC...............................................................RPKGFDFFRIFEGQGDYYNPLVLSHD.............SNGKYEREQGYATDIVTEHAVEFLNQRDEQKPFFLLVEHKAPH
>Sulf_bfrag4S ser SGPSR Bacteroides fragilis
GKWHLES...............................................................LPSGFNYWEIVPGQGDYYNPDFITQ...............NDTVQKHGYITNLITDDAIDWMENKRDESKPFCLLIHHKAIH
>Sulf_ecoli2S ser SGPSR +SulfMod_ecoli2 Escherichia coli
GKWHLSK..................................ISNVPVPEDKQTRDYHDNFTTFSAEEWQPQNRGFDYFMGFHAAGTAYYNSPSLFKN.............RERVPAKGYISDQLTDEAIGVVDRAKTLDQPFMLYLAYNAPH
>Sulf_bbron cys CTAGR CTAGR Bordetella bronchiseptica
GKNHLGD...........................RDEYLPTKHGFDEFFGNLYHLNAEEEPERPYWPKDPKD..............PVVKAYKPRGVIKASADGKIEDTGALTSKRMETIDDETVGAALDYIDKHGKGDKPFFVWMNTTRMH
>Sulf_mbark1 cys CTAGR Methanosarcina barkeri
GKNHFGD...........................LDEYLPTNHGFDEFYGNLYHLNAEEEPENPDYPAEKDF.............PNFRKNYGPRGVIHSYANGRVEDTGPLTRKRMETVDLEFLDAAIDFIKRHHAAEKPFFVWLNTTWMH
>Sulf_mmagn cys CTAGR Magnetospirillum magnetotacticum
GKNHLGD...........................RDEYLPTNHGFDEFFGNLYHLNAEEEPEQRTYPRDPEF................RKRFGPRGVIKSSADGKIEDTGPLTKKRMETIDDETSAAAMDFIERQTRADKPFFCWFNSTRMH
>Sulf_bbron3 cys CVPTR Bordetella bronchiseptica
GKWHLGD...........................VPGRYPSDRGFDEWYGIPRTTDESQFTASIGFDP....................AVADLPYIMQGRAGEPSENVKLYDLESRRRIDE.ELVERSLGFMRGNAAAGRPFFLYLPLVHLH
>Sulf_bbron5 cys CRAVR Bordetella bronchiseptica
GKWHLGD...........................KEGRYPKDRGFDEWYGIPRTTNESMFMEAVGFDP....................DVVEVPYVMEGRKGSPAERRERYDLEMRRRIDE..VLTQRSCEFIGRHAGKAPFFLYVPLTQLH
>Sulf_mloti1 cys CTAGR Mesorhizobium loti
GKNHVGD...........................RNEFLPTVHGFDEFFGNLYHLNAEEEPENVDYPKNPEFHAKFGPRGVLKCTATETDDPTEDPRFGRVGKQKIEDTGPLTKKRMETVDEEFLGAAKDFIDRQHKASKPFFCWFNSTRMH
>Sulf_ecoli3S ser SSPTR +SulfMod_ecoli3 Escherichia coli
GKWHMGE...........................NKESQPQNVGFDDFRGFNSVSDMYTEWRDVHVNPEVALSPDRS........EYIKQLPFSKDDVHAVRGGEQQAIADITPKYMEDLDQRWMDYGVKFLDKMAKSDKPFFLYYGTRGCH
>Sulf_bfrag5S ser SAPAR Bacteroides fragilis
GKWGLGA..........................PGTEGTPNKQGFDSFYGYNCQRQAHSYYPAFLYKNEDRVYLANK...............VLDPHTTKLDAGADPRDEAAYAKFSQKEYANDLIFDELISFVGQNRKKPFFLMWTTPLPH
>Sulf_bfrag3S ser SAPSR Bacteroides fragilis
GKWGLGF..........................IGSTGDPKKQGIDEFYGYNCQLLAHSYYPDHLWDNDKRVELKDNTLD...............................VQYGKGTYSQDLIHSKALDFLDRMGKSGESFCMWYPTIIPH
>Sulf_bfrag10S ser SAPSR Bacteroides fragilis
GKWAGGY..........................EGSASTPDKRGIDEYYGYICQFQAHLYYPNFLNRYSPSLGDTGVVRVVMEENIKYPMYGP..................DYHKRTQYSADLIHQKAMEWIEKQD.GEQPFFGIFTYTLPH
>Sulf_bcepa1 cys CTAGR Burkholderia cepacia
GKNHLGD...........................KNEYLPTNHGFDEFYGNLYHLNAEEEPERPYWPKDKNDPYVKNFSPRGVIHSTSDGKVQDTGPLTAKRME..............TIDDETGAHAEEFIRKQVKNGTPFFVWMNFTRMH
>Sulf_bfrag6S ser STPSR Bacteroides fragilis
GKWHLGL....GDKSGEQDWNAPLPAALGDLGFDYSYIMAATADRVPCVFIENGKVANYDPSAPIEVSYRKPFEGEPLGKDHPELLYNQKHSHGHDMAIVNGIGRIGYMKGGGKALWKDENIADSITTHAINFIREHKDEPFFMYFATNDVH
>Sulf_bfrag7S ser STPSR Bacteroides fragilis
GKWHIGL......GDGHVDWNKEVHPGAAEIGYDYSFIQAATNDRVPCVFLENGRVVGLDPNDPLYVDYRKNFPGEPTGKENPELLRM.HPSVGHAGSIVNGVPRIGFQKGGKAAQWKDEEMAGLFLDKARQFVDDNKDKPFFLYYGLHQPH
>Sulf_styph2 cys CSPSR +SulfMod_sty2 Salmonella typhimurium
GKLHLNA...................................................GGDRTDQPQAKDMGFDYTLVNPAGFVTDATLDNAKERPRYGVVHPTGWIRNGQHIGRADKMSGEFVSSEVVNWLDNKK.DDNPFFLYVAFTEVH
>Sulf_dvulgT thr TIPVR Desulfovibrio vulgaris
GYSRGFD........................................YVRFCNGHELDHETFCNVPLDEEFKAEDYLSPNWLKKDENGE..YDSSSKSLIRETECYLRQRQFWASDADNYASVVISEADNWLKMKRNPQRPFFLWLDSFDPH
>Sulf_ypest2S ser SGPSR +Sulf_ypest2 Yersinia pestis
GKWHLSK........................................ISNVPVPEAEQTRDYHDNFTTYSADEWQPQNRGFQYFMGYHAAGTAYYNSPSLFHNKERVKAKGYISDQLTDEA.......IGVANRAKSLDEPFMMYLAYSAPH
>Sulf_ypest1a cys CGPSR +SulfMod_ypest1 Yersinia pestis
GKWHNAR.......................................IEKKAFVADEVKSRDYHDNMISVSAPGYAPEKRGFDYSYSYYASGAALWHSPAIWQNSKNIAAPGYLTHNLTDE.........TLKFIDDSGKKPFFISLAYSVPH
>Sulf_bfrag11S ser SSPSR Bacteroides fragilis
GKAHFGC..................LKSEGENPTNLGFDVNIAGSAIGHPGSYHGENGYGWIKGQRARAVPDLEQYHK............THTFLSDALTLEA.........................GKEIEKAVAEKKPFYLNMAHYAVH
>Sulf_bfrag2S ser SSPTR Bacteroides fragilis.......
GKAHFGA..................VNTPGESPYHMGFEVNIAGHAGGGLASYLGENNYGNRTDGKPNPWFAVPGLEKYW.........GTDTFVSEALTLEA.........................IKALDHAKEYNQPFFLYMAHYAIH
>Sulf_bfrag9S ser SSPSR Bacteroides fragilis
GKWHLAE....................SAEYYPEQNGFDINIGGNNTGHPSKGYFSPYGNPQLKDGPEG.......................EYLTDRLTDEV...........................IRYISEPKEKPFFVYLSYYTVH
>Sulf_ccres cys CAPSR Caulobacter crescentus.
GKWHLGG....................VKGSRPEDQGFDESLGFMAGAALFAPVGDPGVVESRQDWDPIDKFLWGAAPFAVQFNGGKLFNPSHYMTDYLTDEA...........................VKAIDANKNRPFFMYLAYNAVH
>Sulf_tfusc cys CSSTR Thermobifida fusca
GKWHCGW....................LPWYSPLRIGFETFFGNFDGALDYFEHVDTLGKADLYEGETPVEEVGYYTEIISERA..............................................AEYITAHRNRPFYVQLNYTAPH
>Sulf_sarom4 cys CSPTR Sphingomonas aromaticivorans
GKWHLGE....................PPAHGPLKHGYDHFLGIVEGGADYFVHRMVMSGKPAGVGLAEDDAQTDRTGY...................LTDIFGDEA.....................VRVIEEG..GNQPFFLSLHFTAPH
>Sulf_sarom5 cys CTATR Sphingomonas aromaticivorans
GKWHLGS....................LPDFDPLKSGYQTFWGIRSGGVDYYTHATSNGQPDLWDGPTP....VERAGY...................LTDLLADRA.....................VSEIREASSGEAPWFMSLHFTAPH
>Sulf_bfrag1S ser SSPAR Bacteroides fragilis
GKWHLDA..............................................PYKPYVDTYNNRGKVAWNEWCPPERRHGFDHWIAYGTYDYHLKPMYWNTTAPRDSFYYVNQWGPEYE..ASKAIEYINGQK..DQKQPFALVVSMNPPH
>Sulf_ypest1b cys CTPFR + adjSulfMod_ypest1 Yersinia pestis
GKWHLDA............................................PEAPFVPSYNNPMEGRY.WNDWTPPEKRHGFDFWYSYGTYDLHLNPMYWTNDTPRDKPLKINQWSPEHEADIAIKYLRNEGGKYRDNDQPFALVVSMNPPH
>Sulf_bbron2 cys CAPSR Bordetella bronchiseptica
GKMHFVG...................................................PEQHHGFQERLTTDIYPSDFGWTPDWREEIPIAPTGMNMRSVIEAGECRRSMQIDYDDDVVYRGVQKIYDLGRLHR....DRPFFLAVSMTHPH
>Sulf_paer2 cys CAPSR Pseudomonas aeruginosa
GKMHFCG...................................................PDQLHGYEERLTSDIYPADYGWAVNWDEPEVRPSWYHNMSSVLQAGPCVRTNQLDFDEEVVFKARQYLYDHVRQ.HAG...QPFCLTVSMTHPH
>Sulf_bcepa2 cys CAPSR Burkholderia cepacia
GKMHFCG...................................................ADQLHGFEERLTTDIYPADFGWTPDWEHFETRPTWYHNMSSVIDAGPCVRTNQLDFDDEVTFTTRQKLFDIARERHAGKDARPFCLVASLTHPH
>Sulf_dhafn cys CAVTR Desulfitobacterium hafniense
GKWHLGS..............................................GLPNATNQGAAVGTSAPGQTPVSWGFEKSYALLGGGGDHFGRNGATAYVEDDHYVTPNTTSFFSSDFYTSTIIKYIDSSTGKNTDGKPFFAYLTYQAPH
>Sulf_rmeta1 cys CAVTR Ralstonia metallidurans
GKWHLGS..............................................GLPNATNQGAAVGTSAPGQTPVSWGFEKSYALLGGGGDHFGRNGATAYVEDDHYVTPNTTSFFSSDFYTSTIIKYIDSSTGKNTDGKPFFAYLTYQAPH
>Sulf_atume cys CAPAR Agrobacterium tumefaciens
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYTKPGERIDWWYHNLGSVTGAGVAEITNQMEYDDEVAYHATRKLYDLSR...RLDDRPWCLTVSFTHPH
>Sulf_rspha cys CAPAR Rhodobacter sphaeroides
GKMHFVG...................................................PDQLHGFEARLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVTGAGVAEITNQLEYDDDVAHQAIQKLYDLSR...GADPRPWCLTVSFTHPH
>Sulf_mloti3 cys CAPGR Mesorhizobium loti
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVSGAGVAEISNQMEYDDEVAFHAVQKLYDFARVSDDAAHRPWCLTVSFTHPH
>Sulf_smeli cys CAPAR Sinorhizobium meliloti
GKMHFVG...................................................PDQLHGFEERLTTDIYPADFGWTPDYRKPGERIDWWYHNLGSVTGAGVAEITNQMEYDDEVAFLANQKLYQLSRENDDESRRPWCLTVSFTHPH
>Sulf_bbron6 cys CMPNR Bordetella bronchiseptica
.................................................................................................................TRVPESCSTTAYLGRRTMQALDGYAQRGQPFFIQCSFPDPH
>Sulf_rpalu2 cys CGPSR Rhodopseudomonas palustris
GFEPYER...............................................DDGLHPDGPYDPAPDYDAYLRSQGFDADNPWEVWANSAEGGDGELLSGWLLSHADKPARVPDEHSETPYITRRAIEFIGEAEADGRPWCLHLSYIKPH
>Sulf_paer1 cys CGPSR Pseudomonas aeruginosa
GFEPYDR..........................................NDGVYPDDPAFADKRERAPYTH.YLRRLGFTGDNPWHDWANAAAGADGEILSGWRMRHAGLPTRLPEAHSETAYTTRRAMDFI..DEQGERPWCLHLSYIKPH
>Sulf_bbron1 cys CGPSR Bordetella bronchiseptica
GLLLSRG...........................................GFRELDRYDGHHEPGAESGYPAFLRRHGYDSPDPWSDYVISAIDAGGQVVSGW..HMRNTYLPSRVREAHSETAYMTGQALDFMRQRGGQPWVLHLSYVKPH
>Sulf_rmeta2 cys CGPSR Ralstonia metallidurans
GHFVEVD..................................................RHDGHHAEPRSPYADWLRAQGYDSADPWTDYVISAQTPDGEVVSGW..HMRNAGLPARVAEPHSETAYTVDRAMDYIGARGDDPWVLHLSLVKPH
>Sulf_scoel cys CTPAR Streptomyces coelicolor
GKWHAGN....................................RRTAADYGFDGPELPGWHNPV.DHPDYLAYLD...ERGLPPYEISDRVRGTLPNGGPGNLLAARLHQPVEATFEHYLATRAIERLEHYAADAHDRDRPFFLALHFFGPH
>Sulf_pmult2 cys CGPAR +SulfMod_pmult2 Pasteurella multocida
GKWHVGT...............................KSVPEDYDIKGHNFDGYGYPGSGVYKNLVFNQPPTHSNRYKEWLEEKGFEFPEVSKAYFGDNPHLRVQELCGFLSGTKEQTIPYFIIDEAKRYIQESLEENKPFFTWINFWGPH
>Sulf_rpalu1 cys CTSSR Rhodopseudomonas palustris
GKWHLNR......................................................................KFDTQETDRLFTKEMDDYGFSDYFSPGDIIGHTLGGYQFDPLIASSAITWLRRNGRPLTDDDKPWALFVSLVNPH
>Sulf_kpneu2 cys CTPSR +Sulf_kpneu2 Klebsiella pneumoniae
GKWHLTR.........................................................EIDQPVAGKSVEEMDLGEIPTPRLHEIMEKYGFSDYHGIGDVIGKSKGGYFFDSVTTGQTISWLRNTGRPLNDENKPWFAAVNLVNPH
>Sulf_styph4 cys CTPSR +isolated Salmonella typhimurium
GKWHLTE.........................................................KLEKPLPDEKDEDIDVGDIPEPELHKIMEKYGFADYHGIGDIIGHSKGGYFYDSTTTAQTINWLRCKGQPLNDQHKPWFLAVNLVNPH
>Sulf_styph1S ser SAPAR +SulfMod_sty1 Salmonella typhimurium
GKWHLGF..........................TPGSTPKDRGFRHSFALMGGGASHFDDAVPLGTVEIFHTYYTR................................DNQRISLPSSFYSSEAYASQINRWISETPREQPIFAWLAFTAPH
>Sulf_kpneu1S ser SAPAR +SulfMod_kpneu1 Klebsiella pneumoniae
GKWHLGF..........................VPGATPKDRGFNHAFAFMGGGTSHFNDAIPLGTVEAFHTYYTR................................DGERVSLPDDFYSSEAYARQMNSWIKATPKEQPVFAWLAFTAPH
>Sulf_paer3XR 1HDH cys CSPTR Pseudomonas aeruginosa
GKWHLGL..........................KPEQTPHARGFERSFSLLPGAANHYGFEPPYDESTP..........................RILKGTPALYVEDERYLDTLPEGFYSSDAFGDKLLQYLKERDQSRPFFAYLPFSAPH
>Sulf_styph3 cys CMPAR +isolated Salmonella typhimurium
NYHNRYS........................................................SWDVVRGQEGDHWKASVGEPPIPEVLRVPQKQTGGGVSGLWRHDWANREYIQQEADFPQTKVFDAGCDFIHKNHAEDNWLLQVETFDPH
>Sulf_ecoliH cys CMPAR Escherichia coli O157:H7
GxNYHNR......................................................YSSWEIVRGQEGDHWHASVAQPPIPEVLRVPQKQTGGGVSGLWRHDWANREYIQQEADFPQTKVFDAGCAFIHKNHAEDNWLLQIETFDPH

>Sulf_mtube3 cys CVPSR Mycobacterium tuberculosis
GMQHETS..........................................................................................YPKRLGFDEFDVSNSYCEYVVAKAQDWLHNRVPALDGQRFLLTAGFFETHRPYPH
>Sulf_Bmeli cys CHPSR Brucella melitensis
GWMKALR........................................DLDYYTTTISSFGERHGCWHWYAGFNEVMNCGKGGMENADEIVPMAIDWIARNKSRKWFLHVNLWDPHTPYRVPEEWGDPFAGEPLPAWMTEEVLARSIAGYGPH
>Sulf_bfrag8S ser SCPSR Bacteroides fragilis
GKWHVTV..............................................................EGAFTQPNGSYPVERGFEKYYGCLSGGGSYYTPKPVFSGLQRITEFPKDYYYTTAITDSAVSFIRQHPVDEPMFMYLAHYAPH
>Sulf_mloti2 cys CSPAR Mesorhizobium loti
GYTDVAL....................................DPRLLTSGDPRLKTYEGVLPGFTVRQLLPEHQKQWLSWLKQQGVDASAGSPGIHRPVGDEDDDSVTEAPPIYSKDHTPAAFLAGEFIRWLGEQEQAAPWFAHLSFISPH
>Sulf_bbron4 cys CIPAR Bordetella bronchiseptica
GKLHFRD.........................................................YGGDHGFSEEIIPMHIVGGKGDLMGLVRSDLPVRKGAYKMAQMAGPGESQYTFYDREIVSRAQIWLREQAPRHADKPWVLFVSFVSPH
>Sulf_bcary cys CGPAR Burkholderia caryophylli
GYDPALI..............................GYTTTTPDPRTTSARDPRFTVLGDIMDGFRSVGAFEPNMEGYFGWVAQNGFELPENREDIWLPEGEHSVPGATDKPSRIPKEFSDSTFFTERALTYLKGRDGKPFFLHLGYYRPH
>Sulf_mtube2 cys Mycobacterium tuberculosis
GKWHISH........................................ADLEDPATGAPLATNDNEGVVDSAAVRRYLDADPLGPYGFSGWVGPEPHGAGLANSGFRRDPLVADRVVAWLTERY.....ARRRAGDTAAMRPFLLVASFVNPH
>Sulf_pmult1 cys CGPCR +SulfMod_pmult1 Pasteurella multocida
GKWHLAS........................................DGELEEEPTIDYTTSAIPPERRGGYKGFWRASDVLEFTSHGYDGYVFDENMN.............KCEFKGYRVDCITDFALEYLDQYQG.DKPFFMTISHIEPH
>Sulf_efaec cys CVPAR Enterococcus faecium
GKMHVYP...............................SRKRLGFDHVLLHDGYLHVDRKYDKSYGEQFEYSSDYLMFLKESLGSDADLIDDGLNCNS.........WEARPWMYPEKFHPTNWVVSEGINFLRRKDPTVPFFLKLSFEKPH
>Sulf_ecoli1 cys CTPAR Escherichia coli
GKWHLDG...................HDYFGTGECPPEWDADYWFDGA....NYLSELTEKEISLWRNGLNSVEDLQANHIDETFTWAHRISN.................................RAVDFLQQPARADEPFLMVVSYDEPH

Sulf_paer3XR  1HDH cys CSPTR Pseudomonas aeruginosa            549  2.8e-56   1
ARSB_cal                                                       239  2.0e-23   1
Sulf_styph1S  ser SAPAR +SulfMod_sty1 Salmonella typhimurium   192  1.9e-18   1
Sulf_kpneu1S  ser SAPAR +SulfMod_kpneu1 Klebsiella pneumo...   169  5.1e-16   1
ARSB_spo                                                       144  2.3e-13   1
Sulf_dhafn    cys CAVTR Desulfitobacterium hafniense           142  3.7e-13   1
Sulf_rmeta1   cys CAVTR Ralstonia metallidurans                142  3.7e-13   1
Sulf_bfrag9S  ser SSPSR Bacteroides fragilis                    78  1.3e-11   2
ARSB_cii                                                       124  3.0e-11   1
ARSB2_cii                                                      124  3.0e-11   1
ARSB_dm2                                                       121  6.3e-11   1
SulfY_mmu                                                      121  6.3e-11   1
SulfY_hsa                                                      119  1.0e-10   1
Sulf_sarom4   cys CSPTR Sphingomonas aromaticivorans           119  1.0e-10   1
sulfZ_fru                                                      118  1.3e-10   1
sulfY/Z_hpo                                                    128  2.0e-10   1
SulfY_fru                                                      116  2.1e-10   1
sulfY/Z_cii                                                    128  3.3e-10   1
SulfY_gga                                                      114  3.5e-10   1
SulfZYB_cii                                                    113  4.4e-10   1
sulfZ/Y_cii                                                    110  9.2e-10   1
ARSB_fca                                                       122  9.8e-10   1
ARSB_rno                                                       107  1.9e-09   1
Sulf_prevoS   ser STPAR +SulMod_pre Prevotella sp               73  1.9e-09   2
ARSB_mmu                                                       106  2.4e-09   1
ARSB_hsa                                                       104  4.0e-09   1
SulfY_ame                                                      102  6.5e-09   1
SulfY_odi                                                      100  1.1e-08   1
GALNS_bta                                                      100  1.1e-08   1
sulfZ/Y_hpo                                                    108  2.1e-08   1
Sulf_tfusc    cys CSSTR Thermobifida fusca                      97  2.2e-08   1
GALNS_hsa                                                       95  3.6e-08   1
Sulf_bbron5   cys CRAVR Bordetella bronchiseptica               95  3.6e-08   1
ARSB_dm4                                                        94  4.6e-08   1
SulfY_ava                                                       94  4.6e-08   1
SulfZ_hsa                                                       94  4.6e-08   1
Sulf_ccres    cys CAPSR Caulobacter crescentus                  91  9.5e-08   1
SulfZ_mmu                                                       90  1.2e-07   1
GALNS_rno                                                       90  1.2e-07   1
GALNS_ssc                                                       90  1.2e-07   1
Sulf_bbron3   cys CVPTR Bordetella bronchiseptica               90  1.2e-07   1
Sulf_sarom5   cys CTATR Sphingomonas aromaticivorans            90  1.2e-07   1
Sulf_bfrag8S  ser SCPSR Bacteroides fragilis                    89  1.5e-07   1
ARSB_dm3                                                        87  2.5e-07   1
GALNS_mmu                                                       86  3.2e-07   1
KIAA1001_cii                                                   100  5.0e-07   1
GALNS_fru                                                       84  5.2e-07   1
GALNS_gga                                                       83  6.7e-07   1
GALNS_dre                                                       83  6.7e-07   1
GALNS_mac                                                       82  8.5e-07   1
GALNS_xla                                                       82  8.5e-07   1
GALNS_cii                                                       96  1.2e-06   1
Sulf_bcary    cys CGPAR Burkholderia caryophylli                79  1.8e-06   1
Sulf_bfrag10S ser SAPSR Bacteroides fragilis                    78  2.3e-06   1
GALNS_omy                                                       77  2.9e-06   1
Sulf_mbark2   cys CSPSR Methanosarcina barkeri                  75  4.7e-06   1

STS_hsaXR                                                      315  1.0e-61   2
SGSH_dme                                                       315  3.6e-58   2
STS_rno                                                        195  2.5e-37   2
STS_mmu                                                        204  3.9e-35   2
STS_fru                                                        212  8.1e-35   2
ARSD_fru                                                       176  4.8e-31   2
ARSD_hsa                                                       164  1.3e-27   2
ARSE_bta                                                       156  1.2e-26   2
ARSG_hsa                                                       195  5.5e-26   2
ARSF_hsa                                                       139  1.6e-23   2
ARSE_her                                                       190  3.1e-18   1
ARSE_hpu                                                       169  5.1e-16   1
ARSE_hsa                                                       161  3.6e-15   1
STS_cii                                                        158  7.5e-15   1
STS2_cii                                                       156  1.2e-14   1
ARSE_spu                                                       150  5.3e-14   1
ARSE_cel                                                       150  5.3e-14   1
KIAA1001_hsa                                                   123  3.8e-11   1
KIAA1001_mmu                                                   116  2.1e-10   1
ARSA_fru                                                       124  7.5e-10   1
KIAA1001_cii                                                   124  1.4e-09   1
KIAA1001_ssc                                                   100  1.1e-08   1
Sulf_bfrag9S  ser SSPSR Bacteroides fragilis                    70  1.3e-08   2
KIAA1001_rno                                                    99  1.3e-08   1
GALNS_fru                                                       85  4.1e-07   1
ARSA_gga                                                        82  8.5e-07   1
GALNS_bta                                                       81  1.1e-06   1
GALNS_ssc                                                       81  1.1e-06   1
GALNS_mmu                                                       76  3.7e-06   1
GALNS_rno                                                       76  3.7e-06   1
GALNS_hsa                                                       75  4.7e-06   1
GALNS_dre                                                       74  6.0e-06   1
GALNS_xla                                                       71  1.2e-05   1
GALNS_omy                                                       71  1.2e-05   1

ARSB_hsa                                                       532  1.8e-54   1
ARSB_fca                                                       507  3.3e-51   1
ARSB_rno                                                       437  2.0e-44   1
ARSB_mmu                                                       433  5.4e-44   1
SulfY_fru                                                      281  7.0e-28   1
SulfZ_hsa                                                      281  7.0e-28   1
SulfZ_mmu                                                      278  1.4e-27   1
SulfY_hsa                                                      276  2.4e-27   1
SulfY_mmu                                                      274  3.8e-27   1
sulfZ_fru                                                      274  3.8e-27   1
sulfY/Z_hpo                                                    281  7.8e-27   1
SulfY_gga                                                      260  1.2e-25   1
ARSB2_cii                                                      231  1.4e-22   1
sulfZ/Y_cii                                                    225  6.0e-22   1
ARSB_dm2                                                       191  2.4e-18   1
SulfY_ptr                                                      181  2.7e-17   1
sulfY/Z_cii                                                    191  6.4e-17   1
ARSB_dm3                                                       171  3.2e-16   1
SulfZYB_cii                                                    160  4.6e-15   1
ARSB_hro                                                       158  7.5e-15   1
SulfY_ava                                                      158  7.5e-15   1
ARSB_dm1                                                        96  2.8e-14   2
ARSB_cii                                                       145  1.8e-13   1
SulfY_odi                                                      141  4.8e-13   1
ARSB_dm4                                                       135  2.1e-12   1
ARSE_her                                                       120  8.0e-11   1
ARSB_cal                                                       103  5.1e-09   1
Sulf_tfusc    cys CSSTR Thermobifida fusca                     103  5.1e-09   1
GALNS_dre                                                      102  6.5e-09   1
ARSG_hsa                                                       115  6.5e-09   1
Sulf_bfrag9S  ser SSPSR Bacteroides fragilis                    61  7.9e-09   2
Sulf_bfrag8S  ser SCPSR Bacteroides fragilis                   101  8.2e-09   1
KIAA1001_mmu                                                    77  1.0e-08   2
GALNS_omy                                                      100  1.1e-08   1
GALNS_bta                                                       98  1.7e-08   1
Sulf_prevoS   ser STPAR +SulMod_pre Prevotella sp               59  2.0e-08   2
Sulf_bfrag4S  ser SGPSR Bacteroides fragilis                    57  3.2e-08   2
Sulf_paer3XR  1HDH cys CSPTR Pseudomonas aeruginosa             95  3.6e-08   1
 
ARSB_hsa 1FSU
GKWHLG.MYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCAL........DFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVHEP 
E...S...SSGGGTGGGTT.SEEEE.SSS...TTT.EEEEEEGGGTEEEEEE..........EETTEE..S.TTTTHHHHHHHHHHHHHHTTTTTS.EEEEEE..TTSSS
Sulf_paer3XR 
GKWHLG..LKPEQTPHARGFERSFSLLPGAANHYGFEPPYDESTPRILKGTPA......LYVEDERYLDTLPEGFYSSDAFGDKLLQYLKERDQSRPFFAYLPFSAPHWP 
E..S....SSGGGTGGGTT.SEEEEE.SS...TT......STTTTHHHHT..........EEETTEE.S...TTTTHHHHHHHHHHHHHHTTTTTS.EEEEEE..TTSSS
ArsA_hsa
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYP  
E...SB..GGGTTTGGGGT.SEEEEESS.TTSSB.TTSBSBTTTEE.SS.BS.SS....EEETTEESEES..HHHHTHHHHHHHHHHHHHHHHTT..EEEEEE..TTSSS
STS_hsa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVF.............CFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVH
                 TTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLN
H  helix
B  residue in isolated beta bridge; 
E  extended beta strand; 
G  310 helix; 
I  pi helix; 
T  hydrogen bonded turn; 
S  bend.


>SGSH_hsa
                  10        20        30        40        50        60
                   |         |         |         |         |         |
UNK_9160  GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPH
DPM       ccctetccceccechhhhhhtcceeeeeeeeeeeeeeeehehhchcccceeeeehecccc
DSC       cccccccccccccceeecccccccchhhhhhhheehhhhhhhcccccccceeeeeccccc
GOR4      ccccccccccccccceecccccceeeeccchhhhhhhhhhhhccccccceeeeeeeceec
HNNC      cccccccccccccceeecccccceeeechhhhhhhhhhhhhhhcccccceeeeeeecccc
PHD       cccecccceeccccceeccccceeeeeccceeehhhhhhhhhhcccccceeeeeeeeccc
Predator  cccccccccccccceeeeecccceeeeccchhhhhhhhhhhhhhccccccceeeeecccc
SIMPA96   cccccccccccccceecccccceeeeecchhhhhhhhhhhhhhcccccceeeeeeecccc
SOPM      ttttcctttecceeeeeccttcceeeeccchhhhhhhhhhhhhcttccceeeeeeetttt
Sec.Cons. cccccccccccccceeecccccceeeeccchhhhhhhhhhhhhcccccceeeeeeecccc
                  10        20        30        40        50        60
                   |         |         |         |         |         |
UNK_37250 GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKNESFLLYIGFHDPH
DPM       ccchehchhecchhhhhhhhtcceeeeeeeeehhhhhhhhhhhhhhhthheeeeeeecccc
DSC       ccccccccccccccccchhccchhhhhhhhhhhhhhhhhhhhhhhccccccceeecccccc
GOR4      cccccccccccccccccccccchhhhhcchhhhhhhhhhhhhhhhhhcceeeeeeecccee
HNNC      ccccccccccccccccccccccceehhchhhhhhhhhhhhhhhhhcccceeeeeeeccccc
PHD       cccccccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhcccccceeeeeeeeccc
Predator  ccccccccccccccccccccccceeeeccchhhhhhhhhhhhhhhhcccccceeeeccccc
SIMPA96   ccccccccccccccccccccccchhhhchhhhhhhhhhhhhhhhhhcccceeeeecccccc
SOPM      tccccctteecceeeccchttcheeeeccchhhhhhhhhhhhhhhctttceeeeeecccct
Sec.Cons. cccccccccccccccccccccc???hhchhhhhhhhhhhhhhhhh?cccceeeeeeccccc

>SulfX_




MAAVAAATRWHLLLVLSAAGLGVTGAPQPPNILLLLMDDMGWGD
                     LGVYGEPSRETPNLDRMAAEGMLFPSFYAANPLCSPSRAALLTGRLPIRTGFYTTNGH
                     ARNAYTPQEIVGGIPDPEHLLPELLKGAGYASKIVGKWHLGHRPQFHPLKHGFDEWFG
                     SPNCHFGPYDNRARPNIPVYRDWEMVGRFYEEFPINLKTGESNLTQIYLQEALDFIKR
                     QQATHHPFFLYWAIDATHAPVYASRAFLGTSQRGRYGDAVREIDDSVGRIVGLLRDLK
                     IAGNTFVFFTSDNGAALVSAPKQGGSNGPFLCGKQTTFEGGMREPAIAWWPGHIPAGQ
                     VSHQLGSVMDLFTTSLSLAGLEPPSDRAIDGLDLLPAMLQGRLTERPIFYYRGNTLMA
                     ATLGQYKAHFWTWTNSWEEFRQGVDFCPGQNVSGVTTHSQEEHTKLPLIFHLGRDPGE
                     RFPLSFASTEYLDALRKITLVVQQHQESLVPGQPQLNVCNPAVMNWAPPGCEKLGKCL
                     TPPESVPEKCSWPH
ARSD_gga
FLSGMASSNRYRALQWNAGSGGLPANETTFARLLQQQGYTTGLIGK...KGGKAMGGWEGGIRVPGIFRWPGVLPAGKVISEPTSLMDIYPTVVHLAGGVVPQDR

STS_gga
RTPNIDRLAREGVKLTQHIAAAPLCTPSRAAFLTGRYPIRSGWA



GALNS_spu sea urchin trace exon
RAALLTGRLPIRNGFYTTNGHAHNAWSQQIVKGGIPDSEILLPKLLKLSGYKSKIVGKW

>GNS_chi	
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPH

>GNS_dre	
GKYLNQYGSKDAGGVAHVPPGWDQWHALVGNSKYYNYTLSVNGKEEKHGDSYEKDYLTDLVLNRSLHFLEERSPSHPFFMMLCPPAPH

>GNS__gga	
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSN

MSRSALAALARGLALAALLVLSPAQAARQRPNVVLILTDDQDVFLGGMTPMKKTNALIAQMGVTFSNAYVPSALCCPSRASILTGKYPHNHHVVNNTLEGNCSSKLWQKIQEPNTFPALLKSMCGYQTFFAGKYLNEYGAEDAGGVSHVPPGWSFWYALEKNSKYYNYTLSVNGKARRHGENYSVDYLTDVLANMSLGLLGVQINFWNLFFIDGSQTPAP 

>GNS_ssc	
GKYLNEYGAPDAGGLAHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYFFMMISTPAPH

>GNS_cfa	
VPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANISLGFLDYKSNSEP

>GNS_xla	
GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHGQNYSQDYLTDVLSNVSLDFLNYKSNHEPFFMMIATPAPH  

GKYLNQYGGK...SVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSV.......................................DFIHNHKMRYTQPFFMMISTPAPH
GKYLNQYGGK...SVGGPQHIPVGWDQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDL

GKYLNQYGSEEAGGINHVPPGWSYWFALEKNSKYYNYTLSENGRPKTHG

>KIAA__gga
GSRSRSSILTGKYVHNHNTYTNNENCSSPSWQAQHEIRTFAVYLNNTGYRTAFFGKYLNE
YNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGFDYSRDYLTDLITNDSITFFRIS
KKMYPHRPVLMVISHAAPHGPEDSAPQYSHLFPNASQHITPSYNYAPNPDKHWIMRYTGP
MKPIHMEFTNMLQRKRLQTLMSVDDSMEMIYNTLVETGELDNTYIYTQQIMVIILVSSGW

>KIAA1077__gga
QGHSSTLKSLRFRGRVQQERKNIRPNIILVLTDDQDVELGSLQVMNKTRRIMENGGASFINAFVTTPMCCPSRSSMLTGKYVHNHNIYTNNENCSSPSWQATHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPHGPEDSAPQFSELYPQRFAAYHS


>KIAA1247__dre	
MAVGWRPATLLLVFILTFICLSDGSTYLSGQRQRSRLQRDRRNVRPNMILILTDDQDIELGSMQAMNKTKRIMMQGGTHFSNAFATTPMCCPSRSTILTGKYVHNHHTYTNNENCSSPSWQAHHEPHTFAVHLNNSGYRTAFFGKYLNEYNGSYVPPGWREWVALVKNSRFYNYTLCRNGIXGXHGTQYPKDYLTXRITNDSINFLRMSKRMYPHRPVMMGLSHAAPHGP

>KIAA1077__str	
NSPGCCPSRSSMLTGKYVHNHNIYTNNENCSSPSWQAIHEPRTFAVYLNNTGYRTVFFGK
YLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISY
FKLSKKLYPHRPIMMVISHA

>KIAA1077__str Halocynthia roretzi
MLSDFRILFLIMKRSLSAFDVTFLLLLTLACSVWSKKPNIVIMITDDQDVLLNSMEVMHHTHNELIQQGVEFANAFTTTPMCCPSRSTMLTGLYTHNHHVYTNNDNCSSTLWRKQFESRSYATYLNNSGYNTGYFGKYLNEYNGSYIPAGWKYWMGLIKNSKYYNYAVNHNSQKELHGDDYAKDYLTDLVTNRSMEFFRDSKTERPEDPVLVA

>KIAA1247__hgl	  Heterodera glycines
MIRPNIVLLITDDQDIELGSMAFMPKTLRLLQQRGTEFRNAFVSTPICCPSRSTILTGLYAHNHKVMTNNGNCAGEEWRSDFEKDTFAVYLERTGFLTGFFGKFLNNYDGSWVPPGWTKWAALVRNSRYYNYSLNKNGRNEWHGNRYENDYLTNLVANLSLQFIDESVLLNPHGQPFLVVLSFPAPHGPEDPAPQFGDLFE

>gi|27764275|emb|AL627362.2|CNS07TIX   DNA centromeric region sequence from BAC DP15B03, DP38F06 of
            chromosome 5 of Podospora anserina
          Length = 156244

 Score =  145 bits (365), Expect = 2e-35
 Identities = 67/88 (76%), Positives = 79/88 (89%)
 Frame = +1

>Sulf_pan	
GKLFNAHTVENYNSPYPAGWNGSDFLLDPYTYNYLNSSFQRNQDPPKSYEGFHSVDVLAEKSLGFVDEAVRADGPFFLGIAPVAPH

>Sulf_cgl	
GKLFNAQTVDNYDSPHAAGWTGSDFLDPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAGKALGFLDDVVAEDKPFFLGIAPIAPH

    >gi|6822088|emb|AJ271152.1|CGL271152   Colletotrichum gloeosporioides f. sp. malvae partial ars gene for
           arylsulfatase, exons 1-3
          Length = 1679

 Score =  135 bits (341), Expect = 1e-32
 Identities = 65/88 (73%), Positives = 73/88 (82%)
 Frame = +2

Query: 1   GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAE 60
           GKLFNA TVDNYDSP+ AGW GSDFL DPYTYSYLNATFQRN+D P+S+EG+YS  VLA 
Sbjct: 221 GKLFNAQTVDNYDSPHAAGWTGSDFL-DPYTYSYLNATFQRNKDAPVSHEGEYSTGVLAG 397

Query: 61  KAYGFLDEAAKNVHNRPFFLGIAPIAPH 88
           KA GFLD+      ++PFFLGIAPIAPH
Sbjct: 398 KALGFLDDVV--AEDKPFFLGIAPIAPH 475

>Sulf_vca 649aa Volvox carteri Arylsulfatase sulphohydrolase 69% Chlamy Q10723 ARS_VOLCA ::123::45
RPNFVVIFTDDQDGIQNSTHPRYQPKLHEHIRYPGIELKNYFVTTPVCCPSRTNLWRGQFSHNTNFTDVLGPHGGYAKWKSLGIDKSYLPVWLQNLGYNTYYVGKFLVDYSVSNYQNVPAGWTDIDALVTPYTFDYNNPGFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPHTSTQIYFDPVANATKTFFYPPIPAPRHWELFSDATLPEGTSHKNLYEADVSDKPAWIRALPLAQQNNRTYLEEVYRLRLRSLASVDELIDRVVATLQEAGVLDNTYLIYSADNGYHVGTHRFGAGKVTAYDEDLRVPFLIRGPGIRASHSDKPANSKVGLHVDFAPTILTLAGAGDQVGDKALDGTPLGLYANDDG ::12
NLLADYPRPANHRNQFQGEFWGGWSDEVLHHIPRYTNNSWKAVRVYDEDNQQAWKLIVSCTNERELYDLKTDPGELCNIYNKTRAAVRTRLEALLAVLVVCKGESCTNPWKILHPEGSVNSWNQSLDRKYDKYYANVAPFQYRTCLPYQDHNNEVSAFRSTVAAAAAAAAAAAAQQPGRRRMYTWTSAGRQLSATASAIATSPQPRSEPFVAEVERHSVPVPAEVLQSDVAKWFDNPLALA ::45

>Sulf_cre 647aa Chlamydomonas reinhardtii P14217 ARS_CHLRE Arylsulfatase GNS 30% ::123::45
KPNFVVIFTDDQDAIQNSTHPHYMPSLHKYIRYPGVELSQYFVTTPVCCPSRTNLXRGQFAHNTNFTSVLPPYGGWAKWKGLGIDQSYLPLWLKDQGYNTYYVGKFLVDYSVSNYQQVPRAGTISMPXVTPYTFDYNTRLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPHTSTQISTNPATGVTRSYFFPPIPAPPHWQLFSDANLPGGSXNKNLYEVDVSDKPAWIRALPLAQQNNRTYQEEIYRLRLRSLGPDELIEQVVKTLDEAGVLDNTYIIYSADNGYHVGAHRFGAGKTTGYEEDLRVPFLIRGPGIKASKSDKPQNSKVGLHVDFAPTILSLAGASHLLGDKGLDGTPLGLYANDDG ::12
TLPSDYPRPEQHRQQFQGEFWGGWSDELLQNLRSQPNNTWKVVRTYDESSKQGWKLIAQCTNERELYDLRKDPGELYNIYDKAKPAVRSRLEGLLAVLAVCKGESCSNPWKILHPDGTVKNFTQALNSKYDRIYNAIRPFTYKRCLPYLDWDNEDSQFKTQIRGANPAAGVGHHRLLTAASERAIATRRRAQAAVSAELADGPAVFQAKVEEKSVPVPQDILKADVEKWFAFNNAEYYLA ::45

>ARSA_afu 59aa Aspergillus fumigatus fragment 72% Sulf_psa3 55% ARSA_hsa
RPNFLVIVADDLGFSDCGCFGSEISTPNIDALAYSRGGLRFTSFHVAAACAPTRSMLMTGTDHHLTGLGQLPEYIA-LSRAHQGAPGHEGYLNERVVALPELLRDGGYYTLMSGKWHLGLKREYSPHARGFAKSYAMLSGAANHY

>ARSA_cal 59aa Candida albicans fragment 76% ARSB_spo 52% ARSA_mmu
QPNFLIIVADDLGFTDLSPFGGEINTPNLNKLATGANGVRLTDFHTASACSPTRSMLLSGTDNHIAGLGQMAEFAQRHPEKFNNQPGYEGYLNDKVVALPEILQDNGYHTFISGKWHLGLKKPYWPNKRGFNKSFTLLPGAGNHYKYITRDSQGNQIPFLPAIYVEDDKELLQPEIELPDDFYSTNYFTDKAIEFIKETPQGKPFFGMITYTAPHWPYQAPQDKIAKYNGVYDNGPEELRQKRLQSAKNLGLIDTNIIPHPIKTIRKSWDELTLLEKLKEIKIMQTYAAMVEILDENIGRLIDHLNSIDELNNTFILFMSDNGAEGMLMEALPLTNQRINKFIDEYYDNSLSNIGNKNSFTYYGDQWAQAATAPHAMYKMWSTEGAIVCPLIIHYPNLFSSSAVSGGGSGDGDGDGDGGKILKEFTTVMDILPTILELANVSHPGETYKGRQVVKPRGKSWVNYLINKTDQVHDENTVTGWELFGQQAIRKGSFKAIYIPKPFGPEKWQLFNIIEDPGEINDLSESSSEYQTILNELLDHWAVYAAETGLIELGSD 


=-=-=-= new uro kiaaa long =-=-=-

Boltenia villosa
            Eukaryota; Metazoa; Chordata; Urochordata; Ascidiacea;
            Stolidobranchia; Pyuridae; Boltenia.
REFERENCE   1  (bases 1 to 507)
  AUTHORS   Davidson,B. and Swalla,B.J.
  TITLE     A molecular analysis of ascidian metamorphosis reveals activation
            of an innate immune response
  JOURNAL   Development 129 (20), 4739-4751 (2002)
  MEDLINE   22248966
transcripts differentially expressed during metamorphosis in the ascidian Boltenia villosa by
                              suppressive PCR subtractions of staged larval and juvenile cDNAs. We employed a series of three subtractions to dissect gene
                              expression during metamorphosis. We have isolated 132 different protein coding sequences, and 65 of these transcripts show
                              significant matches to GenBank proteins. Some of these genes have putative functions relevant to key metamorphic events
                              including the differentiation of smooth muscle, blood cells, heart tissue and adult nervous system from larval rudiments.

 VPYPVTQTQNPIEWEFLYPLEITKETEETLHSEQLREYQERIET
                     KRMIKAMRQAKLARIQKREKKGVLRLNRCGSNGVTCFKLSNATWKTEPLWNGGDQCYC
                     TNXNNNTYWCVRIINETTNVLYCEFITYFVEYYNLNDDPHQLVNYRDKISDEEHNALY
                     XEMAXLPSC


=-=-=-=new ids -=-=-=-=-=-


>IDS_dme 512aa AE003478 CG12014 gp Drosophila melanogaster 46% clearly IDS NLmPL is NLvPL genome ::123::45
RPNVVMVIFDDLRPVIGAYGDTLASTPYLDNFARGSHIFTRVYSQQSLCAPSRNSLLTGRRPDTLHLYDFYSYWRTFTGNFTTLPQYFKEHGYYTYSCGKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEALRFVGSRSRHSQEPFFLAMGFHKPHINFRFPRQFLSRFNLSQFYNYTEDSLKPPDMPAVAWNPYTDVRARDDFKHSNISFPYGPISPLQAAQIRQSYYASVSYVDDLFGKLIGGLDLDETVVVALGDHGWSLGEHAEWAKYSNFEVALRVPLIIRSPQFPVAQTKYYHGITELLDVFPTLVDLAGLPKLDKCQSSQELTCG ::12
EGKSLYHQLMGLGRADEHVALSQYPRPGMLPTKHPNSDKPKLRNIKIMGYSLRTDIYRYTMWVRFHAQNFSRDWHDVYGEELYDHRLDSGEELNLmPLPQFDDVRQRLRRRLMEMVGS ::45


>IDS_gga 167aa chicken BI390037, to human 79%
MNVLFIVVDDLRPVLGCYGDNLVKSPNIDQLASQSIVFSNAYAQQAVCAPSRVSFLTGRRPDTTRLYDFYSYWRVHSGNYSTMPQYFKENGYVTMSVGKVFHPGISSNYSDDYPYSWSIPPFHPSTEKYENDKTCRGKDGRLYANLVCPIDVTEMPGGTLPDIETTEEAIRLLNVMKTKKQKFFLAVGYHKPHIPLRYPQEFLKLYPLENITLAPDPWVPEKLPPVAYNPWVDIRQRDDVKALNVTFPYGPLPDDFQRLIRQSYYAAVSYLDMQVGLLLNALDYVGLSNSTIVVFTADHGWSLGEHGE

>IDS_str  Silurana tropicalis
MNLFGYLRFLMCATTVFAVWQQPFLPKHTATGGKNVLIIIADDLRTSLGCYGDSAVKSPNIDHLASQSIIFTNAYAQQAVCAPSRVSFLTGRRPDTTRLFDFNSYWRTHAGNYTTLPQYFKEHGYVTMSVGKIFHPGISSNHSDDYPYSWSVYPYHPSAEKYENSQTCKGKDGKLHANLVCPVDVSEVPEGTLPDIQSTEEAIRLLKTVKQQNASFFLAVGYHKPHIPFRFPKEFLKLYPIKNISLAPDP

>IDS_rno Rattus norvegicus
FSLLLGFFCIALVSAAQGNSATDALNILLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSIVFENAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHSGNFSTIPQYFKENGYVTMSVGKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHTNLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSVSPFFLAVGFHKPHIPFRYPKEF

>IDS_omy	Oncorhynchus mykiss 
MVLRPDSVYVSHDVVAKTVRRNVLFIMADDLRPTLGCYGDPIVKSPNIDQLASKSNVFLNAYAQQAVCGPSRTSLLTSRRPDTTRLYDFNSYWRVHAGNYTTLPQYFKSKGYTTMSVGKVFHPGIASNHSDDYPYSWSVPPYHPPSFKYENMKVCKGSDGKLHANLLCSVNVSETPLGTLPDMESTEEAIRLLKSTRNSGKNFFLAVGFHKPH

>IDS_dre
MNVMFVFTCWWFVFIFHLLGRDVFAAKSKNFNVLYLIADDLRPTLGCYSDPVVKSPNIDQLASLSVVFHNAYAQQAVCGPSRVSFLTSRRPDTTKLYDFNSYWRVHAGNYTTLPQYFKSNGYTTLSVGKVFHPGIASNHSDDYPYSWSVPPYHPPSFEYEKRKVCKDKDGTLHSNLLCPVNVSEMPLGTLPDMENTEEAIRLLRSMKGSQKPfFLSVGFYKPH

>IDS_aga	Anopheles gambiae  
ATDQPNVLLIILDDFRPVINYGYGDGNAITVNIDRLVQQGFFFQNAFAQQALCAPSRNSMLTGRRPDTVRLYDFYSYWRHTSGNYTTLPQYFKQHGYRTHSVGKVFHPGASSNFTDDFPLSWSEPAFHPLTDEYSNAAVCIDPADGRLKRNLLCPVRLETQPLHTLPDIESTEEAKRFLSTVGLSQPYFLAVGYRKPHIPFRIPAKYLGLHPVAKFATLDLDYPPYGLPTVAWSSY

>IDS_bmo Bombyx mori
GKVFHPGKSSNFTDDYPYSWSEYPYHPPTEMYKDAKVCRNKKTKKLERNLICPVSVKRQPGQSLPDLQSLDYAIDFLN 



=-=-=-=-=new arsa ==--=-=-=-=-

>ARSA_gga
MAVWCGFPPWAVLLLWALRGAAGGPPSFVLLLADDLGFGDLGSYGHPSSATPNLD
RMAARGLRFTDFYSSSAVCSPSRAALLTGRFQMRSGIYPGVFYPGSRGGLPLSEVTIAEV
LKAKGYATAIVGKWHLGLGARGSFLPIHQGFDHFLGVPYSHDQGPCQNLTCFPPDIKCFG
TCDQGLVPVPLFWNQSIVQQPVSFLIWCRLQQICTGLHLPTAPGEALLXYYASHHTH

>SulfX_hsa 544 aa 8 exons
0 MLLLWVSVVAALALAVLAPGAGEQRRRAAKAPNVVLVVSDSF 0
0 DGRLTFHPGSQVVKLPFINFMKTRGTSFLNAYTNSPICCPSRA 1
2 AMWSGLFTHLTESWNNFKGLDPNYTTWMDVMERHGYRTQKFGKLDYTSGHHSIS 2
1 NRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK 0
0 VSHDAIKIPKWSPLSEMHPVDYYSSYTKNCTGRFTKKEIKNIRAFYYAMCAETDAML1
2 GEIILALHQLDLLQKTIVIYSSDHGELAMEHRQFYKMSMYEASAHVPLLMMGPGIKAGLQVSNVVSLVDIYPTML 1
2 DIAGIPLPQNLSGYSLLPLSSETFKNEHKVKNLHPPWILSEFHGCNVNASTYMLRTNHWKYIAYSDGASILPQLF 1
2 DLSSDPDELTNVAVKFPEITYSLDQKLHSIINYPKVSASVHQYNKEQFIKWKQSIGQNYSNVIANLRWHQDWQKEPRKYENAIDQWLKTHMNPRAV*

>SulfX_mmu 553 aa 8 exons
0 MLLLLVSVVAALALAAPAPRTQKKRMQVNQAPNVVLVASDSF 0
0 DGRLTFQPGSQVVKLPFINFMRAHGTTFLNAYTNSPICCPSRA 1
2 AMWSGLFTHLTESWNNFKGLDPNYTTWMDIMEKHGYQTQKFGKVDYTSGHHSIS 2
1 NRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK 0
0 VAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAML 1
2 GEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVPLLMMGPGIKANLQVPSVVSLVDIYPTML 1
2 DIAGIALPPNLSGYSLLTLLSNASANEQAFKFHRPPWILSEFHGCNANASTYMLRTGQWKYIAYADGASVQPQLF 1
2 DLSLDPDELTNIATEFPEITYSLDQKLRSIVNYPKVSASVHQYNKEQFIMWKQSVGQNYSNVIAHLRWHQDWQRDPRKYENAIQHWLTAHSSPLASSPTQSTSGSQPTLPQSTSG* 0

>SulfX_rno
atgctgctgctgttggtttcagtgatcgtggcgttggcgctcgtggcaccggctcccgaaacacaggagaaaaggctgcaagtggcccaggcgcccaacgtggtgctggtcgccagtgactccttcgatggaagactaacattccaaccaggaagtcaggtagtaaaacttcccttcattaacttcatgagagcacgtggcaccaccttcctaaatgcctacactaactctcccatctgctgtccatcacgtgcagcaatgtggagtggccttttcactcacttaacagaatcttggaataattttaagggtctggatccaaattacacaacatggatggatgtcatggagaagcatggctatcagacacagaaatttggaaaactggactattcttcagggcatcattccatt 

>SulfX_rno all but one exon can be found using ESTs, HGRP, trace, assembly, and nrn.93%
MLLLLVSVIVALALVAPAPETQEKRLQVAQAPNVVLVASDSF
DGRLTFQPGSQVVKLPFINFMRARGTTFLNAYTNSPICCPSR
AAMWSGLFTHLTESWNNFKGLDPNYTTWMDVMEKHGYQTQKFGKLDYSSGHHSIS
NRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMDKDWQNTDKAIAWLRQVNSTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEK 
vaydaikipkwltlsqmhpvdfyssytknctgkfteneiknirafyyamcaetdaml
GEIILALHKLNLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASAHVPILMMGPGIKANLQVPSLVSLVDIYPTML
DIAGIPLPLNLSGYSLLPLSSNTSANDQAFRVHHPPWILSEFHGCNANASTYMLRTGQWKYIAYSDGTLVQPQLF
nLSLDPDELTNIATEFPEITYSLDQQLRSVINYPKVSASIHRYNKEQFIMWKQSVAQNYSNYIAHLRWHQDWQKDPRKYENAIQRWLAIHSSP


>SulfX_bta 420aa cow 88% to human  positions 3-187 242 note DS not DD
MLLLWVSVVAASALAAPAPGADGQRRGAIQAWPD
APNVLLVVSDSFDGRLTFYPGSQVVKLPFINFMKAHGTSFLNAYTNSPICCPSRAAMWSGLFTHLTESWNNFKGLDPNYTTWMDVMEKHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLAPKKTKVRVMQVDWKNTDRAVNWLRKEASNSTQPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSRYWLKKVSYDAIKIPKWSPLSEMHPVDYYSSYTKTCPGKFPEKEIKNIRAFYYAMCAEXDAMLGEIILALRQLGLLQKXIVIYTSDHGELAMEHRQFYKMSMYEASSHVPLLIMGPGIQANLQVSSVVSLVDIYPTMLDIAGIPLPQNLSGYSLLPSSSEMFKNEQKFKNLHPPWILSEFHGCNVNASTYMLRTNQWKYIAYSDGASVLPQLFDLSSDPDELTNIAAKFPEVTSSLDQKL   

>SulfX_ssc 219aa pig 89% similarity to human positions 117-263 455-526
GKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLISKKTKVRVMEGDWKNTDKAVKWLRKEAMNYTQPFVLYLGLNLPHPYPSPSSGENFGSSTFQTSLYWLKKVSYDAIKIPKWSPLSEMHPVDYYSSYTKNCTGKFTKKKKK QKLRSIINYPKVSASVHQYNKEQFIKWKQSIGQNYSNVIANLRWHQDWLKEPRKYESAIDQWLKTYSDPKKI 

>SulfX_gga
PSRAAMWSGLFTHLTESWNNFKGLDPDYVTWMDLMQKHGYYTQKYGKLDYTSGHHSVSNRVEAWTRDVEFLLRQEGRPKVNLTGDRRHVRVMKTDWQVTDKAVTWIKKEAVNLTQPFALYLGLNLPHPYPSPYAGENFGSSTFLTSPYWLEKVKYEAIKIPTWTALSEMHPVDYYSSYTKNCTGEFTKQEVRRIRAFYYAMCAETDAML

>SulfX_omy
PSRAAMWSGRFVHLTESWNNYKCLDPNATTWMDMLQQNGYNTLSVGKLDYTSGSHSVSNRVEAWTRDVPFLLRQEGRPVTDLVGDASTTRVMTKDWRTTDIATQWIRHKAAALSQPFALYLGLNLPHPYVTDSLGPNAGGSTFRTSPYWLEKVMPEFISIPKW

>SulfX_xen 85aa frog fragment 95% to human KIAA1247_hsa
MEHWRILLLTLLMALVLPAIEGSVLSKQRMKGRFQRDRRNIRPNIILVLTDDQDVELGSMQVMNKTRRIMEQGGTHFINAFVTTPMCCPSRSSILTGKYVHNHNTYTNNENCSSPSWQAQHETHTFFVYLNNTGYRTAFFGKYLNEYNGTYVPPGW
...GSHSLSNRVEAWTRDVPFLLRQEGRPCANLTGNKTQTRVMALDWKNADTATAWIQKAAQNHSQPFFLYLGLNLPHPYPSETMGENFGSSTFLTSPYWLQKVPYKNVTIPKWKPLQSMHPVDYYSSYTKNCT

>KIAA1077_xla kiaa type 
HNHNIYTNNENCSSPSWQAIHEPRTFAVYLNNTGYRTAFFGKYLNEYNGSYIPPGWREWLGLVKNSRFYNYTMCRNGFKEKHGFEYEKDYFTDLITNDSISYFKSSKKMYPHRPIMMVISHAAPHGPEDSAPQFSEFFPNASQHITPSYNYAPNMDKHWIMQYTGAMLPIHMEFTNVLHRKRLQTLLSVDDSMEKLYNMLVDTGELENTYIIYTSDHGYHIGQFGLVKGKSMPYDFDIRVPFFVRGPNVEPGSVVPQIVLNLDLAPTILDIAGAGRTP


>SulfX_ola
SRAAMWSGQFVHLTQSWNNYKCLEANATTWMDLLEENGYLTKMMGKLDFTSGSHSVSNRVEAWTRDVPFLLTQEGRPVSQLVGNTSTIKVMKKDWQNTDQASQWIRHRAAFSNQPFALYLGLNLPHPY

>SulfX_fru 8 exons 504 aa Scaffold_2094:15768-18243 SINFRUP00000086837 (62% Sulf_hsa)
0 MSVKLSALILLFLAFHQVLARNRTRPNFLVVMSDAF 0
0 DGRLTFDPGSKVVKLPFINYLRELGVTFINAYTNSPICCPSRA 1
2 AMWSGQFVHLTQSWNNYKCLDANATTWMDLLEVNGYLTKMMGKLDYTSGSHSvs 0
1 NRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPHPYKTESLGPTAGGSTFRTSPHWLEK 0
0 VSSEHVTVPKWLPGAAMHPVDFYSTFTKNCSGFFTEEEIMNIRAFYYAMCAEADAML 1
2 GQLISALRETHLLNNTVVIFTADHGELAMEHRQFYKMSMFEGSSHVPLLFMGPGLMSGVEADQLVSLVDIYPTVL 1
2 DLADVPPVGSLSGYSLLPLLSTCSSCPGRPHPDWVLSEYHGCNANASTYMLRSGRWKYIAYADGLRVPPQLF 1
2 DMILDKEELHNVVFKFSEVSAQLDKLLRSIVHYPEVSAAVHRYNKESFVAWRHTLGRNYSQVISSLRWHVDWQRNPLANERAIDEWLYGSF* 0 

>SulfX_tni 8 exons CONTIG_5131_1 + CONTIG_27630_1
...DGRLTFDPGSSVVKLPFITYLQELGVTFLNAYTNSPICCP...
...GQFVHLTQSWNNFKCLDSNATTWLDLLESICYRSMRICkRDYTSGSHS...
...VSSEHVSVPKWLPVAAMHPVDLYSTFTKKCSGCFTQEEITNVRAFYYAMCARSGCHA
GQLISALRETRLLGNTVVVFTADHGELAMEHRQFYKMSMFEGSSHVPLLFTGPGLMSGVQVNQLVSLVDIYPTIL
...GRWKYLAYADGLSVPPQLF
DLSLDKEELHNVVFKFTDVYAHLDKLLRSIVDYPAVSAAVHLYNKKAFVAWSQTLGRNYSQVISN...


>SulfX_hsa 1584 bp 8 exons
ATGCTACTGCTGTGGGTGTCGGTGGTCGCAGCCTTGGCGCTGGCGGTACTGGCCCCCGGAGCAGGGGAGCAGAGGCGGAGAGCAGCCAAAGCGCCCAATgatggaaggttaacatttcatccaggaagtcaggtagtgaaacttccttttatcaactttatgaagacacgtgggacttcctttctgaatgcctacacaaactctccaatttgttgcccatcacgcgcagCAATGTGGAGTGGCCTCTTCACTCACTTAACAGAATCTTGGAATAATTTTAAGGGTCTAGATCCAAATTATACAACATGGATGGATGTCATGGAGAGGCATGGCTACCGAACACAGAAATTTGGGAAACTGGACTATACTTCAGGACATCACTCCATTAGtaatcgtgtggaagcgtggacaagagatgttgctttcttactcagacaagaaggcaggcccatggttaatcttatccgtaacaggactaaagtcagagtgatggaaagggattggcagaatacagacaaagcagtaaactggttaagaaaggaagcaattaattacactgaaccatttgttatttacttgggattaaatttaccacacccttacccttcaccatcttctggagaaaattttggatcttcaacatttcacacatctctttattggcttgaaaaaGTGTCTCATGATGCCATCAAAATCCCAAAGTGGTCACCTTTGTCAGAAATGCACCCTGTAGATTATTACTCTTCTTATACAAAAAACTGCACTGGAAGATTTACAAAAAAAGAAATTAAGAATATTAGAGCATTTTATTATGCTATGTGTGCTGAGACAGATGCCATGCTTGgtgaaattattttggcccttcatcaattagatcttcttcagaaaactattgtcatatactcctcagaccatggagagctggccatggaacatcgacagttttataaaatgagcatgtacgaggctagtgcacatgttccgcttttgatgatgggaccaggaattaaagccggcctacaagtatcaaatgtggtttctcttgtggatatttaccctaccatgcttgATATTGCTGGAATTCCTCTGCCTCAGAACCTGAGTGGATACTCTTTGTTGCCGTTATCATCAGAAACATTTAAGAATGAACATAAAGTCAAAAACCTGCATCCACCCTGGATTCTGAGTGAATTCCATGGATGTAATGTGAATGCCTCCACCTACATGCTTCGAACTAACCACTGGAAATATATAGCCTATTCGGATGGTGCATCAATATTGCCTCAACTCTTTGatctttcctcggatccagatgaattaacaaatgttgctgtaaaatttccagaaattacttattctttggatcagaagcttcattccattataaactaccctaaagtttctgcttctgtccaccagtataataaagagcagtttatcaagtggaaacaaagtataggacagaattattcaaacgttatagcaaatcttaggtggcaccaagactggcagaaggaaccaaggaagtatgaaaatgcaattgatcagtggcttaaaacccatatgaatccaagagcagtttga

>SulfX_mmu 1662 bp 8 exons
ATGCTGTTGCTGTTGGTGTCGGTGGTCGCAGCGTTAGCACTCGCAGCACCGGCCCCCAGAACACAGAAGAAAAGGATGCAAGTGAACCAGGCGCCCAACGTGGTGCTGGTCGCCAGTGACTCCTTCgatggaagactaacatttcaaccaggaagtcaggtagtaaaacttcccttcattaacttcatgagagcacatggcaccaccttcctaaatgcctacactaattcacccatctgctgtccatcacgtgcagCAATGTGGAGTGGCCTCTTCACTCACTTGACAGAATCTTGGAATAATTTTAAGGGTCTGGATCCAAATTATACGACATGGATGGACATCATGGAGAAGCATGGCTATCAGACACAGAAATTTGGAAAAGTGGACTATACTTCAGGACATCATTCCATTAGtaaccgtgtggaagcatggacaagagatgttgcattcttgctccgacaagaaggcagacccataattaatcttatccctgataagaatagaaggagagtgatgaccaaggactggcagaatacagacaaagcaatcgaatggctaagacaggttaactacaccaagccatttgtcctttacttgggattgaatttgccacacccttacccttcaccatcttcaggagaaaactttggctcttctacgtttcacacttccctttactggcttgaaaagGTAGCTTATGATGCAATCAAAATCCCAAAGTGGTTGACTTTGTCACAAATGCACCCTGTGGATTTTTACTCCTCCTATACAAAAAACTGCACTGGGAAATTTACTGAAAATGAAATTAAGAACATTAGAGCATTTTATTATGCTATGTGTGCTGAGACAGATGCCATGCTAGgtgaaattattttggctcttcacaagttagatcttcttcagaaaactattgttatatatacctcagaccatggagagatggctatggaacaccgccagttttataaaatgagtatgtatgaagctagtgtccatgttcctcttctgatgatgggaccaggaattaaggccaacctacaagtaccaagtgttgtttctcttgtggatatctaccctactatgcttgACATTGCTGGGATTGCTCTGCCTCCAAATCTGAGTGGATACTCCTTGTTGACGCTGTTGTCAAATGCATCTGCAAATGAACAGGCATTCAAATTCCACCGTCCACCTTGGATTCTGAGTGAATTCCATGGATGCAATGCAAATGCTTCTACCTACATGCTACGAACTGGCCAGTGGAAGTACATAGCCTACGCTGATGGTGCTTCCGTGCAGCCTCAGCTCTTCGatctttccttggatccggatgagctaacaaacattgctacagaatttccagaaattacttattctttggaccagaagcttcgttctattgtaaactaccctaaagtgtctgcttctgtccatcagtacaataaagaacagtttatcatgtggaagcaaagcgtagggcaaaattactcaaacgttatagcacacctcagatggcatcaagattggcagagagatccaaggaagtatgaaaatgcaatccaacattggctcacagcccactccagtccactggctagcagcccaacccagtccaccagtggctcacagcccactcttccccagtccaccagtggctga

>SulfX_fru 8 exons 1569 bp Scaffold_2094
ATGTCGGTAAAATTATCAGCCCTGATTTTGCTTTTTCTGGCTTTTCATCAAGTTTTGGCCCGTAATAGAACCCGACCAAACTTTTTGGTGGTGATGAGTGATGCTTTTgacggacgattgacctttgaccctggcagcaaagttgtgaagctgccattcataaactacctccgagagcttggtgtcaccttcataaatgcttacacgaactcacccatctgctgcccctcccgcgcagCAATGTGGAGCGGTCAGTTTGTTCACCTCACGCAGTCGTGGAACAACTACAAGTGTCTTGATGCCAATGCGACAACATGGATGGATCTGCTGGAGGTGAATGGTTACCTTACCAAGATGATGGGTAAGCTGGACTACACCTCAGGGAGCCACTCTGTCAGcaatcgagttgaggcgtggacacgagacgttcagtttcttctgcgccaagagggccggcctgttacgcaacttgttgggaacatgtcaacagtcaggatcatgggaaaagactgggaaaacatagacaaggctacgcagtggatccagcagagagccgaatcctcacagcagccattcgctctctacctcggcctgaacctgcctcacccctataaaaccgaatccctggggccgacggcaggaggctccaccttccgtacctcaccacactggctggagaagGTGTCTTCTGAGCATGTCACTGTTCCTAAATGGCTTCCAGGCGCCGCCATGCACCCCGTGGACTTCTACTCCACCTTCACCAAAAACTGCAGTGGCTTTTTTACTGAGGAGGAAATCATGAACATACGAGCCTTCTATTACGCCATGTGTGCTGAAGCTGATGCCATGCTGGgtcagctgatctcagccctgagagaaacccatctgctcaacaacaccgttgttatattcacggctgaccacggcgaactggccatggagcaccggcagttctacaaaatgtccatgtttgaaggcagttcccatgttcccctcctcttcatggggccaggtctgatgtcaggcgttgaggccgaccagctagtttccctggttgatatatatcccaccgtcttggACCTTGCCGACGTTCCACCTGTCGGTAGTCTGAGTGGCTACTCGCTCCTTCCTCTGCTGTCCACGTGCAGCTCTTGTCCAGGCAGACCACATCCAGACTGGGTTCTAAGTGAATATCATGGTTGTAACGCCAATGCGTCTACCTACATGCTCAGAAGTGGTCGTTGGAAGTATATTGCTTATGCAGATGGCCTGAGGGTCCCTCCGCAGCTTTTTGatatgattctggacaaggaggaactgcataatgtagtcttcaaattctcagaggtgtctgcacagttggacaagctgttgcgtagcatagtgcactacccagaagtctcagcagccgtccaccggtacaataaagagtcgtttgttgcctggagacacactttagggagaaactacagccaagtcatttcgtccctcaggtggcacgtggattggcaaaggaatccattagccaatgagagagctatagatgagtggctctatggctctttttaa

>
>GNS_hsa 552aa 12q14
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNFEPFFMMIATPAPHSPWTAAPQYQKAFQNVFAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRKRWQTLLSVDDLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLYEFDIKVPLLVRGPGIKPNQTSKM

>GNS_chi 559aa goat
GKYLNEYGAPDAGGLGHVPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANVSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQNAFQNVFAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRERWQTLLSVDDLVEKLVKRLEFNGELNNTYIFYTSDNGYHTGQFSLPIDKRQLYEFDIKVPLLVRGPGIKPNQTSKM

>GNS_mmu 544aa mouse
GKYLNEYGAPDAGGLEHIPLGWSYWYALEKNSKYYNYTLSINGKARKHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQKAFQNVIAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIRFLDDAFRRRWQTLLSVDDLVEKLVKRLDSTGELDNTYIFYTSDNGYHTGQFSLPIDKRQLYESDIKVPLLVRGPGIKPNQTSKM

>GNS_rno 544aa rat
GKYLNEYGAPDAGGLEHVPLGWSYWYALEKNSKYYNYTLSINGKARRHGENYSVDYLTDVLANLSLDFLDYKSNSEPFFMMISTPAPHSPWTAAPQYQKAFPNVIAPRNKNFNIHGTNKHWLIRQAKTPMTNSSIKFLDDAFRRRWQTLLSVDDLVEKLVKRLDSTGELDNTYIFYTSdngyhtgqfslpidkrqlyesdikvpllvrgpgikpnqtskm

>KIAA1077_hsa 871aa 
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNILQRKRLQTLMSVDDSVERLYNMLVETGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSIVPQIVL

>KIAA1077_mmu 870 aa 
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPIMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSVERLYNMLVESGELDNTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSIEPGSIVPQIVL

>KIAA1077_rno floor plate
GKYLNEYNGSYIPPGWREWLGLIKNSRFYNYTVCRNGIKEKHGFDYAKDYFTDLITNESINYFKMSKRMYPHRPVMMVISHAAPHGPEDSAPQFSKLYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSVERLYNMLVETGELGNTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSIEPGSIVPQIVL

>KIAA1077_cco 867aa quail  
GKYLNEYNGSYIPPGWREWVGLVKNSRFYNYTISRNGNKEKHGFDYAKDYFTDLITNESINYFRMSKRIYPHRPIMMVISHAAPHGPEDSAPQFSELYPNASQHITPSYNYAPNMDKHWIMQYTGPMLPIHMEFTNVLQRKRLQTLMSVDDSMERLYQMLAEMGELENTYIIYTADHGYHIGQFGLVKGKSMPYDFDIRVPFFIRGPSVEPGSVVPQIVL

>KIAA1247_hsa 870aa chr 20q13.12 
GKYLNEYNGSYVPPGWKEWVGLLKNSRFYNYTLCRNGVKEKHGSDYSKDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPHGPEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWIMRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDSMETIYNMLVETGELDNTYIVYTADHGYHIGQFGLVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIVL

>KIAA1247_mmu 875aa chr 2
gkylneyngSyvppgwkewvgllknsrfynytlcrngvkekhgsdystDYLTDLITNDSVSFFRTSKKMYPHRPVLMVISHAAPHGPEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWIMRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDSMETIYDMLVETGELDNTYILYTADHGYHIGQFGLVKGKSMPYEFDIRVPFYVRGPNVEAGSLNPHIVL

>ARSA_hsa 507aa 22q13.33 metachromatic leukodystrophy 
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPATPCDGGCDQGLVPIPLLANLSVEAQPPWLPGLEARYMAFAHDLMADAQRQDRPFFLYYASHHTHYPQFSGQSFAERSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEETLVIFTADNGPETMRMSRGGCSGLLRCGKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLA

>ARSA_bta 505aa lower case 44aa missing, substituted human
GKWHLGVGPEGAFLPPHHGFHRFLGIPYSHDQGPCQNLTCFPPATPCEGICDQGLVPIPLLANLSVEAQPPWLPGLEARYVAFARDLMTDAQHQGRPFFLYYASHHTHYPQFSGQSFPGHSGRGPFGDSLMELDAAVGALMTAVGDLGLLGETLVFFTADNGPETMRMSHGGCSGLLRCGkgttyeggvrepalafwpghiapgvthelassldllptla

>ARSA_ssc 504aa pig lower case missing, substituted cow
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPSTPCDGSCDQGLVPVPLLANLSVEAQPPWLPGLEARYVAFARDLMADAQRQGRPFFLYyAsHHTHYPQFSGQSFSGHSgRGPFGDSLMELDASVGALMTAVGDLGLLGETLVIFTADNGPETMRMSHGGCSGLLrCGKGttFEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLA

>ARSA_mmu 506aa mouse
GKWHLGVGPEGAFLPPHQGFHRFLGIPYSHDQGPCQNLTCFPPDIPCKGGCDQGLVPIPLLANLTVEAQPPWLPGLEARYVSFSRDLMADAQRQGRPFFLYYASHHTHYPQFSGQSFTKRSGRGPFGDSLMELDGAVGALMTTVGDLGLLEETLVIFTADNGPELMRMSNGGCSGLLRCGKGTTFEGGVREPALVYWPGHITPGVTHELASSLDLLPTLA

>GALNS_hsa 522aa 16q24.3  
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARPNIPVYRDWEMVGRYYEEFPINLKTGEANLTQIYLQEALDFIKRQARHHPFFLYWAVDATHAPVYASKPFLGTSQRGRYGDAVREIDDSIGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLCGKQTTFEGGMREPALAWWPGHVTAGQVSHQLGSIMDLFTTSLALAGLTPP

>GALNS_mmu 520aa chr 8 mouse 
GKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKAKPNIPVYRDWEMVGRFYEEFPINRKTGEANLTQLYLQEALDFIRTQHARQGPFFLYWAIDATHAPVYASRQFLGTSLRGRYGDAVREIDDSVGKILNLLQNLGISKNTFVFFTSDNGAALISAPNEGGSNGPFLCGKQTTFEGGMREPAIAWWPGHIAAGQVSHQLGSIMDLFTTSLSLAGLKP

>KIAA1001_hsa 525aa 17q24.2 ten exons 
GKWHLGHHGSYHPNFRGFDYYFGIPYSHDMGCTDTPGYNHPPCPACPQGDGPSRNLQRDCYTDVALPLYENLNIVEQPVNLSSLAQKYAEKATQFIQRASTSGRPFLLYVALAHMHVPLPVTQLPAAPRGRSLYGAGLWEMDSLVGQIKDKVDHTVKENTFLWFTGDNGPWAQKCELAGSVGPFTGFWQTRQGGSPAKQTTWEGGHRVPALAYWPGRVPV

>KIAA1001_mmu 526aa
GKWHLGHHGSYHPNFRGFDYYFGIPYSNDMGCTDAPGYNYPPCPACPQRDGLWRNPGRDCYTDVALPLYENLNIVEQPVNLSGLAQKYAERAVEFIEQASTSGRPFLLYVGQAHMHVPLSVTPPLAHPQRQSLYRASLREMDSLVGQIKDKVDHVARENTLLWFTGDNGPWAQKCELAGSVGPFFGLWQTHQGGSPTKQTTWEGGHRVPALAYWPGRVPA

>STS_hsa 583 aa 10 exons  79 aa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQR

>ARSD_hsa 593 aa 10 exons 79 aa
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIEr

>ARSE_hsa 589 aa 10 exons  79 aa
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKR

>ARSF_hsa 590 aa 10 exons  79 aa
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLER

>ARSG_hsa 699 aa 11 exons 79 aa
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIER

>STS_mmu 624aa mouse bad RefSeq chr? 79 aa
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRR 

>STS_rno 578aa rat chrX:87654459-87661379 79 aa
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRR

>ARSE_bta cow Bos taurus 71% complete 80 aa
GKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVH

>SulfX_mmu 556 aa mouse 82% MPA N-trim BAB28703.1| AK01319 ::123::45 105
APNVVLVASDSFDGRLTFQPGSQVVKLPFINFMRAHGTTFLNAYTNSPICCPSRAAMWSGLFTHLTESWNNFKGLDPNYTTWMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAMLGEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVPLLMMGPGIKANLQVPSVVSLVDIYPTMLDIAGIALPPNLSGYSLLTLLSN ::12
ASANEQAFKFHRPPWILSEFHGCNANASTYMLRTGQWKYIAYADGASVQPQLFDLSLDPDELTNIATEFPEITYSLDQKLRSIVNYPKVSASVHQYNKEQFIMWKQSVGQNYSNVIAHLRWHQDWQRDPRKYENAIQHWLTAHSSPLASSPTQSTSGSQPTLPQSTSG ::45

>ARSB_hsa 533aa 5p11q13 MPSVI PDB:1AUK
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCTLIDALNVTRCALDFRDGEEVATGYKNMYSTNIFTKRAIALITNHPPEKPLFLYLALQSVHEPLQVPEEYLKPYDFIQDKNRHHYAGMVSLMDEAVGNVTAALKSSGLWNNTVFIFSTDNGGQTLAGGNNWPLRGRKWSLWEGGVRGVGFVASPLLKQKGVKNRELIHISDWLPTLVKLARG

>ARSB_mmu 491aa mouse from Ests
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIESLNGTRCALDLRDGEEPAKEYTNIYSTNIFTKRATPVIATHPPEKPLFLYLAFQSVHDPLQVPEEYMEPYGFIQDKHRRIYAGMVSLMDEAVGNVTKALKSHGLWNNTVFIFSTDNGGQTRSGGNNWPLRGRKGTLWEGGIRGTGFVASPLLKQKGVKSRELMHISDWLPTLVDLAGG

>ARSB_rno 533aa lower case missing, substituted mouse
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYTHEACAPIECLNGTRCALDLRDGEEPAKEYTDIYSTNIFTKRATTLIANHPPEKPLFLYLAFQSVHDPLQVPEEYMEPYDFIQDKHRRIYAGMVSLLDEAVGNVTKALKSRGLWNNTVLIFSTDNGGQTRSGGNNWPLRGRKGTLWEGGIRGAGFVASPLLKQKGVKSRELMHITDWLPTLVNLAGG

>ARSB_fca 535aa cat
GKWHLGMYRKECLPTRRGFDTYFGYLLGSEDYYSHERCALIDSLNVTRCALDFRDGEQVATGYKNMYSTNIFTERATALITSHPPEKPLFLYLALQSVHEPLQVPEEYLKPYDFIQDKNRHYYAGMVSLMDEAVGNVTAALKSHGLWNNTVFIFSTDNGGQTLAGGNNWPLRGRKWSLWEGGIRGVGFVASPLLKQKGVKNRELIHISDWLPTLVKLARG

>SulfY_hsa 574aa 2 exons chr4:125570802125648441 size 77640  strand polym fitK/R gen/EST
GKWHLGFYRKECMPTRRGFDTFFGSLLGSGDYYTHYKCDSPGMCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILASHNPTKPIFLYIAYQAVHSPLQAPGRYFEHYRSIININRRRYAAMLSCLDEAINNVTLALKTYGFYNNSIIIYSSDNGGQPTAGGSNWPLRGSKGTYWEGGIRAVGFVHSPLLKNKGTVCKELVHITDWYPTLISLAEGQIDE

>SulfY_mmu 573aa from mouse htgs AC091322 94% human 2 or 3 exons 
GKWHLGFYRKDCMPTKRGFDTFFGSLLGSGDYYTHYKCDSPGVCGYDLYENDNAAWDYDNGIYSTQMYTQRVQQILATHDPTKPLFLYVAYQAVHSPLQAPGRYFEHYRSIININRRRYAAMLSCLDEAIHNVTLALKRYGFYNNSIIIYSSDNGGQPTAGGSNWPLRGSKGTYWEGGIRAVGFVHSPLLKNKGTVCKELVHITDWYPTLISLAEGQIDE

>SulfZ_hsa 569aa NT_006951 ARSB type another chr 5 gene 2 exon, 4 glcyo QLLTGR end of exon1
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGENVAWGLSGQYSTMLYAQRASHILASHSPQRPLFLYVAFQAVHTPLQSPREYLYRYRTMGNVARRKYAAMVTCMDEAVRNITWALKRYGFYNNSVIIFSSDNGGQTFSGGSNWPLRGRKGTYWEGGVRGLGFVHSPLLKRKQRTSRALMHITDWYPTLVGLAGGTTSAA

>SulfZ_mmu 572aa 95% human AA123795 
GKWHLGFYRKECLPTRRGFDTFLGSLTGNVDYYTYDNCDGPGVCGFDLHEGESVACGLSGQYSTMLYAQRASHILARHNPQNPLFLYVAFQAVHTPLQSPREYLYRYRTMGNVAQRKYAGMVTCMDEAVRNITWALKRYGFYNNSVIIFSSDNGGQTFSGGSNWPLRGRKGTYWEGGVRGLGFVHSPLLKKKRRTSRALVHITDWYPTLVGLAGGTTSAA

>IDS_hsa 550aa Xq27.3q28 i
GKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDV
 
>IDS_mmu 563aa chr X 27 
GKVFHPGISSNHSDDYPYSWSFPPYHPSSEKYENTKTCKGQDGKLHANLLCPVDVADVPEGTLPDKQSTEEAIRLLEKMKTSGSPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPHVPDSLPPVAYNPWMDIREREDVQALNISVPYGPIPEDFQRKIRQSYFASVSYLDTQVGHVLSALDDLRLAHNTIIAFTSDHGWALGEHGEWAKYSNFDV
 
>SGSH_hsa 502aa 17q25.3 MPSIIIA 
GKKHVGPETVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSQPQYGTFCEKFGNGESGMGRIPDWTPQAYDPLDVLVPYFVPNTPAARADLAAQYTTVGRMDQGVGLVLQELRDAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPLLVSSPEHPKRWGQVSEAYVSLLDLTPTILDWFSIPYPSYAIFGSKTI

>SGSH_bta 505aa cow lower case missing, substituted pig needs fix
GKKHVGPEMVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFLQTRGDRPFFLYVAFHDPHRCGHSQPQYGAFCEKFGNGESGMGRIPDWTPQTYNPKDVQVPYFVPDTPAARADLAAQYTTIGRMDQGIGLVLQELRGAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPMLVSSPEHPKRWGQVSEAYVSLLDLTPTILDWFSIPphyYAIFGTKTV

>SGSH_ssc 505aa pig
GKKHVGPEAVYPFDFAHTEENDSILQVGRNITRMKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSHPQYGAFCEKFGNGESGMGWIPDWTPQTYNPQDVQVPYFVPDTPAARADLAAQYTTIGRMDQGIGLVLQELRGAGVLNDTLVIFTSDNGVPFPSGRTNLYWPGAAEPLLVSSPEHPQRWGQVSEAYVSLLDLTPTVLDWFSIPYPHYAIFGSKTV

>SGSH_mmu 502aa chr 11 Mus MPSIIIA
GKKHVGPETVYPFDFAFTEENSSVMQVGRNITRIKQLVQKFLQTQDDRPFFLYVAFHDPHRCGHSQPQYGTFCEKFGNGESGMGYIPDWTPQIYDPQDVMVPYFVPDTPAARADLAAQYTTIGRMDQGVGLVLQELRGAGVLNDTLIIFTSDNGIPFPSGRTNLYWPGTAEPLLVSSPEHPQRWGQVSDAYVSLLDLTPTILDWFSIPYPSYAIFGSKTI

>SGSH_clu 507aa AF217204 6108 bp 8 exons dog
GKKHVGPESVYPFEFAHTEENSSVLQVGRNITRIKLLVRKFLQTQDDRPFFLYVAFHDPHRCGHSQPQFGTFCEKFGNGESGMGRIPDWTPQTYDPLDVLVPYFVPDTPAARADLAAQYTTIGRMDQGVGLVLQELRGAGVLNDTLVIFTSDNGIPFPSGRTNLYWPGTAEPLLISSPEHRKRWGQVSEAYVSLLDLTPTILDWFSIPYPSYAIFGSKTV

>KIAA47/77_cel 709aa WP:CE04736 Sulf2_C.elegans K09C4.8 KIAA1077_mmu 51% CE04736 14 exons U43375
GKYLNEYDGSYIPPGWDEWHAIVKNSKFYNYTMNSNGEREKFGSEYEKDYFTDLVTNRSLKFIDKHIKIRAWQPFALIISYPAPHGPEDPAPQFAHMFENEISHRTGSWNFAPNPDKQWLLQRTGKMNDVHISFTDLLHRRRLQTLQSVDEGIERLFNLLRELNQLWNTYAIYTSDHGYHLGQFGLLKGKNMPYEFDIRVPFFMRGPGIPRNVTFNEIVT

>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE

>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa 
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID

>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa 
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE

>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

>ARSB_dm1 542aa Drosophila melanogaster 542aa 51%
GKWHLGHWKLKYTPLYRGFSSHWGLDMRNGTQVAYDLHGHYTTDVITDHSVKVIANHNATKGPLFLYVAHAACHSSNPYNPLPVPDNDVIKMSHIPNYKRRKFAAMVSKMDNSVGQIVDQLRKSNMLENSIIIFSSDNGGPAQGFNLNFASNYPLKGVKNTLWEGGVRAAGLMWSPLLKKSQRVSNQTMHIIDWLPTLLEAAGGQPALSNLSKQIDGQSI

>ARSB_dm3 996aa  Drosophila melanogaster 
GKWHLGFSRPEYTPTRRGFDYHFGYWGAYIDYFQRRSKMPVANYSLGYDFRRNMELECRDRGVYVTDLLTAEAERLIKDHADKEQPLFLMLSHLAAHTANEDDPLQAPEEEIQKFSYIKDPNRRKYAAMISKLDQSVGRIITALSSTDQLENSIVIFYSDNGAPSVGMFSNTGSNFPLRGQKNTPWEGGVRVAGAIWSSgLQARGSIFRQPLYVADWLPT

>ARSB_dm2 579aa Drosophila melanogaster 43% ARSB
GKWHLGFWRKDLTPTMRGFDHHFGYYNGYIDYYDHQVRMLDRNYSAGLDFRRDLEPCPEANGTYATEAFTSEAKRIIEQHDKSKPLFMVLSHLAVHTGNEDSPMQAPEEEVAKFPHIRDPKRRTYAGMISSLDKSVAQTIGALKDNGMLNNSIILLYSDNGAPTIGIHSNAGSNYPYRGQKESPWEGGIRSAGALWSPLLKERGYVSNQAIHAVDWLPTL

>ARSB_dm4 585aa Drosophila melanogaster 
GKWHLGLSQRNFTPTERGFDRHLGYLGAYVDYYTQSYEQQNKGYNGHDFRDSLKSTHDHVGHYVTDLLTDAAVKEIEDHGSKNSSQPLFLLLNHLAPHAANDDDPMQAPAEEVSRFEYISNKTHRYYAAMVSRLDKSVGSVIDALARQEMLQNSIILFLSDNGGPTQGQHSTTASNYPLRGQKNSPWEGALRSSAAIWSTEFERLGSVWKQQIYIGDLLP

>SGSH_dme 524aa Drosophila melanogaster
GKKHVGAANNFRFDFEQTEEQHSINQIGRNITRMKEYARQFLKQAKDEKKPFFLMVGFHDPHRCGHITPQFGEFCERWGSGEEGMGSIPDWKPIYYDWRNLDVPAWLPDTDVVRQELAAQYMTISRLDQGVGLMLKELEAAGVADQTLVIYTSDNGPPFPGGRTNLYEHGIRSPLIISSPNKEDRHHEATAAMVSLLDIYPSVMDALQIPRPNDTKIVGR

>IDS_dme 512aa Drosophila melanogaster
GKVFHPGLSSNNTDDYPLSWSAPAFRPRTEQFMNSPVCPDKEGILRKNLICPVELQTQPYKTLPDIESVAEALRFVGSRSRHSQEPFFLAMGFHKPHINFRFPRQFLSRFNLSQFYNYTEDSLKPPDMPAVAWNPYTDVRARDDFKHSNISFPYGPISPLQAAQIRQSYYASVSYVDDLFGKLIGGLDLDETVVVALGDHGWSLGEHAEWAKYSNFEVAL

>GNS_dme 492aa Drosophila C
GKYLNQYWGAGDVPKGWNHFYGLHGNSRYYNYTLRENSGNVHYESTYLTDLLRDRAADFLRNATQSSEPFFAMVAPPAAHEPFTPAPRHEGVFSHIEALRTPSFNQVKQDKHWLVRAARRLPNETINTIDTYFQKRWETLLAVDELVVTLMGVLNDTQSLENTY

>KIAA47/77_dme 1114aa Sulf1 Sulfated Drosophila melanogaster
GKYLNKYNGSYIPPGWREWGGLIMNSKYYNYSINLNGQKIKHGFDYAKDYYPDLIANDSIAFLRSSKQQNQRKPVLLTMSFPAPHGPEDSAPQYSHLFFNVTTHHTPSYDHAPNPDKQWILRVTEPMQPVHKRFTNLLMTKRLQTLQSVDVAVERVYNELKELGEL

>Sulf_ncr 639aa Neurospora crassa
GKLFNAHTVDNYDSPYIAGWNGSDFLLDPYTYSYLNATFQRNRDPPISYEGQYSVDVLAEKAYGFLDEAAKNVHNRPFFLGIAPIAPHSNVEPGFPSSSSSSSSSDSATLHRRPTNEHDDIEKSVSFTPPIPAARHAHLFPDVIVPRTPHFSRASGVSWIARLP

>Sulf_vca 649aa Volvox carteri 
GKFLVDYSVSNYQNVPAGWTDIDALVTPYTFDYNNPGFSRNGATPNIYPGFYSTDVIADKAVAQIKTAVAAGKPFYAQISPIAPHTSTQIYFDPVANATKTFFYPPIPAPRHWELFSDATLPEGTSHKNLYEADVSDKPAWIRALPLAQQNNRTYLEEVYRLRLRSLASVDELIDRVVATLQEAGVLDNTYLIYSADNGYHVGTHRFGAGKVTAYDEDLRVPFL

>Sulf_cre 647aa Chlamydomonas reinhardtii P14217 ARS_CHLRE Arylsulfatase GNS 30%
GKFLVDYSVSNYQQVPRAGTISMPXVTPYTFDYNTRLQRNGATPNIYPGEYSTDVIRDKGVAQIKSAVAAGKPFYAQISPIAPHTSTQISTNPATGVTRSYFFPPIPAPPHWQLFSDANLPGGSXNKNLYEVDVSDKPAWIRALPLAQQNNRTYQEEIYRLRLR

>ARSB_spo 554aa S.pombe
GKWHLGLTPDRYPSKRGFKESFALLPGGGNHFAYEPGTRENPAVPFLPPLYTHNHDPVDHKSLKNFYSSNYFAEKLIDQLKNREKSQSFFAYLPFTAPHWPLQSPKEYINKYRGRYSEGPDVLRKNRLQAQKDLGLIPENVIPAPVDGMGTKSWDELTTEEKEFSARTMEVYAAMVELLDLNIGRVIDYLKTIGELDNTFVIFMSDNGAEGSVLEAIPVL

>KIAA1077_fru 19 exons 864 aa 
GKYLNEYNGSYIPPGWREWVGLIKNSRFYNYTVCRNGYKEKHGGEYAKDYFTDLITNDSINFFRISKRMFPHRPVMMVISHAAPHGPEDSAPQYADHFPNASQHITPSYNYAPNMDKHWIMQYTGPMRPIHMEFTNFLHRKRLQTLMSVDDSVQK

>KIAA1247a_fru 20 exons 869 aa
GKYLNEYNGSYVPPGWKEWVALVKNSRFYNYTLCRNGVREKHSSDYPKDYLTDIITNESINYFRTSKRTYPNRPVMMVLSHVAPHGPEDSAPQYSSAFPNASQHITPSYNYAPNPDKHWILRYTGAMKPVHMQFTNMLQRRRMQTLLSVDDSVEK

>KIAA1247b_fru 22 exons 858 aa
GKYLNEYNGSYVPPGWKEWLGLVKNSRFYNYTLSRNGFREKHGAEYPQDYLTDLITAESMRYFRYSKRVYPHRPVLMVLSHAAPHGPEDSAPQYSTAFQNASQhiTPSYNYAPNLDKHWIMRYIGPMKPIHMEFTNVLQRKRLQTLLSVDDSVEK

>GNS_fru 543 aa (72% GNS_hsa) Scaffold_157:8104384785 SINFRUP00000081852
GKYLNQYGHAQAGGVEHIPPGWSFWVGLEKNSKYYNYTLSVNGKAQKHGSDYSKDYLTDVLANMSLEFLQYKSSYQPFFMMVSTPAPHSPWTAAPQYQNSFNGTKAPRDPNFNVHGKDKHWLIRQAKTPMSNSSVQFLDDAFRK

>GNSlike_fru 14 exons 568 aa 
GKYLNQYGQKDAGHVGHIPPGWDHWHALVGNSQYYNYSLSVNGKEEKHGDNYGDDYLTDLITNRSLTFLDNRSPQLPFFLLLSPPAPHAPWTAAPQHQKDYADIKAPRDGSFDKPGKDKHWLLRQPANPMTASSLTYLDNAYRKR

>GALNS_fru 14 exons 520 aa (69% GALNS_hsa)
GKWHLGHRPQYLPLEHGFDEWFGAPNCHFGPYNNSVRPNIPVYRNSWMLGRYYEEFKIDKKTGESNLTQMYLLEGLDFIQSQAEAQKPFFLYWAPDATHAPVYASKDFLGKSQRGR

>KIAA1001_fru bp 11 exons 527 aa 
GKWHLGHNGPYRPNRRGFDYYYGVPYSNDMGCTDVPGYNLPQCPPCDPPSGPsrSRHDGCYSKVALPLIENTTIVQQPLNLWRLTEQYKSAATRIIQNARAQGQPYFLYIALAHMHVPLAPPVGASATNDNKVYAASLQEMDDLVGAIKRISDETDRDNTLIWFT 1

>ARSA_fru 8 exons 501 aa
GKWHLGIGANGTFLPTRQGFDQYLGIPySHEMGPCQNLTCFPPDVKCFGLCDVGTVTVPLMYNEVIKQQPVNFLDLENAYRDFASDFISTSAKKRQPFFLYFPSHHTHYPQFAGPGAAGMSLRGPFGDALLELDNTIGSLLETLEGTGVLNNTLILFTSDNGPELMRMSRGGNAGPLRCGKSTTYEGGMREPAIAYWRGIIQP

>IDS_fru 12 exons 593 aa
GKVFHPGIASNHTDDYPYSWSIPPYHPASLHFEKQKMCKGDDGQLHANLLCAVNVTEQPGGTLPDLESTEEAIGLLKGRVQNTQPFFLAVGFHKPHIPFRIPQEYLSLYPIEKMTLAPDPIVPELLPPVAYN

>SGSH_fru 8 exons 499 aa
GKKHVGPGSVYPFDFAYTEENGSVLQVGRNITRIKLLVRKFFQAHKEDKVNSQEEERPFFLYVAFHDTHRCGHSQPQYGAFCEKFGNGEKGMGRIPDWKPVYYTPDQVKVPPFAPDTPVTRADLAAQYTTVSRLDQGIGLVLQELREAG

>SulfX_fru 8 exons 504 aa
WMDLLEVNGYLTKMMGKLDYTSGSHSvsNRVEAWTRDVQFLLRQEGRPVTQLVGNMSTVRIMGKDWENIDKATQWIQQRAESSQQPFALYLGLNLPHPYKTESLGPTAGGSTFRTSPHWLEKVSSEHVTVPKWLPGAAMHPVDFYSTFTKNCSGFFTEEEIMNIRAFYYAMCAEADAML

>SulfX_hsa 527a  
WMDVMERHGYRTQKFGKLDYTSGHHSISNRVEAWTRDVAFLLRQEGRPMVNLIRNRTKVRVMERDWQNTDKAVNWLRKEAINYTEPFVIYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVSHDAIKIPKWSPLSEMHPVDYYSSYTKNCTGRFTKKEIKNIRAFYYAMCAETDAMLGEIILALHQLDLLQKTIVIYSSDHGELAMEHRQFYKMSMYEASAH

>SulfX_mmu 556 aa mouse 82%  
WMDIMEKHGYQTQKFGKVDYTSGHHSISNRVEAWTRDVAFLLRQEGRPIINLIPDKNRRRVMTKDWQNTDKAIEWLRQVNYTKPFVLYLGLNLPHPYPSPSSGENFGSSTFHTSLYWLEKVAYDAIKIPKWLTLSQMHPVDFYSSYTKNCTGKFTENEIKNIRAFYYAMCAETDAMLGEIILALHKLDLLQKTIVIYTSDHGEMAMEHRQFYKMSMYEASVHVP

>SulfY_fru 2 exons 560 aa  
GKWHLGFYKRGCLPTQRGFDTFFGSLLGSGDHYSHYKCEAPGMCGYDLYEGEEAAWEQDRGLYSTVMFTQKAISILAKHDPHRKPLFLYLAYQAVHSPLQVPSRYLERYKGISNVHRRKYAAMVSCLDEAIRNLTLALKRYGYYDNTVLVYSSDNGGQPLLGGSNWPLRGSKASYWEGGIRAVGFVHSPLLRNKGTKCRSLIHITDWFPTLVSLGEGTLE

>sulfZ_fru 2 exons 579 aa
GKWHLGFYKKECLPTRRGFDTYFGSLTGSVNYYTYDSCDGPGMCGFDLHEGESVAWSQKGKYSTHLYTQRVRKILATHDPRSQPLFIFLSFQAVHTPLQCPREYIYPYRGLENIARRKYAAMVSAVDEAVRNITYGLRKYGYYENSIMIFSTDNGGQPLSGGSNWPLRGRKGTYWEGGVRGLGFIHSPLLRKKKRVSKALVHITDWYPTLVGLAGGKESH

>STS_fru revisited 75 aaone 22 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER

>ARSD_fru revisited 80 aa split exon 21 
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR

>KIAA47/77_cii 14 exons 854aa 
GKYLNEYNGSYIPQGWQYWMGLVRNSRYYNYSLRHNDVKESHRDNYRDDYFTDLIVNRSMTYFRRKKHEEPDSPILSVLSFPAPHGSEDGAPQYQHLYANVTSHITPSFDYGPNPDKHWIISSRKVPMDETQHRFSSILQQKRLQTLRSVDDAVDRFVSMLQDTGELDNTYLLYTSDHGFHIGQFGLAKGKSMPYDFDVRVPLFMRGPGIQAGLHVNEIILNID

>GNS_cii 11 exons 545aa 
GKYLNQYGGKSVGGPQHVPVGWNQWFGLVGNSKYYNYTISDNGVPVQHGANYHEDYLTDLLANRSVDFIHNHKMRYTQPFFMMISTPAPHSPWDSAPQYSKMYENNTAPHTPSYNTKAVNKHWLVRQATHPMTKESMDYSDNAFRSRWRALKSVDDLVERVINALSKMKQLDNTYVFFSSDNGY

>KIAA1001_cii 8 exons 500 aa 
GKWHLGITKAYHPCSRGFNYYYGLPYSNDMXCVDCDAYNHPQCKKCPKQSGITNDQAIECGNYDTALPLYENYDIIEQPANLVELGDRYVEKATLFIQQAKNKTQPFFLYVATAHTHVPLAYAKRFHNSTSHDTRYSDTLHELDDMIGRIMTSLKDNGLY

>GALNS_cii 14 exons 493aa 
GKWHLGQQEQYLPLKHGFHEWFGSPNCHFGPYDDKTTPNIPVYNNTEMVGRYYEEFAIESHKYLSNMTQYYIQEALDFIERMERNEKPFFLYWAPDATHSPVYSSPMFRGASRRGPYGDAVMELDYGVGVIIQKLKQLGLDKNTLVLFSSDNGAAMIGSA

>SGSH_cii 2 exons 507aa ????
GKKHVAPEAVYPFDFAETEENNSILQVGRNITRMKELAKQFFSMQLKNESFLLYIGFHDPHRCGHTHPQYGEFCEKFGNGDYRMGKIPDWKPDYYSPDDVIVPPFVQDTPASRKDISAQYTTISRLDQGVGLIINELKQAGFLESTLILFTSDNGIPFPNGRTNLYNSGTAGPFILALPVQKHKQAVV

>SulfZYB_cii 10 exons 512aa  
GKWHLGFSSSKYAPWNRGFHGFYGFLAGSENYWSKWLPMARHSNIGGVDFTDSTTGPTNETWGQYSAHVYASRARYVIQHHDQSKPLFLYLPLQTPHTPLGAPSHYYEPFKDIEDDDRMKYLSMVSVLDETVRNVTNYLKDAGMWEDTLLIFSTDNGGEV

>sulfZ/Y_cii 1 exon 517aa
GKWHLGFYKDEYLPWKRGFNSYFGYLTGGEDYYTKWRCDGKLCGYDMTSEKGPTNATYGQYSANLFANKANEAIDKHDKTKPLFLYVAFQSVHSPMEVPESYAKPFDYIKNHNRKMYGGMVAAMDEAVKNITEHLQAAGLWDNTILVFSADNGGQTLSGGNNWPLRGRKLTLWEGGIKGVGFVHGKILNVPNPNYIVNNEMIHISDWFPTIMEATQCPYV

>sulfY/Z_cii 10 exons 562aa
GKWHLGFFREEYLPWNRGFQNFFGFLNGGVNHFTRYHCEPKKTRRFCGYDMIDSRYGPTNATYGEYSTNLFIRKSKEMIDKHNKQKPMFLYLSLQAVHGPLQVPNQYLKRFKHIRDKNRRIYAGMVYAMDRGIRQLVKHLKRARMWKNTIFIFSTDNGGQTTRGGNNWPLRGKKGTLWEGGIRGVGFVHGKPLQVTTPRVNKELLHVSDWYPTIMSATHC

>ARSB2_cii 1 exons 522aa
GKWHLGFYKKECLPTSRGFDTFYGYYCGAEDYYTKQVHANFHFGNKTRRVSGFDFHDNSRTEWEANGTYSSYLYRDRAVRIIKSHNSSIPLFMYLPFQSVHFPLQVPAKYIKRYRHIKDRKRRTFSAMVTAMDEAIGSVVDALKWKGIWQDTLVVFTTDNGGQTLFGGNNWPLRGRKASLWEGGVRGVGLVRGYGIRDKGRSSNELVHISDWFPTLLYIA

>ARSB_cii 12 exons 492aa
GKWHVGYCDEAYTPTRRGFDSHYGFYNSGISYSNYSSTEGTDVGYDYRDDLALNLAAEGKYTTTDFTDQAKTLIDNHDQTNPMFLYMAYNAPHTPFEVEESYRDIYDGNLRDGNRKTYLGMISALDEQVGQLVDKLKEVGMWSNTVFVFYSDNGGTQPQSGQSGNNFPLRGKKGSLFEGGYRLIARTRAGNLELIASTSSTLFHISDMFATFIALAGGDA

>IDS_cii 11 exons 504aa 
GKVFHPGICSNYNDDFPLSWSLPACHPPTQKYKMKQVCPGPDGKLHMNLLCPVNVSTQPEHSLPDIQSAGHAVEMIRKFSNNKSQPFFLAVGFHKPHIPYKFPEQYLDLYPISEIDLAPNPFIPKELPPVAFSPYTQMRIREDVKSLNLSFPFGPIPYDFQRKIRQHYYSSVTYMDSMVGKVLQQLEQSG

>STS_cii 21 aa 47% ARSE/D/STS
GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFIT

>STS2_cii 22 aa 34% STS/E/F/D
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS


new sts sequences:



>ARSD_gga chicken Gallus gallus, no STS_gga available

SLPGRQRSCLSIWLILCLFLRSCVSSPPKPNFLLILADDLGIGDVGCYGNDTIRTPHIDG
LAKEGVRLTQHIAAAAVCTPSRAAFLTGRYPIRSGMASSTERRILFWNGCSGGLPPNETT
FARVLHQQGYSTALVRNKHRPFLLFLSLLHVHTPLITTKEFLGRSRHGLYGDNVEEMDWM
VGRLLDVIDKEGLKNTTFIYFASDHGGSVEAHRGNVRLGGWNGIYKGKILVACTYIKTVY
YKESTE

>STS_hsa 583aa Xp22.31 steroid sulfatase ichthyosis PMID:10607842::123::45
RPNIILVMADDLGIGDPGCYGNKTIRTPNIDRLASGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASWSRTGVFLFTASSGGLPTDEITFAKLLKDQGYSTALIGKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVHTALFSSKDFAGKSQHGVYGDAVEEMDWSVGQILNLLDELRLANDTLIYFTSDQGAHVEEVSSKGEIHGGSNGIYKGGKANNWEGGIRVPGILRWPRVIQAGQKIDEPTSNMDIFPTVAKLAGAPLPEDRIIDGRDLMPLLEG ::12
KSQRSDHEFLFHYCNAYLNAVRWHPQNSTSIWKAFFFTPNFNPVGSNGCFATHVCFCFGSYVTHHDPPLLFDISKDPRERNPLTPASEPRFYEILKVMQEAADRHTQTLPEVPDQFSWNNFLWKPWLQLCCPSTGLSCQCDREKQDKRLSR ::45

>ARSD_hsa_ins 
TLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCI
   
>ARSE_hsa_ins  
SLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCF

>ARSF_hsa_ins  
TLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCL

>STS_hsa_ins  
TNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCF

>STS_mmu_ins  
TNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCF
   
>STS_rra_ins  
TNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCF

-=-=-=-=  reference set including exonis flanking =-====--

>STS_hsa 583 aa 10 exons  79 aa
GKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQR

>ARSD_hsa 593 aa 10 exons 79 aa
GKWHQGVNCASRGDHCHHPLNHGFDYFYGMPFTLTNDCDPGRPPEVDAALRAQLWGYTQFLALGILTLAAGQTCGFFSVSARAVTGMAGVGCLFFISWYSSFGFVRRWNCILMRNHDVTEQPMVLEKTASLMLKEAVSYIEr

>ARSE_hsa 589 aa 10 exons  79 aa
GKWHLGLNCESASDHCHHPLHHGFEHFYGMPFSLMGDCARWELSEKRVNLEQKLNFLFQVLALVALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHADCFLMRNHTITEQPMCFQRTTPLILQEVASFLKR

>ARSF_hsa 590 aa 10 exons  79 aa
GKWHQGLNCDSRSDQCHHPYNYGFDYYYGMPFTLVDSCWPDPSRNTELAFESQLWLCVQLVAIAILTLTFGKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSPLYWDCLLMRGHEITEQPMKAERAGSIMVKEAISFLER

>ARSG_hsa 699 aa 11 exons 79 aa
GKWHLGLSCASRNDHCYHPLNHGFHYFYGVPFGLLSDCQASKTPELHRWLRIKLWISTVALALVPFLLLIPKFARWFSVPWKVIFVFALLAFLFFTSWYSSYGFTRRWNCILMRNHEIIQQPMKEEKVASLMLKEALAFIER

>STS_mmu 624aa mouse bad RefSeq chr? 79 aa
GKWHLGLSCRGATDFCHHPLRHGFDRFLGVPTTNLRDCRPGAGTVFGPALRVFAAGPLAALGASLAAMAAARWAGLARVPGWALAGTAAAMLAVGGPRSASCLGFRPANCFLMDDLAVAQRPTDYGGLTRRLADEAALFLRR 

>STS_rno 578aa rat chrX:87654459-87661379 79 aa
GKWHLGLSCQAASDFCHHPGRHGFDRFLGTPTTNLRDCKPGGGTVFGSAQQVFVVLPMNILGAVLLAMALARWAGLARPPGWVFGVTVAAMAAVGGAYVAFLYHFRPANCFLMADFTITQQPTDYKGLTQRLASEAGDFLRR

>ARSD_ssc pig Sus scrofa 73% nearly complete 79 aa
PSTLGSECHPGWPPQVGEALGGRLWLSTQMMALGVLTGAAGKTLGLVSVPWKFVWGAASLVLLFFGSWFASLGVLRRWNCILMRNHDVVEQPMALESTARLLSGEALSFIQR

>ARSE_ssc pig Sus scrofa 73% partial 59 aa
GKWHLGLNCESSEDHCHHPLNHGFDLFYGMPFSMMGDCLPSDISEKRVILERQVNVCCHIVALAALTLALGKLTRLTPGSWTPVVCSALAA

>ARSE_bta cow Bos taurus 71% complete 80 aa
gKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKR

>STS_bta cow Bos taurus 82% partial 3 aa
GKWHLGISCHDPGDFCHHPTSHGFDYFHGLPLTNM

>ARSD_gga chicken Gallus gallus 60% nearly complete 79 aa
GVNCKSHRDHCHHPLNHGFEYFYGMSFTILNECQGTDDPELAKSSQDTYWLYTQIIFIAVLTLFvGKLTHLFSVKWKIIVCVTIFGLLYFLSWFSSYGFTKYWNCIMMRNHEITEQPMNLDKTTSNMLKEAVSFIER

>ARSD_xla_frag Xenopus laevis CA793586 59 aa
GKWHLGVNCRSRDDFCHHPLNHGFDYYYGLLYTLINDCQASMPSEIHVAFRAQLLFYAQLFAVTLLTAMVTKPNGILLQVSWKSSWPIFCPPSGE

>STS_xla_frag Xenopus laevis 63 aa
ptrpptrpaGAAKYIKATFQISFLALFTLVLISYSGLLNVPWKLIFYIVSVTSLLLGAVIFFFWNFQYLNCVLMRNDKIIQQPLVFDNLTQRITREALQYIKS
 
>ARSD_str_frag Silurana tropicalis 59 aa
GKWHLGLNCASRDDFCHHPNSHGFNYFYGMPFSLYSGCKPGSIPESPNSPKQQLSFVTQIIGFGVLTLTALKYSKILAINGKFLVSCAVFD

>STS_fru revisited 75 aa genomic frameshift, one 2-2 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER

>ARSD_fru revisited 80 aa split exon 2-1 Actinopterygii
GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR

>STS_dre_frag zebrafish Danio rerio 76 aa BQ258682  
GKWHLGLNCEDSSDHCHHPNSHGFHYFYGTIMTHLRDCQPGHGSVILYNVYSHIPFKPLSIGLVSLVLHIRGMLTVSRRVFFSFLILVGLVLSLFRLLVYTFPNLNCFVMRGTEIVEQPYISENLTQRMTSEAIEFLER 

>ARSD_dre_frag zebrafish Danio rerio 5 aa CA377054 
GKWHLGVNCELRGDHCHHPNTHGFSYFYGLPFTLLND

>ARSD_omy_frag Oncorhynchus mykiss 40 aa CA386455 
rvcgllevSVSVIVAVTCLSLLAFSVWFVPFELLMTWNCIIMRNQEVVEQPMDLDTLSQRLLGEAQGFIER

>STS_cii 21 aa 47% ARSE/D/STS
GKWHLGINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSEYPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFIT

>STS2_cii 22 aa 34% STS/E/F/D
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS

>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE

>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa 
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID

>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa 
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE

>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

=-=-=-=-=-=-=-=-

Summary: It is likely that the insert arose only once in the ancester to STS/ARSD sulfatases and subsequently propagated through speciation and gene duplication. Since tunicate, sea urchin, and nematode sulfatases in the STS class all have short (21-23 aa) intervening sequences similar to non-insertional sulfatases [need what happens elsewhere beteeen anchors], the most parsimonious scenario has



-=-=-=-= end of reference set including exonis flanking =-=-=-=-=--
 

>STS_cii 21 aa according to ESTs BW300043 BW047068; exon breaks differently, after GKWHL
MLKFPLWLLLIILVNQVSSRPNFVLIFADDVGYGDFQSYGHPTQERGPIDDLAAEGMRFTQWYSAASLCTPSRAALLT
GRLPIRSGMVGPTRVLHQNDAGGLPKNETTLAEALKDLGYKTGMVGKWHL
GINELKQNDGRHLPKHHGFDFVGTNLPFTFHLFCSPSE
YPVDKMKIKCFLSNKDEIIEQPIIPEKLTDKIVEGAKQFITENQKNPFFLYLSLPQTHVAMFCKEEFCNKSMRG
SYGDNVNEMSWAVGEVVNQLKDLNLDRNTLVMFLSDHGPAVEFCYT
GGSTGGLKGGKASS
WDGGIKVPAVAWWPGTIQPGVKTQVVSTMDIFPTFLQLAGN
SDDQAYYMMLLDYDITQVMKETMETSMECQSPIFF
cNEGNNGNLDGMSISDLLLSNHDNEVHEILFHYCSDRLMAVRYGRYK
IHFHTQHLHVFNSNCIDGKALENIVDYFDCYANTTTHNPPLIFDINTDPEELFPLEAAPRAHIIEEVEKQVAKHQKTIKPVASQLGRHGKDLQPCCNPPSCVCNYPNPDKR
 

>STS2_cii 8 exons 483aa ci0100146549 best 34% STS_cii 43%
PNFIVIMADDIGYGDFQSFGHPTQEYGGVDRMVKEGMRFTQWTSAATLCSPSRAALLTG
RYAIRSGLRGDVAPVFQPQSV
GGLPRKEITIAESLKALGYRTGLVGKWHL
GINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAE
FTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFISNNRLNSFFLYFSPPQAHRALFCAERFCGRSKRG
PYGDTINEMSSAISDILDHLVQLEIDDNTLVIFLSDHGPNSDKCPDGGVPGLFKGTGKGTT
WEGGLRVPAVAWWPGVI
PAGTVSNAVVSTLDVHPTLLKIAALQNQKPIPSKLFDGIPIPDLICSMKHQRTSSCLSTPSNR
ILFHYCGEDILAVRYG
DLKFHFKSNPPLQRRSNCVRTVTADLIRTFSCGKR
THDPPLVFNLLIDPSEEIPLNISHYSEELSEVQRLIRKHKRSIKEVPAQYSPNVPEVQPCCNPPSCICNY


>STS2_cii 8 exons 483aa  21 aa N
GKWHLGINRNTSTDGYHLPHNHGFDFVGTNLPLSHSEMCNPAEvcfmyyciaclinrvtclnkyeiknlfiyfqFTVEELSTMCFLYNGSTIVEQPVNLSTLTDRITSDAKNFIS

atagggatcaaccgaaatacatcaacagatggttatcatttaccacataatcatggcttt
 I  G  I  N  R  N  T  S  T  D  G  Y  H  L  P  H  N  H  G  F 
gactttgttggcaccaaccttcctttatctcattcggagatgtgtaatccagcagaggtc
 D  F  V  G  T  N  L  P  L  S  H  S  E  M  C  N  P  A  E  V 
tgttttatgtattattgtatcgcatgcttaataaatagggtaacctgcttaaataaatat
 C  F  M  Y  Y  C  I  A  C  L  I  N  R  V  T  C  L  N  K  Y 

atatgtaaaagctttttttaaattaagaatttgtttatttattttcagtttactgtggag
 I  C  K  S  F  F  -  I  K  N  L  F  I  Y  F  Q  F  T  V  E 
gaattgtccacaatgtgtttcctttacaacggttctactatagtggaacaacctgtcaat
 E  L  S  T  M  C  F  L  Y  N  G  S  T  I  V  E  Q  P  V  N 
ttaagcacattaactgacagaataacaagtgatgcaaagaattttatatca
 L  S  T  L  T  D  R  I  T  S  D  A  K  N  F  I  S  

>ARSE_hpu 551aa sea urchin Hemicentrotus pulcherrimusa 22 aa
GKWHLGINENSSTDGAHLPFNHGFDFVGHNLPFTNSWSCDDTGLHKDFPDSQRCYLYVNATLVSQPYQHKGLTQLFTDDALGFIE

>ARSE_her 559aa sea urchin Heliocidaris erythrogramma 22 aa 
GKWHLGINEQTSTDGAHLPFNHGFEYVGYNLPFTNSWNCDDTGLHVDFPNTEKCYLYKNATLVSQPYQHRNLTKLFTDDAIEFID

>ARSE_spu 567aa sea urchin Strongylocentrotus purpuratus 22 aa 
GKWHLGINENSSSDGAHLPANRGFDFVGHNLPFGNSWRCDDTGLHQDFPDTNACFLYYNSTSVAQPFQHKGLTQLLRDDTVGFIE

>ARSE_cel 452aa Sulf1_C.elegans 23 aa
GKWHLGINENNATDGAHLPSKRGFEYVGVNLPFTNVWQCDTTREFYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

>ARSE_cbr 421 aa Sulf1_C.briggsae 23 aa
GKWHLGINENNATDGAHLPSKRGFDYVGVNLPFTNVWQCDTTKEYYDKGPDPSLCFLYDGDDIVQQPMKFEHMTENLVGDWKRFLM

-=-==-  working on above ==-=-==-
>STS_dre screwed up with fugu
0 mplr 2
1 kmkIPCTFCLLLYTADAGSGTKPNFVFMMVDDLGIGDLGCYGNTTLR 2
1 TPNIDRVALEGVKLTQHIVXAPLCTPSRAAFLTGRYPVRS 1
2 GMAAHGHMGVFLFSASSGGLPQEEITFAKAVKVQGYSTAVI 1
2 GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLL 2
1 NSARPFLLFFSFLQVHTAMFASAAFRATSQHGIYGDAVHEVDWSV 1
2 GQIMQALDKFNLKDDTLVYLTSDQGGHVEEISATGVVQGGWNGIYK 1
2 AGKATNWEGGIRVPGILRWPGKIPGGRKIDEPTSNMDLFPTVVQLSGASVPLDR 2

>STS_dre_frag zebrafish Danio rerio
MKSFQWIPCTFCLLLYTADAGSGTKPNFVFMMVDDLGIGDLGCYGNTTLRTPNIDRLALEGVKLTQHIAAAPLCTPSRAAFLTGRYPIRSGMAAHGHMGVFLFSASSGGLPQEEITFAKAVKVQGYSTAVIGKWHLGLNCEDSSDHCHHPNSHGFHYFYGTIMTHLRDCQPGHGSV 

>ARSD_dre_frag zebrafish Danio rerio 5 aa
CLLALLLLLDTGSDVTASEEDRKPNFVLMMVDDLGIGDIGCYGNDTIRTPNIDRLAAEGVKLTQHIAAAPLCTPSRAAFHTGRYALRSGLGSTGRVQVLLFLGGSGGLPPTETTFAKRLQEQGYTTGLVGKWHLGVNCELRGDHCHHPNTHGFSYFYGLPFTLLND

>ARSD_omy_frag Oncorhynchus mykiss 
RVCGLLEVSVSVIVAVTCLSLLAFSVWFVPFELLMTWNCIIMRNQEVVEQPMDLDTLSQRLLGEAQGFIERNADRPFLLFLSLAHVHTPLFSSPGFAGKSRHGLYGDNVEEVDYMIGRMTETVDRLGLANNTMMYFTSDHGGHIEDADERGQKGGWNGIYKGGKAMGGWEGGIRVPGIFRWPGRLPAGRVFDTPTSLMDLYPTLTHLAGATHSDRLLDGYDLMPVLEGRTERSQHEF

>STS_fru revisited 75 aa genomic frameshift, one 2-2 exon
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLLYFSGLISSAEKGPFAFWLQRFWSCSFIVGFIMIIPLFNCVLMKDHSIVEQPFVSENLTQRMTREAVDFIER

agctttacatcatcgtattttctgttgttcagggaaatggcaccttggactcaactgtgag
                                G  K  W  H  L  G  L  N  C  E 
agcagagatgatcactgccaccaccccaatgctcacggctttaactatttttttgggatc
 S  R  D  D  H  C  H  H  P  N  A  H  G  F  N  Y  F  F  G  I 
ccgttgaccaacctccgggactgccagccaggacatggtacggtctttcagatccataag
 P  L  T  N  L  R  D  C  Q  P  G  H  G  T  V  F  Q  I  H  K 
tacctaccgtacaggacgctaggcaccgtgttggcttctacagtcttactgtacttcagt
 Y  L  P  Y  R  T  L  G  T  V  L  A  S  T  V  L  L  Y  F  S 
ggactcatcagttcagcagaaaaaggaccttttgccttctggctgcagcggttttggtcgt
 G  L  I  S  S  A  E  K  G  P  F  A  F  W  L  Q  R  F  W  S 

gtacctaccgtacaggacgctaggcaccgtgttggcttctacagtcttactgtacttcag
 V  P  T  V  Q  D  A  R  H  R  V  G  F  Y  S  L  T  V  L  Q 
tggactcatcagttcagcagaaaaaggaccttttgccttctggctgcagcggttttggtc
 W  T  H  Q  F  S  R  K  R  T  F  C  L  L  A  A  A  V  L  V 
gtgagtttcatagtaggattcatcatgattatcccgttattcaactgtgttctcatgaag
 V  S  F  I  V  G  F  I  M  I  I  P  L  F  N  C  V  L  M  K 
gaccacagcattgtggagcagccgtttgtatcagaaaatctgacccaaaggatgacgcgt
 D  H  S  I  V  E  Q  P  F  V  S  E  N  L  T  Q  R  M  T  R 
gaggccgtggacttcatagaaaggtagagactgtgaaaggttgcctctgaactgctattg
 E  A  V  D  F  I  E  R     

>STS_fru 9 exons 496 aa Scaffold_4788:7785-13384 gene missing (61% STS_hsa) insert missing 
GKWHLGLNCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGTVLASTVLL 2
1 NSARPFLLFFSFLQVHTAMFASAAFRATSQHGIYGDAVHEVDWSV 1
2 GQIMQALDKFNLKDDTLVYLTSDQGGHVEEISATGVVQGGWNGIYK 1
2 AGKATNWEGGIRVPGILRWPGKIPGGRKIDEPTSNMDLFPTVVQLSGASVPLDR 2

>ARSD_fru 10 exons 500 aa Scaffold_771:84193-88005 (58% ARSD_hsa) 88500--84180 first exon unproven
2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCV 2
1 NVDRPFLLFFSMAHIHTPLFRNPAFSGKSLHGLYGDNIEEVDWMI 1
2 GKMTETVDSLGLANNTLMYFTSDHGGHLEDSNSRVGQQGGWNSIYR 1
2 GGKAMGGWEGGIRVPAIFRWPGRLAPGRVVHEPTSLMDLYPTLKYLAGDTQPDR 2

87153 2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGS DILADLQKTLRSFTIFLGIGLATLvrlivvfqasfyrlag  
                                                                          2    00      2
hlshfqalvrrcGLLDISLRLLVVLFFISILATVLWLTPF KFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR 86231 gt phase 2
        2        1

>ARSD_fru revisited 80 aa split exon 2-1
KWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCVPGEGSDILADLQKTLRSFTIFLGIGLATLVRLIVVFQASFYSLRLLVVLFFISILATVLWLTPFKFIPTWNCILMRNQEVVEQPMVVETLPRRLLTEAQQFIKR


2 gtaagtggcacttgggagtgaactgtgagcgcagaggggaccactgccaccacccaaaccagcacggcttcagctacttctatggcctccccttcaccctgttcaacgactgtgtgcctggggagggcagcgacatcctggcagacctgcagaaaacgctccgaagcttcaccatttttctgggcatcggactggcaacactggtacggctcatcgttgttttccaggcctctttttacag  cctgcggttgctggtggtgcttttttttatcagtatcctggcaaccgttctgtggttgacgccttttaaattcatcccgacctggaactgcatcctcatgagaaaccaggaggtggtcgagcagccgatggtggtggagacgctgccccggagactgctgacggaagcccaacagtttattaaaag 2

>STS_tni 496 aa 9 exons 86% fugu CONTIG_8306_1 + CONTIG_22961_1
2 GKWHLGLSCESRDDHCHHPNAHGFNYFFGIPLTNLRDCQPGHGTVFQIHKYLPYRTLGIIFASTVLLS 2
1 NSAKPFLLFFSFLQVHTAIFASAAFRGTSQHGIYGDAVHEVDWSV 1
2 GQIMETLDRFNLGDHTLVYLTSDQGGHVEEISAAGVVQGGWNGIYK 1

>ARSD_tni 500 aa CONTIG_24273_2 first exon unproven 2 GC-AG
2 GKWHLGVNCERRGDHCHHPNQHGFSYFYGLPFTLFNDCV 2
1 NVDRPFLLFFSMVHVHTPSLQKPAFAGKSLHGLYGDNIEEVDWMI 1
2 GKMTETVDSLCLANNTLMYFTSDHGGHIEGSTSRASQQGGWNSFYK 1

>STS_dre 117aa zebrafish BG799792, to human 78% 5-116
FLLIMADDLGIGDLGCYGNRTLRTPRTPHIDRLALEGVKLTQHLAAAPLCTPSRAAFLTGRYPVRSGMASHGRLGVFLFSASSGGLPPNEVTFAKLLKGQGYTTGLV
GKWHLGLSCQ

=--=-=-=-=-=-=-=-=-=-==

>ARSd_gga chicken Gallus gallus ???

>ARSD_xla_frag Xenopus laevis
MALIPAVFFLLWASTASQAHGNKPNFVLLMADDLGIGEVGCYGNNTLRTPNIDRLAREGVKLTHHIAASSLCTPSRAAFLTGRYPIRSGMTGHDGGYLVLMWSAVSGGLPTNETTFAKILQEQGYTTGII
GKWHLGVNCRSRDDFCHHPLNHGFDYYYGLLYTLINDCQASMPSEIHVAFRAQLLFYAQLFAVTLLTAMVTKPNGILLQVSWKSSWPIFCPPSGE

>STS_xla_frag Xenopus laevis 63 aa
ptrpptrpaGAAKYIKATFQISFLALFTLVLISYSGLLNVPWKLIFYIVSVTSLLLGAVIFFFWNFQYLNCVLMRNDKIIQQPLVFDNLTQRITREALQYIKS
NKDTPFLLFVSYVQVHTALYASQDFIGKSNHGIYGDATEEVDWSVGELLNELDRSHLQNKTVVYFTSDNGAHLEEISSSGEVHGGC
 
>ARSD_str_frag Silurana tropicalis 59 aa
LYLRTNSQLLLQFSSHRTPNIDRLAKEGLKLTQHISAAPLCTPSRAAFMTGRYPIRSGMDFSNGFRVIVSAAVSAGLPSNETTFATILQQQGYSTGLIGKWHLGLNCASRDDFCHHPNSHGFNYFYGMPFSLYSGCKPGSIPESPNSPKQQLSFVTQIIGFGVLTLTALKYSKILAINGKFLVSCAVFD
 
>ARSD_gga chicken Gallus gallus 60% nearly complete
GVNCKSHRDHCHHPLNHGFEYFYGMSFTILNECQGTDDPELAKSSQDTYWLYTQIIFIAVLTLFvGKLTHLFSVKWKIIVCVTIFGLLYFLSWFSSYGFTKYWNCIMMRNHEITEQPMNLDKTTSNMLKEAVSFIERNKHRPFLLFLSLLHVHTPLITTKEFLGRSRHGLYGDNVEEMDWMVGRLLDVIDKEGLKNTTFIYFASDHGGSVEAHRGNVRLGGWNGIYKGKILVACTYIR

>ARSD_ssc pig Sus scrofa 73% nearly complete
...PSTLGSECHPGWPPQVGEALGGRLWLSTQMMALGVLTGAAGKTLGLVSVPWKFVWGAASLVLLFFGSWFASLGVLRRWNCILMRNHDVVEQPMALESTARLLSGEALSFIQRHKPGPFLLFVSLLHVHVPLMTTKEFQGKSQHGLYGDNVEEMDGLVGDILNAIEEHGLKNTTLTYFTSDHGGHLEAIDGHVQLGGWNGIYRGGKGMGGWEGGIRVPGIFRWPGVLPAGRVIQEPTSLMDVFPTVVQLGGGQVPQDRVIDGRSLVPLLQGETEHSAHEFLFHYCGEHLHAARWHDKDSGRLW

>ARSE_ssc pig Sus scrofa 73% partial
MLSFRSGLALTIGVLLGSKPSAYGDLSASRPNILLLMADDLGIGDLGCYGNHTIRTPNIDRLAADGVMLTQHLAAASLCTPSRAAFLTGRYPLRSGMVSSTGSRVLQWVAASGGLPPNETTFAKILKDKGYVTGLVGKWHLGLNCESSEDHCHHPLNHGFDLFYGMPFSMMGDCLPSDISEKRVILERQVNVCCHIVALAALTLALGKLTRLTPGSWTPVVCSALAA

>ARSE_bta cow Bos taurus 71% complete
gKWHLGLSCASPDDHCHHPLNHGFDHFYGMPFSMMADCERWELSEKRAVLESRLDVCFQLVALATLTLTIGKLTHLIPGASWTLVIWSAVVCLLLFATSCLVGALIMHADCFLMRNHSIAEQPMRSQRTTPLMLQEVSSFVKRHKQGPFLLFVSFLHVHTPLVTT

>STS_bta cow Bos taurus 82% partial
MAWDMMTLLLLLLFLCEAQSRAASKPNFVLLMADDLGIGDPGCYGNKTLRTPNIDRLARGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASQSQVGVFLFSASSGGLPPSEITFAKLLKDQGYSTALIGKWHLGISCHDPGDFCHHPTSHGFDYFHGLPLTNM



>M86934
MDGLLLDTERLYSVVFQEICNRYDKKYSWDVKSLVMGKKALEAAQIIIDV
LQLPMSKEELVEESQTKLKEVFPMAALMPGAEKLIIHLRKHGIPFALATS
SGSASFDMKTSRHKEFFSLFSHIVLGDDPEVQHGKPDPDIFLACAKRFSP
PPAMEKCLVFEDAPNGVEAALAAGMQAVMVPDGNLSRDLTTKATLVLNSL
QDFQPELFGLPSYE

>AF167081
MSPKPRASGPPAKATEAGKRKSSSQPSPSDPKKKTTKVAKKGKA VRRGRRGKKGAATKMAAVTAPEAESAPAAPGPSDQPSQELPQHELPPEEPVSEGTQHD PLSQEAELEEPLSQESEVEEPLSQESQVEEPLSQESEVEEPLSQESQVEEPLSQESEV EEPLSQESQVEEPLSQESEMEEPLSQESQVEEPPSQESEMEELPSV


=-=-=-=-=-=-=-

groomed seqs