GenMap00: how the map was made
GenMap00: the map
Hallervorden-Spatz syndrome
L7a: A processed pseudogene with retained intron
Nearby diseases: AR CHED, AD CHED, and PPMD
Last updated: 7 Feb 00 webmaster researchPrevious maps of the human genome, such as GenMap99 and its predecessors, have created a framework that locates a large number of microsatellites used to determine locations of human disease genes. While radiation hybrid mapping may in principle be capable of sufficient detailed mapping, as matters stand, the map in the chromosome 20p12 region lacks sufficient precision. Many markers are allocated to multiple (contradictory) positions, preventing a unique ordering as well as not providing meaningful recombination mapping distances.
This causes monogenic disease research to stall out at a critical juncture -- associating the disease with a particular gene. If the disease can only be mapped to a 3 million base pair region, gene density is such that 30-50 different genes or more would have to be screened in the patient set and controls.
A different approach was taken in the Whitehead Institute map. Here, chromosome 20 was tiled with a set of overlapping yeast artificial chromosomes (YACs); markers were positioned relative to hits on this panel. This resulted in an ordinal map: microsatellites were sequentially ordered with respect to their telomere to centromere position without any physical or centimorgan distances resulting. It quickly emerges that this map is superior in accuracy to GenMap99.
Meanwhile, the Sanger Centre continues direct sequencing of the chromosome itself. At this time, chr 20 is half done, with equal parts finished and unfinished (sequenced but unassembled contigs). This data and the associated WebAce database realization of mapping status, while released daily, is user-unfriendly. However, it re-surfaces at GenBank at the high-throughput-genome-sequence (htgs) database and at the finished contig repository for chromosome 20 (NT_00xxxx) reference sequences.
It is this third sequence map that enables integration of GenMap99 and the Whitehead map into GenMap00. The NT_00xxx web page lists finished contigs in order of physical position, provides links to bare sequence, to annotated GenBank entries when available, and to some microsatellite and gene STS markers on that contig.
Needless to say, it is not so simple. The three maps do not use the same set of microsatellites and other markers, though there is considerable partial overlap (Venn diagram). Worse, each of 5-6 mapping centers using the composite marker set felt obligated to assign a unique name (synonym), with the result that maps cannot be visually compared for a given marker. Worse still, no single look-up site 1, 2 carries more than a few of the name equivalencies; some older names were even withdrawn or given new definitions. This made the maps nearly impossible to use directly with disease mapping nomenclature as given in the medical literature.
GenMap00 had to address this synonomy issue early on in order to combine and consolidate the 3 maps. With a 'find' operation on the last 4 characters, say in Netscape, it is now easy find a marker of interest in the final map, even if the name did not appear in one of the constituent maps.
The STS and gene marker sets shown on sequenced NT_00xxxx contigs are very incomplete, even though a nearby division of GenBank maintains the STS database against which Blast searches might have been conducted (best done by not repeat-masking for simple repeats). In making GenMap00, each of the annotated entries was opened and searched for markers ('misc_feature'), allowing for synonomy. Additionally, each of residual markers on the Whitehead and GenMap99 map were used as filtered Blastn queries against the unfinished htgs and finished nrn GenBank databases.
These matches must be evaluated with due consideration to misleading hits due to the simple repeat nature of microsatellites. Some markers additionally contain ALU and similar elements; others are mRNAs to a parent gene on a different chromsome matching a pseudogene. Matches to unfinished contigs can also fail to the extent that the marker overlaps an end.
GenMap00 also takes in various lists of protein-coding genes that have been compiled for this chromosomal region, provided that tblastn of the protein against finished or unfinished chr 20 could validate a location. This was supplemented by direct annotation of certain contigs; this varied from high intensity characterization of all features using GeneBander to quick-pass searches for easily identified known genes. Large pieces of several unfinished contigs were also annotated.
Genes and proteins also have a confused nomenclature arising from use of names that came up in weak homology matches (eg lactate dehydrogenase for goliath protein); proteins in this region mainly have poorly understood roles and so no logical names at this time. Many STS markers are mRNAs from these genes; in fact, it is not unusual for a half dozen slightly different STS markers to originate from a single gene. GenMap99 tended to use mediocre computer-generated EST annotations but at least this gave the same name, however inappropriate, to some nearby markers.
The NT_00xxxx map proposes a telomere-to-centromere order for most of its entries, even going beyond this to give absolute physical distances in kilobase units, with gaps estimated as well. This order is independent of, but largely consistent with, the YAC and radiation maps; it ultimately derives from much more detailed sequencing work at the Sanger Centre. However, certain NT_00xxxx contigs in the chr20p12 region lie within the unlabelled set at the bottom; apparently these await further processing. Some of these fail to be in the main non-redundant databases as well.
Orientation is not fully resolved. That is, a particular NT_00xxx contig might be in correct relative physical order but be 'upside-down' relative to the telomere-centromere axis, ie, the reverse-complement sequence should have been provided. This is less likely in the case of megabase contigs. Contigs containing several markers from the Whitehead ordinal map allow for a consistency check. GenMap00 uses the NT_00xxx contigs as given; unfinished contigs are intercalated less reliably, mainly based on Whitehead map order.
On the plus side, nothing beats a long, reliably determined DNA sequence for establishing microsatellite position and order. GenMap00 has many instances where the absolute kilobase distance and order are established for sizeable groups of markers commonly used on the radiation hybrid and Whitehead maps as well as in disease research. Even marker sets in unfinished contigs (unordered internal pieces) helpfully block up and localize common map markers. It is important to use these because half the data is in this form.
GenMap00 is not a static object. Being a 'leading indicator' directly tied to the sequencing of chromosome 20, it rapidly converges to unambiguous status, though functional annotation of all its genes will take many years. GenMap00 is not exhaustive either: while most of the commonly used markers are precisely located, more of the constituent contigs could be annotated at higher intensities now. New markers can be designed directly from low complexity sequence in areas where needed. There are no obstacles to exending the strategies of GenMap00 to the whole genome.
For disease genes, where time is of the essence, the best strategy is to intensify classical microsatellite mapping using old and new markers of GenMap00 while at the same time intensifying gene annotation in the critical region. The synergy is a doubly winnowing: the disease becomes better localized; at the same time, genes plausible for the phenotype become better annotated. This reduces the number of genes that have to be sequenced for confirmation in patients and controls; this is be especially important in the circumstance of non-coding mutation.
Last updated: 7 Feb 00 webmaster researchThe columns of GenMap00 are as follows:
1: marker number 2: disease marker 3: mouse synteny on chromosome 2 4: marker name 5: marker synonyms. 6: gene symbol and name (to extent known). 7: GenMap99 map position. 8: Whitehead YAC map order. 9: GenMap00 cluster number 10: sequential order of markers occurring within a given contig, 0 if not known. 11: NT_00xxxx finished RefSeq contig. 12: GenBank accession of unfinished unassembled contigs. 13: size of contig in kilobase pairs. 14: physical distance from telomere in kilobase pairs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 34 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - hss - hss - - - - - - - - - - - - - - - - hss - ch2 hss - - - - - - - - - - - - - - - - - - - - - - hss - - - - - - - - - hss - - - - - - - - - - - - - - - - - - hss hss - - - - - - - - - - - - hd hss hss hss hss - - - - - - - hss - - - - - - - - - - - - - hss - - gss - hd ch2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - dia - - - - - - - - - mck - - - - hss - - - - - - - mck - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ch1 - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - - - 84 - - - - - - - - - - - - - - - - - - - - - - - - - 73.1 73.1 73.1 73.1 73.1 73.1 73.1 73.1 73.1 73.1 73.1 73 73 - - - - 83.9 83.9 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 73 73 73 - - - - - - - - - - - - - - - 74 74 74 74 - - - - - - - - - - - 71 - - - 75 74.5 74.5 74.2 74.2 - - - - - - - - - - - - - - 74.9 74.9 74.9 - - - - - - - - - - - 75.2 - - - - - - - - - - - 73 75.2 75.2 75.2 75.2 75.2 - - - - - - - - - - - - - - - - - - - - - - - - - - - 75.6 75.6 75.6 - - - - - - - - - - - - - - - - - - - - - - - - - - - 76.2 - - - - - - - - 80 - - - - - - - - - - - - - - 77 77 - - - 78.2 78.2 78.2 - 76.7 - - - - - - - - - - - - - - - - - - - - - - - 73 - - - - 81.4 81.4 - - - - - - - - - - - - - - - - - - - - - - - - 82 84 84 84 84 - 84 - - 84 | stSG20090 stSG20157 stSG34040 WI-15102 SGC34218 A006U19 stSG49427 sts-T67132 stSG29515 A009P19 WI-14697 sts-T91069 sts-U02019 stSG20152 WI-15969 WI-12305 stSG20132 stSG20188 stSG2189 stSG54099 WIAF-15 stSG33834 A002B36 WI-14248 WI-12610 stSG41515 AFMa131wf1 WI-1352 WI-16682 stSG46646 stSG20187 stSG53482 SGC34571 sts-T29481 H15780 sts-X52220 stSG15083 IB255 WI-9632 stSG20190 WI-20603 SGC33430 sts-W80372 SGC44590 stSG3039 stSG40509 A006H16 stSG33832 AFM338td9 stSG52047 AFMA057VB1 WI-21595 stSG40517 stSG46180 sts-D29515 stSG20173 WI-5974 stSG54185 stSG20138 WI-14028 sts-M78966 SGC31938 HHCPJ80 WI-5974 stSG1730 sts-K02268 WI-7002 GAAT4E12 AFM240wb8 stSG41777 AFM205th8 AFM077xd3 sts-U08336 stSG32165 AFM248yc5 A005W30 WI-17015 stSG53329 stSG20093 H48462 stSG8614 WI-21992 stSG34085 stSG31043 A002A04 sts-Z41706 stSG52048 WI-9739 WI-3022 WI-3904 WI-9238 WI-9238 stSG52086 AFM074wa9 WI-8646 SGC32750 stSG40166 stSG42376 WI-16193 stSG20088 mm1037 stSG10918 stSG25692 AFM333xe5 stSG3037 WI-18557 stSG34027 sts-M11186 stSG49795 IB607 stSG40506 stSG3058 stSG44722 stSG9697 WIAF-1797 stSG20164 sts-F16794 sts-W72684 WI-13669 stSG20142 A005R07 stSG4244 AFMA049YD1 AFM240zf4 WI-8798 WI-9015 WIAF-749 sts-M34668 - stSG20180 WI-16594 stSG20257 stSG9792 WI-4715 - - WI-5517 stSG53632 AFM308we1 ATA21E04 A005O05 stSG44152 stSG25693 WI-4876 stSG10910 WIAF-2006 stSG40394 Bdyc4e10 AFMb026xh5 sts-H22126 stSG30448 AA037460 AFM234f10 stSG8000 stSG53601 SGC32955 A002D12 stSG2202 stSG20158 WI-17847 stSG30106 A009O14 stSG20327 sts-M76446 Chr_20ctg73 AFM248td1 WI-3772 AFM036ya3 GATA51D03 WI-2640 - SGC44304 stSG10911 stSG20232 stSG20379 T27631 WI-18738 WI-7784 stSG20381 stSG29963 stSG53387 stSG62312 SGC34960 stSG20076 stSG25710 stSG35837 WI-12264 FB25H5 stSG10203 AFMB352XD9 R79078 stSG42745 stSG8643 SHGC-16916 SGC35090 stSG30431 WIAF-464 AA978290 stSG25485 stSG12840 stSG33855 WI-4689 AFMa130ya9 GC31723 sts-R79720 stSG58170 WI-17673 AFMb290wh5 W86724 AFM023ta1 WI-20801 WI-5288 pm0647 stSG20194 EST159813 sts-R10161 sts-T96330 stSG30232 stSG53094 SGC33687 stSG54023 stSG10925 H72100 WIAF-730 SGC32385 stSG20160 WI-17957 stSG3057 stSG9519 stSG20120 SGC30258 stSG20086 SGC30394 WI-9063 WI-8238 AFMA085ZH9 WI-9399 GGAA9H10 AFM317TE5 WI-9559 WI-8757 GGAA21B07 WI-15457 WI-12553 stSG33852 stSG20085 stSG20493 stSG25695 stSG33860 AFM299xd1 AFMB344XB1 stSG39181 AFMA196WF5 T92258 IP2017 sts-H58415 WI-5126 GATA72E11 GATA64G08 AFMc017ze1 AFM218yg3 WI-6281 AFM345TD1 AFMA114XE5 WI-16702 WI-18137 R20777 A006R17 AFMA218YB5 AFM238ZC11 AFM234ZB12 X83389 - AFMa086we9 NIB489 WI-15043 WI-6712 A001Z47 AFM164TG5 WI-7473 WI-7829 WI-6063 stSG10890 AFMB348YB5 WI-17032 AFM288ZF5 AFM292XB5 X54567 AFM120XC7 WI-1930 GATA100G04 GATA81E09 WI-3903 WI-9762 AFM211YB8 WI-1994 - WI-4171 AFMC013XE5 WI-2270 AFMA224XF1 AFM044XB4 WI-2364 WI-4893 AFM291WE5 AFM102XA7 AFMB067XG1 WI-6871 AFM260XG5 - WI-10777 IP20M12 AFM197XB12 AFM210VB4 GGAA7E02 WI-4582 WI-9181 WI-7877 - - GATA83F12 AFM242YF8 WI-3249 NIB1603 WI-3387 WI-6873 - - - pm1146 AA026396 stSG34957 WIAF-92 - - sts-R73406 stSG29447 GGAA9H11 - - - WI-7085 L14856 - - - - - - | - - - H28185 - - - - - - R85922 - - - R60806 G24251 - - - - - - - R38826 G23285 R85704 G21228 - D20S199 Z24636 D20S456 G03640 T64906 G22082 - - - - - CDS seq'd CDS seq'd CDS seq'd T03417 D20S735 - H19750 - - - - - G20805 not seq'd D20S906 - A057VB1 R44338 - - - - D20S762 - - EST211960 - - - D20S762 - - D20S1072 P8620 D20S179 - D20S113 D20S103 Z16528 stSG20054 - D20S117 Z17123 G32301 G21313 - - G28357 - H05471 SHGC-36460 - G19838 - - D20S737 G05248 D20S1049 D20S745 D20S1032 D20S1032 - Z66604 D20S1022 - - - EST285473 - - - - D20S198 G21789 EST91360 not seq'd not seq'd not seq'd T03618 - - T55794 R07576 - - - - stSG20142 WI-13669 G20430 - stSG20223 D20S181 - G07079 - - - - G23224 - - MR8569 - - D20S619 D20S842 D20S193 Z24264 D20S473 stSG408 - WI-18677 R00301 D20S752 CDC25B - - - D20S867 Z53139 - - - D20S889 - - - - - WI-17847 stSG20158 - G32702 - - - D20S116 Z17107 D20S742 D20S97 D20S482 D20S500 - - - - - - STS-D00015 D20S1014 - - - - - - - - EST265520 T03153 A008E19 D20S895 Z53825 - - - G15478 - - - H55768 Z94590 - - D20S751 D20S835 - - - D20158 D20S882 Z53348 - D20S95 Z16434 H91615 D20S760 G04858 - - - - - - - - - - - - - - R94932 - - - - - - D20S732 D20S1018 D20S916 Z51974 D20S1034 G05380 G09476 D20S194 Z24330 G07337 G06990 G10134 R05442 H20128 - - - - - D20S192 D20S892 - D20S846 - D20S59 - D20S755 G10052 D20S483 D20S900 D20S115 D20S621 G06114 D20S907 Z67291 T97637 H58383 - - D20S851 D20S177 D20S175 Z23728 D20S503 D20S5 near D20S917 T17174 H29897 D20S723 - Z66695 D20S729 G05247 G06721 D20S763 G06112 SNAP25 D20S894 Z53794 G24576 stSG2011 D20S188 D20S189 D20S27 D20S186 D20S492 G10188 G08057 D20S744 G04471 D20S1041 D20S172 Z23610 D20S495 G03650 D20S66 D20S747 G04587 D20S898 Z53993 D20S497 G03652 D20S852 Z52596 D20S98 Z16471 D20S741 G03948 D20S753 G04719 D20S904 Z51285 D20S104 Z16570 D20S875 Z53252 D20S725 G06120 D20S118 Z17167 clone 705D16 D20S1013 G11904 D20S48 D20S112 Z16842 D20S114 Z16950 D20S470 G08061 D20S614 G03662 D20S733 G06127 D20S1015 G06744 - G20550 G10264 D20S182 Z23791 D20S1050 G04240 T16637 D20S1052 G04287 D20S1070 G06295 clone 738P15 clone 775C13 clone RP4-718P11 D20115 not seq'd not seq'd - 20p11.2 20p11.1-p11.2 not seq'd not seq'd D20S471 - 20p11.2 20p11.2 20p11.2 20p11.2 20p11.2 20p11.2 20p11 20p11 20p11 20p11 | - - - - - - - - SOX22 SRY (sex-determining region) - - - HNRPD hetero ribonuc phospho[H... phospho[H... - - - - - - - PSMF1 proteasome macropain PSMF1 proteasome macropain - - - - ANGPT4 (angiopoietin 4) R-spondin - - CSNK2A1 casein kinase 2 CSNK2A1 casein kinase 2 CSNK2A1 casein kinase 2 FKBP FK-506 binding FKBP FK-506 binding FKBP FK-506 binding KIAA0374 p47 AAD44488 p47 AAD44488 p47 AAD44488 p47 AAD44488 p47 AAD44488 p47 AAD44488 p47 AAD44488 p47 AAD44488 SIRP-alpha1 tyr phosphatase NR PTPNS1 alpha 4x tyrosine phosphatase SIRP beta 1 - 3x tyrosine phosphatase SIRP 3x tyrosine phosphatase SIRP - 3x tyrosine phosphatase SIRP PTPNS1 tyr phos, non PTPNS1 tyr phos, non PTPNS1 tyr stSG10889 SHPS-1 SIRP-a1 signal reg SHPS-1 [SIRP] similar to SHPS-1 PTPNS1 tyr phos, non PTPNS1 tyr phos, non PTPNS1 tyr phos, non PTPNS1 tyr phos, non PTPNS1 tyr phos, non PDYN prodynorphin, enkeph PDYN polyA - - - not seq'd NP_004600 transcription factor 15 TCF15 transcription factor - RPS10 neuron zinc finger RPS10 neuron zinc finger RPS10 neuron zinc finger - - L1 - clone HH419 - - clone HH419 clone HH419 HBV associated factor (XAP4),.. false NT_002249 location - TGASE E3 transglutaminases SNRPB small nuc stSG10927 SNRPB small nuc WI-18905 SNRPB [205973 231 aa] - IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh IDH3B NAD+ isocitrate dh [NOP56] nucleolar hN.. [NOP56] nucleolar hN.. [NOP56] nucleolar hN.. [NOP56] nucleolar hN.. OXT: oxytocin-neurophysin I OXT: oxytocin-neurophysin I arginine-vasopressin-neurophysin I bicarb anion exhcnage - - weak: huntingtin ubiquin - - - weak: pre-mRNA cleavage weak: pre-mRNA cleavage KIAA0552 Hs.90232 KIAA0552 Hs.90232 KIAA0552 Hs.90232 KIAA0860 - - PTPRA tyr phos, receptor type PTPRA tyr phos, receptor type PTPRA tyr phos, receptor type PTPRA tyr phos, receptor type GNRH2 gonadotropin-releasing 2 - - weak: procollagen alpha 2(V) weak: procollagen alpha 2(V) - CPX-1 carboxypeptidase RPL19 rib protein L19 - [olfactory ebf unigene] - ATRN attractin mahogany KIAA0534 KIAA0548 SF3A3 pseudogene weak: FZD7 frizzled sialoadhesin+CENPB+CDC25+hs CDC25B cell division cycle 25B CDC25B cell division cycle 25B CENPB centromere B CENPB centromere B Z53139 - - G1L FTLL1 goliath ferritin G1L FTLL1 goliath ferritin G1L goliath, LDHB lactate [G1L goliath] [G1L goliath] [G1L goliath] [G1L goliath] [G1L goliath ferritin] [G1L goliath] [G1L goliath] cyclin G1 interacting protei.. ADRA1A adrenergic alpha 1A ADRA1A adrenergic alpha 1A ADRA1A adrenergic alpha 1A not seq'd - psL7a contig - - psRPS4X prion prion prion prion prion prion PRNP old prion cds est prion gene region prion doppel old prion cds est KIAA0168 KIAA0168 KIAA0168 KIAA0168 KIAA0168 not seq'd: fetal brain CDS2 CDP-diacylglycerol; PCNA PCNA proliferating cyclin PCNA proliferating cyclin PCNA proliferating cyclin - PCNA proliferating cyclin - - - novel GS+ESt+Blastp- - novel Hs.129047 3'UTR - no quick genes - - DKFZp586A0422 DKFZp586A0422 - - - glycerophosphoryl diesterase - - - - - Hs.171917 cap-binding pr.. - weak: mig-2 mitogen inducible - - alternatively spli.. CHGB chromogranin B CHGB chromogranin B secretogranin 1, SCG1 - - - - - - - alternatively spli. psCyt C Ox MCM2/3/5 family CDP-alcohol phosphoTrans psCyt C Ox MCM2/3/5 family not seq'd NT_002265? not seq'd not seq'd not seq'd NM_007375 TAR bp - weak: chromogranin A - - - - - - SHGC-10569 - - BMP2 bone morphogen 2 - RRM RNA binding Gry - - - - - - FMN dehyd Glycolate Oxidase - [PHKBp1] - - KIAA1162 PLCB1 Phospholipase C Beta 1 KIAA0581 KIAA0581 - PLCB1, KIAA0581 - - - Serine/Threonine Protein Kinase PLCB4 phospholipase PLCB4 phospholipase HS1119D91 glucose induced not seq'd weak: ankyrin [SNAP] - SNAP-25 + psRPL23A - Jag Notch Alagille - psRPS11 - - KIAA0952 - - - - - - - WI-8076 RPS10 ribosome - - - - - - - not seq'd - - SNRPB2 small nuclear ribo U2 small nuclear RNA - - - PCSK2 neuroendocrine convertase PCSK2 neuroendocrine convertase - - - CP115 BFSP1 filensin cytoskeletal RRBP1 ribosome bp 1 ES/130 not seq'd - - - - - CD39L2 nuc phosphatase hnRNP RRM rna binding protein weak: serine palmotyltransferase oncogene mRN. oncogene mRN. oncogene mRN. not seq'd ps GAPDH - - - - - PAX1 paired homeobox cystatin C 7 gene complex SSTR4 somatostatin receptor 4 THBD thrombomodulin brain glycogen phosphorylase PYGB not seq'd: insulinoma-associated 1 HNF3B Hepatic nuclear factor no seq: inosine triphosphatase not seq'd myelin basic E tubulin alpha | 9.57 9.47 11.04 11.04 7.64 11.04 7.94 8.74 7.94 8.54 9.47 9.47 8.66 6.73 8.97 11.78 11.04 11.04 11.04 11.04 11.04 11.04 11.78 10.3 11.04 11.04 11.78 - 11.04 11.04 11.04 7.94 11.04 8.54 10.83 11.04 10.3 9.57 11.04 9.57 10.3 10.3 10.83 11.04 11.78 11.04 11.78 11.93 11.04 12.19 - 11.67 12.19 11.04 11.04 11.04 9.78 12.19 11.14 10.26 12.19 13.18 11.09 10.26 15 11.04 - - 13.35 12.19 16.6 7.94 9.17 11.46 6.73 8.54 8.56 11.04 10.2 7.94 7.94 8.56 8.71 9.47 9.5 9.67 10.78 - - - 13.45 - 18.42 - 11.93 11.93 11.93 11.93 12.93 13.45 13.45 18.31 18.73 - 13.35 13.45 11.93 18.42 13.45 11.1 12.09 12.09 20.2 20.2 20.26 20.26 12.09 20.2 11.1 12.09 20.46 18.42 - 19.12 11.46 - 11.46 13.45 - 13.45 13.45 11.93 18.42 - - - - 11.93 20.26 - 21.85 12.09 12.06 - 11.1 12.09 20.2 20.88 - 22.4 12.09 12.09 - 21.61 21.61 12.19 22.09 12.09 21.61 11.67 21.92 24.77 12.09 24.1 - 21.61 - 25.17 - - - 20.93 12.19 24.46 21.14 24.77 21.57 - 23.65 21.61 24.77 23.65 28.85 43.14 43.14 12.04 43.35 - 21.19 - 28.85 28.85 43.14 28.85 44.35 28.85 43.14 31.8 31.8 31.8 43.14 - 28.85 43.14 31.9 30.78 28.85 31.8 31.9 32 32.5 - 41.96 43.14 43.4 32.91 36.5 40.68 36.6 31.8 40.68 31.7 36.6 41.96 38.62 38.72 44.31 42.61 42.61 43.46 43.51 44.31 43.14 - 49.03 - - - - - - - 36.6 38.62 38.62 40.68 44.31 44.31 46.1 36.6 - 40.68 - 36.6 - 38.62 - - - 48.93 43.93 - - - 42.97 39.98 40.86 55.11 - - - - - 55.21 - 55.21 - - - - - - 52.59 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 20.36 20.46 20.2 21.61 - - 11.93 28.85 - - - - - - - - - - - - | - - - - - - - - - - - - - - - - - - - - - - - - - - 1 2 - - - - - - 3 4 - - - - - - - - - - - 5 - - - - - - - - - - - - 7 - - 6 10 8 - 9 12 - - - - - - - - - - - - - - - 13 11 14 - 15 - - 16 - - - - - - - - 17 - - - - - - - - - - - - - - - - - - 18 19 20 21 - - - - - - - - - - 22 - 23 24 - - - 25 - - - - - - - - 26 - - - - - - - - - - - - 27 28 29 30 31 - - - - - - - 32 - - - - - - - - - 33 - 34 - - - - - - - - - - - 35 - - - - - 36 - 37 - 38 - - - - - - - - - - - - - - - - - - - - - 39 46 40 41 42 43 44 45 49 - - - - - - - 48 47 - 50 - 51 - 52 53 54 55 56 57 58 59 - - - - 60 61 62 - - 63 64 - 65 - 66 67 68 69 - 71 - 70 72 74 73 75 76 77 78 79 80 81 82 83 84 85 86 87 89 88 90 91 93 92 94 - 95 96 97 98 99 100 101 102 - - 103 104 105 106 107 108 - - - - - - - - - - - - - - - - - - - - - - - | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 - - - 13 13 13 13 13 13 13 13.5 13.5 13.5 13.5 13.5 13.5 13.5 13.5 14 14 14 14 14 14 14 14 14 15 15 15 15 15 - 16 16 16 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 18 19 19 19 19 - 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 - 21 21 21 21 21 21 21 21 21 22 22 22 23 23 23 23 24 24 24 24 24 24 24 24 24 24 24 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 - - - - 26 26 - 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 28 28 28 28 29 29 29 29 29 - - 30 30 30 30 30 - - 31 31 31 - - 32 32 32 32 - - - - - - - - - 35 35 - - 36 36 36 - - 37 37 37 37 38 38 38 38 38 38 38 38 38 - - 40 40 - - - - - - - - - - - - - - - - - - - - - - - - - | 0 0 0 1 2 3 4 5 6 7 8 9 10 11 11 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 - 1 2 3 - - - - 0 0 0 0 0 0 0 0 0 0 1 0 4 2 3 1 0 0 0 0 0 1 2 3 4 5 6 8 9 10 11 - - 0 0 0 1 2 3 4 5 - - - - - - - - - 1 2 3 3 4 0 0 0 0 0 0 0 0 0 0 1 2 3 - - - - - - - - - - 0 0 1 1 1 2 3 4 - - - - - - - - - 0 0 0 1 2 - - - - - - - - - - - - - - 1 2 3 4 5 6 7 8 9 - - - - - 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 - 0 0 0 0 0 0 0 0 0 1 2 3 1 2 3 4 - - - - - - - - - - - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 - - - - - - - 0 0 0 0 0 0 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 - - 1 4 2 5 3 - - - - - - - 1 2 3 4 0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - x x - - - - - - - - - - - - - - - - - 1 2 - - - - - - | NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_002249 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_001933 NT_002369 NT_002369 NT_002369 - - - - - - - - - - - - - NT_002062 NT_002062 NT_002062 NT_002999 NT_002999 NT_002999 NT_002999 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 NT_001690 - - - - - - - - - - - - - - - - - - NT_002128 NT_002128 NT_002128 NT_002128 NT_002128 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 NT_002301 - - - - - - - - - - - - - - - - - - - - - - - - - - - NT_002855 NT_002855 NT_002855 NT_002855 NT_002855 - NT_003217 NT_003217 NT_003217 - - - - - - - - - NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 NT_001989 - - - - - NT_001001+ NT_001001+ NT_001001+ NT_001001+ NT_001001+ NT_001001 NT_001001 NT_001001 NT_001001 NT_001001 NT_001001 NT_001001- NT_001001- NT_001001- NT_001001 NT_001001- NT_001001- NT_001001- NT_001001- NT_001001- NT_001001- - - - - - - - - - - NT_001934 NT_001934 NT_001934 NT_002650 NT_002650 NT_002650 NT_002650 - - - - - - - - - - - NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 NT_002161 - - - - - - - NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_002065 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001820 NT_001003 NT_001003 NT_001003 NT_001003 NT_001003 NT_001515 NT_001515 NT_001000 NT_001000 NT_001000 NT_001000 NT_001000 - - NT_001691 NT_001691 NT_001691 - - NT_002283 NT_002283 NT_002283 NT_002283 NT_002409 - - NT_002862 NT_002862 - - NT_002019 NT_003003 NT_002063 NT_002063 - NT_002064 - NT_001662 NT_001662 - - NT_001692 NT_001692 NT_001692 NT_001692 NT_001005 NT_001005 NT_001005 NT_001005 NT_002857 NT_002857 NT_002857 NT_002857 - - NT_001636 NT_002340 NT_002340 - NT_000281 NT_001893 - - - - - - NT_001972 NT_002015 - - - NT_002563 NT_002532 - NT_002084 NT_002084 - - - - - - | AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL034548 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL031665 AL050325 AL050325 AL050325 AL050325 AL049761 AL049761 AL049761 AL136531 AL109658 AL136531 AL109658 AL136531 AL109658 AL136531? AL109658 AL109658 AL109658 AL109658 AL109658 AL109658 AL109658 AL109658 AL109658 AL049634 AL049634 AL049634 NT_002999 AL035460 AL109809 AL109809 AL109809 AL121760 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 AL034562 - AL133231 AL133231 AL133231 AL121758 AL118502 AL121758 AL118502 AL121758 AL118502 AL118502 AL118502 AL121747 AL121747 AL121747 AL121747 AL121747 AL121747 AL121747 AL121747 AL121747 AL031678 AL031678 AL031678 AL031678 AL031678 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 AL049712 - - - AL109976 AL109976 AL109976 AL109976 AL109976 AL109976 AL109976 AL121891 AL121891 AL121891 AL121891 AL121891 AL121891 AL121891 AL121891 AL121905 AL121905 AL121905 AL121905 AL121905 AL121905 AL121905 AL121905 AL121905 AL035460 AL035460 AL035460 AL035460 AL035460 AL117334 AL353193 AL109805 AL132773 AL109805 AL132773 AL109805 AL132773 AL109804 AL109804 AL109804 AL109804 AL109804 AL109804 AL353194 AL109804 AL353194 AL109804 AL353194 AL109804 AL031670 AL031670 AL031670 AL031670 AL031670 AL031670 AL031670 AL031670 AL031670 AL031670 AL121675 AL121675 AL121675 AL121675 - AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121916 HSJ189G13 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 AL121781 HSJ1164C1 U29185 U29185 AL109808 HSJ1187J4 AL109808 AL133354 AL109808 AL133354 AL133354 AL133354 AL133354 AL133354 AL133354 - AL121924 AL121755 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL121924 AL121890 AL117377 AL035249 AL117377 AL035249 AL117377 AL035249 AL121757 AL121757 AL121757 AL121757 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL109935 AL118505 AL118505 AL118505 AL118505 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 AL035461 - - - - AL109811 AL109811 AL121911 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL035668 AL034554 AL031679 AL031679 AL078643 AL078643 AL078643 AL021879 AL021396 AL021396 AL021396 AL021396 AL021396 AL049632 AL031132 AL031655 AL031132 AL031683 AL121909 AL031682 AL034427 AL023805 AL133002 AL031652 AL031652 AL121740 - AL109754 AL023913 AL034430 AL023913 AL034430 AL023913 AL034430 AL135937 AL035456 AL035456 AL133340 AL049690 AL049649 AL049690 AL049649 AL049690 AL079337 AL049690 AL049649 AL034547 AL035448 AL096862 AL136460? AL034561 AL078623 AL078623 AL121754 AL132826 AL031677 AL118510 AL118503 AL118503 AL121584 AL118503 AL109912? AL035073 AL035073 - AL135938 AL135938 AL034428 AL034428 AL034428 AL031675 AL031675 AL031675 AL121779 AL031664 AL031664 AL132765 AL132765 AL132765 - AL035045 AL050321 AL050321 AL121893 - AL035252 AL109618 AL109983 - - - - AL034426 AL031673 - - AL132821 AL034551 - AL121894 AL121831 AL049651 AL049651 - - AL121772 - - - | 153 153 153 153 153 153 153 153 153 153 153 153 153 153 153 364 364 364 364 364 364 364 364 364 364 364 364 364 364 364 364 109 109 109 - - - 127 127 127 127 127 127 127 127 127 127 115 115 115 224 224 224 224 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 103 - 117 117 117 - - - - - 127 127 127 127 127 127 127 127 127 205 205 205 205 205 159 159 159 159 159 159 159 159 159 159 159 159 159 - - - 137 137 137 137 137 137 137 159 159 159 159 159 159 159 159 247 247 247 247 247 247 247 247 247 135 135 135 135 135 127 161 161 161 196 196 196 196 196 196 196 196 196 130 130 130 130 130 130 130 130 130 130 142 142 142 142 - 299 299 299 299 143 63 63 63 63 63 63 63 63 151 151 151 143 143 143 143 143 - 475 167 167 167 167 167 167 167 167 98 98 98 167 167 167 167 180 180 180 180 180 180 180 180 180 180 180 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 - - - - - - - 330 330 330 330 330 330 330 330 330 330 330 330 330 1187 1187 1187 1187 1187 1187 1187 1187 1187 1187 1187 530 530 530 530 530 213 213 286 286 286 286 286 - - 215 215 215 - - 605 605 605 605 355 - - 177 177 - - 106 124 244 244 - 89 - 257 257 - - 373 373 373 373 89 89 89 89 155 155 155 155 155 - 122 101 101 - - 108 - - - - - - 148 127 - - - 138 138 - - - - - - - - - | 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2378-2531 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 2957-3322 3142-3251 3142-3251 3142-3251 - - - - - - - - - - - - - 3851-3966 3851-3966 3851-3966 - - - - 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 3540-3643 - - - - - - - - - - - - - - - - - - 3641-3846 3641-3846 3641-3846 3641-3846 3641-3846 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 3857-4016 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 4924-5055 - - - - - 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 5863-5926 - - - - - - - - - - 8083-8182 8083-8182 8083-8182 8845-9012 8845-9012 8845-9012 8845-9012 - - - - - - - - - - - 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 9572-9711 - - - - - - - 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 9389-9720 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 8877-10064 9928-10459 9928-10459 9928-10459 9928-10459 9928-10459 20p12 20p12 10321-10608 10321-10608 10321-10608 10321-10608 10321-10608 - - 12634-12849 12634-12849 12634-12849 - - 13213-13818 13213-13818 13213-13818 13213-13818 14210-14565 - - - - - - 20p11.2-p12 - 20405-20650 20405-20650 - 21257-21346 - 21615-21872 21615-21872 - - 22889-23262 22889-23262 22889-23262 22889-23262 23531-23620 23531-23620 23531-23620 23531-23620 23219-23374 23219-23374 23219-23374 23219-23374 23219-23374 - 24125-24248 25308-25409 25308-25409 - - 30704-30813 - - - - - - - - - - - - - - 27506-27603 27506-27603 - - - - - - |
| 220 Features on Chromosome 20 | |
|---|---|
acetyl-CoA synthetase adenosine deaminase adrenergic, alpha-1A-, receptor agouti mouse-signaling 8 antigen NY-CO-33 ATP synthase, H+ transporting F1, epsilon ATP/GTP-binding site motif AP-loop: Similar to ATP5E, nuclear gene encoding mitochondrial ATPase type IV, phospholipid-transporting P-type,putative bactericidal/permeability-increasing bladder cancer related 10kD bone morphogenetic 2 bone morphogenetic 7 osteogenic 1 breast carcinoma amplified sequence 1 C Oxidase I and CDP-alcohol phosphatidyltransferase C.elegansP1:CEC47E128;Mouse alpha-mannosidaseP1:B54407 cathepsin Z CCAAT/enhancer binding C/EBP, beta CD39-like 2 CDP-diacylglycerol synthase cytidylyltransferase 2 cell division cycle 25B centromere B 80kD centrosome associated cep250 centrosome associated 2 CGI-107 CGI-15 CGI-53 cholinergic receptor, nicotinic, alpha 4 chromogranin B secretogranin 1 chromogranin B secretogranin 1, SCG1, pseudogene chromosome segregation 1 yeast homolog-like cleavage stimulation factor, 3' pre-RNA collagen, type IX, alpha 3 copine I core-binding factor, runt domain, alpha 2 cystatin C amyloid angiopathy and cerebral hemorrhage cystatin D 3 cysteine desulfurase cytochrome P450, XXIV vitamin D 24-hydroxylase D32579 comes from this gene death associated transcription factor 1 DKFZP434C0935 DKFZP434I114 DKFZP434N061 DKFZP564A032 DKFZP566A0946 DKFZP727M231 dolichyl-phosphate mannosyltransferase 1, catalytic drin actin depolymerizing factor E2F transcription factor 1 endothelial cell C/activated C receptor endothelin 3 epididymis-specific, whey-acidic type, four-disulfide erythrocyte membrane band 4.1-like 1 eukaryotic translation initiation factor 2, 2 beta eyes absent Drosophila homolog 2 F15D4.3 [C.elegans] F52C12.2 [C.elegans] ferritin, light polypeptide 2 FK506-binding 1A 12kD frizzled Drosophila homolog 7 gamma-glutamyltransferase ganglioside-induced differentiation associated 1 GEF-2 glutathione synthetase glycerophosphoryl dier phosphodierase domain goliath-like C3HC4 type growth differentiation factor 5 cartilage morphogenetic growth hormone releasing hormone 6 guanine nucleotide binding G, alpha stimulating H.sapiens seb4D mRNA HBV associated factor Helicase C-terminal domain and SNF2 N-terminal domains hemomucin [D.melanogaster] hemopoietic cell kinase hepatocyte nuclear factor 4 gamma hepatocyte nuclear factor 4, alpha heterogeneous nuclear ribonucleoprotein D hHCN2 HNF-3beta mRNA for hepatocyte nuclear factor-3 beta0 Human clone 23586 mRNA sequence Human putative cyclin G1 interacting mRNA Human ras inhibitor mRNA, 3' end hydroxyacid oxidase glycolate oxidase 1 hyperpolarization cyclic nucleotide-gated channel insulinoma-associated 1 symbol provisional integrin beta 4 binding isocitrate dehydrogenase 3 NAD+ beta jagged1 Alagille syndrome KIAA0168 KIAA0172 KIAA0181 KIAA0249 and CpG island KIAA0255 KIAA0308, a LY6 Lymphocyte antigen 6 T-cell KIAA0374 KIAA0395 KIAA0406 KIAA0548 KIAA0552 KIAA0581 KIAA0693 KIAA0772 KIAA0784 KIAA0823 KIAA0860 KIAA0939 KIAA0952 KIAA0964 KIAA0978 | kinesin family member 3B
Kreisler mouse maf-related leucine zipper KRML
laminin, alpha 5
lethal 3 malignant brain tumor l3mbt Drosophila homolog
M88866 comes from this gene [C.elegans] 37
matrix metalloproteinase gelatinase B, type IV collagenase
MCM2/3/5 family member, a pseudogene Cytochrome
microtubule-associated, RP/EB family1
mitogen inducible gene mig-2 1
mouse Dhm1 [M.musculus]
MyD-1 antigen 3
N-terminal acetyltransferase complex ard1
neuronal thread AD7c-NTP
neuronatin
nuclear receptor coactivator 3
nucleolar KKE/D repeat
ORF YNL059c [S.cerevisiae]
P24 [M.musculus] 7
PAK1 LIKE Serine/Threonine-Protein Kinase PLCB4
PCK1 gene for soluble phosphoenolpyruvate carboxykinase 1
peroxisomal acyl-CoA thioerase
Phopholipase C beta 1-Phosphatidylinositol 4,5-Bisphosphate
phosphoenolpyruvate carboxykinase 1 soluble
phospholipase C, beta 4
phospholipase C, gamma 1 formerly subtype 148
phospholipid transfer
phosphoprotein
pleiomorphic adenoma gene-like 2
PMP24 24 kDa intrinsic membrane
POLYADENYLATE-BINDING 1 39
potassium voltage-gated channel, Shab-related
potassium voltage-gated channel, subfamily G,
PRE-MRNA SPLICING FACTOR RNA HELICASE
preferentially expressed in colorectal cancer
prion Creutzfeld-Jakob disease
prodynorphin
proline-rich M14 precursor [M.musculus] 127
proprotein convertase subtilisin/kexin type 2
prostaglandin I2 prostacyclin synthase
protease inhibitor 3, skin-derived SKALP
proteasome
proteasome prosome, macropain inhibitor subunit 1 PI31
protective for beta-galactosidase galactosialidosis
protein kinase C binding 1
protein kinase cAMP-dependent, catalytic inhibitor gamma
protein phosphatase 1, regulatory subunit 6
protein phosphatase 2A BR gamma subunit 100
protein tyrosine phosphatase, non-receptor type 1
protein tyrosine phosphatase, receptor type, alpha
protein tyrosine phosphatase, receptor type, T
PROTEIN-TYROSINE PHOSPHATASE 1B 9
putative brain nuclearly-targeted
putative oncogene mRNA, partial cds
putative Rab5-interacting {clone L1-57}
Quions
RAE1 RNA export 1, S.pombe homolog
RAS-RELATED RAB-31
rat kidney-specific 108
RENAL SODIUM/DICARBOXYLATE COTRANSPORTER
RETINOBLASTOMA-LIKE 1 4
retinoblastoma-like 1 p107
reverse transcriptase
ribophorin II
RNA-binding autoantigenic
S-adenosylhomocysteine hydrolase
S. cerevisiae CBP3 precursor
S. cerevisiae VPS16 [C.elegans] 47
S.pombe hypothetical C1D4.09C [C.elegans]
S68401 cattle glucose induced gene
secretory leukocyte protease inhibitor antileukoproteinase
semenogelin II
serine/threonine kinase 4
serine/threonine kinase 15
small nuclear ribonucleoprotein polypeptide B''
small nuclear ribonucleoprotein polypeptides B and B1
sodium-dependent dicarboxylate transporter SDCT2
somatostatin receptor 4 5
sorting nexin 5
spermatogenesis associated PD1
splicing factor CC1.3
splicing factor, arginine/serine-rich 6
staufen Drosophila, RNA-binding
synaptosomal-associated, 25kD
syndecan 4 amphiglycan, ryudocan
syndrome, fatal familial insomnia
syntaxin 16
syntrophin, alpha dystrophin-associated A1
TAP pseudogene and the 3' l KIAA0188 and
TATA box binding TBP-associated RNA polymerase II
TH1 [D.melanogaster]
thrombomodulin
topoisomerase DNA I
transcription factor 15 basic helix-loop-helix
transcription factor AP-2 gamma activating enhancer gamma
transcription factor-like 5 basic helix-loop-helix
transformation-related
transglutaminase 2 C polypeptide,gamma-glutamyltransferase
transglutaminase 3 E polypeptide,gamma-glutamyltransferase
translocase of outer mitochondrial membrane
troponin C2, fast
tumor necrosis factor receptor superfamily, member 5
type II CALM/AF10 fusion 6
ubiquitin carrier E2-C
UDP-Gal:betaGlcNAc beta 1,4- galactosyltransferase5
undulin 2
v-myb avian myeloblastosis viral oncogene homolog-like 2
VAMP vesicle-associated membrane-associated B and C
Ydr531wp [S.cerevisiae]
ZINC FINGER 151 4
zinc finger 133 clone pHZ-13
|
last updated: 8 Jan 00. webmaster researchThe Human Genome Project has been a boon to inherited disease research. Positioning of thousands of microsatellites have allowed the genes for many diseases to be mapped for the first time, including at least 4 diseases to the prion gene neighborhood in chr 20p12.3. However, for rare diseases, the final mapping resolution might be only a million base pairs and so the gene itself cannot quite be identified. (Gene density on chr 20p12 is perhaps 12 per million bp.)
At the same time, gene-finding tools applied to newly sequenced human chromosome are rapidly identifying genes that have no known associated disease. Some genes, of course, will not have an associated disease because of minor function or compensation.
The question is, do any inherited diseases mapping to chr 20p12.3 match up with any of the new genes being identified there? It is fairly easy to recover all orphaned diseases (OMIM and Medline) and exhaustively list all genes for a given stretch of chromosome but there is no systematic way of matching these up.
For the doppel gene, we might expect an ataxia from mouse studies but as over 409 human ataxias have been described, gene-disease matching is not feasible. For nearby KIAA0168, a nuclear gene with a ras homology domain, again the disease class is too broad (cancer). Four pseudogenes near prion-doppel are not transcribed so probably have no role in any disease.
![]() |
However, an autosomal recessive disease identified in the 1920's, Hallervorden-Spatz disease (now called neurodegeneration with brain iron accumulation or NBIA type 1; OMIM #234200) may be matchable to a newly discovered ferritin light chain gene just telomeric to the prion gene on chromosome 20.
Ferritin is the major intracellular iron storage protein in many organisms; it concentrates iron 12 orders of magnitude above its solubility. The protein oligomer has the shape of a hollow sphere which stores up to 4500 iron atoms as ferric hydroxide phosphate. Mammalian liver and spleen ferritin consists of 24 subunits of variable numbers of 21k heavy and 19k light subunits for a total of 450,000k. The ferritin light chain in these tissues maps to chr 19q13.3-qter, so an alternative brain-specific isoform must be invoked for Hallervorden-Spatz. Mutations of the chr 19 gene in a conserved 5' UTR regulatory region [iron response element , cis-acting stem loop] cause hyperferritinemia-cataract syndrome. ] Now at least 25 other inherited diseases with iron accumulation are known (OMIM search or SJ Hayflick's list; seven of them have arisen previously in connection with CJD, namely Parkinson, Alzheimer, Huntington, ALS (SOD), CP (ceruloplasmin), Friedreich ataxia, and hemochromatosis. It is very clear that brain iron accumulation can occur in genes having nothing to do with iron metabolism. In fact, iron accumulation in Hallervorden-Spatz disease might procede through the alpha-synuclein Lewy body mode as seen in PD and AD. [Neurology 1998 Sep;51(3):887-9]. |

| Human Chromosome 20p12.3 | ||
|---|---|---|
| Marker | Zmax | Mapping |
...telomere AFMa057vb1 D20S906 D20S113 GAAT4E12 D20S198 D20S842 AFMa049yd1 D20S181 D20S193 D20S473 AFMa074wa9 D20S889 D20S116 D20S867 ...centromere | 3.4 9.0 5.7 6.2 9.6 13.6 13.8 7.6 7.0 8.4 2.3 ferritin 6.9 4.8 | Mapping in ten affected families has located the NBIA gene to an interval between D20S906 and D20S116 on chromosome 20p12.3-p13. Note that microsatellites within a given contig are readily determined by Blastn(sts) of unmasked sequence. GenBank entries are occasionally annotated for microsatellites. In this instance, the Sanger Centre noted the presence of (CA)n microsatellite D20S889 at positions 81245-81562 of a 130,263 bp chr 20 contig called 681N20 (accession AL031670) or genomic NT_001989.
At the Whitehead Center, D20S889 is found on a single yeast artificial chromosome,YAC 753_G_9, telomeric to D20S116 and the prion gene. Finished contig AL049634, which is even more telomeric, contains D20S906 at positions 28787-29123 (as well as a triplicated PTPNS1 feature). In other words, the new ferritin feature has an excellent fit relative to disease mapping data. |
The 23 Nov 1999 entry lists a CpG island at 14306-15895 followed by a ferritin light chain-like (FTLL1) single exon feature at positions 23174-24151 of the minus strand with coding sequence from 23424-23951 (as well as a multi-exon goliath-like gene). The coded protein is 96% identical to liver ferritin light chain. As we will see shortly, it is not at all clear whether this is a gene (like the GenBank entry claims) or a pseudogene.
Since ferritin is a well studied protein, it is instructive to look in other species for precedents of an alternative ferritin light chain. In fact, a second ferritin light chain protein was found in mouse in 1992. It was never mapped; to be homologous to the new ferritin on human chr 20, it would have to map to mouse chr 2. (Mouse also has 11 ferritin pseudogenes on 11 different chromsomes; the homologue to human spleen ferritin is on mouse chr 7.) The second ferritin gene in mouse is intronless, apparently staying functional despite having arisen as a retrotransposed processed mRNA from the main mouse ferritin light chain gene. Some rat strains also have a second ferritin light gene with introns.
Highlights of human chr 20 ferritin relative to authentic chr 19 liver ferritin:
fortuitous new promoter
flanking direct repeats 15 bp
5' UTR, possible iron regulatory element IRE
protein coding region 528 bp, lacking 3 introns
poly A upstream signal
genomic poly A 29 bp
3' UTR
flanking direct repeats 15 bp
agtcaaaacaagcaagcaaactaataatTAAAAtaaacagaaaaaggcaagttggaggaaaccaagatttatttttaaGAATAAGAGGTGATA
ggcagttcggcggtccagtgggtctgtctcttgcttcaacagtgtttggacggaacagatccggggacggtcttccagcctccgaccgccctccaattt
cctctccacttgcaacctccgggaccatcttctcggctatctcctgcttctgggacctgccagcaccgtttttgtcgttagctccttcttggcgaccaacc
ATGAGCTCCCAGATTCGTCAAAATTATTCCACCGACGTGGAGGCAGCCGTCAACAGCCTGGTCAATTTGTACCTGCAGG
CCTCCTACACCTACCTCTCTCTGGGCTTCTATTTCGACCGCGATGATGCGGCTCTGGAAGGCGTGAGCCACTTCTTCCG
CGAATTGACCGAGGAGAAGCGCGAGGGCTACGAGCGTCTCCTGAAGATGCAAAACCAGCGTGGCGGCCGCGCTCTCTTC
CAGGACATCAAGAAGCCAGCTGAAGATGAGTGGGGTAAAACCCCAGATGCCATGAAAGCTGCCATGGCCCTGGAGAAAA
AGCTGAACCAGGCCCTTTTGGATCTTCATGCCCTGGATTCTGCCCACATGGACCCCCATCTCTGTGACTTCCTGGAGAC
TCACTTCCTAGATGAGGAAGTGAAGCTCATCAAGAAGATGGGTGACCACCTGACGAACCTCCACAGGCTGGGAGGCCCA
GAGGCTGGGCTGGGCGAGTATCTCTTCGAAAGGCTCACTCTCAAGCACGTCTAAgagccttatgagcccagcgact
tctgaagggccccttgcaaagtaatagggcttctgcctaagcctctccctccagccaataggcagctttcttaactaccctaacaagccttggaccaaatgga
AATAAggctttctgatgcAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGAATAAGAGGTGATA
ttacacccatgaaacaagaataagatgctattttgagacaagaaacaaaaaataagctttaaa
The annotation graphic shows the first 30000 bp of this contig. The SVA CpG island shows up clearly, as does a less dramatic CpG enriched area upstream to the correct GenScan prediction of ferritin (on the minus strand). The total interspersed repeats percentage is high at 16450 bp or 54.83 % with SINEs 16.4 %, LINEs 17.25 %, LTR elements 20.8 % DNA elements, 0.40 % and small nuclear RNA, 0.18 %.Three sequence tagged sites are also shown: G14691, G01533, and G43501. D20S889, a major microsatellite marker is 57084 bp downstream of the ferritin gene.
Since the new human chr 20 ferritin light chain is also intronless, one might guess that the second minor ferritin gene arose prior to rodent/primate divergence. This is not supported by alignment data: the minor ferritins clearly cluster with their respective parental genes (note especially the 7 residue deletion in both human genes), suggesting that the second mouse ferritin represents parallel evolution with respect to the second human ferritin gene, rather than orthology. (However, it remains possible that mutational drift of ferritin light chains is driven by a coupling to the within-species ferritin heavy chain; synteny mapping of the second mouse light gene would settle this.)

Human, like mouse, apparently has a large number of ferritin light chain pseudogenes. Within finished human sequence, 9 strong matches to ferritin light chain are found by tBlastn, notably on chr Xp22 (tandem) and Xq27, chr 1p34, and chr 11q12. Unfinished human sequence, tBlastn(thgs), shows a further 11 features, notably on chromosomes 11, 6, 8, and 1. Since human genome is half-finished, these numbers could double.
The human chr 20 ferritin light gene is not a pseudogene if you believe the GenBank annotation. The support supplied is in the form of an upstream CpG island and in numerous EST matches. The CpG-enriched area lies 8 kilobases upstream of the gene and is not centered about any known ferritin exon or promoter. CpG islands have no natural polarity and this one could equally well belong to some gene 5' to the contig boundary. Ominously, it coincides with a tandem SVA insertion from 13769-16004. It is conceivable that a CpG island brought in by a transposon was recruited to serve the ferritin gene; more likely this is a retro-CpG feature not serving any host gene.
RepeatMasker, run on 12800-17000 of AL049634 on 10 Jan 00, masks 99.81% of the sequence. SVA is a long terminal repeat composite retroposon with a SINE found in 315 genomic sequences; it is odd to have 4 fragments back to back. The CpG enrichment may have to do with retroviral gene expression or merely an area of very high GC content.
C SVA LTR/Retroviral (937) 449 115 + LTR5 LTR/Retroviral 332 875 (94) + SVA LTR/Retroviral 1 954 (432) + SVA LTR/Retroviral 630 694 (692) + SVA LTR/Retroviral 521 882 (504) + SVA LTR/Retroviral 519 1340 (46)A second downstream CpG island in the annotation at 31916..32941 apparently belongs to the following goliath-like gene because it partly envelopes its first untranslated exon 32479..32723.
It is equally crucial to look closely at proposed evidence of transcription to be sure that ESTs should not be allocated to the main ferritin light gene on chromosome 19 instead. This is a common mistake in computer gene annotation -- to set a high threshhold for Blast matches and assume the ESTs surviving the cutoff belong to the gene under study. Here the two genes are so close in sequence that only their 7 diagnostic differences drive EST allocation [arrows on graphic] via tBlastn(est). Similar diagnostic residues are found in flanking UTR that allocate ESTs not reaching coding regions.
In fact, not a single one of the 100+ EST "matches" at GenBank belongs to the chromosome 20 ferritin light gene. After allowing for an inherent error rate in EST sequence determination, all belong to the main ferritin light chain gene on chr 19. Therefore there is no support for feature transcription. Studies in the mid-80's found no evidence at the protein level for light chain heterogeneity. This gene may only be rarely transcribed and/or transcribed only in tissues not used in compiling the EST databases -- the disease manifests itself mainly in basal ganglia within the globus pallidus.)
The easy way to allocate ESTs is to take a differentially diagnostic stretch of 60 nucleotides at a time (one line of blast output), 0 descriptions, 100 alignments as 'flat master-slave with identities.' Below, EST matches to chr 20 probe are shown: all 100 belong to chr 19 ferritin according to the 3 diagnostic bases:
chr_20 1 ttcttggcgaccaaccatgagctcccagattcgtcaaaattattccaccgacgtggaggc 60 3086553 222 ........a...........................g....................... 163 6361411 236 ......c.a...........................g....................... 177 5880135 224 ......c.a...........................g....................... 165 5662977 224 ......c.a...........................g....................... 165 ...]The coded protein has no internal stop codons nor frameshifts and its 7 amino acid differences are fairly conservative (though this is an inherent property of the genetic code). Key charged residues involved in iron sequestration have not been affected though an alanine to threonine change is seen within this region. [Of the H-, L-, and M-type ferritin subunits in animals, only the H and M types have a functional diiron site Ferritins concentrate iron in cells as a mineral; cytotoxic reactions of both Fe2+ and O2 are controlled by ferritin chemistry as ferric oxo reaction products are directed to a large cavity as a ferric oxide hydrate. A a peroxodiferric intermediate in the ferritin ferroxidase reaction shows the ferritin ferroxidase site to be very similar to O2-activating diiron enzyme sites.
It cannot be determined bioinformatically whether the hypothetical chr 20 ferritin would interact properly with ferritin heavy chain to form a (globus pallidus-specific?) functional 24-mer nor even whether this would be the function of the new gene. [The 5' end potentially continues for 19 additional amino acids but this has no Blastp(nrp) support.] The 5' leader sequence could also be tested for stem-loop changes affecting the iron regulatory element IRE.
Could the new ferritin light chain be "older than it looks?" That is, the ratio of non-coding to coding changes may be high (parental gene is used for comparison). But this is not so: 7 of 14 nucleotide changes are non-synonomous, a 1:1 ratio. Adjacent UTR might establish baseline rates of mutational fixation in this region of the genome except for the fact that these 5' regions are conserved regulatory regions.
The 3' UTRs agree at 114 of the first 116 positions, slightly better than coding region. This does not support selective pressure acting at the protein level for some time interval following gene establishment. At a rate of 3 mutations fixed per 100 residues per 10 million years (a generic pseudogene rate), the ferritin feature on chr 20 is roughly 8.3 million years old.
SwissProt chr 19 ferritin P02792
INIT_MET 0 0 MOD_RES 1 1 ACETYLATION. DOMAIN 53 60 CATALYTIC SITE FOR IRON OXIDATION. METAL 53 53 IRON (POTENTIAL). METAL 56 56 IRON (POTENTIAL). METAL 57 57 IRON (POTENTIAL). METAL 60 60 IRON (POTENTIAL). METAL 63 63 IRON (POTENTIAL).
What is the bottom line here? On the one hand, the chromosome 20 ferritin has all the properties of a classical pseudogene derived recently from the liver ferritin mRNA from chr 19: it is processed (intronless) and retropositioned (direct flanking repeats, genomic poly A). There is no evidence that it is ever transcribed, much less translated into protein, much less specifically in the affected globus pallidus in HSS. (Note: gene expression (ESTs) has never been studied from basal ganglia in the globus pallidus.)
On the other hand, the insertion is full length (not truncated) and fortuitously follows a pentanucleotide identical to authentic ferritin promoter (if the entry to X03742 is to be believed). There are many precedents for intron-purged functional genes, especially from the X chromosome, arising in this manner. It is common for gene duplications to diverge in function via tissue-specific expression. It is also biochemically plausible that a ferritin light chain, when suitably mutated, could give rise to iron deposition. And here is a disease of iron accumulation mapping very close to genomic feature concerned with iron metabolism.
Thus it is premature to decide between gene/pseudogene or to posit a role in Hallervorden-Spatz syndrome. It may be necessary to find all gene candidates in the mapping region and sequence all of them as well in affected individuals. However, the single-exon ferritin feature could readily be sequenced in a single pass and is a prime candidate for an early screen. Its diagnostic sequence differences also allow design of specific primers that could amplify rare or tissue-specific mRNAs. The putative protein could be produced recombinationally and possibly be resolved from liver ferritin by monoclonals.
Genes involved with iron metabolism are not that common. To have the disease map so close to such a gene in conjunction with a lack of other plausible candidates from the current human genome project favor this ferritin light gene as responsible for HSS.
Nat Genet 1996 Dec;14(4):479-81 Published erratum appears in Nat Genet 1997 May;16(1):109 concerns unpublished locus heterozygosity Taylor TD, Litt M, Kramer P, Pandolfo M, Angelini L, Nardocci N, Davis S, Pineda M, Hattori H, Flett PJ, Cilio MR, Bertini E, Hayflick Susan J (503) 494-7703 Excellent HSS research website at OHSUHallervorden-Spatz syndrome is a rare, autosomal recessive neurode-generative disorder with brain iron accumulation as a prominent finding. Clinical features include extrapyramidal dysfunction, onset in childhood, and a relentlessly progressive course. Histologic study reveals massive iron deposits in the basal ganglia. Systemic and cerebrospinal fluid iron levels are normal, as are plasma levels of ferritin, transferrin and ceruloplasmin. Conversely, in disorders of systemic iron overload, such as haemochromatosis, brain iron is not increased, which suggests that fundamental differences exist between brain and systemic iron metabolism and transport. In normal brain, non-haem iron accumulates regionally and is highest in basal ganglia. Pathologic brain iron accumulation is seen in common disorders, including Parkinson's disease, Alzheimer's disease and Huntington disease. In order to gain insight into normal and abnormal brain iron transport, metabolism and function, our approach was to map the gene for HSS. A primary genome scan was performed using samples from a large, consanguineous family (HS1) (see Fig. 1). While this family was immensely powerful for mapping, the region demonstrating homozygosity in all affected members spans only 4 cM, requiring very close markers in order to detect linkage. The HSS gene maps to an interval flanked by D20S906 and D20S116 on chromosome 20p12.3-p13. Linkage was confirmed in nine additional families of diverse ethnic backgrounds.
Mamm Genome 1992;2(3):143-9 Renaudie F, Yachou AK, Grandchamp B, Jones R, Beaumont CMultiple homologous sequences for the ferritin L subunit are present in mammalian genomes, but so far, only one expressed gene has been described. Here we report the isolation of a cDNA from a mouse bone marrow library, corresponding to an isoform of the mouse ferritin L subunit. This new subunit, that we named Lg, differs from the L subunit of ten amino acids. Specific amplification of mouse genomic DNA using the polymerase chain reaction (PCR) confirmed the presence of this Lg sequence in the mouse genome but also suggested that it must be encoded by an intronless gene.
Using a series of different Lg-specific oligonucleotides as probes, we subsequently isolated a genomic clone containing an uninterrupted sequence, identical to the Lg cDNA. This Lg gene lacks introns and does not contain the 28 base pairs (bp) conserved motif usually present at the 5' end of most ferritin mRNAs, which confers translational regulation by iron. When transiently transfected into K562 cells, this Lg genomic clone is actively transcribed, suggesting that, although it possesses the characteristics of a processed pseudogene, it is likely to correspond to the gene encoding this new ferritin subunit.
C R Acad Sci III 1995 Apr;318(4):431-7 Renaudie F, Boulanger L, Grandchamp B, Beaumont CWe have cloned the functional gene coding for the L ferritin subunit by successive rounds of screening of a mouse genomic library using different oligonucleotides so as to avoid cloning the multiple pseudogenes of this rather complex multigene family. The L gene consists in 4 exons interrupted by 3 introns and spanning 1.8 kb. Quantitative measurements of H and L ferritin mRNA in various mouse tissues using a ribonuclease protection assay reveals important variations in the L/H ratio, the liver displaying the highest amount of L mRNA. Functional analysis of 1 kb of upstream sequence by transient transfections into the hepatoma cell line HepG2 shows that the mouse L gene transcription relies upon a minimal 130 bp promoter region containing 1 TATA box and 2 CCAAT motifs. Elements with an enhancing activity specific of hepatic tissue are likely to be located outside of this 1 kb fragment.
Proc Natl Acad Sci U S A 1988 Dec;85(24):9503-7 Walden WE, Daniels-McQueen S, Brown PH, Gaffield L, Russell DA, Bielser D, Bailey LC, Thach REMouse and rabbit ferritin mRNAs translate very poorly in rabbit reticulocyte lysates relative to most other mRNAs. This translational deficiency is not seen in wheat germ lysates, suggesting the presence of an inhibitor in reticulocyte lysate that is specific for ferritin mRNA. A specific repressor of ferritin mRNA translation has been partially purified. The inhibitory activity of this repressor against native ferritin mRNA can be relieved by adding in vitro transcripts of ferritin light-chain RNAs that contain the first 92 nucleotides of the 5' untranslated region. No other sequences appear to be necessary for this effect.
J Biol Chem 1987 May 25;262(15):7335-41 Leibold EA, Munro HNThe iron storage protein ferritin consists of two types of subunits of different molecular weight, heavy (H) and light (L). The rat genome contains approximately 20 copies of the ferritin L-subunit gene, of which we have sequenced seven. One is an expressed ferritin gene containing three introns located between the alpha-helical domains of the L-subunit protein. The remaining six have the characteristics of processed pseudogenes. Sequence divergence suggest that these pseudogenes arose approximately 3-12 X 10(6) years ago. By using intron probes derived from the expressed ferritin L-gene, a homologous second copy has been identified in some Fischer rats. Comparison of the 5'-untranslated region of the rat L-gene with the published sequences of this region of the human L show a strongly conserved 28-base pair sequence, suggesting a translational regulatory function. The 5' flanking region of the rat L-gene contains sequences homologous to those in the flanking areas of the human L- and H-genes.
last updated: 3 Jan 00 webmaster
61853 bp remaining in contig HSJ189G13 1396 bp to start of 406 bp psL7a feature 106461 bp to ATG of prion 25331 bp to ATG of doppel 38620 bp to start of dynein 16617 bp to end of contig 45000 bp to end of KIAA0168 295278 bp total lengthWe look here at the region upstream of the prion-doppel tandem to see what the nearest neighbor might be in the 5' direction based on a 63,249 bp unfinished contig, AL121916, positions 56750-57209, that showed up at htgs on 19 Dec 99. The adjacent 30,000 bp is basically just another heavily parasitized wasteland of the human genome, in this case with nothing in it but a slightly entertaining pseudogene for ribosomal protein large subunit, L7a.
There are many dozens of these L7a pseudogenes in the human genome; chromosome 20 alone will have about 8 of them. We saw earlier another ribosomal protein RSP4X nearer the prion gene. (Proteins made in great abundance have many mRNA copies and so a greater likelihood of retrotransposition?)
Processing the 30 kb contig through GeneBander, it quickly emerges that GenScan is mostly predicting actively transcribed repeat elements, not host genes. GenScan does however correctly find a piece of this pseudogene in the midst of an otherwise erroneously predicted gene.
It has become fashionable to inflate gene counts and downplay pseudogenes, eg, the recent 3 Arabidopsis papers or human chr 22, via reliance on GenScan and XGrail. The psychology is that genome sequencing is such hard work that long barren stretches are more than anyone can bear to report, leading to gene exageration.
It is much better to tblastn(nrp) the blastn(est) repeat-masked sequence matches because the mRNA database is approaching saturation. ESTs have experimental reality (even though not all are mRNAs); ab initio predictions do not.
It is soon evident that the chr 20 feature contains exons 6,7,8 of authentic L7a gene, PRPL7a (but without the respective introns), as well as the 3'UTR, two poly A signals, a poly A site, followed by genomic poly T (the feature is on the minus strand relative to prion and doppel). While direct flanking repeats are not evident, a translation gives both an internal frameshift and a stop codon though 86% identity after frame-jumping. There is no event association with retrotransposons nor any ESTs that 'belong' to this genomic stretch.
This fits the classical picture of a processed, non-expressed, recent retropositional mRNA pseudogene. However, a segment of intron 5 of authentic L7a also had an unmistakable match within the feature. Processed pseudogenes don't have introns. What is going on -- might this really be a genomic transposition? [This would possibly implicate prion and doppel as having originated elsewhere in the same event.]
No. Two ESTs also cover this intron (which then isn't really intronic): AI274211 and AI272858. In other words, alternative splicing occurs in the L7a gene (rarely today, less than 2%), leading to mRNAs with an upstream extension of exon 6. The ESTs themselves were too short to determine the upstream splice acceptor directly and could not be further tiled. Conceptual translations and web splice site predicting tools resolve this.
If alternatively spliced L7a mRNAs were as uncommon at the time of the event15 million years ago as they are today, we are left thinking it odd that one of these was 'chosen' for retrotransposition and wondering if the 5' truncation in this very region is coincidental, that is, whether some structural anomaly of the alternately spliced message predisposed it to fragmentary retrotransposition.
A ribosome does not seem offhand like a good place for an alternately spliced proteins because of the need for translational fidelity; it is not easily checked if other ribosomal proteins exhibit this property. L7a is highly conserved: the human protein is not far from drosphila or for that matter yeast. In fact, the 'extended' exon 6s are the rule in species such as Schizosaccharomyces pombe, suggesting the mammalian L7a short splice may not in fact be ancestral.
There has not been a lot of 'action' in chromosome 20 over the last 500 million years, either at the 300,000 bp scale here or on the p arm of the whole chromosome. Horse prion mapped to horse chr 22 [December 99 Genome Research] as did all human chr 20 q genes tested. That is why finding the KIAA0168 gene in zebrafish was so important; even if zfish prions cannot be primed, they can be identified from proximity to KIAA0168 and are very likely to be present on its clones.
chr 20 genomic psL7a reference sequence:460 bp aaggaaaatttctattattttaattatttttatgtacagaaaactcaacagcgtac atttaacccagtttagtcgcaagttctttagccttcgccttttttagcttggtgat gcgagccacagacttgggacccaggacattacctccccagtgacagcagatctcat cgtatctgtcgctgtaattggtcctgatagtttccaccagcttagccaaagctcct ttgtcttccaagttaacctgtgcgaaggcagcagtggtgcctctcttcttgtggac tagaagtcccagtcttgacttccccttgataaagcagtaagggccccccatttttt atgacacagggcaggcaggaagacaaccagctagaagaaagcactagctgaagagc atattttgaccaaaagcagtaaatttcaaagctagctgggtagcaactgctctgggttaaaaagttca
chr 20:366 LVVFLPALCHK 334
LVVFLPALC K site of frameshift
L7a: 166 LVVFLPALCRK 176
chr 20:334 KMGGPYCFIKGKSRLGLLVHKKRGTTAAFAQVNLEDKGALAKLVETIRTNYSDRYDEICC 155
KMG PYC IKGK+RLG LVH+K TT AF QVN EDKGALAKLVE IRTNY+DRYDEI
L7a: 176 KMGVPYCIIKGKARLGRLVHRKTCTTVAFTQVNSEDKGALAKLVEAIRTNYNDRYDEIRR 235
chr 20:154 HWGGNVLGPKSVARITKLKKAKAKELATKLG 62
HWGGNVLGPKSVARI KL+KAKAKELATKLG
L7a: 236 HWGGNVLGPKSVARIAKLEKAKAKELATKLG 266
11 Jan 99 webmaster These disease also map very near prion-doppel; more details shortlyOMIM: The corneal dystrophies can be classified according to the site of predominant involvement, the cornea having 5 layers: from outside inward, epithelium, Bowman membrane, stroma, Descemet membrane, and endothelium. Most cases are recessive.
Ophthalmic Genet 1999 Dec;20(4):243-249 Kanis AB, Al-Rajhi AA, Taylor CM, Mathers WD, Folberg RY, Nishimura DY, Sheffield VC, Stone EMThis study sought to determine whether AR-CHED segregating in a consanguineous Saudi Arabian pedigree is linked to the previously mapped and overlapping loci for AD-CHED and PPMD on the pericentric region of chromosome 20. Forty members of a consanguineous Saudi Arabian pedigree segregating AR-CHED were ascertained. Short tandem-repeat polymorphic markers from the 20 cM interval on chromosome 20 containing both the PPMD and AD-CHED loci were used to genotype these individuals. LOD score analysis of the genotype data with the MENDEL software package utilizing a model of autosomal recessive inheritance with complete penetrance showed exclusion of CHED from the entire PPMD/AD-CHED interval by utilizing overlapping intervals of LOD scores of at least -2. The results obtained demonstrate that AR-CHED is not allelic to either AD-CHED or PPMD, although it has been proposed that AD-CHED may be allelic to PPMD.
Genomics 1999 Oct 1;61(1):1-4 Hand CK, Harmon DL, Kennedy SM, FitzSimon JS, Collum LM, Parfrey NACongenital hereditary endothelial dystrophy (CHED) is a corneal disorder that presents with diffuse bilateral corneal clouding. Both autosomal dominant (AD) and autosomal recessive (AR) forms of the disorder have been described. The gene responsible for AD CHED (HGMW-approved symbol CHED1) has been mapped to the pericentromeric region of chromosome 20. Investigating a large, consanguineous Irish pedigree with autosomal recessive CHED, we have previously excluded linkage to this AD CHED locus. We now describe a genome-wide search using homozygosity mapping and DNA pooling. Evidence of linkage to chromosome 20p was demonstrated with microsatellite marker D20S482.
A region of homozygosity in all affected individuals was identified, narrowing the disease gene locus to an 8-cM region flanked by markers D20S113 and D20S882. This AR CHED (HGMW-approved symbol CHED2) disease gene locus is physically and genetically distinct from the AD CHED locus.
Br J Ophthalmol 1999 Jan;83(1):115-9 Callaghan M, Hand CK, Kennedy SM, FitzSimon JS, Collum LM, Parfrey NAConventional genetic analysis in addition to a pooled DNA strategy excludes linkage of AR CHED to the AD CHED and larger PPMD loci. This demonstrates that AR CHED is genetically distinct from AD CHED and PPMD.
Hum Mol Genet 1995 Mar;4(3):485-8 Heon E, Mathers WD, Alward WL, Weisenthal RW, Sunden SL, Fishbaugh JA, Taylor CM, Krachmer JH, Sheffield VC, Stone EMPosterior polymorphous dystrophy (PPMD) is an autosomal dominant disorder of the cornea that is clinically recognized by the presence of vesicles on the endothelial surface of the cornea. The corneal endothelium is normally a single layer of cells that lose their mitotic potential after development is complete. In PPMD, the endothelium is often multi-layered and has several other characteristics of an epithelium including the presence of desmosomes, tonofilaments, and microvilli. These abnormal cells retain their ability to divide and extend onto the trabecular meshwork to cause glaucoma in up to 40% of cases. A large family with 21 members affected with PPMD was genotyped with short tandem repeat polymorphisms distributed across the autosomal genome. Linkage was established with markers on the long arm of chromosome 20. The highest observed LOD score was 5.54 (theta = 0) with marker D20S45. Analysis of recombination events in four affected individuals revealed that the disease gene lies within a 30cM interval between markers D20S98 and D20S108.