Flanking Genes in the Prion Gene Neighborhood
Mad Cow Home ... Best Links ... Search this site

GenMap00: how the map was made
GenMap00: the map
Hallervorden-Spatz syndrome
L7a: A processed pseudogene with retained intron
Nearby diseases: AR CHED, AD CHED, and PPMD

Making GenMap00

Last updated: 7 Feb 00 webmaster research
Previous maps of the human genome, such as GenMap99 and its predecessors, have created a framework that locates a large number of microsatellites used to determine locations of human disease genes. While radiation hybrid mapping may in principle be capable of sufficient detailed mapping, as matters stand, the map in the chromosome 20p12 region lacks sufficient precision. Many markers are allocated to multiple (contradictory) positions, preventing a unique ordering as well as not providing meaningful recombination mapping distances.

This causes monogenic disease research to stall out at a critical juncture -- associating the disease with a particular gene. If the disease can only be mapped to a 3 million base pair region, gene density is such that 30-50 different genes or more would have to be screened in the patient set and controls.

A different approach was taken in the Whitehead Institute map. Here, chromosome 20 was tiled with a set of overlapping yeast artificial chromosomes (YACs); markers were positioned relative to hits on this panel. This resulted in an ordinal map: microsatellites were sequentially ordered with respect to their telomere to centromere position without any physical or centimorgan distances resulting. It quickly emerges that this map is superior in accuracy to GenMap99.

Meanwhile, the Sanger Centre continues direct sequencing of the chromosome itself. At this time, chr 20 is half done, with equal parts finished and unfinished (sequenced but unassembled contigs). This data and the associated WebAce database realization of mapping status, while released daily, is user-unfriendly. However, it re-surfaces at GenBank at the high-throughput-genome-sequence (htgs) database and at the finished contig repository for chromosome 20 (NT_00xxxx) reference sequences.

It is this third sequence map that enables integration of GenMap99 and the Whitehead map into GenMap00. The NT_00xxx web page lists finished contigs in order of physical position, provides links to bare sequence, to annotated GenBank entries when available, and to some microsatellite and gene STS markers on that contig.

Needless to say, it is not so simple. The three maps do not use the same set of microsatellites and other markers, though there is considerable partial overlap (Venn diagram). Worse, each of 5-6 mapping centers using the composite marker set felt obligated to assign a unique name (synonym), with the result that maps cannot be visually compared for a given marker. Worse still, no single look-up site 1, 2 carries more than a few of the name equivalencies; some older names were even withdrawn or given new definitions. This made the maps nearly impossible to use directly with disease mapping nomenclature as given in the medical literature.

GenMap00 had to address this synonomy issue early on in order to combine and consolidate the 3 maps. With a 'find' operation on the last 4 characters, say in Netscape, it is now easy find a marker of interest in the final map, even if the name did not appear in one of the constituent maps.

The STS and gene marker sets shown on sequenced NT_00xxxx contigs are very incomplete, even though a nearby division of GenBank maintains the STS database against which Blast searches might have been conducted (best done by not repeat-masking for simple repeats). In making GenMap00, each of the annotated entries was opened and searched for markers ('misc_feature'), allowing for synonomy. Additionally, each of residual markers on the Whitehead and GenMap99 map were used as filtered Blastn queries against the unfinished htgs and finished nrn GenBank databases.

These matches must be evaluated with due consideration to misleading hits due to the simple repeat nature of microsatellites. Some markers additionally contain ALU and similar elements; others are mRNAs to a parent gene on a different chromsome matching a pseudogene. Matches to unfinished contigs can also fail to the extent that the marker overlaps an end.

GenMap00 also takes in various lists of protein-coding genes that have been compiled for this chromosomal region, provided that tblastn of the protein against finished or unfinished chr 20 could validate a location. This was supplemented by direct annotation of certain contigs; this varied from high intensity characterization of all features using GeneBander to quick-pass searches for easily identified known genes. Large pieces of several unfinished contigs were also annotated.

Genes and proteins also have a confused nomenclature arising from use of names that came up in weak homology matches (eg lactate dehydrogenase for goliath protein); proteins in this region mainly have poorly understood roles and so no logical names at this time. Many STS markers are mRNAs from these genes; in fact, it is not unusual for a half dozen slightly different STS markers to originate from a single gene. GenMap99 tended to use mediocre computer-generated EST annotations but at least this gave the same name, however inappropriate, to some nearby markers.

The NT_00xxxx map proposes a telomere-to-centromere order for most of its entries, even going beyond this to give absolute physical distances in kilobase units, with gaps estimated as well. This order is independent of, but largely consistent with, the YAC and radiation maps; it ultimately derives from much more detailed sequencing work at the Sanger Centre. However, certain NT_00xxxx contigs in the chr20p12 region lie within the unlabelled set at the bottom; apparently these await further processing. Some of these fail to be in the main non-redundant databases as well.

Orientation is not fully resolved. That is, a particular NT_00xxx contig might be in correct relative physical order but be 'upside-down' relative to the telomere-centromere axis, ie, the reverse-complement sequence should have been provided. This is less likely in the case of megabase contigs. Contigs containing several markers from the Whitehead ordinal map allow for a consistency check. GenMap00 uses the NT_00xxx contigs as given; unfinished contigs are intercalated less reliably, mainly based on Whitehead map order.

On the plus side, nothing beats a long, reliably determined DNA sequence for establishing microsatellite position and order. GenMap00 has many instances where the absolute kilobase distance and order are established for sizeable groups of markers commonly used on the radiation hybrid and Whitehead maps as well as in disease research. Even marker sets in unfinished contigs (unordered internal pieces) helpfully block up and localize common map markers. It is important to use these because half the data is in this form.

GenMap00 is not a static object. Being a 'leading indicator' directly tied to the sequencing of chromosome 20, it rapidly converges to unambiguous status, though functional annotation of all its genes will take many years. GenMap00 is not exhaustive either: while most of the commonly used markers are precisely located, more of the constituent contigs could be annotated at higher intensities now. New markers can be designed directly from low complexity sequence in areas where needed. There are no obstacles to exending the strategies of GenMap00 to the whole genome.

For disease genes, where time is of the essence, the best strategy is to intensify classical microsatellite mapping using old and new markers of GenMap00 while at the same time intensifying gene annotation in the critical region. The synergy is a doubly winnowing: the disease becomes better localized; at the same time, genes plausible for the phenotype become better annotated. This reduces the number of genes that have to be sequenced for confirmation in patients and controls; this is be especially important in the circumstance of non-coding mutation.

GenMap00: the map

Last updated: 7 Feb 00 webmaster research
The columns of GenMap00 are as follows:

 1: marker number 
 2: disease marker
 3: mouse synteny on chromosome 2
 4: marker name
 5: marker synonyms. 
 6: gene symbol and name (to extent known). 
 7: GenMap99 map position. 
 8: Whitehead YAC map order. 
 9: GenMap00 cluster number 
10: sequential order of markers occurring within a given contig, 0 if not known. 
11: NT_00xxxx finished RefSeq contig. 
12: GenBank accession of unfinished unassembled contigs. 
13: size of contig in kilobase pairs. 
14: physical distance from telomere in kilobase pairs.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
34
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
hss
-
hss
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
hss
-
ch2
hss
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
hss
-
-
-
-
-
-
-
-
-
hss
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
hss
hss
-
-
-
-
-
-
-
-
-
-
-
-
hd
hss
hss
hss
hss
-
-
-
-
-
-
-
hss
-
-
-
-
-
-
-
-
-
-
-
-
-
hss
-
-
gss
-
hd
ch2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
dia
-
-
-
-
-
-
-
-
-
mck
-
-
-
-
hss
-
-
-
-
-
-
-
mck
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
ch1
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
84
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
73.1
73.1
73.1
73.1
73.1
73.1
73.1
73.1
73.1
73.1
73.1
73
73
-
-
-
-
83.9
83.9
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
73
73
73
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
74
74
74
74
-
-
-
-
-
-
-
-
-
-
-
71
-
-
-
75
74.5
74.5
74.2
74.2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
74.9
74.9
74.9
-
-
-
-
-
-
-
-
-
-
-
75.2
-
-
-
-
-
-
-
-
-
-
-
73
75.2
75.2
75.2
75.2
75.2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
75.6
75.6
75.6
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
76.2
-
-
-
-
-
-
-
-
80
-
-
-
-
-
-
-
-
-
-
-
-
-
-
77
77
-
-
-
78.2
78.2
78.2
-
76.7
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
73
-
-
-
-
81.4
81.4
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
82
84
84
84
84
-
84
-
-
84
stSG20090
stSG20157
stSG34040
WI-15102
SGC34218
A006U19
stSG49427
sts-T67132
stSG29515
A009P19
WI-14697
sts-T91069
sts-U02019
stSG20152
WI-15969
WI-12305
stSG20132
stSG20188
stSG2189
stSG54099
WIAF-15
stSG33834
A002B36
WI-14248
WI-12610
stSG41515
AFMa131wf1
WI-1352
WI-16682
stSG46646
stSG20187
stSG53482
SGC34571
sts-T29481
H15780
sts-X52220
stSG15083
IB255
WI-9632
stSG20190
WI-20603
SGC33430
sts-W80372
SGC44590
stSG3039
stSG40509
A006H16
stSG33832
AFM338td9
stSG52047
AFMA057VB1
WI-21595
stSG40517
stSG46180
sts-D29515
stSG20173
WI-5974
stSG54185
stSG20138
WI-14028
sts-M78966
SGC31938
HHCPJ80
WI-5974
stSG1730
sts-K02268
WI-7002
GAAT4E12
AFM240wb8
stSG41777
AFM205th8
AFM077xd3
sts-U08336
stSG32165
AFM248yc5
A005W30
WI-17015
stSG53329
stSG20093
H48462
stSG8614
WI-21992
stSG34085
stSG31043
A002A04
sts-Z41706
stSG52048
WI-9739
WI-3022
WI-3904
WI-9238 
WI-9238
stSG52086
AFM074wa9
WI-8646
SGC32750
stSG40166
stSG42376
WI-16193
stSG20088
mm1037
stSG10918
stSG25692
AFM333xe5
stSG3037
WI-18557
stSG34027
sts-M11186
stSG49795
IB607
stSG40506
stSG3058
stSG44722
stSG9697
WIAF-1797
stSG20164
sts-F16794
sts-W72684
WI-13669
stSG20142
A005R07
stSG4244
AFMA049YD1
AFM240zf4
WI-8798
WI-9015
WIAF-749
sts-M34668
-
stSG20180
WI-16594
stSG20257
stSG9792
WI-4715
-
-
WI-5517
stSG53632
AFM308we1
ATA21E04
A005O05
stSG44152
stSG25693
WI-4876
stSG10910
WIAF-2006
stSG40394
Bdyc4e10
AFMb026xh5
sts-H22126
stSG30448
AA037460
AFM234f10
stSG8000
stSG53601
SGC32955
A002D12
stSG2202
stSG20158
WI-17847
stSG30106
A009O14
stSG20327
sts-M76446
Chr_20ctg73
AFM248td1
WI-3772
AFM036ya3
GATA51D03
WI-2640
-
SGC44304
stSG10911
stSG20232
stSG20379
T27631
WI-18738
WI-7784
stSG20381
stSG29963
stSG53387
stSG62312
SGC34960
stSG20076
stSG25710
stSG35837
WI-12264
FB25H5
stSG10203
AFMB352XD9
R79078
stSG42745
stSG8643
SHGC-16916
SGC35090
stSG30431
WIAF-464
AA978290
stSG25485
stSG12840
stSG33855
WI-4689
AFMa130ya9
GC31723
sts-R79720
stSG58170
WI-17673
AFMb290wh5
W86724
AFM023ta1
WI-20801
WI-5288
pm0647
stSG20194
EST159813
sts-R10161
sts-T96330
stSG30232
stSG53094
SGC33687
stSG54023
stSG10925
H72100
WIAF-730
SGC32385
stSG20160
WI-17957
stSG3057
stSG9519
stSG20120
SGC30258
stSG20086
SGC30394
WI-9063
WI-8238
AFMA085ZH9
WI-9399
GGAA9H10
AFM317TE5
WI-9559
WI-8757
GGAA21B07
WI-15457
WI-12553
stSG33852
stSG20085
stSG20493
stSG25695
stSG33860
AFM299xd1
AFMB344XB1
stSG39181
AFMA196WF5
T92258
IP2017
sts-H58415
WI-5126
GATA72E11
GATA64G08
AFMc017ze1
AFM218yg3
WI-6281
AFM345TD1
AFMA114XE5
WI-16702
WI-18137
R20777
A006R17
AFMA218YB5
AFM238ZC11
AFM234ZB12
X83389
-
AFMa086we9
NIB489
WI-15043
WI-6712
A001Z47
AFM164TG5
WI-7473
WI-7829
WI-6063
stSG10890
AFMB348YB5
WI-17032
AFM288ZF5
AFM292XB5
X54567
AFM120XC7
WI-1930
GATA100G04
GATA81E09
WI-3903
WI-9762
AFM211YB8
WI-1994
-
WI-4171
AFMC013XE5
WI-2270
AFMA224XF1
AFM044XB4
WI-2364
WI-4893
AFM291WE5
AFM102XA7
AFMB067XG1
WI-6871
AFM260XG5
-
WI-10777
IP20M12
AFM197XB12
AFM210VB4
GGAA7E02
WI-4582
WI-9181
WI-7877
-
-
GATA83F12
AFM242YF8
WI-3249
NIB1603
WI-3387
WI-6873
-
-
-
pm1146
AA026396
stSG34957
WIAF-92
-
-
sts-R73406
stSG29447
GGAA9H11
-
-
-
WI-7085
L14856
-
-
-
-
-
-
-
-
-
H28185
-
-
-
-
-
-
R85922
-
-
-
R60806
G24251
-
-
-
-
-
-
-
R38826 G23285
R85704 G21228
-
D20S199 Z24636
D20S456 G03640
T64906 G22082
-
-
-
-
-
CDS seq'd
CDS seq'd
CDS seq'd
T03417
D20S735
-
H19750
-
-
-
-
-
G20805 
not seq'd
D20S906
-
A057VB1
R44338
-
-
-
-
D20S762
-
-
EST211960
-
-
-
D20S762
-
-
D20S1072
P8620
D20S179
-
D20S113
D20S103 Z16528 
stSG20054
-
D20S117 Z17123 
G32301
G21313
-
-
G28357 
-
H05471
SHGC-36460
-
G19838 
-
-
D20S737 G05248 
D20S1049
D20S745
D20S1032
D20S1032
-
Z66604
D20S1022
-
-
-
EST285473
-
-
-
-
D20S198
G21789
EST91360
not seq'd
not seq'd
not seq'd
T03618 
-
-
T55794
R07576 
-
-
-
-
stSG20142
WI-13669
G20430
-
stSG20223
D20S181
-
G07079
-
-
-
-
G23224 
-
-
MR8569
-
-
D20S619
D20S842
D20S193 Z24264 
D20S473
stSG408
-
WI-18677 R00301
D20S752
CDC25B
-
-
-
D20S867 Z53139 
-
-
-
D20S889
-
-
-
-
-
WI-17847
stSG20158
-
G32702
-
-
-
D20S116 Z17107 
D20S742
D20S97
D20S482
D20S500
-
-
-
-
-
-
STS-D00015
D20S1014
-
-
-
-
-
-
-
-
EST265520
T03153
A008E19
D20S895 Z53825
-
-
-
G15478 
-
-
-
H55768
Z94590
-
-
D20S751
D20S835
-
-
-
D20158
D20S882  Z53348 
-
D20S95 Z16434 
H91615
D20S760 G04858
-
-
-
-
-
-
-
-
-
-
-
-
-
-
R94932
-
-
-
-
-
-
D20S732
D20S1018
D20S916  Z51974 
D20S1034 G05380
G09476
D20S194 Z24330 
G07337
G06990
G10134
R05442
H20128
-
-
-
-
-
D20S192
D20S892
-
D20S846
-
D20S59
-
D20S755
G10052
D20S483
D20S900
D20S115
D20S621 G06114 
D20S907
Z67291
T97637
H58383
-
-
D20S851
D20S177
D20S175 Z23728 
D20S503
D20S5 near
D20S917
T17174
H29897
D20S723
-
Z66695
D20S729 G05247 
G06721
D20S763 G06112 
SNAP25
D20S894 Z53794 
G24576 stSG2011
D20S188
D20S189
D20S27
D20S186
D20S492
G10188
G08057
D20S744 G04471
D20S1041 
D20S172 Z23610
D20S495 G03650
D20S66
D20S747 G04587
D20S898 Z53993
D20S497 G03652
D20S852  Z52596
D20S98 Z16471
D20S741 G03948
D20S753 G04719
D20S904 Z51285
D20S104  Z16570
D20S875 Z53252 
D20S725 G06120
D20S118 Z17167 
clone 705D16
D20S1013 G11904
D20S48
D20S112 Z16842
D20S114 Z16950
D20S470 G08061
D20S614 G03662
D20S733 G06127
D20S1015 G06744
-
G20550
G10264
D20S182 Z23791
D20S1050 G04240
T16637
D20S1052 G04287
D20S1070 G06295
clone 738P15
clone 775C13
clone RP4-718P11
D20115
not seq'd
not seq'd
-
20p11.2
20p11.1-p11.2
not seq'd
not seq'd
D20S471 
-
20p11.2
20p11.2
20p11.2
20p11.2
20p11.2
20p11.2
20p11
20p11
20p11
20p11
-
-
-
-
-
-
-
-
SOX22 SRY (sex-determining region)
-
-
-
HNRPD hetero ribonuc
phospho[H...
phospho[H...
-
-
-
-
-
-
-
PSMF1 proteasome macropain 
PSMF1 proteasome macropain 
-
-
-
-
ANGPT4 (angiopoietin 4) R-spondin 
-
-
CSNK2A1 casein kinase 2
CSNK2A1 casein kinase 2
CSNK2A1 casein kinase 2
FKBP FK-506 binding
FKBP FK-506 binding
FKBP FK-506 binding
KIAA0374
p47 AAD44488
p47 AAD44488
p47 AAD44488
p47 AAD44488
p47 AAD44488
p47 AAD44488
p47 AAD44488
p47 AAD44488
SIRP-alpha1 tyr phosphatase NR
PTPNS1 alpha
4x tyrosine phosphatase SIRP beta 1
-
3x tyrosine phosphatase SIRP
3x tyrosine phosphatase SIRP
-
3x tyrosine phosphatase SIRP
PTPNS1 tyr phos, non  
PTPNS1 tyr phos, non  
PTPNS1 tyr stSG10889
SHPS-1 SIRP-a1 signal reg
SHPS-1 [SIRP]
similar to SHPS-1
PTPNS1 tyr phos, non  
PTPNS1 tyr phos, non  
PTPNS1 tyr phos, non  
PTPNS1 tyr phos, non  
PTPNS1 tyr phos, non  
PDYN  prodynorphin, enkeph
PDYN polyA
-
-
-
not seq'd
NP_004600 transcription factor 15
TCF15 transcription factor
-
RPS10 neuron zinc finger
RPS10 neuron zinc finger
RPS10 neuron zinc finger
-
-
L1
-
clone HH419 
-
-
clone HH419 
clone HH419 
HBV associated factor (XAP4),..
false NT_002249 location
-
TGASE E3 transglutaminases
SNRPB small nuc stSG10927
SNRPB small nuc WI-18905
SNRPB  [205973 231 aa]
-
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
IDH3B NAD+ isocitrate dh
[NOP56] nucleolar hN..
[NOP56] nucleolar hN..
[NOP56] nucleolar hN..
[NOP56] nucleolar hN..
OXT: oxytocin-neurophysin I
OXT: oxytocin-neurophysin I
arginine-vasopressin-neurophysin I
bicarb anion exhcnage
-
-
weak: huntingtin ubiquin
-
-
-
weak: pre-mRNA cleavage
weak: pre-mRNA cleavage
KIAA0552 Hs.90232
KIAA0552 Hs.90232
KIAA0552 Hs.90232
KIAA0860
-
-
PTPRA tyr phos, receptor type
PTPRA tyr phos, receptor type
PTPRA tyr phos, receptor type
PTPRA tyr phos, receptor type
GNRH2 gonadotropin-releasing 2
-
-
weak: procollagen alpha 2(V)
weak: procollagen alpha 2(V)
-
CPX-1 carboxypeptidase
RPL19 rib protein L19
-
[olfactory ebf unigene]
-
ATRN attractin mahogany KIAA0534
KIAA0548
SF3A3 pseudogene
weak: FZD7 frizzled
sialoadhesin+CENPB+CDC25+hs
CDC25B cell division cycle 25B
CDC25B cell division cycle 25B
CENPB centromere B
CENPB centromere B
Z53139
-
-
G1L FTLL1 goliath ferritin 
G1L FTLL1 goliath ferritin 
G1L goliath, LDHB lactate 
[G1L goliath]
[G1L goliath]
[G1L goliath]
[G1L goliath]
[G1L goliath ferritin]
[G1L goliath]
[G1L goliath]
cyclin G1 interacting protei..
ADRA1A adrenergic alpha 1A
ADRA1A adrenergic alpha 1A
ADRA1A adrenergic alpha 1A
not seq'd
-
psL7a contig
-
-
psRPS4X
prion
prion
prion
prion
prion
prion
PRNP
old prion cds est
prion gene region
prion doppel
old prion cds est
KIAA0168
KIAA0168
KIAA0168
KIAA0168
KIAA0168
not seq'd: fetal brain
CDS2 CDP-diacylglycerol; PCNA
PCNA proliferating cyclin
PCNA proliferating cyclin
PCNA proliferating cyclin
-
PCNA proliferating cyclin
-
-
-
novel GS+ESt+Blastp-
-
novel Hs.129047 3'UTR
-
no quick genes
-
-
DKFZp586A0422
DKFZp586A0422
-
-
-
glycerophosphoryl diesterase
-
-
-
-
-
Hs.171917 cap-binding pr..
-
weak: mig-2 mitogen inducible
-
-
alternatively spli..
CHGB chromogranin B
CHGB chromogranin B
secretogranin 1, SCG1
-
-
-
-
-
-
-
alternatively spli.
psCyt C Ox MCM2/3/5 family
CDP-alcohol phosphoTrans
psCyt C Ox MCM2/3/5 family
not seq'd NT_002265?
not seq'd
not seq'd
not seq'd
NM_007375 TAR bp
-
weak: chromogranin A
-
-
-
-
-
-
SHGC-10569
-
-
BMP2 bone morphogen 2
-
RRM RNA binding Gry
-
-
-
-
-
-
FMN dehyd Glycolate Oxidase 
-
[PHKBp1]
-
-
KIAA1162
PLCB1 Phospholipase C Beta 1
KIAA0581
KIAA0581
-
PLCB1, KIAA0581
-
-
-
Serine/Threonine Protein Kinase
PLCB4 phospholipase
PLCB4 phospholipase
HS1119D91 glucose induced 
not seq'd
weak: ankyrin
[SNAP]
-
SNAP-25 + psRPL23A
-
Jag Notch Alagille
-
psRPS11
-
-
KIAA0952
-
-
-
-
-
-
-
WI-8076 RPS10 ribosome
-
-
-
-
-
-
-
not seq'd
-
-
SNRPB2 small nuclear ribo
U2 small nuclear RNA
-
-
-
PCSK2 neuroendocrine convertase
PCSK2 neuroendocrine convertase
-
-
-
CP115 BFSP1 filensin cytoskeletal
RRBP1 ribosome bp 1 ES/130
not seq'd
-
-
-
-
-
CD39L2  nuc phosphatase
hnRNP RRM rna binding protein
weak: serine palmotyltransferase
oncogene mRN.
oncogene mRN.
oncogene mRN.
not seq'd
ps GAPDH
-
-
-
-
-
PAX1 paired homeobox
cystatin C 7 gene complex
SSTR4 somatostatin receptor 4
THBD thrombomodulin
brain glycogen phosphorylase PYGB
not seq'd: insulinoma-associated 1
HNF3B Hepatic nuclear factor
no seq: inosine triphosphatase
not seq'd myelin basic E
tubulin alpha
9.57
9.47
11.04
11.04
7.64
11.04
7.94
8.74
7.94
8.54
9.47
9.47
8.66
6.73
8.97
11.78
11.04
11.04
11.04
11.04
11.04
11.04
11.78
10.3
11.04
11.04
11.78
-
11.04
11.04
11.04
7.94
11.04
8.54
10.83
11.04
10.3
9.57
11.04
9.57
10.3
10.3
10.83
11.04
11.78
11.04
11.78
11.93
11.04
12.19
-
11.67
12.19
11.04
11.04
11.04
9.78
12.19
11.14
10.26
12.19
13.18
11.09
10.26
15
11.04
-
-
13.35
12.19
16.6
7.94
9.17
11.46
6.73
8.54
8.56
11.04
10.2
7.94
7.94
8.56
8.71
9.47
9.5
9.67
10.78
-
-
-
13.45
-
18.42
-
11.93
11.93
11.93
11.93
12.93
13.45
13.45
18.31
18.73
-
13.35
13.45
11.93
18.42
13.45
11.1
12.09
12.09
20.2
20.2
20.26
20.26
12.09
20.2
11.1
12.09
20.46
18.42
-
19.12
11.46
-
11.46
13.45
-
13.45
13.45
11.93
18.42
-
-
-
-
11.93
20.26
-
21.85
12.09
12.06
-
11.1
12.09
20.2
20.88
-
22.4
12.09
12.09
-
21.61
21.61
12.19
22.09
12.09
21.61
11.67
21.92
24.77
12.09
24.1
-
21.61
-
25.17
-
-
-
20.93
12.19
24.46
21.14
24.77
21.57
-
23.65
21.61
24.77
23.65
28.85
43.14
43.14
12.04
43.35
-
21.19
-
28.85
28.85
43.14
28.85
44.35
28.85
43.14
31.8
31.8
31.8
43.14
-
28.85
43.14
31.9
30.78
28.85
31.8
31.9
32
32.5
-
41.96
43.14
43.4
32.91
36.5
40.68
36.6
31.8
40.68
31.7
36.6
41.96
38.62
38.72
44.31
42.61
42.61
43.46
43.51
44.31
43.14
-
49.03
-
-
-
-
-
-
-
36.6
38.62
38.62
40.68
44.31
44.31
46.1
36.6
-
40.68
-
36.6
-
38.62
-
-
-
48.93
43.93
-
-
-
42.97
39.98
40.86
55.11
-
-
-
-
-
55.21
-
55.21
-
-
-
-
-
-
52.59
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
20.36
20.46
20.2
21.61
-
-
11.93
28.85
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
2
-
-
-
-
-
-



3
4
-
-
-
-
-
-
-
-
-
-
-
5
-
-
-
-
-
-
-
-
-
-
-
-
7
-
-
6
10
8
-
9
12
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
13
11
14
-
15
-
-
16
-
-
-
-
-
-
-
-
17
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
18
19
20
21
-
-
-
-
-
-
-
-
-
-
22
-
23
24
-
-
-
25
-
-
-
-
-
-
-
-
26
-
-
-
-
-
-
-
-
-
-
-
-
27
28
29
30
31
-
-
-
-
-
-
-
32
-
-
-
-
-
-
-
-
-
33
-
34
-
-
-
-
-
-
-
-
-
-
-
35
-
-
-
-
-
36
-
37
-
38
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
39
46
40
41
42
43
44
45
49
-
-
-
-
-
-
-
48
47
-
50
-
51
-
52
53
54
55
56
57
58
59
-
-
-
-
60
61
62
-
-
63
64
-
65
-
66
67
68
69
-
71
-
70
72
74
73
75
76
77
78
79
80
81
82
83
84
85
86
87
89
88
90
91
93
92
94
-
95
96
97
98
99
100
101
102
-
-
103
104
105
106
107
108
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
6
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
9
10
10
10
10
10
10
10
10
10
11
11
11
11
11
12
12
12
12
12
12
12
12
12
12
12
12
12
-
-
-
13
13
13
13
13
13
13
13.5
13.5
13.5
13.5
13.5
13.5
13.5
13.5
14
14
14
14
14
14
14
14
14
15
15
15
15
15
-
16
16
16
17
17
17
17
17
17
17
17
17
18
18
18
18
18
18
18
18
18
18
19
19
19
19
-
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
-
21
21
21
21
21
21
21
21
21
22
22
22
23
23
23
23
24
24
24
24
24
24
24
24
24
24
24
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
25
-
-
-
-
26
26
-
27
27
27
27
27
27
27
27
27
27
27
27
27
28
28
28
28
28
28
28
28
28
28
28
29
29
29
29
29
-
-
30
30
30
30
30
-
-
31
31
31
-
-
32
32
32
32
-
-
-
-
-
-
-
-
-
35
35
-
-
36
36
36
-
-
37
37
37
37
38
38
38
38
38
38
38
38
38
-
-
40
40
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0
0
0
1
2
3
4
5
6
7
8
9
10
11
11
0
0
0
0
0
0
1
2
3
4
5
6
7
8
9
-
1
2
3
-
-
-
-
0
0
0
0
0
0
0
0
0
0
1
0
4
2
3
1
0
0
0
0
0
1
2
3
4
5
6
8
9
10
11
-
-
0
0
0
1
2
3
4
5
-
-
-
-
-
-
-
-
-
1
2
3
3
4
0
0
0
0
0
0
0
0
0
0
1
2
3
-
-
-
-
-
-
-
-
-
-
0
0
1
1
1
2
3
4
-
-
-
-
-
-
-
-
-
0
0
0
1
2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
2
3
4
5
6
7
8
9
-
-
-
-
-
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
-
0
0
0
0
0
0
0
0
0
1
2
3
1
2
3
4
-
-
-
-
-
-
-
-
-
-
-
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
3
-
-
-
-
-
-
-
0
0
0
0
0
0
0
1
2
3
4
5
6
1
2
3
4
5
6
7
8
9
10
11
0
1
2
3
4
-
-
1
4
2
5
3
-
-
-
-
-
-
-
1
2
3
4
0
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
x
x
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
2
-
-
-
-
-
-
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_002249
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_001933
NT_002369
NT_002369
NT_002369
-
-
-
-
-
-
-
-
-
-
-
-
-
NT_002062
NT_002062
NT_002062
NT_002999
NT_002999
NT_002999
NT_002999
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
NT_001690
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
NT_002128
NT_002128
NT_002128
NT_002128
NT_002128
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
NT_002301
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
NT_002855
NT_002855
NT_002855
NT_002855
NT_002855
-
NT_003217
NT_003217
NT_003217
-
-
-
-
-
-
-
-
-
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
NT_001989
-
-
-
-
-
NT_001001+
NT_001001+
NT_001001+
NT_001001+
NT_001001+
NT_001001
NT_001001
NT_001001
NT_001001
NT_001001
NT_001001
NT_001001-
NT_001001-
NT_001001-
NT_001001
NT_001001-
NT_001001-
NT_001001-
NT_001001-
NT_001001-
NT_001001-
-
-
-
-
-
-
-
-
-
-
NT_001934
NT_001934
NT_001934
NT_002650
NT_002650
NT_002650
NT_002650
-
-
-
-
-
-
-
-
-
-
-
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
NT_002161
-
-
-
-
-
-
-
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_002065
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001820
NT_001003
NT_001003
NT_001003
NT_001003
NT_001003
NT_001515
NT_001515
NT_001000
NT_001000
NT_001000
NT_001000
NT_001000
-
-
NT_001691
NT_001691
NT_001691
-
-
NT_002283
NT_002283
NT_002283
NT_002283
NT_002409
-
-
NT_002862
NT_002862
-
-
NT_002019
NT_003003 
NT_002063 
NT_002063 
-
NT_002064 
-
NT_001662 
NT_001662 
-
-
NT_001692
NT_001692
NT_001692
NT_001692
NT_001005 
NT_001005 
NT_001005
NT_001005
NT_002857
NT_002857
NT_002857
NT_002857
-
-
NT_001636 
NT_002340
NT_002340
-
NT_000281
NT_001893
-
-
-
-
-
-
NT_001972
NT_002015
-
-
-
NT_002563
NT_002532
-
NT_002084
NT_002084
-
-
-
-
-
-
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL034548
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL031665
AL050325
AL050325
AL050325
AL050325
AL049761
AL049761
AL049761
AL136531 AL109658
AL136531 AL109658
AL136531 AL109658
AL136531?
AL109658
AL109658
AL109658
AL109658
AL109658
AL109658
AL109658
AL109658
AL109658
AL049634
AL049634
AL049634 NT_002999
AL035460
AL109809 
AL109809 
AL109809 AL121760
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
AL034562
-
AL133231
AL133231
AL133231 AL121758
AL118502 AL121758
AL118502 AL121758
AL118502 
AL118502 
AL118502 
AL121747 
AL121747 
AL121747 
AL121747 
AL121747 
AL121747 
AL121747 
AL121747 
AL121747 
AL031678
AL031678
AL031678
AL031678
AL031678
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
AL049712 
-
-
-
AL109976 
AL109976 
AL109976 
AL109976 
AL109976 
AL109976 
AL109976 
AL121891
AL121891
AL121891
AL121891
AL121891
AL121891
AL121891
AL121891
AL121905
AL121905
AL121905
AL121905
AL121905
AL121905
AL121905
AL121905
AL121905
AL035460
AL035460
AL035460
AL035460
AL035460
AL117334 AL353193
AL109805 AL132773
AL109805 AL132773 
AL109805 AL132773 
AL109804
AL109804
AL109804
AL109804
AL109804
AL109804
AL353194 AL109804
AL353194 AL109804
AL353194 AL109804
AL031670
AL031670
AL031670
AL031670
AL031670
AL031670
AL031670
AL031670
AL031670
AL031670
AL121675 
AL121675 
AL121675 
AL121675 
-
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121916 HSJ189G13
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
AL121781 HSJ1164C1
U29185
U29185
AL109808 HSJ1187J4
AL109808 AL133354
AL109808 AL133354 
AL133354
AL133354
AL133354
AL133354
AL133354
-
AL121924 AL121755
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL121924 AL121890
AL117377 AL035249 
AL117377 AL035249 
AL117377 AL035249 
AL121757
AL121757
AL121757
AL121757
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL109935
AL118505
AL118505
AL118505
AL118505
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
AL035461
-
-
-
-
AL109811
AL109811
AL121911
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL035668 AL034554 
AL031679
AL031679
AL078643
AL078643
AL078643
AL021879
AL021396
AL021396
AL021396
AL021396
AL021396
AL049632
AL031132 AL031655
AL031132
AL031683
AL121909
AL031682 
AL034427
AL023805
AL133002 
AL031652
AL031652
AL121740
-
AL109754 
AL023913 AL034430
AL023913 AL034430
AL023913 AL034430
AL135937 AL035456
AL035456 AL133340
AL049690 AL049649
AL049690 AL049649
AL049690 AL079337
AL049690 AL049649
AL034547 AL035448
AL096862 AL136460?
AL034561
AL078623
AL078623
AL121754
AL132826 
AL031677
AL118510
AL118503 
AL118503 
AL121584
AL118503 
AL109912?
AL035073
AL035073
-
AL135938
AL135938
AL034428
AL034428
AL034428
AL031675
AL031675
AL031675
AL121779
AL031664
AL031664
AL132765 
AL132765 
AL132765 
-
AL035045
AL050321
AL050321
AL121893 
-
AL035252
AL109618
AL109983
-
-
-
-
AL034426
AL031673
-
-
AL132821
AL034551
-
AL121894 AL121831
AL049651
AL049651
-
-
AL121772
-
-
-
153
153
153
153
153
153
153
153
153
153
153
153
153
153
153
364
364
364
364
364
364
364
364
364
364
364
364
364
364
364
364
109
109
109
-
-
-
127
127
127
127
127
127
127
127
127
127
115
115
115
224
224
224
224
103
103
103
103
103
103
103
103
103
103
103
103
103
103
103
103
-
117
117
117
-
-
-
-
-
127
127
127
127
127
127
127
127
127
205
205
205
205
205
159
159
159
159
159
159
159
159
159
159
159
159
159
-
-
-
137
137
137
137
137
137
137
159
159
159
159
159
159
159
159
247
247
247
247
247
247
247
247
247
135
135
135
135
135
127
161
161
161
196
196
196
196
196
196
196
196
196
130
130
130
130
130
130
130
130
130
130
142
142
142
142
-
299
299
299
299
143
63
63
63
63
63
63
63
63
151
151
151
143
143
143
143
143
-
475
167
167
167
167
167
167
167
167
98
98
98
167
167
167
167
180
180
180
180
180
180
180
180
180
180
180
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
139
-
-
-
-
-
-
-
330
330
330
330
330
330
330
330
330
330
330
330
330
1187
1187
1187
1187
1187
1187
1187
1187
1187
1187
1187
530
530
530
530
530
213
213
286
286
286
286
286
-
-
215
215
215
-
-
605
605
605
605
355
-
-
177
177
-
-
106
124
244
244
-
89
-
257
257
-
-
373
373
373
373
89
89
89
89
155
155
155
155
155
-
122
101
101
-
-
108
-
-
-
-
-
-
148
127
-
-
-
138
138
-
-
-
-
-
-
-
-
-
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2378-2531
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
2957-3322
3142-3251
3142-3251
3142-3251
-
-
-
-
-
-
-
-
-
-
-
-
-
3851-3966
3851-3966
3851-3966
-
-
-
-
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
3540-3643
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
3641-3846
3641-3846
3641-3846
3641-3846
3641-3846
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
3857-4016
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-


4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
4924-5055
-
-
-
-
-
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
5863-5926
-
-
-
-
-
-
-
-
-
-
8083-8182
8083-8182
8083-8182
8845-9012
8845-9012
8845-9012
8845-9012
-
-
-
-
-
-
-
-
-
-
-
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
9572-9711
-
-
-
-
-
-
-
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
9389-9720
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
8877-10064
9928-10459
9928-10459
9928-10459
9928-10459
9928-10459
20p12
20p12
10321-10608
10321-10608
10321-10608
10321-10608
10321-10608
-
-
12634-12849
12634-12849
12634-12849
-
-
13213-13818
13213-13818
13213-13818
13213-13818
14210-14565
-
-
-
-
-
-
20p11.2-p12
-
20405-20650
20405-20650
-
21257-21346
-
21615-21872
21615-21872
-
-
22889-23262
22889-23262
22889-23262
22889-23262
23531-23620
23531-23620
23531-23620
23531-23620
23219-23374
23219-23374
23219-23374
23219-23374
23219-23374
-
24125-24248
25308-25409
25308-25409
-
-
30704-30813
-
-
-
-
-
-
-
-
-
-
-
-
-
-
27506-27603
27506-27603
-
-
-
-
-
-
220 Features on Chromosome 20
acetyl-CoA synthetase  
adenosine deaminase  
adrenergic, alpha-1A-, receptor  
agouti mouse-signaling 8  
antigen NY-CO-33  
ATP synthase, H+ transporting F1, epsilon 
ATP/GTP-binding site motif AP-loop: Similar to
ATP5E, nuclear gene encoding mitochondrial  
ATPase type IV, phospholipid-transporting P-type,putative  
bactericidal/permeability-increasing  
bladder cancer related 10kD  
bone morphogenetic 2  
bone morphogenetic 7 osteogenic 1  
breast carcinoma amplified sequence 1  
C Oxidase I and   CDP-alcohol phosphatidyltransferase
C.elegansP1:CEC47E128;Mouse alpha-mannosidaseP1:B54407
cathepsin Z  
CCAAT/enhancer binding C/EBP, beta  
CD39-like 2  
CDP-diacylglycerol synthase  cytidylyltransferase 2  
cell division cycle 25B  
centromere B 80kD  
centrosome associated  
cep250 centrosome associated 2  
CGI-107  
CGI-15  
CGI-53  
cholinergic receptor, nicotinic, alpha 4  
chromogranin B secretogranin 1  
chromogranin B secretogranin 1, SCG1, pseudogene
chromosome segregation 1 yeast homolog-like  
cleavage stimulation factor, 3' pre-RNA
collagen, type IX, alpha 3  
copine I  
core-binding factor, runt domain, alpha 2
cystatin C amyloid angiopathy and cerebral hemorrhage  
cystatin D 3  
cysteine desulfurase  
cytochrome P450, XXIV vitamin D 24-hydroxylase  
D32579 comes from this gene 
death associated transcription factor 1  
DKFZP434C0935  
DKFZP434I114  
DKFZP434N061  
DKFZP564A032  
DKFZP566A0946  
DKFZP727M231  
dolichyl-phosphate mannosyltransferase 1, catalytic 
drin actin depolymerizing factor  
E2F transcription factor 1  
endothelial cell C/activated C receptor  
endothelin 3  
epididymis-specific, whey-acidic type, four-disulfide   
erythrocyte membrane band 4.1-like 1  
eukaryotic translation initiation factor 2,  2 beta    
eyes absent Drosophila homolog 2  
F15D4.3 [C.elegans]  
F52C12.2 [C.elegans]  
ferritin, light polypeptide 2 
FK506-binding 1A 12kD  
frizzled Drosophila homolog 7  
gamma-glutamyltransferase  
ganglioside-induced differentiation associated 1
GEF-2  
glutathione synthetase  
glycerophosphoryl dier phosphodierase domain
goliath-like C3HC4 type  
growth differentiation factor 5 cartilage morphogenetic
growth hormone releasing hormone 6  
guanine nucleotide binding G, alpha stimulating  
H.sapiens seb4D mRNA  
HBV associated factor  
Helicase C-terminal domain and SNF2 N-terminal domains 
hemomucin [D.melanogaster]  
hemopoietic cell kinase  
hepatocyte nuclear factor 4 gamma  
hepatocyte nuclear factor 4, alpha  
heterogeneous nuclear ribonucleoprotein D  
hHCN2  
HNF-3beta mRNA for hepatocyte nuclear factor-3 beta0
Human clone 23586 mRNA sequence  
Human putative cyclin G1 interacting mRNA  
Human ras inhibitor mRNA, 3' end  
hydroxyacid oxidase glycolate oxidase 1  
hyperpolarization cyclic nucleotide-gated channel
insulinoma-associated 1 symbol provisional  
integrin beta 4 binding  
isocitrate dehydrogenase 3 NAD+ beta  
jagged1 Alagille syndrome  
KIAA0168    
KIAA0172 
KIAA0181  
KIAA0249 and CpG island  
KIAA0255   
KIAA0308, a LY6 Lymphocyte antigen 6 T-cell 
KIAA0374   
KIAA0395   
KIAA0406 
KIAA0548    
KIAA0552  
KIAA0581    
KIAA0693    
KIAA0772   
KIAA0784  
KIAA0823  
KIAA0860  
KIAA0939  
KIAA0952  
KIAA0964  
KIAA0978
kinesin family member 3B  
Kreisler mouse maf-related leucine zipper KRML
laminin, alpha 5  
lethal 3 malignant brain tumor l3mbt Drosophila homolog  
M88866 comes from this gene [C.elegans] 37
matrix metalloproteinase gelatinase B, type IV collagenase
MCM2/3/5 family member, a pseudogene Cytochrome
microtubule-associated, RP/EB family1  
mitogen inducible gene mig-2 1  
mouse Dhm1 [M.musculus]  
MyD-1 antigen 3  
N-terminal acetyltransferase complex ard1 
neuronal thread AD7c-NTP  
neuronatin
nuclear receptor coactivator 3  
nucleolar KKE/D repeat
ORF YNL059c [S.cerevisiae]  
P24 [M.musculus] 7 
PAK1 LIKE Serine/Threonine-Protein Kinase  PLCB4
PCK1 gene for soluble phosphoenolpyruvate carboxykinase 1
peroxisomal acyl-CoA thioerase  
Phopholipase C beta 1-Phosphatidylinositol 4,5-Bisphosphate 
phosphoenolpyruvate carboxykinase 1 soluble  
phospholipase C, beta 4  
phospholipase C, gamma 1 formerly subtype 148  
phospholipid transfer  
phosphoprotein  
pleiomorphic adenoma gene-like 2  
PMP24 24 kDa intrinsic membrane   
POLYADENYLATE-BINDING 1 39
potassium voltage-gated channel, Shab-related 
potassium voltage-gated channel, subfamily G,  
PRE-MRNA SPLICING FACTOR RNA HELICASE
preferentially expressed in colorectal cancer  
prion  Creutzfeld-Jakob disease 
prodynorphin  
proline-rich M14 precursor [M.musculus] 127
proprotein convertase subtilisin/kexin type 2  
prostaglandin I2 prostacyclin synthase  
protease inhibitor 3, skin-derived SKALP  
proteasome  
proteasome prosome, macropain inhibitor subunit 1 PI31  
protective for beta-galactosidase galactosialidosis  
protein kinase C binding 1  
protein kinase cAMP-dependent, catalytic inhibitor gamma  
protein phosphatase 1, regulatory subunit 6  
protein phosphatase 2A BR gamma subunit 100
protein tyrosine phosphatase, non-receptor type 1  
protein tyrosine phosphatase, receptor type, alpha   
protein tyrosine phosphatase, receptor type, T  
PROTEIN-TYROSINE PHOSPHATASE 1B 9
putative brain nuclearly-targeted  
putative oncogene mRNA, partial cds  
putative Rab5-interacting {clone L1-57}
Quions
RAE1 RNA export 1, S.pombe homolog  
RAS-RELATED RAB-31  
rat kidney-specific 108
RENAL SODIUM/DICARBOXYLATE COTRANSPORTER
RETINOBLASTOMA-LIKE 1 4 
retinoblastoma-like 1 p107  
reverse transcriptase  
ribophorin II  
RNA-binding autoantigenic  
S-adenosylhomocysteine hydrolase  
S. cerevisiae CBP3 precursor
S. cerevisiae VPS16 [C.elegans] 47
S.pombe hypothetical C1D4.09C [C.elegans]
S68401 cattle glucose induced gene  
secretory leukocyte protease inhibitor antileukoproteinase  
semenogelin II  
serine/threonine kinase 4  
serine/threonine kinase 15  
small nuclear ribonucleoprotein polypeptide B''  
small nuclear ribonucleoprotein polypeptides B and B1  
sodium-dependent dicarboxylate transporter SDCT2
somatostatin receptor 4 5  
sorting nexin 5  
spermatogenesis associated PD1  
splicing factor CC1.3  
splicing factor, arginine/serine-rich 6  
staufen Drosophila, RNA-binding  
synaptosomal-associated, 25kD  
syndecan 4 amphiglycan, ryudocan  
syndrome, fatal familial insomnia  
syntaxin 16  
syntrophin, alpha dystrophin-associated A1
TAP pseudogene and the 3' l KIAA0188 and
TATA box binding TBP-associated RNA polymerase II
TH1 [D.melanogaster]  
thrombomodulin  
topoisomerase DNA I  
transcription factor 15 basic helix-loop-helix  
transcription factor AP-2 gamma activating enhancer gamma
transcription factor-like 5 basic helix-loop-helix  
transformation-related  
transglutaminase 2 C polypeptide,gamma-glutamyltransferase
transglutaminase 3 E polypeptide,gamma-glutamyltransferase 
translocase of outer mitochondrial membrane 
troponin C2, fast  
tumor necrosis factor receptor superfamily, member 5  
type II CALM/AF10 fusion 6  
ubiquitin carrier E2-C  
UDP-Gal:betaGlcNAc beta 1,4- galactosyltransferase5  
undulin 2  
v-myb avian myeloblastosis viral oncogene homolog-like 2  
VAMP vesicle-associated membrane-associated B and C  
Ydr531wp [S.cerevisiae]  
ZINC FINGER 151 4  
zinc finger 133 clone pHZ-13

Hallervorden-Spatz syndrome

 last updated: 8 Jan 00.  webmaster research
The Human Genome Project has been a boon to inherited disease research. Positioning of thousands of microsatellites have allowed the genes for many diseases to be mapped for the first time, including at least 4 diseases to the prion gene neighborhood in chr 20p12.3. However, for rare diseases, the final mapping resolution might be only a million base pairs and so the gene itself cannot quite be identified. (Gene density on chr 20p12 is perhaps 12 per million bp.)

At the same time, gene-finding tools applied to newly sequenced human chromosome are rapidly identifying genes that have no known associated disease. Some genes, of course, will not have an associated disease because of minor function or compensation.

The question is, do any inherited diseases mapping to chr 20p12.3 match up with any of the new genes being identified there? It is fairly easy to recover all orphaned diseases (OMIM and Medline) and exhaustively list all genes for a given stretch of chromosome but there is no systematic way of matching these up.

For the doppel gene, we might expect an ataxia from mouse studies but as over 409 human ataxias have been described, gene-disease matching is not feasible. For nearby KIAA0168, a nuclear gene with a ras homology domain, again the disease class is too broad (cancer). Four pseudogenes near prion-doppel are not transcribed so probably have no role in any disease.

However, an autosomal recessive disease identified in the 1920's, Hallervorden-Spatz disease (now called neurodegeneration with brain iron accumulation or NBIA type 1; OMIM #234200) may be matchable to a newly discovered ferritin light chain gene just telomeric to the prion gene on chromosome 20.

Ferritin is the major intracellular iron storage protein in many organisms; it concentrates iron 12 orders of magnitude above its solubility. The protein oligomer has the shape of a hollow sphere which stores up to 4500 iron atoms as ferric hydroxide phosphate. Mammalian liver and spleen ferritin consists of 24 subunits of variable numbers of 21k heavy and 19k light subunits for a total of 450,000k. The ferritin light chain in these tissues maps to chr 19q13.3-qter, so an alternative brain-specific isoform must be invoked for Hallervorden-Spatz. Mutations of the chr 19 gene in a conserved 5' UTR regulatory region [iron response element , cis-acting stem loop] cause hyperferritinemia-cataract syndrome. ]

Now at least 25 other inherited diseases with iron accumulation are known (OMIM search or SJ Hayflick's list; seven of them have arisen previously in connection with CJD, namely Parkinson, Alzheimer, Huntington, ALS (SOD), CP (ceruloplasmin), Friedreich ataxia, and hemochromatosis. It is very clear that brain iron accumulation can occur in genes having nothing to do with iron metabolism. In fact, iron accumulation in Hallervorden-Spatz disease might procede through the alpha-synuclein Lewy body mode as seen in PD and AD. [Neurology 1998 Sep;51(3):887-9].

Human Chromosome 20p12.3
MarkerZmaxMapping
...telomere
AFMa057vb1
D20S906
D20S113
GAAT4E12
D20S198
D20S842
AFMa049yd1
D20S181
D20S193
D20S473
AFMa074wa9
D20S889
D20S116
D20S867
...centromere
3.4
9.0
5.7
6.2
9.6
13.6
13.8
7.6
7.0
8.4
2.3
ferritin
6.9
4.8
Mapping in ten affected families has located the NBIA gene to an interval between D20S906 and D20S116 on chromosome 20p12.3-p13. Note that microsatellites within a given contig are readily determined by Blastn(sts) of unmasked sequence. GenBank entries are occasionally annotated for microsatellites. In this instance, the Sanger Centre noted the presence of (CA)n microsatellite D20S889 at positions 81245-81562 of a 130,263 bp chr 20 contig called 681N20 (accession AL031670) or genomic NT_001989.

At the Whitehead Center, D20S889 is found on a single yeast artificial chromosome,YAC 753_G_9, telomeric to D20S116 and the prion gene. Finished contig AL049634, which is even more telomeric, contains D20S906 at positions 28787-29123 (as well as a triplicated PTPNS1 feature). In other words, the new ferritin feature has an excellent fit relative to disease mapping data.

The 23 Nov 1999 entry lists a CpG island at 14306-15895 followed by a ferritin light chain-like (FTLL1) single exon feature at positions 23174-24151 of the minus strand with coding sequence from 23424-23951 (as well as a multi-exon goliath-like gene). The coded protein is 96% identical to liver ferritin light chain. As we will see shortly, it is not at all clear whether this is a gene (like the GenBank entry claims) or a pseudogene.

Since ferritin is a well studied protein, it is instructive to look in other species for precedents of an alternative ferritin light chain. In fact, a second ferritin light chain protein was found in mouse in 1992. It was never mapped; to be homologous to the new ferritin on human chr 20, it would have to map to mouse chr 2. (Mouse also has 11 ferritin pseudogenes on 11 different chromsomes; the homologue to human spleen ferritin is on mouse chr 7.) The second ferritin gene in mouse is intronless, apparently staying functional despite having arisen as a retrotransposed processed mRNA from the main mouse ferritin light chain gene. Some rat strains also have a second ferritin light gene with introns.

Highlights of human chr 20 ferritin relative to authentic chr 19 liver ferritin:

fortuitous new promoter
flanking direct repeats 15 bp
5' UTR, possible iron regulatory element IRE
protein coding region 528 bp, lacking 3 introns
poly A upstream signal
genomic poly A 29 bp
3' UTR
flanking direct repeats 15 bp

agtcaaaacaagcaagcaaactaataatTAAAAtaaacagaaaaaggcaagttggaggaaaccaagatttatttttaaGAATAAGAGGTGATA
ggcagttcggcggtccagtgggtctgtctcttgcttcaacagtgtttggacggaacagatccggggacggtcttccagcctccgaccgccctccaattt
cctctccacttgcaacctccgggaccatcttctcggctatctcctgcttctgggacctgccagcaccgtttttgtcgttagctccttcttggcgaccaacc

ATGAGCTCCCAGATTCGTCAAAATTATTCCACCGACGTGGAGGCAGCCGTCAACAGCCTGGTCAATTTGTACCTGCAGG
CCTCCTACACCTACCTCTCTCTGGGCTTCTATTTCGACCGCGATGATGCGGCTCTGGAAGGCGTGAGCCACTTCTTCCG
CGAATTGACCGAGGAGAAGCGCGAGGGCTACGAGCGTCTCCTGAAGATGCAAAACCAGCGTGGCGGCCGCGCTCTCTTC
CAGGACATCAAGAAGCCAGCTGAAGATGAGTGGGGTAAAACCCCAGATGCCATGAAAGCTGCCATGGCCCTGGAGAAAA
AGCTGAACCAGGCCCTTTTGGATCTTCATGCCCTGGATTCTGCCCACATGGACCCCCATCTCTGTGACTTCCTGGAGAC
TCACTTCCTAGATGAGGAAGTGAAGCTCATCAAGAAGATGGGTGACCACCTGACGAACCTCCACAGGCTGGGAGGCCCA
GAGGCTGGGCTGGGCGAGTATCTCTTCGAAAGGCTCACTCTCAAGCACGTCTAA
gagccttatgagcccagcgact
tctgaagggccccttgcaaagtaatagggcttctgcctaagcctctccctccagccaataggcagctttcttaactaccctaacaagccttggaccaaatgga

AATAAggctttctgatgcAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGAATAAGAGGTGATA
ttacacccatgaaacaagaataagatgctattttgagacaagaaacaaaaaataagctttaaa

The annotation graphic shows the first 30000 bp of this contig. The SVA CpG island shows up clearly, as does a less dramatic CpG enriched area upstream to the correct GenScan prediction of ferritin (on the minus strand). The total interspersed repeats percentage is high at 16450 bp or 54.83 % with SINEs 16.4 %, LINEs 17.25 %, LTR elements 20.8 % DNA elements, 0.40 % and small nuclear RNA, 0.18 %.Three sequence tagged sites are also shown: G14691, G01533, and G43501. D20S889, a major microsatellite marker is 57084 bp downstream of the ferritin gene.

Since the new human chr 20 ferritin light chain is also intronless, one might guess that the second minor ferritin gene arose prior to rodent/primate divergence. This is not supported by alignment data: the minor ferritins clearly cluster with their respective parental genes (note especially the 7 residue deletion in both human genes), suggesting that the second mouse ferritin represents parallel evolution with respect to the second human ferritin gene, rather than orthology. (However, it remains possible that mutational drift of ferritin light chains is driven by a coupling to the within-species ferritin heavy chain; synteny mapping of the second mouse light gene would settle this.)

Human, like mouse, apparently has a large number of ferritin light chain pseudogenes. Within finished human sequence, 9 strong matches to ferritin light chain are found by tBlastn, notably on chr Xp22 (tandem) and Xq27, chr 1p34, and chr 11q12. Unfinished human sequence, tBlastn(thgs), shows a further 11 features, notably on chromosomes 11, 6, 8, and 1. Since human genome is half-finished, these numbers could double.

The human chr 20 ferritin light gene is not a pseudogene if you believe the GenBank annotation. The support supplied is in the form of an upstream CpG island and in numerous EST matches. The CpG-enriched area lies 8 kilobases upstream of the gene and is not centered about any known ferritin exon or promoter. CpG islands have no natural polarity and this one could equally well belong to some gene 5' to the contig boundary. Ominously, it coincides with a tandem SVA insertion from 13769-16004. It is conceivable that a CpG island brought in by a transposon was recruited to serve the ferritin gene; more likely this is a retro-CpG feature not serving any host gene.

RepeatMasker, run on 12800-17000 of AL049634 on 10 Jan 00, masks 99.81% of the sequence. SVA is a long terminal repeat composite retroposon with a SINE found in 315 genomic sequences; it is odd to have 4 fragments back to back. The CpG enrichment may have to do with retroviral gene expression or merely an area of very high GC content.

  C  SVA       LTR/Retroviral (937)  449    115  
+  LTR5      LTR/Retroviral   332  875   (94)  
+  SVA       LTR/Retroviral     1  954  (432)  
+  SVA       LTR/Retroviral   630  694  (692)  
+  SVA       LTR/Retroviral   521  882  (504)  
+  SVA       LTR/Retroviral   519 1340   (46)  
A second downstream CpG island in the annotation at 31916..32941 apparently belongs to the following goliath-like gene because it partly envelopes its first untranslated exon 32479..32723.

It is equally crucial to look closely at proposed evidence of transcription to be sure that ESTs should not be allocated to the main ferritin light gene on chromosome 19 instead. This is a common mistake in computer gene annotation -- to set a high threshhold for Blast matches and assume the ESTs surviving the cutoff belong to the gene under study. Here the two genes are so close in sequence that only their 7 diagnostic differences drive EST allocation [arrows on graphic] via tBlastn(est). Similar diagnostic residues are found in flanking UTR that allocate ESTs not reaching coding regions.

In fact, not a single one of the 100+ EST "matches" at GenBank belongs to the chromosome 20 ferritin light gene. After allowing for an inherent error rate in EST sequence determination, all belong to the main ferritin light chain gene on chr 19. Therefore there is no support for feature transcription. Studies in the mid-80's found no evidence at the protein level for light chain heterogeneity. This gene may only be rarely transcribed and/or transcribed only in tissues not used in compiling the EST databases -- the disease manifests itself mainly in basal ganglia within the globus pallidus.)

The easy way to allocate ESTs is to take a differentially diagnostic stretch of 60 nucleotides at a time (one line of blast output), 0 descriptions, 100 alignments as 'flat master-slave with identities.' Below, EST matches to chr 20 probe are shown: all 100 belong to chr 19 ferritin according to the 3 diagnostic bases:

chr_20   1   ttcttggcgaccaaccatgagctcccagattcgtcaaaattattccaccgacgtggaggc 60
3086553  222 ........a...........................g....................... 163
6361411  236 ......c.a...........................g....................... 177
5880135  224 ......c.a...........................g....................... 165
5662977  224 ......c.a...........................g....................... 165
...]
The coded protein has no internal stop codons nor frameshifts and its 7 amino acid differences are fairly conservative (though this is an inherent property of the genetic code). Key charged residues involved in iron sequestration have not been affected though an alanine to threonine change is seen within this region. [Of the H-, L-, and M-type ferritin subunits in animals, only the H and M types have a functional diiron site Ferritins concentrate iron in cells as a mineral; cytotoxic reactions of both Fe2+ and O2 are controlled by ferritin chemistry as ferric oxo reaction products are directed to a large cavity as a ferric oxide hydrate. A a peroxodiferric intermediate in the ferritin ferroxidase reaction shows the ferritin ferroxidase site to be very similar to O2-activating diiron enzyme sites.

It cannot be determined bioinformatically whether the hypothetical chr 20 ferritin would interact properly with ferritin heavy chain to form a (globus pallidus-specific?) functional 24-mer nor even whether this would be the function of the new gene. [The 5' end potentially continues for 19 additional amino acids but this has no Blastp(nrp) support.] The 5' leader sequence could also be tested for stem-loop changes affecting the iron regulatory element IRE.

Could the new ferritin light chain be "older than it looks?" That is, the ratio of non-coding to coding changes may be high (parental gene is used for comparison). But this is not so: 7 of 14 nucleotide changes are non-synonomous, a 1:1 ratio. Adjacent UTR might establish baseline rates of mutational fixation in this region of the genome except for the fact that these 5' regions are conserved regulatory regions.

The 3' UTRs agree at 114 of the first 116 positions, slightly better than coding region. This does not support selective pressure acting at the protein level for some time interval following gene establishment. At a rate of 3 mutations fixed per 100 residues per 10 million years (a generic pseudogene rate), the ferritin feature on chr 20 is roughly 8.3 million years old.

SwissProt chr 19 ferritin P02792

  INIT_MET      0      0
  MOD_RES       1      1       ACETYLATION.
  DOMAIN       53     60       CATALYTIC SITE FOR IRON OXIDATION.
  METAL        53     53       IRON (POTENTIAL).
  METAL        56     56       IRON (POTENTIAL).
  METAL        57     57       IRON (POTENTIAL).
  METAL        60     60       IRON (POTENTIAL).
  METAL        63     63       IRON (POTENTIAL).

What is the bottom line here? On the one hand, the chromosome 20 ferritin has all the properties of a classical pseudogene derived recently from the liver ferritin mRNA from chr 19: it is processed (intronless) and retropositioned (direct flanking repeats, genomic poly A). There is no evidence that it is ever transcribed, much less translated into protein, much less specifically in the affected globus pallidus in HSS. (Note: gene expression (ESTs) has never been studied from basal ganglia in the globus pallidus.)

On the other hand, the insertion is full length (not truncated) and fortuitously follows a pentanucleotide identical to authentic ferritin promoter (if the entry to X03742 is to be believed). There are many precedents for intron-purged functional genes, especially from the X chromosome, arising in this manner. It is common for gene duplications to diverge in function via tissue-specific expression. It is also biochemically plausible that a ferritin light chain, when suitably mutated, could give rise to iron deposition. And here is a disease of iron accumulation mapping very close to genomic feature concerned with iron metabolism.

Thus it is premature to decide between gene/pseudogene or to posit a role in Hallervorden-Spatz syndrome. It may be necessary to find all gene candidates in the mapping region and sequence all of them as well in affected individuals. However, the single-exon ferritin feature could readily be sequenced in a single pass and is a prime candidate for an early screen. Its diagnostic sequence differences also allow design of specific primers that could amplify rare or tissue-specific mRNAs. The putative protein could be produced recombinationally and possibly be resolved from liver ferritin by monoclonals.

Genes involved with iron metabolism are not that common. To have the disease map so close to such a gene in conjunction with a lack of other plausible candidates from the current human genome project favor this ferritin light gene as responsible for HSS.

References:

Homozygosity mapping of Hallervorden-Spatz syndrome to chromosome 20p12.3-p13.

Nat Genet 1996 Dec;14(4):479-81  
Published erratum appears in Nat Genet 1997 May;16(1):109 concerns unpublished locus heterozygosity
Taylor TD, Litt M, Kramer P, Pandolfo M, Angelini L, Nardocci N, Davis S, Pineda M, Hattori H,
Flett PJ, Cilio MR, Bertini E, Hayflick Susan J (503) 494-7703
Excellent HSS research website at OHSU
Hallervorden-Spatz syndrome is a rare, autosomal recessive neurode-generative disorder with brain iron accumulation as a prominent finding. Clinical features include extrapyramidal dysfunction, onset in childhood, and a relentlessly progressive course. Histologic study reveals massive iron deposits in the basal ganglia. Systemic and cerebrospinal fluid iron levels are normal, as are plasma levels of ferritin, transferrin and ceruloplasmin. Conversely, in disorders of systemic iron overload, such as haemochromatosis, brain iron is not increased, which suggests that fundamental differences exist between brain and systemic iron metabolism and transport. In normal brain, non-haem iron accumulates regionally and is highest in basal ganglia. Pathologic brain iron accumulation is seen in common disorders, including Parkinson's disease, Alzheimer's disease and Huntington disease. In order to gain insight into normal and abnormal brain iron transport, metabolism and function, our approach was to map the gene for HSS. A primary genome scan was performed using samples from a large, consanguineous family (HS1) (see Fig. 1). While this family was immensely powerful for mapping, the region demonstrating homozygosity in all affected members spans only 4 cM, requiring very close markers in order to detect linkage. The HSS gene maps to an interval flanked by D20S906 and D20S116 on chromosome 20p12.3-p13. Linkage was confirmed in nine additional families of diverse ethnic backgrounds.

A second ferritin L subunit is encoded by an intronless gene in the mouse.

Mamm Genome 1992;2(3):143-9 
Renaudie F, Yachou AK, Grandchamp B, Jones R, Beaumont C
Multiple homologous sequences for the ferritin L subunit are present in mammalian genomes, but so far, only one expressed gene has been described. Here we report the isolation of a cDNA from a mouse bone marrow library, corresponding to an isoform of the mouse ferritin L subunit. This new subunit, that we named Lg, differs from the L subunit of ten amino acids. Specific amplification of mouse genomic DNA using the polymerase chain reaction (PCR) confirmed the presence of this Lg sequence in the mouse genome but also suggested that it must be encoded by an intronless gene.

Using a series of different Lg-specific oligonucleotides as probes, we subsequently isolated a genomic clone containing an uninterrupted sequence, identical to the Lg cDNA. This Lg gene lacks introns and does not contain the 28 base pairs (bp) conserved motif usually present at the 5' end of most ferritin mRNAs, which confers translational regulation by iron. When transiently transfected into K562 cells, this Lg genomic clone is actively transcribed, suggesting that, although it possesses the characteristics of a processed pseudogene, it is likely to correspond to the gene encoding this new ferritin subunit.

[Cloning, characterization and expression of mouse ferritin L subunit gene].

C R Acad Sci III 1995 Apr;318(4):431-7 
Renaudie F, Boulanger L, Grandchamp B, Beaumont C
We have cloned the functional gene coding for the L ferritin subunit by successive rounds of screening of a mouse genomic library using different oligonucleotides so as to avoid cloning the multiple pseudogenes of this rather complex multigene family. The L gene consists in 4 exons interrupted by 3 introns and spanning 1.8 kb. Quantitative measurements of H and L ferritin mRNA in various mouse tissues using a ribonuclease protection assay reveals important variations in the L/H ratio, the liver displaying the highest amount of L mRNA. Functional analysis of 1 kb of upstream sequence by transient transfections into the hepatoma cell line HepG2 shows that the mouse L gene transcription relies upon a minimal 130 bp promoter region containing 1 TATA box and 2 CCAAT motifs. Elements with an enhancing activity specific of hepatic tissue are likely to be located outside of this 1 kb fragment.

Translational repression in eukaryotes: partial purification and characterization of repressor of ferritin mRNA translation.

Proc Natl Acad Sci U S A 1988 Dec;85(24):9503-7
Walden WE, Daniels-McQueen S, Brown PH, Gaffield L, Russell DA, Bielser D, Bailey LC, Thach RE
Mouse and rabbit ferritin mRNAs translate very poorly in rabbit reticulocyte lysates relative to most other mRNAs. This translational deficiency is not seen in wheat germ lysates, suggesting the presence of an inhibitor in reticulocyte lysate that is specific for ferritin mRNA. A specific repressor of ferritin mRNA translation has been partially purified. The inhibitory activity of this repressor against native ferritin mRNA can be relieved by adding in vitro transcripts of ferritin light-chain RNAs that contain the first 92 nucleotides of the 5' untranslated region. No other sequences appear to be necessary for this effect.

Characterization and evolution of the expressed rat ferritin light subunit gene and its pseudogene family. Conservation of sequences within noncoding regions of ferritin genes.

J Biol Chem 1987 May 25;262(15):7335-41
Leibold EA, Munro HN
The iron storage protein ferritin consists of two types of subunits of different molecular weight, heavy (H) and light (L). The rat genome contains approximately 20 copies of the ferritin L-subunit gene, of which we have sequenced seven. One is an expressed ferritin gene containing three introns located between the alpha-helical domains of the L-subunit protein. The remaining six have the characteristics of processed pseudogenes. Sequence divergence suggest that these pseudogenes arose approximately 3-12 X 10(6) years ago. By using intron probes derived from the expressed ferritin L-gene, a homologous second copy has been identified in some Fischer rats. Comparison of the 5'-untranslated region of the rat L-gene with the published sequences of this region of the human L show a strongly conserved 28-base pair sequence, suggesting a translational regulatory function. The 5' flanking region of the rat L-gene contains sequences homologous to those in the flanking areas of the human L- and H-genes.

A processed pseudogene with retained intron

last updated:  3 Jan 00 webmaster
 61853  bp remaining in contig HSJ189G13 
  1396  bp to start of 406 bp psL7a feature
106461  bp to ATG of prion
 25331  bp to ATG of doppel
 38620  bp to start of dynein
 16617  bp to end of contig
 45000  bp to end of KIAA0168
295278  bp total length
We look here at the region upstream of the prion-doppel tandem to see what the nearest neighbor might be in the 5' direction based on a 63,249 bp unfinished contig, AL121916, positions 56750-57209, that showed up at htgs on 19 Dec 99. The adjacent 30,000 bp is basically just another heavily parasitized wasteland of the human genome, in this case with nothing in it but a slightly entertaining pseudogene for ribosomal protein large subunit, L7a.

There are many dozens of these L7a pseudogenes in the human genome; chromosome 20 alone will have about 8 of them. We saw earlier another ribosomal protein RSP4X nearer the prion gene. (Proteins made in great abundance have many mRNA copies and so a greater likelihood of retrotransposition?)

Processing the 30 kb contig through GeneBander, it quickly emerges that GenScan is mostly predicting actively transcribed repeat elements, not host genes. GenScan does however correctly find a piece of this pseudogene in the midst of an otherwise erroneously predicted gene.

It has become fashionable to inflate gene counts and downplay pseudogenes, eg, the recent 3 Arabidopsis papers or human chr 22, via reliance on GenScan and XGrail. The psychology is that genome sequencing is such hard work that long barren stretches are more than anyone can bear to report, leading to gene exageration.

It is much better to tblastn(nrp) the blastn(est) repeat-masked sequence matches because the mRNA database is approaching saturation. ESTs have experimental reality (even though not all are mRNAs); ab initio predictions do not.

It is soon evident that the chr 20 feature contains exons 6,7,8 of authentic L7a gene, PRPL7a (but without the respective introns), as well as the 3'UTR, two poly A signals, a poly A site, followed by genomic poly T (the feature is on the minus strand relative to prion and doppel). While direct flanking repeats are not evident, a translation gives both an internal frameshift and a stop codon though 86% identity after frame-jumping. There is no event association with retrotransposons nor any ESTs that 'belong' to this genomic stretch.

This fits the classical picture of a processed, non-expressed, recent retropositional mRNA pseudogene. However, a segment of intron 5 of authentic L7a also had an unmistakable match within the feature. Processed pseudogenes don't have introns. What is going on -- might this really be a genomic transposition? [This would possibly implicate prion and doppel as having originated elsewhere in the same event.]

No. Two ESTs also cover this intron (which then isn't really intronic): AI274211 and AI272858. In other words, alternative splicing occurs in the L7a gene (rarely today, less than 2%), leading to mRNAs with an upstream extension of exon 6. The ESTs themselves were too short to determine the upstream splice acceptor directly and could not be further tiled. Conceptual translations and web splice site predicting tools resolve this.

If alternatively spliced L7a mRNAs were as uncommon at the time of the event15 million years ago as they are today, we are left thinking it odd that one of these was 'chosen' for retrotransposition and wondering if the 5' truncation in this very region is coincidental, that is, whether some structural anomaly of the alternately spliced message predisposed it to fragmentary retrotransposition.

A ribosome does not seem offhand like a good place for an alternately spliced proteins because of the need for translational fidelity; it is not easily checked if other ribosomal proteins exhibit this property. L7a is highly conserved: the human protein is not far from drosphila or for that matter yeast. In fact, the 'extended' exon 6s are the rule in species such as Schizosaccharomyces pombe, suggesting the mammalian L7a short splice may not in fact be ancestral.

There has not been a lot of 'action' in chromosome 20 over the last 500 million years, either at the 300,000 bp scale here or on the p arm of the whole chromosome. Horse prion mapped to horse chr 22 [December 99 Genome Research] as did all human chr 20 q genes tested. That is why finding the KIAA0168 gene in zebrafish was so important; even if zfish prions cannot be primed, they can be identified from proximity to KIAA0168 and are very likely to be present on its clones.

 chr 20 genomic psL7a reference sequence:460 bp
aaggaaaatttctattattttaattatttttatgtacagaaaactcaacagcgtac
atttaacccagtttagtcgcaagttctttagccttcgccttttttagcttggtgat
gcgagccacagacttgggacccaggacattacctccccagtgacagcagatctcat
cgtatctgtcgctgtaattggtcctgatagtttccaccagcttagccaaagctcct
ttgtcttccaagttaacctgtgcgaaggcagcagtggtgcctctcttcttgtggac
tagaagtcccagtcttgacttccccttgataaagcagtaagggccccccatttttt
atgacacagggcaggcaggaagacaaccagctagaagaaagcactagctgaagagc
atattttgaccaaaagcagtaaatttcaaagctagctgggtagcaactgctctgggttaaaaagttca
chr 20:366 LVVFLPALCHK 334
           LVVFLPALC K      site of frameshift
L7a:   166 LVVFLPALCRK 176

chr 20:334 KMGGPYCFIKGKSRLGLLVHKKRGTTAAFAQVNLEDKGALAKLVETIRTNYSDRYDEICC 155
           KMG PYC IKGK+RLG LVH+K  TT AF QVN EDKGALAKLVE IRTNY+DRYDEI  
L7a:   176 KMGVPYCIIKGKARLGRLVHRKTCTTVAFTQVNSEDKGALAKLVEAIRTNYNDRYDEIRR 235

chr 20:154 HWGGNVLGPKSVARITKLKKAKAKELATKLG 62
           HWGGNVLGPKSVARI KL+KAKAKELATKLG
L7a:   236 HWGGNVLGPKSVARIAKLEKAKAKELATKLG 266 

PPMD, CHED 1, and Ched 2

11 Jan 99 webmaster
These disease also map very near prion-doppel; more details shortly
OMIM: The corneal dystrophies can be classified according to the site of predominant involvement, the cornea having 5 layers: from outside inward, epithelium, Bowman membrane, stroma, Descemet membrane, and endothelium. Most cases are recessive.

Exclusion of AR-CHED from the chromosome 20 region containing the PPMD and AD-CHED loci.

Ophthalmic Genet 1999 Dec;20(4):243-249
Kanis AB, Al-Rajhi AA, Taylor CM, Mathers WD, Folberg RY, Nishimura DY, Sheffield VC, Stone EM
This study sought to determine whether AR-CHED segregating in a consanguineous Saudi Arabian pedigree is linked to the previously mapped and overlapping loci for AD-CHED and PPMD on the pericentric region of chromosome 20. Forty members of a consanguineous Saudi Arabian pedigree segregating AR-CHED were ascertained. Short tandem-repeat polymorphic markers from the 20 cM interval on chromosome 20 containing both the PPMD and AD-CHED loci were used to genotype these individuals. LOD score analysis of the genotype data with the MENDEL software package utilizing a model of autosomal recessive inheritance with complete penetrance showed exclusion of CHED from the entire PPMD/AD-CHED interval by utilizing overlapping intervals of LOD scores of at least -2. The results obtained demonstrate that AR-CHED is not allelic to either AD-CHED or PPMD, although it has been proposed that AD-CHED may be allelic to PPMD.

Localization of the gene for autosomal recessive congenital hereditary endothelial dystrophy (CHED2) to chromosome 20 by homozygosity mapping.

Genomics 1999 Oct 1;61(1):1-4
Hand CK, Harmon DL, Kennedy SM, FitzSimon JS, Collum LM, Parfrey NA
Congenital hereditary endothelial dystrophy (CHED) is a corneal disorder that presents with diffuse bilateral corneal clouding. Both autosomal dominant (AD) and autosomal recessive (AR) forms of the disorder have been described. The gene responsible for AD CHED (HGMW-approved symbol CHED1) has been mapped to the pericentromeric region of chromosome 20. Investigating a large, consanguineous Irish pedigree with autosomal recessive CHED, we have previously excluded linkage to this AD CHED locus. We now describe a genome-wide search using homozygosity mapping and DNA pooling. Evidence of linkage to chromosome 20p was demonstrated with microsatellite marker D20S482.

A region of homozygosity in all affected individuals was identified, narrowing the disease gene locus to an 8-cM region flanked by markers D20S113 and D20S882. This AR CHED (HGMW-approved symbol CHED2) disease gene locus is physically and genetically distinct from the AD CHED locus.

Autosomal recessive CHED and autosomal dominant CHED are genetically distinct.

Br J Ophthalmol 1999 Jan;83(1):115-9
Callaghan M, Hand CK, Kennedy SM, FitzSimon JS, Collum LM, Parfrey NA
Conventional genetic analysis in addition to a pooled DNA strategy excludes linkage of AR CHED to the AD CHED and larger PPMD loci. This demonstrates that AR CHED is genetically distinct from AD CHED and PPMD.

Linkage of posterior polymorphous corneal dystrophy to 20q11.

Hum Mol Genet 1995 Mar;4(3):485-8
Heon E, Mathers WD, Alward WL, Weisenthal RW, Sunden SL, Fishbaugh JA, Taylor CM, Krachmer JH, Sheffield VC, Stone EM 
Posterior polymorphous dystrophy (PPMD) is an autosomal dominant disorder of the cornea that is clinically recognized by the presence of vesicles on the endothelial surface of the cornea. The corneal endothelium is normally a single layer of cells that lose their mitotic potential after development is complete. In PPMD, the endothelium is often multi-layered and has several other characteristics of an epithelium including the presence of desmosomes, tonofilaments, and microvilli. These abnormal cells retain their ability to divide and extend onto the trabecular meshwork to cause glaucoma in up to 40% of cases. A large family with 21 members affected with PPMD was genotyped with short tandem repeat polymorphisms distributed across the autosomal genome. Linkage was established with markers on the long arm of chromosome 20. The highest observed LOD score was 5.54 (theta = 0) with marker D20S45. Analysis of recombination events in four affected individuals revealed that the disease gene lies within a 30cM interval between markers D20S98 and D20S108.

Mad Cow Home ... Best Links ... Search this site