Supplementary Table 1. Proteins for which many of the dbSNP variants from were likely to be mistakenly inferred. The first column is the amino acid substitution, the second column is the SIFT prediction, the third column contains information about the sequence from which the nsSNP was inferred. Variants that may exist in the human population are in bold; these SNPs were the only base change in the sequence source and were detected in normal individuals (as opposed to cancer cells, which may have somatic mutations).
For variants submitted by Irizarry, et al.1, the links from dbSNP to the sequence source were followed and sequences with quality >= 30 retrieved. For variants submitted by Sunyaev, et al., evidence of the source of the variant was not provided in either HGBASE of dbSNP. Since, these variants were inferred from EST database2, the reference gene was searched against the human EST database at EMBL and the highest scoring hit that contained the SNP was inferred to be the source of the variant.
1. Irizarry, K., Kustanovich, V., Li, C. Brown, N., Nelson, S. Wong, W., Lee, C.J. (2000) Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences. Nature Genetics 26:233-236.
2. Sunyaev, S., Hanke, J., Brett, D.,Aydin, A., Zastrow, I., Lathe, W., Bork, P., Reich, J. (2000) Individual variation in protein-coding sequences of human genome. Advances in Protein Chemistry: 54 409-437.
Proteasome subunit beta 7 (XP_005463), highly expressed |
||
V39A |
Tolerated |
Submitter: Irizarry, et al. R72984: Donor-placenta T97170: Donor-fetal liver spleen AA461585: Donor- male testis R48896: Donor- breast AA186917: Donor- endothelial cell AA082245: Donor-endothelial cell H50560: Donor- fetal liver spleen H48417: Donor- fetal liver AA002190: Donor- fetal liver R93669: Donor- fetal liver spleen H58983: Donor- fetal liver spleen H67630: Donor- fetal liver spleen R83475: Donor- fetal liver spleen H50560: Donor- fetal liver spleen H89822: Donor- fetal liver spleen H72554: Donor- fetal liver spleen R83475: Sequence has a frameshift. Donor-fetal liver spleen. |
G250W |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Infer that SNP came from Hs520198, only single base change observed. Donor- 20-week postconception fetus. |
T251I |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Infer that SNP came from AA694109, only single base change observed. Donor- 20-week postconception fetus. |
M274I |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Inferred from H46275, which also has T240P subst. |
Myosin light polypeptide 6 (XP_012180) highly expresssed |
T103P, rs1050470 E126K Rs1959688 |
Affects protein function Affects protein function |
Submitter: Irizarry, et al. AI718825: This sequence also had T103P, E126K, and E124K. AI460303: This sequence also had D97Y, T103P, T115P, E122K, E126K, E124K and early stop at codon 133. AI718703: This sequence also had T103P, T115P, E122K, E126K, E124K, and early stop at 141. |
E126K |
Affects protein function |
AI640410: This sequence also had T115P, E122K, E126K, E124K, D134A, S135T, A143K, F144L, H147M, I148V, S150N. |
E122G Rs1802559 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Source N/A |
Y89H rs1804000 |
Affects protein function |
Submitter: Sunyaev,, et al. 2000 Inferred from AA536129: This sequence also had Q41S substitution. |
H11L Rs1804001 |
Tolerated |
Submitter: Sunyaev, et al. 2000 Inferred from AA828057: This sequence also had Q41S substitution. |
Ribosomal protein L11 (XP_001555) highly expressed |
||
L11P Rs1804302 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Inferred from AA827959: From invasive ovarian tumor + 1 other nonsynonymous change |
L29S Rs1804295 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 AA564495: From a normal prostate epithelial cell. In high quality sequence. Male, 45 yrs. Old. The only base change. |
H60P Rs1059635 |
Tolerated |
Submitter: Irizarry, et al. 2000 AI491898: No other base changes. Donor-differentiated endometrial adenocarcinoma, 3 pooled tumors. Could be somatic mutation AI439058: Donor - B-cell, chronic lymphatic leukemia. Sequence has early stop at 64, also R43K, Y44N, T45P substitutions. |
A68V Rs1804301 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 AA809603: 97% identical, 2 indels + 7 changes (3’ EST) result in 4 nonsynonymous changes “Trace considered overall poor quality” Donor- normal prostatic epithelial cells |
F88I Rs1804297 |
Tolerated |
Submitter: Sunyaev, et al. 2000 N/A |
E100G Rs1804298 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Inferred from AA593143, 98% identical 4 base changes + 1 early deletion resulting in frameshift. Donor- prostatic intraepithelial neoplasia |
Y108C Rs1804296 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 N/A |
I130T Rs1804303 |
Tolerated |
Submitter: Sunyaev, et al. 2000 N/A |
Fau (XP_006522) highly expressed |
||
T53I |
Affects protein function
|
NCBI panel of 90 individuals 0.006 frequency |
V93M |
Affects protein function |
NCBI panel of 90 individuals 0.006 frequency |
K102T Rs1065065 |
Affects protein function |
Submitter: Irizarry, et al. 2000 AIO53858: Can’t find the K102T substitution in this
sequence. Donor- parpillary serous ovarian carcinoma. |
R105G |
Affects protein function |
Submitter: Irizarry, et al. 2000 AA829613: 3 base changes total, 2 resulting in nonsynonymous. Donor- adult neuroendocrine lung carcinoid |
TRIP1 Proteasome 26S subunit(Ref acc#: XP_008254) highly expressed |
||
D266S rs1050708, I300M rs1050740 |
Affects protein function Affects protein function |
Submitter: Irizarry, et al. 2000 L38810: 7 base changes-3 resulting in aa changes, 4 synonymous Published in Mol Endocrin 9 (20) 243-254 NM_002805 derived from L38810 |
M138K Rs1802130 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 Source N/A |
D249V Rs1302131 |
Affects protein function |
Submitter: Sunyaev, et al. 2000 AA677667:D249V is not in high quality region. |
Tight junction protein 2 (Ref Acc#: XP_005446) highly expressed |
||
K822N Rs1049624 N829D Rs1049625 K834N Rs1049626 Q842H Rs1049627 |
Affects protein function Tolerant Affects protein function Affects protein function |
Submitter: Irizarry, et al. 2000 L27476: All variants occur together + 7 other nonsynonymous aa changes. Cloned from cDNA library from brain. Human Molecular Genetics , 1994, Vol. 3, No.6 . 909-914. NM_004817: derived from L27476 |
Ephrin-A2
(Ref. Acc#: XP_002088) |
||
I40N Rs1058370 I42F Rs1058371 K45N Rs1058372 |
Affects protein function Affects protein function Affects protein function |
Submitter: Irizarry, et al. 2000 M59371 : 3 polymorphisms occur together on 2 sequences. Missing last 15 resides of transcript & extends a lot further (different isoform). Cloned from HeLa and keratinocyte Ref: Molecular and Cellular Biology Dec.1990, p.6316-6324 Vol.10, No.12 NM_0044313: derived from M59371 |
Xanthene dehydrogenase (Ref Acc # XP_002472) |
||
N525K Rs669884 |
Affects protein function |
Submitter: SeaHashSNP. AL121657 and AC010743, variant sources that SeaHashSNP linked to, do not show this substitution. |
T526K Rs566362 |
Affects protein function |
Submitter: SeaHashSNP Neither of these occur in the Genbank Accession #’s ref. AL121657 and AC010743 But translated the DNA provided by ss737586, has T526K subst. + Q534R subst., doesn’t align throughout entire DNA (probably poor quality). |
P766R Rs1042036 |
Affects protein function |
Submitter: Irizarry, et al. 2000 U06117 and U394878: does not have this substitution. |