Supplementary Table 1.  Proteins for which many of the dbSNP variants from were likely to be mistakenly inferred.  The first column is the amino acid substitution, the second column is the SIFT prediction, the third column contains information about the sequence from which the nsSNP was inferred.  Variants that may exist in the human population are in bold; these SNPs were the only base change in the sequence source and were detected in normal individuals (as opposed to cancer cells, which may have somatic mutations).

For variants submitted by Irizarry, et al.1, the links from dbSNP to the sequence source were followed and sequences with quality >= 30 retrieved.  For variants submitted by Sunyaev, et al., evidence of the source of the variant was not provided in either HGBASE of dbSNP.  Since, these variants were inferred from EST database2, the reference gene was searched against the human EST database at EMBL and the highest scoring hit that contained the SNP was inferred to be the source of the variant.

 

1.  Irizarry, K., Kustanovich, V., Li, C. Brown, N., Nelson, S. Wong, W., Lee, C.J.  (2000) Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences. Nature Genetics 26:233-236.

 

2.  Sunyaev, S., Hanke, J., Brett, D.,Aydin, A., Zastrow, I., Lathe, W., Bork, P., Reich, J.  (2000) Individual variation in protein-coding sequences of human genome.  Advances in Protein Chemistry: 54 409-437.

 

 

 

Proteasome subunit beta 7 (XP_005463), highly expressed

     V39A

 

Tolerated

 

 

 

 

 

 

Submitter: Irizarry, et al.

R72984: Donor-placenta

T97170: Donor-fetal liver spleen

AA461585: Donor- male testis

R48896: Donor- breast

AA186917: Donor- endothelial cell

AA082245: Donor-endothelial cell

H50560: Donor- fetal liver spleen

H48417: Donor- fetal liver

AA002190: Donor- fetal liver

R93669: Donor- fetal liver spleen

H58983: Donor- fetal liver spleen

H67630: Donor- fetal liver spleen

R83475: Donor- fetal liver spleen

H50560: Donor- fetal liver spleen

H89822: Donor- fetal liver spleen

H72554: Donor- fetal liver spleen

R83475: Sequence has a frameshift.  Donor-fetal liver spleen.

     G250W

 

Affects protein function

Submitter: Sunyaev, et al. 2000

Infer that  SNP came from Hs520198, only single base change observed.

Donor- 20-week postconception fetus.

     T251I

 

Affects protein function

Submitter: Sunyaev, et al. 2000

Infer that  SNP came from AA694109, only single base change observed.

Donor- 20-week postconception fetus.

     M274I

Affects protein function

 

Submitter: Sunyaev, et al. 2000

Inferred from H46275, which also has T240P subst.

 

Myosin light polypeptide 6 (XP_012180) highly expresssed

T103P, rs1050470

E126K

Rs1959688

 

Affects protein function

 

Affects protein function

Submitter: Irizarry, et al.

AI718825: This sequence also had T103P, E126K, and E124K.

AI460303: This sequence also had D97Y, T103P, T115P, E122K, E126K, E124K  and early stop at codon 133.

AI718703: This sequence also had T103P, T115P, E122K, E126K, E124K, and early stop at 141.

E126K

Affects protein function

AI640410: This sequence also had  T115P, E122K, E126K, E124K, D134A, S135T, A143K, F144L, H147M, I148V, S150N.

E122G

Rs1802559

Affects protein function

Submitter: Sunyaev, et al. 2000 

Source N/A

Y89H

rs1804000

Affects protein function

Submitter: Sunyaev,, et al. 2000

Inferred from AA536129: This sequence also had Q41S substitution.

H11L

Rs1804001

Tolerated

 

Submitter: Sunyaev, et al. 2000

Inferred from AA828057: This sequence also had Q41S substitution.

 

 

Ribosomal protein L11 (XP_001555) highly expressed

L11P

Rs1804302

Affects protein function

Submitter: Sunyaev, et al. 2000

Inferred from AA827959: From invasive ovarian tumor

 + 1 other nonsynonymous change

L29S

Rs1804295

Affects protein function

Submitter: Sunyaev, et al. 2000

AA564495: From a normal prostate epithelial cell.  In high quality sequence.  Male, 45 yrs. Old. The only base change.

H60P

Rs1059635

Tolerated

Submitter: Irizarry, et al. 2000

AI491898: No other base changes.  Donor-differentiated endometrial adenocarcinoma, 3 pooled tumors.  Could be somatic mutation

AI439058: Donor - B-cell, chronic lymphatic leukemia.

Sequence has early stop at 64, also R43K, Y44N, T45P substitutions.

A68V

Rs1804301

Affects protein function

Submitter: Sunyaev, et al. 2000

AA809603: 97% identical, 2 indels +  7 changes (3’ EST)  result in 4 nonsynonymous changes

“Trace considered overall poor quality” Donor- normal prostatic epithelial cells

F88I

Rs1804297

Tolerated

Submitter: Sunyaev, et al. 2000

N/A

E100G

Rs1804298

Affects protein function

Submitter: Sunyaev, et al. 2000

Inferred from AA593143, 98% identical

4 base changes + 1 early deletion resulting in frameshift.

Donor- prostatic intraepithelial neoplasia

Y108C

Rs1804296

Affects protein function

Submitter: Sunyaev, et al. 2000

N/A

I130T

Rs1804303

Tolerated

Submitter: Sunyaev, et al. 2000

N/A

 

 

 

Fau (XP_006522) highly expressed

T53I

Affects protein function

NCBI panel of 90 individuals

0.006 frequency

V93M

Affects protein function

NCBI panel of 90 individuals

0.006 frequency

K102T

Rs1065065

Affects protein function

Submitter: Irizarry, et al. 2000

AIO53858: Can’t find the K102T substitution in this sequence. Donor- parpillary serous ovarian carcinoma.

R105G

Affects protein function

Submitter: Irizarry, et al. 2000

AA829613: 3 base changes total, 2 resulting in nonsynonymous.  Donor- adult neuroendocrine lung carcinoid

 

 

TRIP1  Proteasome 26S subunit(Ref acc#: XP_008254) highly expressed

D266S rs1050708, I300M rs1050740

Affects protein function

Affects protein function

Submitter: Irizarry, et al. 2000

L38810: 7 base changes-3 resulting in aa changes, 4 synonymous

Published in Mol Endocrin 9 (20) 243-254

NM_002805 derived from L38810

M138K

Rs1802130

Affects protein function

Submitter: Sunyaev, et al. 2000

Source N/A

D249V

Rs1302131

Affects protein function

Submitter: Sunyaev, et al. 2000

AA677667:D249V is not in high quality region.

 

Tight junction protein 2  (Ref Acc#: XP_005446) highly expressed

K822N

Rs1049624

N829D

Rs1049625

K834N

Rs1049626

Q842H

Rs1049627

Affects protein function

 

Tolerant

 

Affects protein function

 

Affects protein function

Submitter: Irizarry, et al. 2000

L27476: All variants occur together + 7 other nonsynonymous aa changes.

Cloned from cDNA library from brain.

Human Molecular Genetics , 1994, Vol. 3, No.6 . 909-914.

NM_004817: derived from L27476

 

 

Ephrin-A2  (Ref. Acc#: XP_002088)

I40N

Rs1058370

I42F

Rs1058371

K45N

Rs1058372

Affects protein function

 

Affects protein function

 

Affects protein function

Submitter: Irizarry, et al. 2000

M59371 : 3 polymorphisms occur together on 2 sequences.  Missing last 15 resides of transcript & extends a lot further (different isoform).

Cloned from HeLa and keratinocyte

Ref: Molecular and  Cellular Biology Dec.1990, p.6316-6324 Vol.10, No.12

NM_0044313: derived from M59371

 

 

Xanthene dehydrogenase  (Ref Acc # XP_002472)

N525K

Rs669884

Affects protein function

Submitter: SeaHashSNP.

AL121657 and AC010743, variant sources that SeaHashSNP linked to, do not show this substitution.

T526K

Rs566362

Affects protein function

Submitter: SeaHashSNP

Neither of these occur in the Genbank Accession #’s ref. AL121657 and AC010743

But translated the DNA provided by ss737586, has T526K subst. + Q534R subst., doesn’t align throughout entire DNA (probably poor quality).

P766R

Rs1042036

Affects protein function

Submitter: Irizarry, et al. 2000

U06117 and U394878: does not have this substitution.