Similar results are obtained when sequences greater than 90%, 95%, and 99% identical to the query are removed from the alignment.  These sequences are removed in order to exclude proteins containing the substitution of interest and pseudogenes. Predictions are on the 5780 nsSNPs in 3005 proteins from dbSNP (build #95).

 

Exclude sequences greater than n% identical to the query

Sequence coverage

Amino acid substitution coverage

% Predicted as damaging

n=90

60% (1789/3005)

53% (3084/5780)

25% (757/3084)

n=95

60% (1793/3005)

53% (3088/5780)

24% (743/3088)

n=99

60% (1794/3005)

53% (3091/5780)

24% (736/3091)