TY - JOUR
T1 - Computing the Pathogenicity of Wilson’s Disease ATP7B Mutations
T2 - Implications for Disease Prevalence
AU - Tang, Ning
AU - Sandahl, Thomas D.
AU - Ott, Peter
AU - Kepp, Kasper P.
PY - 2019/12
Y1 - 2019/12
N2 - Genetic variations in the gene encoding the copper-transport protein ATP7B are the primary cause of Wilson's disease. Controversially, clinical prevalence seems much smaller than the prevalence estimated by genetic screening tools, causing fear that many people are undiagnosed, although early diagnosis and treatment is essential. To address this issue, we benchmarked 16 state-of-the-art computational disease-prediction methods against established data of missense ATP7B mutations. Our results show that the quality of the methods varies widely. We show the importance of optimizing the threshold of the methods used to distinguish pathogenic from nonpathogenic mutations against data of clinically confirmed pathogenic and nonpathogenic mutations. We find that most methods use thresholds that predict too many ATP7B mutations to be pathogenic. Thus, our findings explain the current controversy on Wilson's disease prevalence because meta-analysis and text search methods include many computational estimates that lead to higher disease prevalence than clinically observed. As proteins and diseases differ widely, a one-size-fits-all threshold cannot distinguish pathogenic and nonpathogenic mutations efficiently, as shown here. We also show that amino acid changes with small evolutionary substitution probability, mainly due to amino acid volume, are more associated with the disease, implying a pathological effect on the conformational state of the protein, which could affect copper transport or adenosine triphosphate recognition and hydrolysis. These findings may be a first step toward a more quantitative genotype-phenotype relationship of Wilson's disease.
AB - Genetic variations in the gene encoding the copper-transport protein ATP7B are the primary cause of Wilson's disease. Controversially, clinical prevalence seems much smaller than the prevalence estimated by genetic screening tools, causing fear that many people are undiagnosed, although early diagnosis and treatment is essential. To address this issue, we benchmarked 16 state-of-the-art computational disease-prediction methods against established data of missense ATP7B mutations. Our results show that the quality of the methods varies widely. We show the importance of optimizing the threshold of the methods used to distinguish pathogenic from nonpathogenic mutations against data of clinically confirmed pathogenic and nonpathogenic mutations. We find that most methods use thresholds that predict too many ATP7B mutations to be pathogenic. Thus, our findings explain the current controversy on Wilson's disease prevalence because meta-analysis and text search methods include many computational estimates that lead to higher disease prevalence than clinically observed. As proteins and diseases differ widely, a one-size-fits-all threshold cannot distinguish pathogenic and nonpathogenic mutations efficiently, as shown here. We also show that amino acid changes with small evolutionary substitution probability, mainly due to amino acid volume, are more associated with the disease, implying a pathological effect on the conformational state of the protein, which could affect copper transport or adenosine triphosphate recognition and hydrolysis. These findings may be a first step toward a more quantitative genotype-phenotype relationship of Wilson's disease.
UR - http://www.scopus.com/inward/record.url?scp=85076890301&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.9b00852
DO - 10.1021/acs.jcim.9b00852
M3 - Journal article
C2 - 31751128
AN - SCOPUS:85076890301
SN - 1549-9596
VL - 59
SP - 5230
EP - 5243
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 12
ER -