Popis: |
Additional file 1: Table S1. Annotation of clinical variants, using terms from ClinVar VCFs, according to the map below. Table S2. Predictors contained in NDamage. Table S3. Discretization of ExAC_AF variable: allele frequency of variants based on all samples from ExAC. Table S4. Discretization of COMMON variable, based on 1000genomes database. Table S5. Discretization of NDamage variable: number of predictors that point out a variant as pathogenic. Table S6. Discretization of Interpro_Domain variable: functional domain or site associated to mutation. Table S7. Discretization of Transition/transversion variable, based on nucleotide transversions or transitions. Table S8. Discretization of Charged/uncharged variable. Table S9. Discretization of Hydrophobic/hydrophilic variable. Table S10. Discretization of Essential/non-essential variable. Table S11. Discretization of Initial/not initial exon variable: mutations affecting gene start are more impactant. Table S12. Discretization of PPI variable. Table S13. Distribution of neutral and pathogenic mutations for each variable of the proposed model (in percent fraction number), in Best Conjecture of Training, Validation, and Testing, so as in 1000-step Monte Carlo simulation dataset. Table S14. Accuracy of predictors and proposed model in 10-fold Cross-Validation process, according to ClinVar version 2017-05-30. Mean and standard were calculated from ten steps of 10-fold Cross Validation. Table S15. Coefficient φ for the two-by-two variable combinations of the proposed model. Figure S1. Distribution of neutral and pathogenic mutations of variables used in proposed tree, according to ClinVar version 2017-05-30. In tables’ columns and row names, 0 indicates the mutations classified by a variable as neutral, and 1, as pathogenic. In tables’ cells, numbers at left of the slash indicates the percentage of neutral mutations for each classification combination of each variables pair, and numbers at right of the slash, the percentage of pathogenic mutations. |