Abstrakt: |
An exploratory machine learning (ML) classification model that seeks to examine CaCO3polymorph selection is presented. The ML model can distinguish if a given peptide sequence binds with calcite or aragonite, polymorphs of CaCO3. The classifier, which was created using SVM and amino acid chemical composition as the input descriptors, yielded satisfactory performance in the classification task, as characterized by AUC = 0.736 and F1 = 0.800 in the test set. Model optimization revealed that tiny, aliphatic, aromatic, acidic, and basic residues are essential descriptors for discriminating aragonite biomineralization peptides from calcite. The presented model offers valuable insights on the significant chemical attributes of biomineralization peptides involved in polymorph binding preference. This can deepen our understanding about the biomineralization phenomenon and may be deployed in the future for the creation biomimetic materials.Graphical AbstractThis study explores the use of machine learning (ML) to examine the binding preference of biomineralization peptides toward calcium carbonate polymorphs. The ML model can classify if a given peptide sequence binds with calcite or aragonite using descriptors based on amino acid composition as the features. The model exhibits reliable performance asas characterized by AUC = 0.736 and F1 = 0.800 in the test set. Model optimization revealed that tiny, aliphatic, aromatic, acidic, and basic residues are essential descriptors for discriminating aragonite biomineralization peptides from calcite. The presented model offers valuable insights on the significant chemical attributes of biomineralization peptides involved in polymorph selection. |