Machine Learning of Synthetic Biological Sequences for Designer Attribution

Autor: Jessica S. Dymond, Brant W. Chee, Corban G. Rivera, Joseph Downs, Claire Marie Filone, Craig Howser, Miller Wilt, Joshua T. Wolfe
Jazyk: angličtina
Rok vydání: 2019
Předmět:
DOI: 10.1101/555631
Popis: Advances in genome editing and gene synthesis technologies have increased the ease with which biological agents can be engineered. Existing methods to identify the engineering source are insufficient for attribution. We hypothesized that strategies used for DNA design and optimization could act as identifiable fingerprints of design software or particular vendors, making engineered agents more attributable to their source. To test this hypothesis, sequences optimized using various gene synthesis vendors were characterized using a machine learning model. By capturing optimization signatures unique to each vendor, the trained model showed an ability to identify a sequences origin with an accuracy up to 92%, indicating it is possible to distinguish the algorithm utilized to optimize a genetic sequence based on the DNA sequence output alone.
Databáze: OpenAIRE