Popis: |
Technology drives advances in science. Giving scientists access to more powerful tools for collecting and understanding data enables them to both ask and answer new kinds questions that were previously beyond their reach. Of these new tools at their disposal, machine learning offers the opportunity to understand and analyze data at unprecedented scales and levels of detail. The standard machine learning pipeline consists of data labeling, feature extraction, training, and evaluation. However, without expert machine learning knowledge, it is difficult for scientists to optimally construct this pipeline to fully leverage machine learning in their work. Using ecology as a motivating example, we analyze a typical scientist's data collection and processing workflow and highlight many problems facing practitioners when attempting to capitalize on advances in machine learning and pattern recognition. Understanding these shortcomings allows us to outline several novel and underexplored research directions. We end with recommendations to motivate progress in future cross-disciplinary work. |