Leveraging Data and People to Accelerate Data Science
Autor: | Laura M. Haas |
---|---|
Rok vydání: | 2017 |
Předmět: |
060201 languages & linguistics
Process (engineering) Computer science business.industry Context (language use) 06 humanities and the arts 02 engineering and technology Data science Domain (software engineering) Information engineering Analytics 0602 languages and literature 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Prescriptive analytics business |
Zdroj: | ICDE |
DOI: | 10.1109/icde.2017.9 |
Popis: | Doing data science - extracting insight by analyzing data - is not easy. Data science is used to answer interesting questions that typically involve multiple diverse data sources, many different types of analysis, and often, large and messy data volumes. To answer one of these questions, several types of expertise may be needed to understand the context and domain being served, to import and transform individual data sets, to implement effective machine learning and/or statistical methods, to design and program applications and interfaces to extract and share data and insights, and to manage the data and systems used for analysis and storage. In the IBM Research Accelerated Discovery Lab, we are studying how data scientists work, and using what we learn to help them gain insights faster. In this talk, we will look at what we have learned to date, through user studies and experience with tens of analytics projects, and the environment that we’ve built as a result. In particular, I will describe how we capture information to enable contextual search, provenance queries, and other functionality to afford teams faster progress in data-intensive investigations. I will also touch on our efforts to leverage data and people to explain what happens during an investigation, with an ultimate goal of moving from descriptive to prescriptive analytics in order to accelerate data science and the analytic process. I will illustrate these various efforts using an ambitious current project on applying metagenomics to food safety, and will conclude with a discussion of where more work is needed and our future directions. |
Databáze: | OpenAIRE |
Externí odkaz: |