Autor: |
Pitkämäki, Tinja, Pahikkala, Tapio, Perez, Ileana Montoya, Movahedi, Parisa, Nieminen, Valtteri, Southerington, Tom, Vaiste, Juho, Jafaritadi, Mojtaba, Khan, Muhammad Irfan, Kontio, Elina, Ranttila, Pertti, Pajula, Juha, Pölönen, Harri, Degerli, Aysen, Plomp, Johan, Airola, Antti |
Předmět: |
|
Zdroj: |
Applied Computing & Intelligence; 2024, Vol. 4 Issue 2, p1-26, 26p |
Abstrakt: |
The use of synthetic data could facilitate data-driven innovation across industries and applications. Synthetic data can be generated using a range of methods, from statistical modeling to machine learning and generative AI, resulting in datasets of different formats and utility. In the health sector, the use of synthetic data is often motivated by privacy concerns. As generative AI is becoming an everyday tool, there is a need for practice-oriented insights into the prospects and limitations of synthetic data, especially in the privacy sensitive domains. We present an interdisciplinary outlook on the topic, focusing on, but not limited to, the Finnish regulatory context. First, we emphasize the need for working definitions to avoid misplaced assumptions. Second, we consider use cases for synthetic data, viewing it as a helpful tool for experimentation, decision-making, and building data literacy. Yet the complementary uses of synthetic datasets should not diminish the continued efforts to collect and share high-quality real-world data. Third, we discuss how privacy-preserving synthetic datasets fall into the existing data protection frameworks. Neither the process of synthetic data generation nor synthetic datasets are automatically exempt from the regulatory obligations concerning personal data. Finally, we explore the future research directions for generating synthetic data and conclude by discussing potential future developments at the societal level. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|