Výsledky vyhledávání

Report

Position: Measure Dataset Diversity, Don't Just Claim It

Autor: Zhao, Dora, Andrews, Jerone T. A., Papakyriakopoulos, Orestis, Xiang, Alice

Machine learning (ML) datasets, often perceived as neutral, inherently encapsulate abstract and disputed social constructs. Dataset curators frequently employ value-laden terms such as diversity, bias, and quality to characterize datasets. Despite th

Externí odkaz: http://arxiv.org/abs/2407.08188

Zobrazit plný text záznamu

Report

Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes

Autor: Hirota, Yusuke, Andrews, Jerone T. A., Zhao, Dora, Papakyriakopoulos, Orestis, Modas, Apostolos, Nakashima, Yuta, Xiang, Alice

We tackle societal bias in image-text datasets by removing spurious correlations between protected groups and image attributes. Traditional methods only target labeled attributes, ignoring biases from unlabeled ones. Using text-guided inpainting mode

Externí odkaz: http://arxiv.org/abs/2407.03623

Zobrazit plný text záznamu

Report

A Taxonomy of Challenges to Curating Fair Datasets

Autor: Zhao, Dora, Scheuerman, Morgan Klaus, Chitre, Pooja, Andrews, Jerone T. A., Panagiotidou, Georgia, Walker, Shawn, Pine, Kathleen H., Xiang, Alice

Despite extensive efforts to create fairer machine learning (ML) datasets, there remains a limited understanding of the practical aspects of dataset curation. Drawing from interviews with 30 ML dataset curators, we present a comprehensive taxonomy of

Externí odkaz: http://arxiv.org/abs/2406.06407

Zobrazit plný text záznamu

Report

Augmented Datasheets for Speech Datasets and Ethical Decision-Making

Autor: Papakyriakopoulos, Orestis, Choi, Anna Seo Gyeong, Andrews, Jerone, Bourke, Rebecca, Thong, William, Zhao, Dora, Xiang, Alice, Koenecke, Allison

Speech datasets are crucial for training Speech Language Technologies (SLT); however, the lack of diversity of the underlying training data can lead to serious limitations in building equitable and robust SLT products, especially along dimensions of

Externí odkaz: http://arxiv.org/abs/2305.04672

Zobrazit plný text záznamu

Report

Ethical Considerations for Responsible Data Curation

Autor: Andrews, Jerone T. A., Zhao, Dora, Thong, William, Modas, Apostolos, Papakyriakopoulos, Orestis, Xiang, Alice

Human-centric computer vision (HCCV) data curation practices often neglect privacy and bias concerns, leading to dataset retractions and unfair models. HCCV datasets constructed through nonconsensual web scraping lack crucial metadata for comprehensi

Externí odkaz: http://arxiv.org/abs/2302.03629

Zobrazit plný text záznamu

Report

GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition

Autor: Ramaswamy, Vikram V., Lin, Sing Yu, Zhao, Dora, Adcock, Aaron B., van der Maaten, Laurens, Ghadiyaram, Deepti, Russakovsky, Olga

Current dataset collection methods typically scrape large amounts of data from the web. While this technique is extremely scalable, data collected in this way tends to reinforce stereotypical biases, can contain personally identifiable information, a

Externí odkaz: http://arxiv.org/abs/2301.02560

Zobrazit plný text záznamu

Report

Men Also Do Laundry: Multi-Attribute Bias Amplification

Autor: Zhao, Dora, Andrews, Jerone T. A., Xiang, Alice

As computer vision systems become more widely deployed, there is increasing concern from both the research community and the public that these systems are not only reproducing but amplifying harmful social biases. The phenomenon of bias amplification

Externí odkaz: http://arxiv.org/abs/2210.11924

Zobrazit plný text záznamu

Report

Understanding Teenage Perceptions and Configurations of Privacy on Instagram

Autor: Zhao, Dora, Inaba, Mikako, Monroy-Hernández, Andrés

As teenage use of social media platform continues to proliferate, so do concerns about teenage privacy and safety online. Prior work has established that privacy on networked publics, such as social media, is complex, requiring users to navigate not

Externí odkaz: http://arxiv.org/abs/2208.02796

Zobrazit plný text záznamu

Report

Gender Artifacts in Visual Datasets

Autor: Meister, Nicole, Zhao, Dora, Wang, Angelina, Ramaswamy, Vikram V., Fong, Ruth, Russakovsky, Olga

Gender biases are known to exist within large-scale visual datasets and can be reflected or even amplified in downstream models. Many prior works have proposed methods for mitigating gender biases, often by attempting to remove gender expression info

Externí odkaz: http://arxiv.org/abs/2206.09191

Zobrazit plný text záznamu

Report

Understanding and Evaluating Racial Biases in Image Captioning

Autor: Zhao, Dora, Wang, Angelina, Russakovsky, Olga

Image captioning is an important task for benchmarking visual reasoning and for enabling accessibility for people with vision impairments. However, as in many machine learning settings, social biases can influence image captioning in undesirable ways

Externí odkaz: http://arxiv.org/abs/2106.08503

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání