Zobrazeno 1 - 10
of 5 069
pro vyhledávání: '"CHENG Guang"'
In this paper, we propose a novel statistical framework for watermarking generative categorical data. Our method systematically embeds pre-agreed secret signals by splitting the data distribution into two components and modifying one distribution bas
Externí odkaz:
http://arxiv.org/abs/2411.10898
Data collaboration via Data Clean Room offers value but raises privacy concerns, which can be addressed through synthetic data and multi-table synthesizers. Common multi-table synthesizers fail to perform when subjects occur repeatedly in both tables
Externí odkaz:
http://arxiv.org/abs/2411.00879
In the current era of big data and machine learning, it's essential to find ways to shrink the size of training dataset while preserving the training performance to improve efficiency. However, the challenge behind it includes providing practical way
Externí odkaz:
http://arxiv.org/abs/2410.09311
Autor:
Suh, Namjoon, Yang, Yuning, Hsieh, Din-Yin, Luan, Qitong, Xu, Shirong, Zhu, Shixiang, Cheng, Guang
In this paper, we leverage the power of latent diffusion models to generate synthetic time series tabular data. Along with the temporal and feature correlations, the heterogeneous nature of the feature in the table has been one of the main obstacles
Externí odkaz:
http://arxiv.org/abs/2406.16028
The evaluation of synthetic data generation is crucial, especially in the retail sector where data accuracy is paramount. This paper introduces a comprehensive framework for assessing synthetic retail data, focusing on fidelity, utility, and privacy.
Externí odkaz:
http://arxiv.org/abs/2406.13130
The promise of tabular generative models is to produce realistic synthetic data that can be shared and safely used without dangerous leakage of information from the training set. In evaluating these models, a variety of methods have been proposed to
Externí odkaz:
http://arxiv.org/abs/2406.13012
Generative Foundation Models (GFMs) have produced synthetic data with remarkable quality in modalities such as images and text. However, applying GFMs to tabular data poses significant challenges due to the inherent heterogeneity of table features. E
Externí odkaz:
http://arxiv.org/abs/2406.04619
Recommender systems play a crucial role in internet economies by connecting users with relevant products or services. However, designing effective recommender systems faces two key challenges: (1) the exploration-exploitation tradeoff in balancing ne
Externí odkaz:
http://arxiv.org/abs/2406.04374
Diffusion models, a specific type of generative model, have achieved unprecedented performance in recent years and consistently produce high-quality synthetic samples. A critical prerequisite for their notable success lies in the presence of a substa
Externí odkaz:
http://arxiv.org/abs/2405.16876
Autor:
Yu, Peiyu, Zhang, Dinghuai, He, Hengzhi, Ma, Xiaojian, Miao, Ruiyao, Lu, Yifan, Zhang, Yasi, Kong, Deqian, Gao, Ruiqi, Xie, Jianwen, Cheng, Guang, Wu, Ying Nian
Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design
Externí odkaz:
http://arxiv.org/abs/2405.16730