Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Müller, Nicolas M."'
In this paper, we demonstrate that attacks in the latest ASVspoof5 dataset -- a de facto standard in the field of voice authenticity and deepfake detection -- can be identified with surprising accuracy using a small subset of very simplistic features
Externí odkaz:
http://arxiv.org/abs/2408.15775
Publikováno v:
Interspeech 2024
Recent research has highlighted a key issue in speech deepfake detection: models trained on one set of deepfakes perform poorly on others. The question arises: is this due to the continuously improving quality of Text-to-Speech (TTS) models, i.e., ar
Externí odkaz:
http://arxiv.org/abs/2406.03512
For classification, the problem of class imbalance is well known and has been extensively studied. In this paper, we argue that imbalance in regression is an equally important problem which has so far been overlooked: Due to under- and over-represent
Externí odkaz:
http://arxiv.org/abs/2402.11963
Autor:
Müller, Nicolas M., Kawa, Piotr, Hu, Shen, Neu, Matthias, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin
Voice faking, driven primarily by recent advances in text-to-speech (TTS) synthesis technology, poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can be considered genuine, while fake speech co
Externí odkaz:
http://arxiv.org/abs/2402.06304
Autor:
Müller, Nicolas M., Kawa, Piotr, Choong, Wei Herng, Casanova, Edresson, Gölge, Eren, Müller, Thorsten, Syga, Piotr, Sperl, Philip, Böttinger, Konstantin
Text-to-Speech (TTS) technology offers notable benefits, such as providing a voice for individuals with speech impairments, but it also facilitates the creation of audio deepfakes and spoofing attacks. AI-based detection methods can help mitigate the
Externí odkaz:
http://arxiv.org/abs/2401.09512
Autor:
Müller, Nicolas M., Burgert, Maximilian, Debus, Pascal, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin
Machine-learning (ML) shortcuts or spurious correlations are artifacts in datasets that lead to very good training and test performance but severely limit the model's generalization capability. Such shortcuts are insidious because they go unnoticed d
Externí odkaz:
http://arxiv.org/abs/2310.19381
Current anti-spoofing and audio deepfake detection systems use either magnitude spectrogram-based features (such as CQT or Melspectrograms) or raw audio processed through convolution or sinc-layers. Both methods have drawbacks: magnitude spectrograms
Externí odkaz:
http://arxiv.org/abs/2308.11800
For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shor
Externí odkaz:
http://arxiv.org/abs/2302.04246
Machine learning is a data-driven field, and the quality of the underlying datasets plays a crucial role in learning success. However, high performance on held-out test data does not necessarily indicate that a model generalizes or learns anything me
Externí odkaz:
http://arxiv.org/abs/2211.15510
Autor:
Müller, Nicolas M., Czempin, Pavel, Dieckmann, Franziska, Froghyar, Adam, Böttinger, Konstantin
Current text-to-speech algorithms produce realistic fakes of human voices, making deepfake detection a much-needed area of research. While researchers have presented various techniques for detecting audio spoofs, it is often unclear exactly why these
Externí odkaz:
http://arxiv.org/abs/2203.16263