Zobrazeno 1 - 10
of 29
pro vyhledávání: '"Boettinger, Konstantin"'
Publikováno v:
Interspeech 2024
Recent research has highlighted a key issue in speech deepfake detection: models trained on one set of deepfakes perform poorly on others. The question arises: is this due to the continuously improving quality of Text-to-Speech (TTS) models, i.e., ar
Externí odkaz:
http://arxiv.org/abs/2406.03512
Autor:
Müller, Nicolas M., Kawa, Piotr, Hu, Shen, Neu, Matthias, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin
Voice faking, driven primarily by recent advances in text-to-speech (TTS) synthesis technology, poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can be considered genuine, while fake speech co
Externí odkaz:
http://arxiv.org/abs/2402.06304
Autor:
Müller, Nicolas M., Kawa, Piotr, Choong, Wei Herng, Casanova, Edresson, Gölge, Eren, Müller, Thorsten, Syga, Piotr, Sperl, Philip, Böttinger, Konstantin
Text-to-Speech (TTS) technology offers notable benefits, such as providing a voice for individuals with speech impairments, but it also facilitates the creation of audio deepfakes and spoofing attacks. AI-based detection methods can help mitigate the
Externí odkaz:
http://arxiv.org/abs/2401.09512
Neural networks build the foundation of several intelligent systems, which, however, are known to be easily fooled by adversarial examples. Recent advances made these attacks possible even in air-gapped scenarios, where the autonomous system observes
Externí odkaz:
http://arxiv.org/abs/2311.08539
Autor:
Müller, Nicolas M., Burgert, Maximilian, Debus, Pascal, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin
Machine-learning (ML) shortcuts or spurious correlations are artifacts in datasets that lead to very good training and test performance but severely limit the model's generalization capability. Such shortcuts are insidious because they go unnoticed d
Externí odkaz:
http://arxiv.org/abs/2310.19381
Current anti-spoofing and audio deepfake detection systems use either magnitude spectrogram-based features (such as CQT or Melspectrograms) or raw audio processed through convolution or sinc-layers. Both methods have drawbacks: magnitude spectrograms
Externí odkaz:
http://arxiv.org/abs/2308.11800
For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shor
Externí odkaz:
http://arxiv.org/abs/2302.04246
Publikováno v:
Proc. 2nd Symposium on Security and Privacy in Speech Communication, 2022
Model inversion (MI) attacks allow to reconstruct average per-class representations of a machine learning (ML) model's training data. It has been shown that in scenarios where each class corresponds to a different individual, such as face classifiers
Externí odkaz:
http://arxiv.org/abs/2301.03206
Machine learning is a data-driven field, and the quality of the underlying datasets plays a crucial role in learning success. However, high performance on held-out test data does not necessarily indicate that a model generalizes or learns anything me
Externí odkaz:
http://arxiv.org/abs/2211.15510
Neural networks follow a gradient-based learning scheme, adapting their mapping parameters by back-propagating the output loss. Samples unlike the ones seen during training cause a different gradient distribution. Based on this intuition, we design a
Externí odkaz:
http://arxiv.org/abs/2206.10259