Audio style transfer using shallow convolutional networks and random filters

Autor:	Huihuang Zhao, Gaobo Yang, Manimaran Ramasamy, Jiyou Chen
Rok vydání:	2020
Předmět:	Computer Networks and Communications Computer science Speech recognition ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION 020207 software engineering 02 engineering and technology Convolutional neural network Image (mathematics) Tone (musical instrument) Continuous wavelet Hardware and Architecture Transfer (computing) 0202 electrical engineering electronic engineering information engineering Media Technology Spectrogram Sound quality Representation (mathematics) Software
Zdroj:	Multimedia Tools and Applications. 79:15043-15057
ISSN:	1573-7721 1380-7501
DOI:	10.1007/s11042-020-08798-6
Popis:	Recently, with the advent of Convolutional Neural Network (CNN) era, Neural style transfer on images has become a very active research topic and the style of an image can be transferred to another image through a CNN so that the image retains both its own content and another style of image. In this work, we propose an algorithm for audio style transfer that uses the force of CNN to generate a new audio from a style audio. We use Continuous Wavelet Transfer(CWT) to convert the audio into a spectrogram and then use the spectrogram as the representation of the audio image through image style transfer method to obtain a new image, and finally, generate an audio using iterative phase reconstruction with Griffin-Lim. We succeed in transferring audio such as light music but had difficulty in transferring audio that has lyrics and high-level metrics such as emotion or tone. We propose several measures to improve the quality of audio and a lot of experimental results shows that our method is better than other methods in terms of sound quality.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::7af6cfcf93959ed5fb7d238b5c8f1afe https://doi.org/10.1007/s11042-020-08798-6 Zobrazit plný text záznamu Full text from SpringerLink