BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control
Autor: | Emmanuel Deruty, Maarten Grachten, Stefan Lattner |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
0209 industrial biotechnology
Computer science 02 engineering and technology latent space models Machine learning computer.software_genre lcsh:Technology lcsh:Chemistry 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering General Materials Science Instrumentation lcsh:QH301-705.5 user control Fluid Flow and Transfer Processes Bass guitar business.industry lcsh:T Process Chemistry and Technology Deep learning music generation General Engineering deep learning Autoencoder lcsh:QC1-999 Computer Science Applications Bass (sound) lcsh:Biology (General) lcsh:QD1-999 lcsh:TA1-2040 User control Rock music 020201 artificial intelligence & image processing Artificial intelligence Diatonic scale business lcsh:Engineering (General). Civil engineering (General) computer Timbre lcsh:Physics |
Zdroj: | Applied Sciences Volume 10 Issue 18 Applied Sciences, Vol 10, Iss 6627, p 6627 (2020) |
ISSN: | 2076-3417 |
DOI: | 10.3390/app10186627 |
Popis: | Deep learning has given AI-based methods for music creation a boost by over the past years. An important challenge in this field is to balance user control and autonomy in music generation systems. In this work, we present BassNet, a deep learning model for generating bass guitar tracks based on musical source material. An innovative aspect of our work is that the model is trained to learn a temporally stable two-dimensional latent space variable that offers interactive user control. We empirically show that the model can disentangle bass patterns that require sensitivity to harmony, instrument timbre, and rhythm. An ablation study reveals that this capability is because of the temporal stability constraint on latent space trajectories during training. We also demonstrate that models that are trained on pop/rock music learn a latent space that offers control over the diatonic characteristics of the output, among other things. Lastly, we present and discuss generated bass tracks for three different music fragments. The work that is presented here is a step toward the integration of AI-based technology in the workflow of musical content creators. |
Databáze: | OpenAIRE |
Externí odkaz: |