BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control

Autor: Emmanuel Deruty, Maarten Grachten, Stefan Lattner
Jazyk: angličtina
Rok vydání: 2020
Předmět:
0209 industrial biotechnology
Computer science
02 engineering and technology
latent space models
Machine learning
computer.software_genre
lcsh:Technology
lcsh:Chemistry
020901 industrial engineering & automation
0202 electrical engineering
electronic engineering
information engineering

General Materials Science
Instrumentation
lcsh:QH301-705.5
user control
Fluid Flow and Transfer Processes
Bass guitar
business.industry
lcsh:T
Process Chemistry and Technology
Deep learning
music generation
General Engineering
deep learning
Autoencoder
lcsh:QC1-999
Computer Science Applications
Bass (sound)
lcsh:Biology (General)
lcsh:QD1-999
lcsh:TA1-2040
User control
Rock music
020201 artificial intelligence & image processing
Artificial intelligence
Diatonic scale
business
lcsh:Engineering (General). Civil engineering (General)
computer
Timbre
lcsh:Physics
Zdroj: Applied Sciences
Volume 10
Issue 18
Applied Sciences, Vol 10, Iss 6627, p 6627 (2020)
ISSN: 2076-3417
DOI: 10.3390/app10186627
Popis: Deep learning has given AI-based methods for music creation a boost by over the past years. An important challenge in this field is to balance user control and autonomy in music generation systems. In this work, we present BassNet, a deep learning model for generating bass guitar tracks based on musical source material. An innovative aspect of our work is that the model is trained to learn a temporally stable two-dimensional latent space variable that offers interactive user control. We empirically show that the model can disentangle bass patterns that require sensitivity to harmony, instrument timbre, and rhythm. An ablation study reveals that this capability is because of the temporal stability constraint on latent space trajectories during training. We also demonstrate that models that are trained on pop/rock music learn a latent space that offers control over the diatonic characteristics of the output, among other things. Lastly, we present and discuss generated bass tracks for three different music fragments. The work that is presented here is a step toward the integration of AI-based technology in the workflow of musical content creators.
Databáze: OpenAIRE