Musical agents based on self-organizing maps for audio applications

Resource type
Thesis type
(Thesis) Ph.D.
Date created
Musical agents are artificial agents that tackle musical creative tasks. Musical agents implement the technologies of Artificial Intelligence (AI) and Multi-agent systems (MAS) for musical applications. Musical agent studies situate in the interdisciplinary studies of Musical Metacreation (MuMe) with a focus on the agent architectures. Metacreation and MuMe combine the artistic practice of Generative Arts with the scientific literature of Computational Creativity. We define Musical Metacreation as an interdisciplinary field that studies the partial or complete automation of musical tasks. In this work, we concentrate on an audio-based musical agent architecture with unsupervised learning while presenting the literature review of musical agents. Our review of musical agents surveys seventy-eight musical agent systems that have been presented in peer-reviewed publications. Building on our literature review, we propose a typology of musi- cal agents in nine dimensions of agent architectures, musical tasks, environment types, number of agents, number of agent roles, communication types, corpus types, input/output types, human inter- action modality. Our typology of musical agents builds on the AI terminology and agent architecture typology in MAS. In comparison to agent typology of MAS, the categories that we present in our typology address the specific phenomenon that appear in the agent-based applications of musical tasks. Our survey of musical agents indicated a possibility of research on an audio-based musical agent architecture with unsupervised learning. The implementations of musical agents that we present in this thesis utilize audio recordings with unsupervised learning because a variety of musical styles are available in the audio domain. Audio recordings are accessible; thus, the curation of a corpus for agent learning is easier. amount of work to gather the training data. To address this research possibility, we proposed an architecture called Musical Agent based on Self-Organizing Maps (MASOM) for audio applications. We were inspired by Edgard Varèse’s definition of music, which suggested that music is “nothing but organized sounds.” We put the notion of music as organized sounds into practice by combining autonomous audio latent space generation with musical structure modelling. This unique combination suggests that an audio-based musical agent architecture requires two kinds of sound organization: organizing sounds in latent sonic space to differentiate sound objects and organizing sounds in time to create temporal musical structures. We present two main real-time applications of Musical Agents based on Self-Organizing Maps: architectures for experimental electronic music with machine listening and an architecture for virtual reality applications with respiratory user interaction. Our applications exemplify the strengths and possibilities of audio-based musical agents in the artistic domain. We believe that MASOM architectures can be useful for the applications of “musical creativity as it is.” We also propose that the innovative perspective of MASOM architectures provide an exploration of the “musical creativity as it could be.”
Copyright statement
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Pasquier, Philippe
Attachment Size
etd20334.pdf 37.58 MB