Skip to main content

Autolume-Live: An interface for live visual performances using GANs

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2023-03-09
Authors/Contributors
Abstract
As of writing this, Deep Learning models are able to generate images with high-fidelity. They are becoming an integral tool for creative expression and democratizing creativity. With the introduction of prompt-based generative systems we have seen that reducing the skill floor to accessing leads to an explosion of possibilities for creative practices. But, most creative AI systems are still inaccessible to users that are not trained in coding or the respective research domain. We present Autolume-Live, an interface created to make a deep learning model a malleable material for live performances. We aggregate and link together creative practices with current state-of-the-art frameworks from model compression, generative AI and interpretable AI, resulting an interactive tool that opens up a new paradigm for audio-visual performances and visual performance in general. We show that by shifting the focus on creating a tool for a valid user scenario (VJing), it is possible to use current technologies to run StyleGAN2 at a framerate of 30FPS (1024px) to 50FPS (512px) without compressing the model and 40FPS and 60FPS after compression in a tool for live performances. Autolume-Live can be used on its own to train and compress models, find salient features and create audio-visual mappings. Furthermore, we include networking protocols that allow users to control the tool via Open-Sound-Control. Hence, they can interface Autolume-Live's parameters with their preferred software and hardware, e.g. TouchDesigner, PureData or MaxMSP. Lastly, the resulting video stream is sent out as an NDI-stream allowing for postprocessing in software such as OBS or TouchDesigner. We have showcased the work in two installations during the COVID19 pandemic showcasing the generative power (offline) and we showcase setups for live performances in this work. Beyond this thesis, there is continued development on new features and work on both interactive installations and audio visual pieces.
Document
Extent
58 pages.
Identifier
etd22382
Copyright statement
Copyright is held by the author(s).
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Pasquier, Philippe
Language
English
Download file Size
etd22382.pdf 1.1 MB

Views & downloads - as of June 2023

Views: 97
Downloads: 6