Skip to main content

In-context concept learning using diffusion models

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2024-05-31
Authors/Contributors
Abstract
This thesis addresses the challenge of learning contextual visual concepts from limited image sets. We leverage pre-trained text-to-image diffusion models and introduce a novel optimization framework targeting token embeddings and linear projection parameters in the model's cross-attention layers. Our approach is guided by three primary objectives: perceiving the concept within its context, ensuring focused attention on the concept, and accurately capturing the concept's details. Additionally, we develop a controllable image-editing workflow that allows for precise control over the strength and location of the transferred concepts in new images. Comprehensive qualitative and quantitative experiments demonstrate the effectiveness of our method in learning and transferring contextual visual concepts, significantly outperforming existing techniques.
Document
Extent
52 pages.
Identifier
etd23108
Copyright statement
Copyright is held by the author(s).
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Madahvi-Amiri, Ali
Language
English
Member of collection
Download file Size
etd23108.pdf 18.75 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 0