Sharma, Akshit

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2022-10-20

Authors/Contributors

Author: Sharma, Akshit

Abstract

Coreference resolution is a challenging problem that requires clustering relevant mentions based on referent objects in a text document. Most work on it has relied extensively on text-only datasets, which fail to provide visual cues about the entities represented by the phrases. On this basis, we introduce DenseRefer3D, a language \& 3D dataset to create alignment between rich referring expressions and real-world objects and an annotation tool, DenseRefer3D-Annotator, that facilitates the rendering of natural language sentences and 3D scenes. The tool provides functionalities to manage data collection workflow on the MTurk crowdsourcing platform efficiently and enables effective visualization of coreference links and phrases-to-object mappings. We outline several coreference experiments using an end-to-end deep learning approach, analyze the quality of detected mentions and clustering, propose a new task that directly aligns textual phrases with 3D objects, and explore ways to further research in the combined domain of language and vision.

Extent

86 pages.

Keywords

Identifier

etd22202

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Chang, Angel

Language

English

Member of collection

Computing Science Theses

Download file	Size
etd22202.pdf	7.56 MB

DenseRefer3D: A language and 3D dataset for coreference resolution and referring expression comprehension

Keywords

Views & downloads - as of June 2023