Sheikhshabbafghi, Golnar

Thesis type

(Thesis) Ph.D.

Date created

2021-03-31

Authors/Contributors

Author: Sheikhshabbafghi, Golnar

Abstract

Human annotations, especially in highly technical domains, are expensive and time consuming togather, and can also be erroneous. As a result, we never have sufficiently accurate data to train andevaluate supervised methods. In this thesis, we address this problem by taking a semi-supervised approach to biomedical namedentity recognition (NER), and by proposing an inventory-independent evaluation framework for supervised and unsupervised word sense disambiguation. Our contributions are as follows: We introduce a novel graph-based semi-supervised approach to named entity recognition(NER) and exploit pre-trained contextualized word embeddings in several biomedical NER tasks. We propose a new evaluation framework for word sense disambiguation that permits a fair comparison between supervised methods trained on different sense inventories as well as unsupervised methods without a fixed sense inventory.

Keywords

Identifier

etd21291

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Sarkar, Anoop

Language

English

Member of collection

Computing Science Theses

Download file	Size
input_data\21742\etd21291.pdf	3.3 MB

Investigations into the value of labeled and unlabeled data in biomedical entity recognition and word sense disambiguation

Keywords

Views & downloads - as of June 2023