Ambartsoumian, Artaches

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2018-11-26

Authors/Contributors

Author: Ambartsoumian, Artaches

Abstract

Many machine learning tasks are structured as sequence modeling problems, predominantly dealing with text and data with a time dimension. It is thus very important to have a model that is good at capturing both short range and long range dependencies across sequence steps. Many approaches have been used over the past few decades, with various neural network architectures becoming the standard in recent years. The main neural network architecture types that have been applied are recurrent neural networks (RNNs) and convolutional neural neworks (CNNs). In this work, we explore a new type of neural network architecture, self-attention networks (SANs), by testing on sequence modeling tasks of sentiment analysis classification and time-series regression. First we perform a detailed comparison between simple SANs, RNNs, and CNNs on six sentiment analysis datasets, where we demonstrate SANs achieving higher classification accuracy while having other better model characteristics over RNNs such as faster training and inference times, lower number of trainable parameters, and consuming less memory during training. Next we propose a more complex self-attention based architecture called ESSAN and use it to achieve state-of-the-art (SOTA) results on the Stanford Sentiment Treebank fine-grained sentiment analysis dataset. Finally, we apply our ESSAN architectures for the regression task of multivariate time-series prediction. Our preliminary results show that ESSAN once again achieves SOTA results, beating previous SOTA RNN with attention architectures.

Keywords

Identifier

etd19959

Copyright statement

Copyright is held by the author.

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Popowich, Fred

Member of collection

Computing Science Theses

Download file	Size
etd19959.pdf	894.08 KB

Applying self-attention neural networks for sentiment analysis classification and time-series regression tasks

Keywords

Views & downloads - as of June 2023