Automated natural language headline generation using discriminative machine learning models

Date created: 
2007
Abstract: 

Headline or short summary generation is an important problem in Text Summarization and has several practical applications. We present a discriminative learning framework and a rich feature set for the headline generation task. Secondly, we present a novel Bleu measure based scheme for evaluation of headline generation models, which does not require human produced references. We achieve this by building a test corpus using the Google news service. We propose two stacked log-linear models for both headline word selection (Content Selection) and for ordering words into a grammatical and coherent headline (Headline Synthesis). For decoding a beam search algorithm is used that combines the two log-linear models to produce a list of k-best human readable headlines from a news story. Systematic training and experimental results on the Google-news test dataset demonstrate the success and effectiveness of our approach.

Description: 
The author has placed restrictions on the PDF copy of this thesis. The PDF is not printable nor copyable. If you would like the SFU Library to attempt to contact the author to get permission to print a copy, please email your request to summit-permissions@sfu.ca.
Language: 
English
Document type: 
Thesis
Rights: 
Copyright remains with the author
File(s): 
Department: 
School of Computing Science - Simon Fraser University
Thesis type: 
(Computing Science) Project (M.Sc.)
Statistics: