Linguistics, Department of

Receive updates for this collection

Rethinking the Phonetics of Baby-talk: Differences Across Canada and Vanuatu in the Articulation of Mothers' Speech to Infants

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2021-10-11
Abstract: 

Infant-directed speech (IDS) is phonetically distinct from adult-directed speech (ADS): It is typically considered to have special prosody—like higher pitch and slower speaking rates—as well as unique speech sound properties, for example, more breathy, hyperarticulated, and/or variable consonant and vowel articulation. These phonetic features are widely observed in the IDS of caregivers from urbanized contexts who speak a handful of very well-researched languages. Yet studies with more diverse socio-cultural and linguistic samples show that this “typical” IDS prosody is not consistently observed across cultures. We extended cross-cultural work by examining IDS speech segment articulation, which—like prosody—is also thought to be a characteristic phonetic feature of IDS that might aid speech and language development. Here we asked whether IDS vowels have different articulatory features compared to ADS vowels in two distinct linguistic and socio-cultural contexts: urban English-speaking Canadian mothers, and rural Lenakel- and Southwest Tanna-speaking ni-Vanuatu mothers (n = 57, 20–46 years of age). Replicating prior work, Canadian mothers had more variable vowels in IDS compared to ADS, but also did not show clear register differences for breathiness or hyperarticulation. Vowels spoken by ni-Vanuatu mothers showed very distinct articulatory tendencies, using less variable (and less breathy) IDS vowels. Along with other work showing diversity in IDS phonetics across populations, this paper suggests that any understanding of how IDS might aid speech and language development are best examined through a culturally- and linguistically-specific lens.

Document type: 
Article

On the Difficulty of Defining “Difficult” in Second-Language Vowel Acquisition

Author: 
Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2021-08-05
Abstract: 

Hierarchies of difficulty in second-language (L2) phonology have long played a role in the postulation and evaluation of learning models. In L2 pronunciation teaching, hierarchies are assumed to be helpful in the development of instructional strategies based on anticipated areas of difficulty. This investigation addressed the practicality of defining a pedagogically useful hierarchy of difficulty for English tense and lax close vowels (/i I u ʊ/) produced by Cantonese speakers. Unlike their English counterparts, Cantonese close tense-lax pairs are allophonic variants with [i u] occurring before alveolars and [I ʊ] before velars. Each tense-lax pair represents a “phonemic split” in which members of a single L1 category are realized contrastively in L2. Despite evidence that English tense-lax distinctions are challenging for Cantonese speakers, no previous empirical work has closely considered the problem from the standpoint of vowel intelligibility across multiple phonetic contexts and in different words sharing the same rhyme. In a picture-based word-elicitation task, 18 Cantonese-speaking participants produced 31 high-frequency CV and CVC words. Vowels were evaluated for intelligibility by phonetically-trained judges. A series of mixed-effects binary logistic models were fitted to the scores, with vowel quality, phonetic context (rhyme) and word as factors, and length of Canadian residence and daily use of English as co-variates. As expected, the general hierarchy of difficulty for vowels that emerged (/i/ > /u/ > /ʊ/ > /I/) was complicated by large differences across phonetic contexts. Results were not readily explicable in terms of transfer; moreover, different words with the same rhyme were not produced with equal intelligibility. The most serious modeling complication was the sizeable inter-speaker variability in difficulties, which could not be accounted for by model co-variates. Although some difficulties were roughly systematic at the group level, it is argued that establishing a pedagogically useful hierarchy on such data would prove intractable. Rather, L2 learners might be better served by assessment and instructional targeting of their individual problem areas than by a focus on errors predicted from hierarchies of difficulty.

Document type: 
Article
File(s): 

The Gender Gap Tracker: Using Natural Language Processing To Measure Gender Bias in Media

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2021-01-29
Abstract: 

We examine gender bias in media by tallying the number of men and women quoted in news text, using the Gender Gap Tracker, a software system we developed specifically for this purpose. The Gender Gap Tracker downloads and analyzes the online daily publication of seven English-language Canadian news outlets and enhances the data with multiple layers of linguistic information. We describe the Natural Language Processing technology behind this system, the curation of off-the-shelf tools and resources that we used to build it, and the parts that we developed. We evaluate the system in each language processing task and report errors using real-world examples. Finally, by applying the Tracker to the data, we provide valuable insights about the proportion of people mentioned and quoted, by gender, news organization, and author gender. Data collected between October 1, 2018 and September 30, 2020 shows that, in general, men are quoted about three times as frequently as women. While this proportion varies across news outlets and time intervals, the general pattern is consistent. We believe that, in a world with about 50% women, this should not be the case. Although journalists naturally need to quote newsmakers who are men, they also have a certain amount of control over who they approach as sources. The Gender Gap Tracker relies on the same principles as fitness or goal-setting trackers: By quantifying and measuring regular progress, we hope to motivate news organizations to provide a more diverse set of voices in their reporting.

Document type: 
Article
File(s): 

Evaluation in Political Discourse Addressed to Women: Appraisal Analysis of Cosmopolitan's Coverage of the 2014 US Midterm Elections

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2017-08
Abstract: 

Before the US midterm elections of November 2014, the well-known women’s magazine Cosmopolitan decided to include politics in its contents. The editorial board stated that their aim was to encourage readers to vote and to be engaged with women’s rights advocay in the election process. To that end, Cosmopolitan created a new website, CosmoVotes, with content ranging from discussion of political issues to endorsement of specific candidates who were believed to advance women’s issues. Topics include labour rights, abortion, contraception, health, minimum wage and social equity.

This paper evaluates the discourse of this new section of the Cosmopolitan website, together with readers’ responses, concentrating on evaluative language. In particular, we are concerned with differences between the editorial position and readers’ responses as viewed through the Appraisal framework (Martin & White, 2005), and the role that verbal processes play in the expression of evaluative meanings. The corpus used for the analysis consists of a selection of articles and readers’ opinions from CosmoVotes. The methodology is based on annotation of Appraisal features and processes related to the interpersonal dimension of meaning. Those features reveal how attitudes are evaluated and capture ideological positionings in this discourse. Our results show that CosmoVotes has special characteristics, such as a predominance of high intensification in the readers’ opinions, and strong negative judgements and expressions, while the magazine’s pieces on political issues are more nuanced and eschew intensification.

Document type: 
Article
File(s): 

Big Data and Quality Data for Fake News and Misinformation Detection

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-05-23
Abstract: 

Fake news has become an important topic of research in a variety of disciplines including linguistics and computer science. In this paper, we explain how the problem is approached from the perspective of natural language processing, with the goal of building a system to automatically detect misinformation in news. The main challenge in this line of research is collecting quality data, i.e., instances of fake and real news articles on a balanced distribution of topics. We review available datasets and introduce the MisInfoText repository as a contribution of our lab to the community. We make available the full text of the news articles, together with veracity labels previously assigned based on manual assessment of the articles’ truth content. We also perform a topic modelling experiment to elaborate on the gaps and sources of imbalance in currently available datasets to guide future efforts. We appeal to the community to collect more data and to make it available for research purposes.

Document type: 
Article
File(s): 

History of Language Teaching Methods

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-07-28
Abstract: 

The earliest European written accounts of language teaching methods are from the 5th century AD, referring specifically to Latin. For centuries the language of the Romans was the primary foreign code throughout much of Europe, functioning as the language of scholarship, trade, and government. The founding of universities in the latter Middle Ages led to developing the Grammar-Translation Method, based on the centuries’ long tradition of reading Latin and Greek learned texts. In the 15th century, Europeans began shifting from Latin to using the continent’s modern languages more widely. By the 19th century, the Direct Method was developed, modeled on first language acquisition and addressing the greater need for speaking skills in e.g. French, German, and English. In the early 20th century, research largely in educational psychology led to developing the Audio-lingual Method in the 1940s. Believing language use was an issue of stimulus and response, teaching methods emphasized repetition and dialogue memorization. A decade later, Chomsky’s landmark research on cognitive aspects of language acquisition recognized that children do not acquire an inventory of linguistic stimuli and responses. Instead, deep processing in the brain enables them to generate sentences they have never heard before. This led to modernizing the Direct Method by incorporating cognitive dimensions of language learning. Since the 1970s, language is further recognized as a social phenomenon that inherently entails expressing, interpreting, and negotiating meaning. To foster such competence, the current approach of Communicative Language Teaching emphasizes having learners do meaningful activities involving the exchange of new information.

Document type: 
Article
File(s): 

Quinlingualism in the Maghreb? English Use in Moroccan Outdoor Advertising

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-05-27
Document type: 
Article
File(s): 

RST Signalling Corpus: A Corpus of Signals of Coherence Relations

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-03
Abstract: 

We present the RST Signalling Corpus (Das et al. in RST signalling corpus, LDC2015T10. https://catalog.ldc.upenn.edu/LDC2015T102015), a corpus annotated for signals of coherence relations. The corpus is developed over the RST Discourse Treebank (Carlson et al. in RST Discourse Treebank, LDC2002T07. https://catalog.ldc.upenn.edu/LDC2002T072002) which is annotated for coherence relations. In the RST Signalling Corpus, these relations are further annotated with signalling information. The corpus includes annotation not only for discourse markers which are considered to be the most typical (or sometimes the only type of) signals in discourse, but also for a wide array of other signals such as reference, lexical, semantic, syntactic, graphical and genre features as potential indicators of coherence relations. We describe the research underlying the development of the corpus and the annotation process, and provide details of the corpus. We also present the results of an inter-annotator agreement study, illustrating the validity and reproducibility of the annotation. The corpus is available through the Linguistic Data Consortium, and can be used to investigate the psycholinguistic mechanisms behind the interpretation of relations through signalling, and also to develop discourse-specific computational systems such as discourse parsing applications.

Document type: 
Article
File(s): 

On Being Negative

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2017-04-01
Abstract: 

This paper investigates the pragmatic expressions of negative evaluation (negativity) in two corpora: (i) comments posted online in response to newspaper opinion articles; and (ii) online reviews of movies, books and consumer products. We propose a taxonomy of linguistic resources that are deployed in the expression of negativity, with two broad groups at the top level of the taxonomy: resources from the lexicogrammar or from discourse semantics. We propose that rhetorical figures can be considered part of the discourse semantic resources used in the expression of negativity. Using our taxonomy as starting point, we carry out a corpus analysis, and focus on three phenomena: adverb + adjective combinations; rhetorical questions; and rhetorical figures. Although the analysis in this paper is corpus-assisted rather than corpus-driven, the final goal of our research is to make it quantitative, in extracting patterns and resources that can be detected automatically.

Document type: 
Article
File(s): 

The Semantics of Evaluational Adjectives: Perspectives from Natural Semantic Metalanguage and Appraisal

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2017
Abstract: 

We apply the Natural Semantic Metalanguage (NSM) approach (Goddard & Wierzbicka 2014) to the lexical-semantic analysis of English evaluational adjectives and compare the results with the picture developed in the Appraisal Framework (Martin & White 2005). The analysis is corpus-assisted, with examples mainly drawn from film and book reviews, and supported by collocational and statistical information from WordBanks Online. We propose NSM explications for 15 evaluational adjectives, arguing that they fall into five groups, each of which corresponds to a distinct semantic template. The groups can be sketched as follows: “First-person thought-plus-affect”, e.g. wonderful; “Experiential”, e.g. entertaining; “Experiential with bodily reaction”, e.g. gripping; “Lasting impact”, e.g. memorable; “Cognitive evaluation”, e.g. complex, excellent. These groupings and semantic templates are compared with the classifications in the Appraisal Framework’s system of Appreciation. In addition, we are particularly interested in sentiment analysis, the automatic identification of evaluation and subjectivity in text. We discuss the relevance of the two frameworks for sentiment analysis and other language technology applications.

Document type: 
Article
File(s):