Please use this identifier to cite or link to this item: /library/oar/handle/123456789/137284
Title: On the cusp of comprehensibility : can language models distinguish between metaphors and nonsense?
Authors: Griciute, Bernadeta (2022)
Keywords: Natural language generation (Computer science)
Metaphor
Issue Date: 2022
Citation: Griciute, B. (2022). On the cusp of comprehensibility: can language models distinguish between metaphors and nonsense? (Master's dissertation).
Abstract: Utterly creative texts can sometimes be difficult to understand, balancing on the edge of comprehensibility. However, good language skills and common sense allow advanced language users both to interpret creative texts and to reject some linguistic input as nonsense. The goal of this thesis is to evaluate whether the current language models are also able to make the distinction between creative language use, namely (unconventional) metaphors, and nonsense. To test this, mean rank and pseudo-log-likelihood score (PLL) of metaphorical and nonsensical sentences were computed, and several pre-trained models (BERT, RoBERTa) were fine-tuned for binary classification between the two categories. There was a significant difference between the categories in the mean ranks and PPL scores, and the classifier reached around 70.0% - 85.5% accuracy, which is close to the 87% accuracy of the human baseline. The satisfactory performance seems to signal that it is already possible to train the current language models to distinguish between metaphors and nonsense. This raises further questions on the characteristics of metaphorical and nonsensical sentences which allow the successful classification.
Description: M.Sc. (HLST)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/137284
Appears in Collections:Dissertations - FacICT - 2022
Dissertations - FacICTAI - 2022

Files in This Item:
File Description SizeFormat 
2318ICTCSA531005075347_1.PDF
  Restricted Access
1.37 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.