Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/165027
Title: Building generalizable models for discourse phenomena evaluation and machine translation
Authors: Jwalapuram, Prathyusha
Keywords: Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Jwalapuram, P. (2022). Building generalizable models for discourse phenomena evaluation and machine translation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165027
Abstract: The neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities and enable better translations. Despite increasing instances of machine translation systems including contextual information, the evidence for translation quality improvement is sparse, especially for discourse phenomena. Most of these phenomena go virtually unnoticed by traditional automatic evaluation measures such as BLEU. This work presents testsets and evaluation measures for four discourse phenomena: anaphora, lexical consistency, discourse connectives, and coherence, and highlights the need for performing such fine-grained evaluation. We present benchmarking results for several context-aware machine translation models using these testsets and evaluation measures, showing that the performance is not always consistent across languages. We also present a targeted fine-tuning strategy which improves pronoun translations by leveraging errors in already seen training data and additional losses, instead of building specialized architectures that do not generalize across languages.
URI: https://hdl.handle.net/10356/165027
DOI: 10.32657/10356/165027
Schools: School of Computer Science and Engineering 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
I_Thesis__Generalizable_Models_for_Evaluation-6.pdf3.63 MBAdobe PDFThumbnail
View/Open

Page view(s)

269
Updated on Apr 29, 2025

Download(s) 50

175
Updated on Apr 29, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.