Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/183922
Title: Vaporizer: breaking watermarking schemes for large lanaguage model outputs
Authors: Ng, Jonathan Hong Jin
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Ng, J. H. J. (2025). Vaporizer: breaking watermarking schemes for large lanaguage model outputs. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/183922
Project: CCDS24-0133
Abstract: In this report, we investigate three recently proposed schemes for watermark- ing the output generated by Large Language Models (LLMs), namely - Prov- able Robust Watermarking for AI-Generated Text [1], Publicly Detectable Watermarking [2] and SynthID-Text [3]. These techniques are claimed to be robust, scalable and production-grade, aimed at promoting responsible us- age of LLMs. We analyse the effectiveness of these watermarking techniques against an extensive collection of modified text attacks, which perform tar- geted semantic changes without altering the general meaning of the text content. Our approach encompasses multiple attack strategies, which in- clude lexical alterations, machine translation (with MarianMT models), and even neural paraphrasing (with usage of BART and Pegasus models). The attack efficacy is measured with two target criteria - successful removal of the watermark and preservation of semantic content. We evaluate semantic preservation through BERT scores, text complexity measures, grammatical errors, and Flesch Reading Ease indices. The experimental results reveal varying levels of effectiveness among different watermarking models, with the same underlying result that it is possible to remove the watermark with reasonable effort. This study sheds light on the strengths and weaknesses of existing LLM watermarking systems, suggesting how they should be constructed to improve security of available schemes.
URI: https://hdl.handle.net/10356/183922
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Ng_Hong_Jin_FYP_Final.pdf
  Restricted Access
4.67 MBAdobe PDFView/Open

Page view(s)

21
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.