Get the latest tech news

Scalable watermarking for identifying large language model outputs

A scheme for watermarking the text generated by large language models shows high text quality preservation and detection accuracy and low latency, and is feasible in large-scale-production settings.

Watermarking can help identify synthetic text and limit accidental or deliberate misuse 4, but has not been adopted in production systems owing to stringent quality, detectability and computational efficiency requirements. SynthID-Text comes with rigorous and customizable non-distortion properties that can be configured to guarantee text quality preservation; we confirm this empirically, including via real user feedback measured over approximately 20 million Gemini chatbot interactions. Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, Jamie Hayes, Jonah Brown-Cohen, Rudy Bunel, Borja Balle, Taylan Cemgil, Zahra Ahmed, Kitty Stacpoole, Ilia Shumailov, Sven Gowal, Demis Hassabis & Pushmeet Kohli

Get the Android app

Or read this on Hacker News