Get the latest tech news
Using a LLM to compress text
Introduction Large language models are trained on huge datasets of text to learn the relationships and contexts of words within larger documents. These relationships are what allows the model to generate text. Recently I've read concerns about LLMs being trained on copyrighted text and reproducing it. This got me thinking:
Large language models are trained on huge datasets of text to learn the relationships and contexts of words within larger documents. I figured that, for the most part, many texts contain sections that would naturally align with the language relationships the model has learned. For the first, I decided to use the first chapter of " Alice's Adventures in Wonderland" as I assumed it would be in the model's training data.
Or read this on Hacker News