Get the latest tech news

How crawlers impact the operations of the Wikimedia projects

Since the beginning of 2024, the demand for the content created by the Wikimedia volunteer community – especially for the 144 million images, videos, and other files on Wikimedia Commons – has grow…

But with the rise of AI, the dynamic is changing: We are observing a significant increase in request volume, with most of this traffic being driven by scraping bots collecting training data for large language models (LLMs) and other use cases. We have started to work towards addressing these questions systemically, and have set a major focus on establishing sustainable ways for developers and reusers to access knowledge content in the Foundation’s upcoming fiscal year. Our content is free, our infrastructure is not: We need to act now to re-establish a healthy balance, so we can dedicate our engineering resources to supporting and prioritizing the Wikimedia projects, our contributors and human access to knowledge.

Get the Android app

Or read this on r/technology