Get the latest tech news
ByteDance’s Bytespider is scraping at much higher rates than other platforms
The crawler, dubbed Bytespider, is scraping the internet at 3,000 times the rate of other genAI tools like Anthropic.
It’s scraping data at a rate that’s many multiples of other major companies, such as ( Google, Meta, Amazon, OpenAI, and Anthropic, which use their own scraper bots to help create and improve their large language or multimodal models, known as LLMs or LMMs. Earlier this year, ByteDance released a chat-based LLM called Duabo, but work on that model would have been completed prior to the accumulation of more recent training data scraped by Bytespider. “Given the audience and the amount of use, TikTok with a search environment that is a completely biddable space with keywords and topics, that would be very interesting to a lot of people spending a ton of money with Google right now,” the person said.
Or read this on Hacker News