Get the latest tech news

ByteDance’s Bytespider is scraping at much higher rates than other platforms


The crawler, dubbed Bytespider, is scraping the internet at 3,000 times the rate of other genAI tools like Anthropic.

It’s scraping data at a rate that’s many multiples of other major companies, such as ( Google, Meta, Amazon, OpenAI, and Anthropic, which use their own scraper bots to help create and improve their large language or multimodal models, known as LLMs or LMMs. Earlier this year, ByteDance released a chat-based LLM called Duabo, but work on that model would have been completed prior to the accumulation of more recent training data scraped by Bytespider. “Given the audience and the amount of use, TikTok with a search environment that is a completely biddable space with keywords and topics, that would be very interesting to a lot of people spending a ton of money with Google right now,” the person said.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of TikTok

TikTok

Photo of OpenAI

OpenAI

Photo of World

World

Related news:

News photo

How to build an AC that will get the world through hotter summers

News photo

Google’s Grip on Search Slips as TikTok and AI Startup Mount Challenge

News photo

Zuckerberg Passes Bezos to Become World’s Second-Richest Person