Get the latest tech news
Google’s Gemini Pro 1.5 enters public preview on Vertex AI
Its headlining feature is the amount of context that it can process: 1 million tokens, which is equivalent to around 700,000 words or around 30,000 lines of code.
And, as an added upside, large-context models can better grasp the narrative flow of data they take in, generate contextually richer responses and reduce the need for fine-tuning and factual grounding — hypothetically, at least. Because Gemini 1.5 Pro is multilingual — and multimodal in the sense that it’s able to understand images and videos and, as of Tuesday, audio streams in addition to text — the model can also analyze and compare content in media like TV shows, movies, radio broadcasts, conference call recordings and more across different languages. In a pre-recorded demo earlier this year, Google showed Gemini 1.5 Pro searching the transcript of the Apollo 11 moon landing telecast (which comes to about 400 pages) for quotes containing jokes, and then finding a scene in movie footage that looked similar to a pencil sketch.
Or read this on TechCrunch