Get the latest tech news

Grounding AI in reality with a little help from Data Commons

September 12, 2024 Jennifer Chen, Software Engineer, and Prem Ramaswami, Head of Data Commons, Google Technology & Society, Data Commons Team Google's DataGemma models bridge the gap between large language models (LLMs) and real-world data by leveraging the Data Commons knowledge graph to improve the factuality and trustworthiness of LLM responses. Large Language Models (LLMs) have revolutionized how we interact with information, but grounding their responses in verifiable facts remains a fundamental challenge.

This broad and openly available repository continues to expand its global coverage and exemplifies what it means to make data AI-ready, providing a rich foundation for building more grounded and reliable AI. By leveraging innovative retrieval techniques, DataGemma helps LLMs access and incorporate into their responses data sourced from trusted institutions (including governmental and intergovernmental organizations and NGOs), mitigating the risk of hallucinations and improving the trustworthiness of their outputs. This approach fine-tunes Gemma 2 to identify statistics within its responses and annotate them with a call to Data Commons, including a relevant query and the model's initial answer for comparison.

Get the Android app

Or read this on Hacker News