Wikidata’s next leap: the open database powering tomorrow’s AI and Wikipedia

Wait 5 sec.

Many people have never heard of Wikidata, yet it’s a thriving knowledge graph that powers enterprise IT projects, AI assistants, civic tech, and even Wikipedia’s data backbone. As one of the world’s largest freely editable databases, it makes structured, license-free data available to developers, businesses, and communities tackling global challenges.With a gleaming new API, an AI-ready initiative, and a long-standing vision of decentralization, Wikidata is redefining open data’s potential. This article explores its real-world impact through projects like AletheiaFact and Sangkalak, its many technical advances, and its community-driven mission to build knowledge “by the people, for the people,” while unassumingly but effectively enhancing Wikipedia’s global reach.Wikidata’s impact: from enterprise to civic innovationLaunched in 2012 to support Wikipedia’s multilingual content, today Wikidata centralizes structured data — facts like names, dates, and relationships — and streamlines updates across Wikipedia’s language editions. A single edit (like the name of a firm’s CEO) propagates to all linking pages, ensuring consistency for global enterprises and editors alike. And beyond Wikipedia, Wikidata’s machine-readable format makes it ideal for business-tech solutions and ripe for developer innovation.Wikidata’s database includes over 1.3 billion structured facts and even more connections that link related data together. This massive scale makes it a powerful tool for developers. They can access the data using tools like SPARQL (a query language for exploring linked data) or the EventStreams API for real-time updates. The information is available in a wide variety of tool-friendly formats like JSON-LD, XML, and Turtle. Best of all, the data is freely available under CC-O, making it easy for businesses and startups to build on.Wikibase’s robust and open infrastructure drives transformative projects. AletheiaFact, a platform for verifying political claims based in São Paulo, harnesses Wikidata’s records to drive civic transparency, empowering communities with trusted government insights and showcasing open knowledge’s transformative impact. In India, Wikidata was used to create a map of medical facilities in Murshidabad district, color-coded by type (sub-centers, hospitals, etc.) , making healthcare access easier.In Bangladesh, Sangkalak opens up access to Bengali Wikisource texts, unlocking a trove of open knowledge for the region. These projects rely on a mix of SPARQL for fast queries, the REST API for synchronization, and Wikimedia’s Toolforge platform for free hosting, empowering even the smallest of teams to deploy impactful tools.A lot of large tech companies also use Wikidata’s data. One example is WolframAlpha, which uses Wikidata through its WikidataData function, retrieving data like chemical properties via SPARQL for computational tasks, or analyzing chemical properties. This integration with free and open data streamlines data models, cuts redundancy, and boosts query accuracy for businesses, all with zero proprietary constraints.Wikidata’s vision: scaling for a trusted, AI-driven futureHandling nearly 500,000 daily edits, Wikidata pushes the limits of MediaWiki, the software it shares with Wikipedia, and the team is working on various areas of scaling Wikidata. As part of this work, a new RESTful API has simplified data access, thereby energizing Paulina, a public domain book discovery tool, and LangChain, an AI framework with strong Wikidata support. Developers enjoy the API’s responsiveness, sparking excitement for Wikidata’s potential in everything from civic platforms like AletheiaFact to quirky experiments.The REST API release has had immediate impact. For example, developer Daniel Erenrich has used it to integrate access to Wikidata’s data into LangChain, allowing AI agents to retrieve real-time, structured facts directly from Wikidata, which in turn supports generative AI systems in grounding their output in verifiable data. Another example is the aforementioned Paulina, which relies on the API to surface public domain literature from Wikisource, the Internet Archive and more, a fine demonstration of how easier access to open data can enrich cultural discovery.Then there is the visionary leap of the Wikibase Ecosystem project, which enables organizations to store data in their own federated knowledge graphs using MediaWiki and Wikibase, interconnected according to Linked Open Data standards. Decentralizing the data reduces strain on Wikidata and lets it go on serving core data. With its vision of thousands of interconnected Wikibase instances, this project could create a global open data network, boosting Wikidata’s value for enterprises and communities.The potential here is enormous: local governments, enterprises, libraries, research labs, and museums could each maintain their own Wikibase instance, contributing regionally relevant data while maintaining interoperability with global systems. Such decentralization makes the platform more resilient and more inclusive, offering open data stewardship at every scale.Community events drive this mission. WikidataCon, organized by Wikimedia Deutschland and running from 31 October to 2 November 2025, unites developers, editors, and organizations in an effort to refine tools and data quality. Wikidata Days, local meetups and editathons foster collaboration and offer support for budding projects like Paulina. These events embody Wikidata’s ethos of knowledge built by the people, for the people, and help it remain transparent and community-governed.Wikidata and AI: the Embedding Project and beyondThe Wikidata Embedding Project is an effort to represent Wikidata’s structured knowledge as vectors, enabling generative AI systems to employ up-to-date, verifiable information. It aims to address persistent challenges in AI — such as hallucinations and outdated training data — by grounding machine outputs in curated, reliable sources. This could render applications like virtual assistants significantly more accurate, transparent, and aligned with public knowledge.The next decade holds promising opportunities for Wikidata’s continued relevance. As enterprise needs become more complex and interconnected, the demand for interoperable, machine-readable, and trusted datasets will only grow. Wikidata is uniquely positioned to meet this demand — remaining free, open, community-driven, and technically adaptable.Enterprise IT teams will find particular value in Wikidata’s real-time APIs and its nearly 10,000 external identifiers, which link entries across platforms like IMDb, Instagram, and national library systems. These links reduce duplication, streamline data integration, and bridge otherwise isolated datasets. Whether it’s mapping identities across services or enhancing AI with structured facts, Wikidata provides a scalable foundation that saves time and improves precision.With AI chatbots and large-language models now woven into everything from enterprise search to productivity software, the need for accurate, real-time information is more urgent than ever. Wikidata’s linked data embeddings could herald a new generation of AI tools — blending the speed of automation with the reliability of human-curated, public knowledge.As AI reshapes the digital landscape, Wikidata stands out as a beacon of trust and collaboration. By empowering developers, enterprises, and communities alike through projects like AletheiaFact and Sangkalak, it supports transparency, civic innovation, and educational equity. With the Embedding Project improving AI accuracy, the Wikibase Ecosystem enabling federated knowledge networks, and events like WikidataCon and Wikidata Days sparking global collaboration, Wikidata is building an accountable future full of open data. More than a knowledge graph, it’s a people-powered infrastructure for the trustworthy web.I tried 70+ best AI tools.This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro