AI web search risks: Mitigating business data accuracy threats

Wait 5 sec.

Over half of us now use AI to search the web, yet the stubbornly low data accuracy of common tools creates new business risks.While generative AI (GenAI) offers undeniable efficiency gains, a new investigation highlights a disparity between user trust and technical accuracy that poses specific risks to corporate compliance, legal standing, and financial planning.For the C-suite, the adoption of these tools represents a classic ‘shadow IT’ challenge. According to a survey of 4,189 UK adults conducted in September 2025, around a third of users believe AI is already more important to them than standard web searching. If employees trust these tools for personal queries, they are almost certainly employing them for business research.The investigation, conducted by Which?, suggests that unverified reliance on these platforms could be costly. Around half of AI users report trusting the information they receive to a ‘reasonable’ or ‘great’ extent. Yet, looking at the granularity of the responses provided by AI models, that trust is often misplaced.The accuracy gap when using AI to search the webThe study tested six major tools – ChatGPT, Google Gemini (both standard and ‘AI Overviews’), Microsoft Copilot, Meta AI, and Perplexity – across 40 common questions spanning finance, law, and consumer rights.Perplexity achieved the highest total score at 71 percent, closely followed by Google Gemini AI Overviews at 70 percent. In contrast, Meta scored the lowest at 55 percent. ChatGPT, despite its widespread adoption, received a total score of 64 percent, making it the second-lowest performer among the tools tested. This disconnect between market dominance and reliable output underlines the danger of assuming popularity equals performance in the GenAI space.However, the investigation revealed that all of these AI tools frequently misread information or provided incomplete advice that could pose serious business risks. For financial officers and legal departments, the nature of these errors is particularly concerning.When asked how to invest a £25,000 annual ISA allowance, both ChatGPT and Copilot failed to identify a deliberate error in the prompt regarding the statutory limit. Instead of correcting the figure, they offered advice that potentially risked breaching HMRC rules.While Gemini, Meta, and Perplexity successfully identified the error, the inconsistency across platforms necessitates a rigorous “human-in-the-loop” protocol for any business process involving AI to ensure accuracy.For legal teams, the tendency of AI to generalise regional regulations when using it for web search presents a distinct business risk. The testing found it common for tools to misunderstand that legal statutes often differ between UK regions, such as Scotland versus England and Wales.Furthermore, the investigation highlighted an ethical gap in how these models handle high-stakes queries. On legal and financial matters, the tools infrequently advised users to consult a registered professional. For example, when queried about a dispute with a builder, Gemini advised withholding payment; a tactic that experts noted could place a user in breach of contract and weaken their legal position.This “overconfident advice” creates operational hazards. If an employee relies on an AI for preliminary compliance checks or contract review without verifying the jurisdiction or legal nuance, the organisation could face regulatory exposure.Source transparency issuesA primary concern for enterprise data governance is the lineage of information. The investigation found that AI search tools often bear a high responsibility to be transparent, yet frequently cited sources that were vague, non-existent, or have dubious accuracy, such as old forum threads. This opacity can lead to financial inefficiency.In one test regarding tax codes, ChatGPT and Perplexity presented links to premium tax-refund companies rather than directing the user to the free official HMRC tool. These third-party services are often characterised by high fees.In a business procurement context, such algorithmic bias from AI tools when using them for web search could lead to unnecessary vendor spend or engagement with service providers that pose a high risk due to not meeting corporate due diligence standards.The major technology providers acknowledge these limitations, placing the burden of verification firmly on the user—and, by extension, the enterprise.A Microsoft spokesperson emphasised that their tool acts as a synthesiser rather than an authoritative source. “Copilot answers questions by distilling information from multiple web sources into a single response,” the company noted, adding that they “encourage people to verify the accuracy of content.”OpenAI, responding to the findings, said: “Improving accuracy is something the whole industry’s working on. We’re making good progress and our latest default model, GPT-5, is the smartest and most accurate we’ve built.”Mitigating AI business risk through policy and workflowFor business leaders, the path forward is not to ban AI tools – which often increases by driving usage further into the shadows – but to implement robust governance frameworks to ensure the accuracy of their output when bring used for web search:Enforce specificity in prompts: The investigation notes that AI is still learning to interpret prompts. Corporate training should emphasise that vague queries yield risky data. If an employee is researching regulations, they must specify the jurisdiction (e.g., “legal rules for England and Wales”) rather than assuming the tool will infer the context.Mandate source verification: Trusting a single output is operationally unsound. Employees must demand to see sources and check them manually. The study suggests that for high-risk topics, users should verify findings across multiple AI tools or “double source” the information. Tools like Google’s Gemini AI Overviews, which allow users to review presented web links directly, performed slightly better in scoring because they facilitated this verification process.Operationalise the “second opinion”: At this stage of technical maturity, GenAI outputs should be viewed as just one opinion among many. For complex issues involving finance, law, or medical data, AI lacks the ability to fully comprehend nuance. Enterprise policy must dictate that professional human advice remains the final arbiter for decisions with real-world consequences.The AI tools are evolving and their web search accuracy is gradually improving, but as the investigation concludes, relying on them too much right now could prove costly. For the enterprise, the difference between a business efficiency gain from AI and a compliance failure risk lies in the verification process.See also: How Levi Strauss is using AI for its DTC-first business modelWant to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo. Click here for more information.AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.The post AI web search risks: Mitigating business data accuracy threats appeared first on AI News.