Introducing Watchdog: Your Brand's AI Fact-Checker
AI platforms are telling your customers about your brand right now. Some of what they say is accurate. Some of it is fabricated. Watchdog monitors what AI engines claim about your brand, verifies it against ground truth, and alerts you when something is wrong. It builds a fact profile of your company and gets more precise with every monitoring run.
This is the first tool purpose-built for detecting AI hallucinations about your brand, not just tracking whether you get mentioned.
TL;DR
Watchdog is the first tool purpose-built for detecting AI hallucinations about your brand. It builds a verified fact profile of your company, classifies every AI response for factual accuracy, and generates alerts when AI platforms contradict your ground truth. A learning period establishes baselines, and semantic similarity suppression reduces false positives over time.

AI Hallucinations Are a Brand Problem
ChatGPT has over 400 million weekly active users. Perplexity, Gemini, and Copilot are growing fast. When these tools answer questions about your industry, they make factual claims about your company. Sometimes those claims are wrong.
In February 2024, a Canadian tribunal ordered Air Canada to pay damages after its AI chatbot fabricated a bereavement discount policy that did not exist. The company argued the chatbot was a separate entity. The tribunal disagreed. Google lost $100 billion in market value when Bard hallucinated a false claim in a promotional demo. In October 2025, Deloitte issued a partial refund on a A$440,000 government report after AI-generated content was found to contain fabricated academic sources and a fake court quote.
These are high-profile cases. The smaller ones go undetected. A SaaS company discovers ChatGPT is telling prospects it has features that only exist in a competitor's product. An automotive brand finds AI models citing safety recalls that never happened. A financial services firm sees fabricated executive quotes attributed to its CEO.
According to MIT research published in January 2025, AI models use more confident language when hallucinating than when stating facts. The wrong information sounds more authoritative than the right information. That makes these errors harder for consumers to catch on their own.
The Numbers Are Worse Than You Think
The Vectara hallucination leaderboard tracks factual accuracy across major LLMs. Even the best-performing models hallucinate at measurable rates. The average general knowledge hallucination rate sits around 9.2%. Some models reach nearly 30%.
| Model | Hallucination Rate | Context |
|---|---|---|
| Gemini 2.0 Flash | 0.7% | Lowest measured (April 2025) |
| GPT-5 | ~8% | Summarization and reasoning tasks |
| Average across models | ~9.2% | General knowledge queries |
| Falcon-7B-Instruct | ~29.9% | Highest measured |
These rates apply to general knowledge. Brand-specific facts, which are less represented in training data, are likely hallucinated at higher rates. If your company changed its pricing six months ago, raised a funding round, or launched a new product, AI models may not reflect those changes for months or years.
A KPMG global study from 2025 found that less than half of respondents (46%) are willing to trust AI. Trust is declining even as adoption grows. Every hallucinated fact erodes the trust your customers place in the information AI delivers about you.
Visibility Tools Track Mentions. They Do Not Verify Facts.
The AI visibility monitoring space has grown significantly. Profound, Otterly.ai, Peec AI, Semrush, and others now track brand mentions across AI platforms. They answer a valid question: is my brand showing up in AI responses?
That question matters. But it is only half the picture. Being mentioned is useless if the mention contains a fabricated pricing claim, an outdated product description, or a nonexistent partnership with a competitor.
| Capability | Visibility Tools | Sill Watchdog |
|---|---|---|
| Track brand mentions | Yes | Yes |
| Share of voice metrics | Yes | Yes (via monitoring) |
| Detect factual errors | No | Yes |
| Verify claims against ground truth | No | Yes |
| Flag competitor framing | Limited | Yes |
| Detect single-platform anomalies | No | Yes |
| Learn and suppress false positives | No | Yes |
Most tools in this space, including Profound (which raised $96M at a $1B valuation in February 2026), Otterly.ai (starting at $29/month), and Peec AI (backed by a EUR 21M Series A), focus on visibility analytics and sentiment. They are useful tools for understanding reach. They do not tell you whether what AI says about you is true.
What Watchdog Does
Watchdog starts by building a fact profile of your company. It researches your brand across the web, gathering verified facts about your identity, leadership, financials, market position, products, and recent news. Each fact is tagged with a confidence level and source freshness. You can review, edit, and add your own facts to the profile.
Every time Sill runs a monitoring cycle, Watchdog classifies every AI response where your brand was mentioned. It extracts two things: caveats (qualifying statements that limit the recommendation) and factual claims (concrete assertions about your company). Each claim is compared against your fact profile to determine whether it is accurate, unverified, or a contradiction.
Watchdog uses semantic deduplication to group similar claims and caveats together, so you see patterns rather than noise. If three different AI platforms phrase the same pricing error in slightly different ways, Watchdog recognizes it as one issue, not three.
Seven Types of Alerts
Watchdog generates alerts across seven categories. Each one represents a different kind of risk to your brand in AI outputs.
| Alert Type | What It Means |
|---|---|
| Factual Contradiction | AI states something that directly conflicts with a verified fact in your profile. Wrong CEO name, outdated pricing, fabricated product features. |
| Caveat Detected | AI consistently qualifies its recommendation of your brand with a specific concern, such as pricing, complexity, or reliability. |
| Competitive Framing | A competitor is repeatedly positioned as the solution to one of your brand's perceived weaknesses. |
| Single-Model Anomaly | A claim appears on only one AI platform and nowhere else, suggesting a platform-specific hallucination rather than a real perception. |
| Novel Claim | A new factual assertion about your brand that has never appeared before. Could be accurate or could be the start of a hallucination spreading. |
| Competitor Discovered | A new competitor starts appearing consistently in AI responses alongside your brand that was not there before. |
| Presence Change | Your brand's presence on an external platform changed significantly, such as gaining or losing a Wikipedia page or a notable shift in third-party coverage. |
Each alert includes the exact AI response that triggered it, the platforms involved, and how many times the issue has recurred. For factual contradictions, Watchdog shows both what the AI claimed and what your verified fact profile says.
It Gets Smarter as You Use It
Watchdog does not fire alerts on day one. The first three monitoring runs are a learning period. During this phase, Watchdog classifies every AI response and builds baselines: what caveats appear consistently, which claims recur, and how different platforms talk about your brand. No alerts are shown during this phase to avoid false positives from incomplete data.
Starting from the fourth run, alerts begin. But the learning does not stop. Every action you take teaches Watchdog what matters to you.
When you dismiss an alert, Watchdog uses embedding similarity to suppress semantically equivalent alerts in the future. If you dismiss "pricing is expensive," it will also suppress "the cost is on the higher end" and "not the most affordable option" without you having to dismiss each variant individually. You can set suppression to be permanent or temporary (one week, one month, or three months).
When you confirm a factual contradiction, Watchdog updates your fact profile and strengthens its ground truth for future comparisons. When you tell it "this is actually correct," it adjusts its records to match what the AI got right. Over time, the fact profile becomes a living document that reflects the most up-to-date truth about your company.
This feedback loop means that after a few weeks of use, Watchdog surfaces fewer false positives and catches more genuine issues. The signal-to-noise ratio improves with every review cycle.
The Fact Profile: Your Brand's Source of Truth
The fact profile is the foundation of everything Watchdog does. It contains structured, verified facts about your company organized into four categories: identity, leadership, financials, and market position. Each fact includes a source URL, a confidence rating, and a freshness indicator.
Watchdog builds the initial profile automatically by researching your brand across the web. You can then review every fact, correct errors, add information the research missed, and set how quickly facts should be considered outdated. A fast-moving startup might set "current" to 30 days and "recent" to 90 days. A more established company might use 90 and 365 days.
When the classifier encounters a claim that contradicts a fact marked "outdated," it treats the conflict with more nuance. A company that changed its CEO two years ago might legitimately have outdated references. A pricing change from last week is a different kind of problem. Freshness metadata makes this distinction possible.
35% of Consumers Use AI for Purchase Decisions. The Stakes Are Real.
The World Economic Forum's 2025 Global Risk Report ranked AI misinformation as a top concern for the second consecutive year. G2's 2025 Buyer Behavior report found that AI-generated recommendations are the number one source influencing B2B vendor shortlists. A Resolver survey found that 35% of brands report inaccurate AI responses have damaged their reputation.
These numbers will grow. As AI search grows from 2-6% of organic traffic toward the 25% migration Gartner predicted, the cost of unchecked hallucinations grows proportionally. A wrong fact in a Google result at least has a link you can click to verify. A wrong fact in a ChatGPT response is presented as authoritative truth with no obvious source to check.
RAG (retrieval-augmented generation) reduces hallucination rates by roughly 71% when used properly. But brand-specific information often falls outside RAG knowledge bases, especially for mid-market companies without Wikipedia pages or extensive press coverage. The brands most vulnerable to hallucinations are the ones least likely to have robust information available for AI retrieval.
Getting Started with Watchdog
Watchdog is available to all Sill monitoring subscribers. If you already have monitoring set up, Watchdog activates automatically. Your fact profile begins building as soon as your monitoring configuration is saved.
The first step is reviewing your fact profile. Watchdog's automated research is thorough, but you know your company better than any web scraper. Add any facts it missed, correct any it got wrong, and set your freshness preferences. The more complete and accurate your fact profile, the more useful Watchdog's alerts will be.
After three monitoring runs (typically three days), alerts start appearing. Review them in the Watchdog dashboard, where you can filter by alert type, platform, and status. Confirm contradictions, dismiss known caveats, and watch the system learn your preferences.
Know What AI Is Saying About Your Brand
Watchdog monitors every AI response about your brand, verifies factual claims against ground truth, and alerts you when something is wrong. Start monitoring today.
References
- Moffatt v. Air Canada. BC Civil Resolution Tribunal, 2024. Air Canada ordered to pay $812 for chatbot misrepresentation. americanbar.org
- Vectara. "Hallucination Leaderboard." LLM factual accuracy benchmarks across major models. github.com/vectara
- KPMG. "Trust in AI Global Study, 2025." Less than 46% of respondents willing to trust AI. kpmg.com
- MIT. "AI Confidence in Hallucinations, January 2025." AI models use more confident language when hallucinating than when providing factual information.
- G2. "Buyer Behavior in 2025." AI recommendations rank #1 in B2B vendor shortlist influence. company.g2.com
- World Economic Forum. "Global Risks Report 2025." AI misinformation ranked as top short-term and medium-term concern for the second year.
- Harvard Law School Forum on Corporate Governance. "Misinformation and Disinformation in the Digital Age, May 2025." corpgov.law.harvard.edu
Get Your Report
Request your first analysis today to see where you stand.
