Why Your Reports Are Wrong: Data Quality Issues You're Missing

A VP of Sales pulls up the Q4 pipeline report, presents it to the board, and commits to a number. Three weeks later, the forecast is off by 30%. Not because the market shifted — because the data underneath was garbage. Deals with no amounts, contacts in the wrong lifecycle stage, leads attributed to "direct traffic" because nobody tagged the campaign properly.

This happens more than anyone wants to admit. The reports look clean. The dashboards are polished. But the data feeding them is full of holes, inconsistencies, and outright fiction. And HubSpot won't tell you — it just reports what it has.

Lifecycle Stage Chaos

This is the single biggest source of bad reporting we see. Here's how it usually goes wrong:

Marketing says an MQL is a contact who downloaded two resources and visited the pricing page. Sales says an MQL is anyone who looks vaguely interested. There's no written definition, so three different workflows set lifecycle stage based on three different interpretations. A contact downloads a whitepaper, gets set to MQL by workflow A. Then a deal gets created and workflow B sets them to SQL. Then workflow C runs a nightly cleanup and resets them to Lead because they don't meet the lead score threshold.

Your MQL-to-SQL conversion report? Meaningless. Your funnel metrics? Fiction.

The fix is boring but necessary: write down your lifecycle definitions, get sales and marketing to agree on them, implement them in one workflow (not three), and enforce data hygiene on who can modify lifecycle stage manually.

Attribution Black Holes

Pull up a random sample of 100 contacts. Check the "Original Source" field. In a typical portal, 25-40% will show "Offline Sources" or "Direct Traffic" — which basically means "we don't know where they came from."

Every one of those contacts is invisible to your channel performance reports. If 35% of your contacts have no real attribution, your "which channels drive revenue" analysis is based on 65% of the picture. You might be underfunding your best channel because its contacts keep showing up as "direct."

Common culprits: links in emails that strip UTM parameters, form submissions from landing pages without tracking codes, contacts imported from events without source data, and integrations that create contacts without passing source information. None of these are hard to fix individually. But nobody fixes them because nobody notices until the report doesn't add up.

Deal Amount Problems

Run this report right now: open deals grouped by pipeline stage, showing total amount. Then filter for deals where amount is empty or $0. In most portals, 15-25% of open deals have no amount at all.

That means your pipeline value is understated by whatever those deals are actually worth. Your weighted forecast is wrong. Your average deal size metric is wrong (it's only averaging deals that bothered to fill in the field). Your revenue projection is a guess built on incomplete data.

Some teams use $1 as a placeholder so the deal shows up in reports. Others put in the "best case" number. Some put in the annual value, others the monthly value. When those all land in the same report, the total is nonsensical.

Date Field Confusion

HubSpot has multiple date fields on deals: create date, close date, last modified date. When you build a time-based report, which one are you using?

We worked with a company whose "deals closed this quarter" report was showing 40% more revenue than they actually closed. The issue: they were filtering on "close date" but some reps were updating the close date on old deals when they pushed them to a new quarter. So a deal originally expected to close in Q2, pushed to Q3, pushed again to Q4, showed up in Q4's report — but it also appeared in Q2 and Q3's historical reports if anyone ran those retroactively.

Use create date for "when did this opportunity enter the pipeline." Use close date only on actually closed deals. And consider a custom "original close date" property if you want to track forecast accuracy.

Duplicate Inflation

If 15% of your contacts are duplicates (which is normal — we've written about why this matters), every metric that counts contacts is inflated by 15%. Your "new contacts this month" number is overstated. Your email engagement rates are diluted. Your conversion rates are understated because the denominator is too big.

The worst part: duplicates split activity history. Contact A has 3 page views and 2 email opens. Contact B (same person) has a form submission and a meeting. Neither record tells the full story. Any lead scoring model built on incomplete activity data is making bad decisions.

What to Do About It

You don't fix all of this in a week. Start with the problem that's causing the most visible damage — usually lifecycle stage definitions or deal amount completeness — and work outward.

  1. Pick one metric your leadership team relies on.
  2. Trace it back to the underlying data. Check for gaps, conflicts, and inconsistencies.
  3. Fix the data entry point, not just the existing data. If deal amounts are missing, make the field required at a specific stage.
  4. Monitor the metric weekly for a month to verify it's improving.

Our free HubSpot audit tool scans for the most common data quality issues — duplicate rates, missing fields, lifecycle stage consistency — and gives you a starting point. It's not a complete data quality assessment, but it shows you where to look first.


Need help getting your HubSpot data and reports back on track? Check out our services or book a discovery call.

You Might Also Like

HubSpot Security Settings Every Admin Should Check

HubSpot Security Settings Every Admin Should Check

Your Pipeline Is Lying to You (And How to Tell)

Your Pipeline Is Lying to You (And How to Tell)

The Real Reason CRM Adoption Fails

The Real Reason CRM Adoption Fails

Comments