CASE STUDY·2025–2026·ACTIVE ENGAGEMENT·80M+ MONTHLY VISITS
How we got an 80M-visit holiday authority out of the research business.
Holidaycalendar.io is the global authority on cultural holidays: 80M+ visits per month across 6,000+ observances. When they expanded coverage to include deals last year, two full-time researchers were absorbed scouring the web for which brands were running which promotions. We built a system that does the discovery and grades what it finds, scoring every candidate on source quality, independent corroboration, and specificity. The editorial team got their week back. They confirm what the system surfaces and write the articles their audience comes for.
M01 · KPI
80M+
Monthly visits served
global cultural-holiday authority
M02 · KPI
6,000+
Holidays covered
global · every observance
M03 · KPI
2 FT
Researchers freed
discovery moved to the system
M04 · KPI
100%
Source-traced deals
every published row carries provenance
EXHIBIT 01
Two researchers, full-time, and the deal surface kept growing.
Holidaycalendar.io is known for editorial depth: history, origin, ways to celebrate. When they expanded into deal coverage last year, two full-time researchers were absorbed scouring social media, press releases, news, and brand pages for every holiday window. The work was multi-channel and multi-pass, and every year more brands hooked campaigns to more holidays. The team didn't want more researchers. They wanted the editorial calendar back.
⊘ Status quo
Manual research that doesn't scale. Editorial time spent hunting, not writing.
×01Two full-time researchers on deal discovery, not on editorial writing
×03Speculation everywhere. 'What to expect for next year' articles look identical to real deal coverage
×04Brand trend rising. Every year, more brands hooking campaigns to more holidays
→ Mandate
Automate the discovery. Verify what's real. Hand the team a queue to approve.
→01Programmatic search across every channel a researcher used to check
→02Confidence scoring on each candidate: source quality, independent corroboration, specificity
→03Editorial queue of high-confidence deals, ready to write up
→04Full source provenance attached to every published row
EXHIBIT 02
Discovery is automatable. The hard part is knowing what's real.
Most of what the web says about a holiday isn't a deal. It's speculation (what to expect for next year), the same offer paraphrased across five articles, or a roundup citing one primary source. We built confidence scoring that grades every candidate on three signals: source quality (is this a known reliable source?), independent corroboration (how many distinct sources confirm it?), and specificity (real dates, real mechanics?). High-confidence candidates surface for editorial review. Speculation and noise get rejected before they reach the queue.
DIAGRAM · 4-STAGE VERIFICATION SYSTEM
∑
Source quality, independent corroboration, and specificity together separate real deals from speculation.
EXHIBIT 03
Two researchers' work absorbed. Deal volume up. Editorial team back to writing.
HEADLINE METRIC
The system absorbed the discovery surface. Two full-time researchers' worth of weekly research now runs on cron, scoring every candidate before it hits the editorial queue. The team's calendar shifted from finding deals to confirming them and writing the articles, which is what their audience comes to Holidaycalendar.io for. Deal volume went up because the system surfaces more candidates per holiday than two researchers could check, and coverage stays comprehensive across all 6,000+ holidays.
Discovery hours per week · two-researcher manual vs. system actual
Forecast (board commit)ActualTwo FT researchers' discovery work, absorbed by the system
"
We didn't ship until the system could explain itself. Every published deal answers who said it, when, and why we kept it.
Engagement retrospective · Boost
Holidaycalendar Deal-Finder · #4
EXHIBIT 04
Two patterns the system handles so the editorial team doesn't have to.
EVIDENCE
Two recurring patterns from the operating logs. Each one shows the system doing work a human would otherwise have to.
Pattern · A
Speculation, rejected before it reaches the queue.
→ Source filtering · pre-extraction
The web is full of articles predicting what brands will do for upcoming holidays. They look identical to real deal coverage and they fool naive extractors. The system flags speculation language and 'next year' framings at the URL and rejects the source before extraction even runs. None of it reaches the editorial queue.
Pattern · B
Independent corroboration makes deals more credible, not less.
→ Confidence scoring · merge
Most deals get covered by multiple sources, often paraphrased. The system collapses paraphrased duplicates into a single entry, but inherits the strongest evidence: the count of independent confirmations, the most authoritative source URL, and the widest verified date span. Published deals get more credible the more sources confirm them.
EXHIBIT 05
Operating model. Four stages, one system.
The system runs continuously. Each stage feeds the next: discovery surfaces candidates, scoring grades them, the editorial queue presents the high-confidence ones for review. Hourly cadence costs roughly what the team's prior daily research did, because gates short-circuit the work when there's nothing new to score.
Daily · 15-min backfill
PHASE 01
Search
Programmatic queries across the channels two researchers used to check: social media for brand announcements, press releases for QSR offers, news feeds for retail roundups, brand pages for the official mechanics. Daily seed at 4 AM; 15-minute backfill catches anything that didn't run.
Outputs
→ Programmatic search across every channel
→ Daily seed cron · 4 AM
→ 15-min backfill · catches misses
→ Domain-trust filter on the URL queue
Per-URL
PHASE 02
Extract
Structured deal candidates pulled from raw articles. Speculation language and 'next year' framings get rejected before extraction runs. Schema validators catch missing fields, wrong dates, and wrong-year tags so bad rows never enter the candidate set.
Outputs
→ Structured deal candidates
→ Speculation rejected pre-extraction
→ Schema validation · deterministic
→ Day-vs-month framing check
Hourly
PHASE 03
Score
Every candidate gets a confidence score from three signals. <strong>Source quality</strong>: is this a known reliable source? <strong>Independent corroboration</strong>: how many distinct sources confirm it? <strong>Specificity</strong>: does it have real dates and real mechanics? Paraphrased duplicates collapse into one entry; corroboration counts add up; the strongest source wins.
Outputs
→ Confidence score per candidate
→ Independent corroboration counter
→ Paraphrased duplicates merged
→ Strongest source preserved
Editorial · confirm + write
PHASE 04
Author
High-confidence candidates surface in a review dashboard with full lineage attached: source URL, domain trust score, dedup history. The editorial team confirms each candidate, writes the article in the cultural context of its holiday, and publishes. Discovery on cron; humans on writing.
Outputs
→ Editorial confirmation · review dashboard
→ Article writeup in CMS · cultural context
→ Per-deal source lineage attached
→ System-monitoring dashboard · cron + drift
"
We didn't ship until the system could explain itself. Every published deal can answer who said it, when, and why we kept it.