GRS Marketing Bot — Real Emails.
No Fabrication.
A 4-level enrichment hierarchy that finds verified business contacts from real web data — after learning the hard way that every LLM will fabricate contact information if you let it.
We Tried the Fast Path.
It Cost Us 64% Bounce Rate.
Building a compliant outbound engine required solving the AI fabrication problem first.
The original GRS email outreach system used an LLM-based enrichment step to generate and enrich contact data from prospect descriptions. It produced plausible-looking email addresses at scale. It was also fabricating approximately 85% of them — resulting in a 64% email bounce rate and Hostinger rate-limiting the sending domain within weeks.
The lesson: all LLMs fabricate emails when asked to generate contact data. Real contact enrichment requires real web data: finding actual email addresses on actual websites, verifying them against real mail exchange records, and scoring them against real business signals.
The current architecture uses a 4-level enrichment hierarchy: (1) website scraping for explicit email addresses, (2) Serper.dev web search for contact pages, (3) pattern-based email inference from known format + domain verification, (4) manual research flag for high-value prospects that don't yield to automated methods. Each level is attempted in sequence; contacts only advance when verification succeeds.
Four Enrichment Levels.
One Sends-Per-Day Cap.
Every prospect goes through enrichment before entering the send queue. No exceptions.
Website Scraping
Extracts email addresses from prospect website contact pages, footers, and about pages using regex patterns. Highest confidence level — explicit contact information from the business's own site. ~40% of prospects yield an email at this stage.
Serper.dev Web Search
For prospects that don't yield an email from their own website, Serper.dev runs a targeted web search for "[business name] email contact" and related patterns. Results are scraped for email addresses appearing in search snippets, directories, and review sites.
Pattern Inference + Domain Verification
If a domain is known but no email is found, common email patterns (firstname@domain, info@domain, contact@domain) are tested against the domain's MX records to identify valid mailboxes without sending. Reduces guessing to a structured verification process.
Manual Research Flag
High-value prospects that don't yield to automated enrichment are flagged for manual research and written to a priority queue in Postgres. These are surfaced to Adrian for direct outreach rather than automated sequence enrollment.
Scheduling + CAN-SPAM Compliance
Enriched contacts enter the send queue. Schedule: Tue/Wed/Thu 10AM + 2PM, Fri 11AM + 1PM. Volume cap: 20/day. Random 45–120 second delay between sends. Opt-out processed immediately. Physical address in footer. No deceptive subject lines. Sending from a dedicated outreach alias (protects primary business email reputation).
First Real Campaign Delivered.
Enrichment Pipeline Maturing.
832+ prospects scored and in the database. First real campaign of 10 emails sent successfully after the UTC timezone bug was identified and fixed (JavaScript Date() in containers returns UTC; fixed by using Schedule Trigger output fields instead).
91 bad emails cleaned from the database following the LLM-era bounce events. The current enrichment pipeline produces verified contacts only — fabrication is prevented by design: the system doesn't attempt to generate emails, only to find and verify them.
The Serper.dev integration for Level 02 enrichment is operational. Reacher (self-hosted email verification) is recommended but not yet deployed — its absence means Level 03 inference relies on MX record checks rather than full SMTP verification. This is the current gap in the enrichment pipeline and the next build priority.
Every LLM Fabricates Emails.
Plan for It.
AI-generated contact data is not contact data.
LLMs fabricate emails when asked to generate contact information. This is not a model-specific bug; it's a category error. Real enrichment requires real web sources.
Protect the sending domain at all costs.
Using a dedicated outreach alias protects the primary business email from reputation damage during cold outreach. Once a sending domain is rate-limited, recovery takes weeks.
UTC is not local time inside Docker containers.
JavaScript Date() returns UTC in containerized environments. Scheduling logic that relies on local business hours must use the Schedule Trigger's output fields (input.Hour) or explicit timezone conversion — not JavaScript Date objects.
Volume caps are compliance infrastructure, not performance limits.
The 20-sends-per-day cap isn't a bottleneck — it's a deliverability protection. High-volume cold email sends trigger spam filters and ISP rate limits regardless of content quality. Pacing is part of the system design.
Want This Outbound Engine
for Your Brokerage or Practice?
The same 4-level enrichment hierarchy deployed for your vertical — legal, freight, or retail. 30-minute discovery call.