Skip to content

Releases


Lead Enrichment Pipeline & Email Intelligence Improvements


This release introduces a full lead enrichment pipeline that transforms completed scraping jobs into structured, verified, and confidence-scored leads. It also improves email classification accuracy by refining how role-based and generic emails are filtered out before identifying personal contacts.

  • Added a complete lead enrichment pipeline for processing finished scraping jobs into structured lead data
  • Introduced email verification using domain validation and MX record checks to improve deliverability confidence
  • Implemented confidence scoring for leads based on email validity, domain quality, and classification signals
  • Improved email classification logic to better distinguish personal contacts from role-based and generic addresses
  • Enhanced filtering of blocked prefixes and domains to reduce false-positive lead identification


New Lead Retrieval Endpoint & Workflow Integration


A new API layer has been introduced to expose enriched lead data directly from completed jobs.

  • Added /leads/<job_id> endpoint to serve enriched and processed leads
  • Added load_job_result helper to simplify access to completed job data
  • Updated client workflow to automatically poll /result and fetch enriched leads from /result/leads
  • Streamlined job-to-lead retrieval flow for faster downstream consumption


Data Quality & Email Filtering Improvements


Focused improvements on lead quality, ensuring cleaner and more actionable outputs.

  • Added stricter filtering for role-based email prefixes (e.g. info, support, sales)
  • Introduced generic domain filtering to reduce low-quality or non-personal leads
  • Ensured proper classification of valid personal-name emails for higher accuracy
  • Reduced false positives in lead generation through improved validation logic


Refactors & Pipeline Centralization


Improved internal structure to make lead generation more maintainable and reusable across services.

  • Centralized lead-building logic into a dedicated service module
  • Refactored configuration-driven filtering for reusable prefix and domain rules
  • Streamlined internal data flow between scraping, processing, and lead generation layers
  • Improved separation between job execution and lead enrichment responsibilities