Releases
Lead Enrichment Pipeline & Email Intelligence Improvements
This release introduces a full lead enrichment pipeline that transforms completed scraping jobs into structured, verified, and confidence-scored leads. It also improves email classification accuracy by refining how role-based and generic emails are filtered out before identifying personal contacts.
- Added a complete lead enrichment pipeline for processing finished scraping jobs into structured lead data
- Introduced email verification using domain validation and MX record checks to improve deliverability confidence
- Implemented confidence scoring for leads based on email validity, domain quality, and classification signals
- Improved email classification logic to better distinguish personal contacts from role-based and generic addresses
- Enhanced filtering of blocked prefixes and domains to reduce false-positive lead identification
New Lead Retrieval Endpoint & Workflow Integration
A new API layer has been introduced to expose enriched lead data directly from completed jobs.
- Added
/leads/<job_id>endpoint to serve enriched and processed leads - Added
load_job_resulthelper to simplify access to completed job data - Updated client workflow to automatically poll
/resultand fetch enriched leads from/result/leads - Streamlined job-to-lead retrieval flow for faster downstream consumption
Data Quality & Email Filtering Improvements
Focused improvements on lead quality, ensuring cleaner and more actionable outputs.
- Added stricter filtering for role-based email prefixes (e.g. info, support, sales)
- Introduced generic domain filtering to reduce low-quality or non-personal leads
- Ensured proper classification of valid personal-name emails for higher accuracy
- Reduced false positives in lead generation through improved validation logic
Refactors & Pipeline Centralization
Improved internal structure to make lead generation more maintainable and reusable across services.
- Centralized lead-building logic into a dedicated service module
- Refactored configuration-driven filtering for reusable prefix and domain rules
- Streamlined internal data flow between scraping, processing, and lead generation layers
- Improved separation between job execution and lead enrichment responsibilities