Releases

0.0.5 - 10 Jun 2026

Lead Enrichment Pipeline & Email Intelligence Improvements

This release introduces a full lead enrichment pipeline that transforms completed scraping jobs into structured, verified, and confidence-scored leads. It also improves email classification accuracy by refining how role-based and generic emails are filtered out before identifying personal contacts.

Added a complete lead enrichment pipeline for processing finished scraping jobs into structured lead data
Introduced email verification using domain validation and MX record checks to improve deliverability confidence
Implemented confidence scoring for leads based on email validity, domain quality, and classification signals
Improved email classification logic to better distinguish personal contacts from role-based and generic addresses
Enhanced filtering of blocked prefixes and domains to reduce false-positive lead identification

New Lead Retrieval Endpoint & Workflow Integration

A new API layer has been introduced to expose enriched lead data directly from completed jobs.

Added /leads/<job_id> endpoint to serve enriched and processed leads
Added load_job_result helper to simplify access to completed job data
Updated client workflow to automatically poll /result and fetch enriched leads from /result/leads
Streamlined job-to-lead retrieval flow for faster downstream consumption

Data Quality & Email Filtering Improvements

Focused improvements on lead quality, ensuring cleaner and more actionable outputs.

Added stricter filtering for role-based email prefixes (e.g. info, support, sales)
Introduced generic domain filtering to reduce low-quality or non-personal leads
Ensured proper classification of valid personal-name emails for higher accuracy
Reduced false positives in lead generation through improved validation logic

Refactors & Pipeline Centralization

Improved internal structure to make lead generation more maintainable and reusable across services.

Centralized lead-building logic into a dedicated service module
Refactored configuration-driven filtering for reusable prefix and domain rules
Streamlined internal data flow between scraping, processing, and lead generation layers
Improved separation between job execution and lead enrichment responsibilities