All Projects
professionaltooldesktop
JobScraper
Playwright browser automation with route optimization
About
A Python scraping engine that uses Playwright to automate browser sessions and extract fiber installation job data. Orchestrates 8-16 concurrent browser workers with async task management, parses job text into structured data, and feeds results into a VRPTW route optimizer powered by a VROOM solver with OSRM routing and Nominatim geocoding backends.
Features
- Playwright browser automation with 8-16 concurrent worker orchestration
- Job text parsing and structured data extraction pipeline
- VRPTW route optimization via self-hosted VROOM solver
- Self-hosted geocoding (Nominatim) and routing (OSRM) backends
- Strategy pattern for swappable geocoding/routing backends with cloud fallbacks
- 795+ automated tests with comprehensive coverage
Tech Stack
PythonPlaywrightPandasVROOMOSRMNominatim