Saudi Car Marketplace Scraper
Overview
A continuous scraping system that extracts car listings, pricing, and images from the major Saudi marketplaces Dubizzle.sa, Syarah, and Haraj. Listings are filtered for validity and synced into a Supabase/PostgreSQL database that powers the client's React-based comparison platform. Outcome: Provides the client's comparison platform with real-time, validated listings from across Saudi Arabia's largest car marketplaces.
Architecture & Pipeline
flowchart LR
n0["Dubizzle.sa · Syarah · HarajSource marketplaces"]
n1["ScrapersPython · Selenium · BeautifulSoup"]
n2["Validation FiltersPrice · condition · availability"]
n3["Image PipelinePer-listing assets"]
n4["Supabase / PostgreSQLReal-time DB"]
n5["Client React PlatformComparison UI"]
n0 --> n1
n1 --> n2
n2 --> n3
n3 --> n4
n4 --> n5
classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step2 fill:#f0fdfa,stroke:#0d9488,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
class n0 step0;
class n1 step1;
class n2 step2;
class n3 step2;
class n4 step3;
class n5 step4;
End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.
Key Features
- Continuous scraping with real-time database updates
- Validation filters for price, condition, and availability
- Image extraction tied to structured listing records
- Direct Supabase/PostgreSQL integration
- Designed to scale across additional marketplaces
- Tech Stack:** Python, BeautifulSoup, Selenium, Supabase, PostgreSQL