Amazon Price Comparison Scraper
Overview
A Python-based scraper that monitors selected product categories on Amazon and compares prices against the client's other sources. The script runs every four hours, exports a clean comparison spreadsheet, and delivers it to the client by email or shared cloud storage. Outcome: Replaced manual price tracking with a hands-off pipeline that produces fresh, decision-ready price reports six times a day.
Architecture & Pipeline
flowchart LR
n0["Scheduler (4 h)Recurring runs"]
n1["AmazonSource listings"]
n2["Selenium ScrapeNames · IDs · prices · images"]
n3["Compare PricesVs client sources"]
n4["JSON / Excel + ImagesStructured output"]
n5["Email / Drive DeliveryClient report"]
n0 --> n1
n1 --> n2
n2 --> n3
n3 --> n4
n4 --> n5
classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step2 fill:#f0fdfa,stroke:#0d9488,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
class n0 step0;
class n1 step1;
class n2 step2;
class n3 step2;
class n4 step3;
class n5 step4;
End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.
Key Features
- Scheduled execution every four hours with full logging
- Structured output in JSON and Excel for easy downstream use
- Image extraction with PNG/JPG export
- Automatic delivery via Google Drive and email
- Deployed on an Ubuntu remote desktop server for reliability
- Tech Stack:** Python, Selenium, BeautifulSoup, Pandas, MongoDB, Linux