Deep RL Agents for Flappy Bird & Lunar Lander

Overview

DQN and DDQN agents trained to play Flappy Bird and Lunar Lander. The agents achieve high scores and sustain long play sessions, demonstrating practical value-based deep reinforcement learning. Outcome: Working DQN/DDQN implementations that consistently outperform baselines on classic RL benchmarks.

Architecture & Pipeline

flowchart LR
    n0["Game Environment
Flappy Bird · Lunar Lander"]
    n1["State Observation
Pixels / vector"]
    n2["DQN / DDQN Agent
TensorFlow"]
    n3["Action
Discrete control"]
    n4["Reward Signal
Score / landing"]
    n5["Replay Buffer
Off-policy training"]
    n6["Updated Policy
High-score play"]
    n0 --> n1
    n1 --> n2
    n2 --> n3
    n3 --> n4
    n4 --> n5
    n5 --> n6
classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step2 fill:#f0fdfa,stroke:#0d9488,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
    class n0 step0;
    class n1 step1;
    class n2 step1;
    class n3 step2;
    class n4 step3;
    class n5 step3;
    class n6 step4;

End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.

Key Features

DQN and DDQN agents for two distinct environments
Sustained high-score performance across long episodes
Reproducible training and evaluation setup
Tech Stack:** Python, TensorFlow

Overview

Architecture & Pipeline

Key Features

More AI & ML projects

COVID-19 Detection from Chest X-rays

Graph Neural Network Fraud Detection

K-Means Fleet Order Assignment