Deep RL Agents for Flappy Bird & Lunar Lander
Overview
DQN and DDQN agents trained to play Flappy Bird and Lunar Lander. The agents achieve high scores and sustain long play sessions, demonstrating practical value-based deep reinforcement learning. Outcome: Working DQN/DDQN implementations that consistently outperform baselines on classic RL benchmarks.
Architecture & Pipeline
flowchart LR
n0["Game EnvironmentFlappy Bird · Lunar Lander"]
n1["State ObservationPixels / vector"]
n2["DQN / DDQN AgentTensorFlow"]
n3["ActionDiscrete control"]
n4["Reward SignalScore / landing"]
n5["Replay BufferOff-policy training"]
n6["Updated PolicyHigh-score play"]
n0 --> n1
n1 --> n2
n2 --> n3
n3 --> n4
n4 --> n5
n5 --> n6
classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step2 fill:#f0fdfa,stroke:#0d9488,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
class n0 step0;
class n1 step1;
class n2 step1;
class n3 step2;
class n4 step3;
class n5 step3;
class n6 step4;
End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.
Key Features
- DQN and DDQN agents for two distinct environments
- Sustained high-score performance across long episodes
- Reproducible training and evaluation setup
- Tech Stack:** Python, TensorFlow