Deep RL Agents for Flappy Bird & Lunar Lander

Overview

DQN and DDQN agents trained to play Flappy Bird and Lunar Lander. The agents achieve high scores and sustain long play sessions, demonstrating practical value-based deep reinforcement learning. Outcome: Working DQN/DDQN implementations that consistently outperform baselines on classic RL benchmarks.

Architecture & Pipeline

flowchart LR
    n0["
Game Environment
Flappy Bird · Lunar Lander
"] n1["
State Observation
Pixels / vector
"] n2["
DQN / DDQN Agent
TensorFlow
"] n3["
Action
Discrete control
"] n4["
Reward Signal
Score / landing
"] n5["
Replay Buffer
Off-policy training
"] n6["
Updated Policy
High-score play
"] n0 --> n1 n1 --> n2 n2 --> n3 n3 --> n4 n4 --> n5 n5 --> n6 classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10; classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10; classDef step2 fill:#f0fdfa,stroke:#0d9488,color:#1e293b,stroke-width:2px,rx:10,ry:10; classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10; classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10; class n0 step0; class n1 step1; class n2 step1; class n3 step2; class n4 step3; class n5 step3; class n6 step4;

End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.

Key Features

  • DQN and DDQN agents for two distinct environments
  • Sustained high-score performance across long episodes
  • Reproducible training and evaluation setup
  • Tech Stack:** Python, TensorFlow