Urdu Image Caption Generation

Overview

An attention-based LSTM model that generates Urdu-language captions for input images. The project required building a custom dataset of images paired with native Urdu captions, then training and deploying the model as an online service. Outcome: First-of-its-kind plug-and-play Urdu captioning model, available online for direct use.

Architecture & Pipeline

flowchart LR
    n0["Image Input
Custom Urdu-caption dataset"]
    n1["CNN Encoder
Visual features"]
    n2["Attention LSTM Decoder
TensorFlow"]
    n3["Urdu Caption
Online inference"]
    n0 --> n1
    n1 --> n2
    n2 --> n3
classDef step0 fill:#f1f5f9,stroke:#64748b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step1 fill:#ecfeff,stroke:#06b6d4,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step3 fill:#ecfdf5,stroke:#10b981,color:#1e293b,stroke-width:2px,rx:10,ry:10;
classDef step4 fill:#fffbeb,stroke:#f59e0b,color:#1e293b,stroke-width:2px,rx:10,ry:10;
    class n0 step0;
    class n1 step1;
    class n2 step3;
    class n3 step4;

End-to-end flow derived from this project's scope and tech stack. Tap View Fullscreen for a larger view, or scroll horizontally on small screens.

Key Features

Attention-based LSTM architecture
Custom Urdu image-caption dataset
Trained and deployed for online inference
Tech Stack:** Python, TensorFlow

Overview

Architecture & Pipeline

Key Features

More AI & ML projects

COVID-19 Detection from Chest X-rays

Graph Neural Network Fraud Detection

K-Means Fleet Order Assignment