ReSign — Real-time Sign Language Recognition

title: "ReSign — Real-time Sign Language Recognition" date: "2024-04-01" description: "Webcam-input sign language classifier built at DiamondHacks 2024." tech: ["Python", "TensorFlow", "React"] status: "archived" links:

label: "GitHub" href: "https://github.com/randamit123/diamondhacks-frontend/tree/main"

Motivation

A weekend hackathon project: take a webcam stream and turn it into ASL letter predictions in real time. The interesting bit isn't the model accuracy — it's the latency budget you have before the experience starts to feel broken.

Approach

A small CNN classifier trained on a public ASL dataset, paired with MediaPipe-style hand landmark extraction so the input was a stable 21-point skeleton rather than raw pixels. The React frontend captures frames at a fixed cadence and ships them to the model worker.

Frame-rate decoupled from model rate so the UI never blocks
Confidence smoothing across frames to suppress flicker
Lightweight model targeted at sub-50ms inference

Results

Worked end-to-end at the hackathon for the alphabet. Recognisably wrong on signs the dataset under-represented, which was a useful reminder of where the bottleneck actually is.

What's next

Move from per-letter to continuous signing (sequence model)
Profile the inference path with the actual deployment runtime in mind

References

The dataset and a number of public Kaggle baselines were our starting point