ReSign — Real-time Sign Language Recognition
title: "ReSign — Real-time Sign Language Recognition" date: "2024-04-01" description: "Webcam-input sign language classifier built at DiamondHacks 2024." tech: ["Python", "TensorFlow", "React"] status: "archived" links:
- label: "GitHub" href: "https://github.com/randamit123/diamondhacks-frontend/tree/main"
Motivation
A weekend hackathon project: take a webcam stream and turn it into ASL letter predictions in real time. The interesting bit isn't the model accuracy — it's the latency budget you have before the experience starts to feel broken.
Approach
A small CNN classifier trained on a public ASL dataset, paired with MediaPipe-style hand landmark extraction so the input was a stable 21-point skeleton rather than raw pixels. The React frontend captures frames at a fixed cadence and ships them to the model worker.
- Frame-rate decoupled from model rate so the UI never blocks
- Confidence smoothing across frames to suppress flicker
- Lightweight model targeted at sub-50ms inference
Results
Worked end-to-end at the hackathon for the alphabet. Recognisably wrong on signs the dataset under-represented, which was a useful reminder of where the bottleneck actually is.
What's next
- Move from per-letter to continuous signing (sequence model)
- Profile the inference path with the actual deployment runtime in mind
References
- The dataset and a number of public Kaggle baselines were our starting point