Week 1: Foundations
Day 3: Recognition vs Production
Separate systems that read signs from systems that create sign motion.
Goal
Separate sign recognition from sign production.
Learn
- Recognition means the system reads sign-language input and predicts something: a label, gloss sequence, translation, or class.
- Production means the system creates sign-language output: pose motion, avatar animation, or a video-like result from text, gloss, or another prompt.
- Recognition and production are opposite directions. A good recognition score does not prove an avatar signs naturally, and a smooth avatar does not prove the language is correct.
Example
- Recognition: video clip in, predicted gloss out: IX-1 WANT COFFEE.
- Production: gloss in, pose sequence out, then an avatar renders that motion.
- Translation: sign video in, English sentence out. This is especially hard because ASL and English have different grammar and context.
Practice
- Label these as recognition, production, or translation: predicting a sign class, generating avatar movement, producing English captions from sign video, and converting gloss into pose.
- For each one, write what the input is and what the output is.
Checkpoint
Before moving on
You can identify whether a system is reading signs, creating signs, or translating between languages.
Pipeline note
Pipeline note
Keep the task direction visible in project notes. It prevents vague claims like 'the model does ASL' when the real task is narrower.