Week 1: Foundations
Day 1: What a SignLLM Is
Goal
Understand SignLLMs as AI systems for sign-language recognition, translation, or production.
Learn
- A SignLLM is a sign-language AI system. Depending on the project, it may read sign video, predict gloss, compare pose sequences, generate pose, or help drive an avatar.
- There is no single magic SignLLM that fully translates all ASL in every setting. Current systems are usually built for a specific task, dataset, language variety, camera setup, and output format.
- Inputs and outputs can be mixed: video to gloss, pose to gloss, text to gloss, gloss to pose, pose to avatar, or video to written language. Each direction needs different data and different checks.
Example
- Recognition example: a learner signs on camera, the system extracts pose, and the model predicts a gloss label such as HELLO or MY NAME R-A-L-P-H.
- Production example: a prompt is converted into planned gloss, then pose motion, then an avatar preview. The avatar still needs human review before anyone treats it as correct signing.
Practice
- Draw four boxes: video, pose/keypoints, gloss, and avatar motion.
- Add arrows for video to gloss, text to gloss, gloss to pose, and pose to avatar.
- Write one sentence under each arrow explaining what the system is trying to do.
Checkpoint
Before moving on
You can explain that SignLLM is a broad term for a pipeline or model family, not one single tool that automatically understands every sign.
Deaf-first note
Deaf-first note
Use SignLLM language carefully. A model can support access work, research, or prototyping, but Deaf language expertise is still required for meaning, grammar, and cultural fit.