Week 1: Foundations
Day 6: Why Pose Data Is Used
Understand why researchers convert videos into pose and keypoints.
Goal
Understand why researchers convert videos into pose/keypoints.
Learn
- Raw video includes identity, clothing, background, lighting, camera quality, and many details that may distract the model.
- Pose extraction reduces a frame to named points on the body, hands, and face. This can make patterns easier to compare across signers and backgrounds.
- Pose data is useful but incomplete. It can lose handshape detail, facial expression, mouth movement, and subtle timing if the extractor is weak or the video is hard to read.
Example
- A video frame may become numbers like left_wrist = (412, 288, 0.91) and right_index_tip = (540, 330, 0.83). The final value is often a confidence score.
- Low confidence on fingertips means the model should not blindly trust that frame.
Practice
- List three benefits of pose data: smaller files, less background noise, easier motion comparison.
- List three risks: missed fingers, lost facial grammar, identity or signer variation still leaking through body motion.
Checkpoint
Before moving on
You can explain why pose data is useful but incomplete.
Quality note
Quality note
Pose data should be visualized and inspected. Numbers alone can hide obvious signing problems.