Week 2: Data Pipeline
Day 14: Week 2 Review
Connect video, gloss, keypoints, NPZ, and metadata.
Goal
Connect video, gloss, keypoints, and NPZ.
Learn
- A clean training example usually includes aligned video, frame timing, extracted motion, gloss or labels, metadata, and quality notes.
- Alignment matters. The gloss should match the exact frames used for training, not a longer clip with extra movement.
- Metadata makes the dataset auditable. It tells reviewers where the sample came from, how it was processed, and whether it is approved.
Example
- Sample metadata record: sample_id = asl_00042, source_video = raw/signer03/clip0042.mp4, frame_start = 18, frame_end = 82, gloss = THANK-YOU, pose_file = pose/asl_00042.npz, signer_ref = signer03, consent_id = c-2026-04, qa_status = approved, notes = clear face and both hands.
- If qa_status is review, the training script can skip the sample until a human resolves it.
Practice
- Create one sample record with source video, signer ID, gloss, frame range, pose file, split, consent reference, and quality status.
- Mark the sample keep, review, or reject and explain why.
Checkpoint
Before moving on
You can describe what one clean training example contains.
Quality note
Quality note
A model-ready sample should be traceable back to the source video and review decision.