Week 3: Training and Review
Day 20: Validation
Goal
Understand validation as testing during development.
Learn
- Validation uses examples the model did not train on. It helps show whether the model is learning a pattern that may generalize.
- A separate test set is usually saved for final evaluation. Do not tune the model repeatedly on the test set or it stops being a fair test.
- For sign-language data, splits should avoid leakage. If the same signer or near-duplicate clip appears in train and validation, scores may look better than real-world performance.
Example
- A simple split could be 70 percent train, 15 percent validation, and 15 percent test.
- A signer-independent split is stricter: some signers appear only in validation or test, so the model must handle people it did not train on.
Practice
- Pretend you have 100 reviewed clips.
- Split them into train, validation, and test counts.
- Write one rule to prevent duplicates or signer leakage.
Checkpoint
Before moving on
You can explain why models need unseen examples.
Pipeline note
Pipeline note
Record split names in metadata so training can be repeated later.