Week 3: Training and Review

Day 20: Validation

Day 20 of 2818 minGoal - Learn - Example - Practice - Checkpoint

Goal

Understand validation as testing during development.

Learn

  • Validation uses examples the model did not train on. It helps show whether the model is learning a pattern that may generalize.
  • A separate test set is usually saved for final evaluation. Do not tune the model repeatedly on the test set or it stops being a fair test.
  • For sign-language data, splits should avoid leakage. If the same signer or near-duplicate clip appears in train and validation, scores may look better than real-world performance.

Example

  • A simple split could be 70 percent train, 15 percent validation, and 15 percent test.
  • A signer-independent split is stricter: some signers appear only in validation or test, so the model must handle people it did not train on.

Practice

  1. Pretend you have 100 reviewed clips.
  2. Split them into train, validation, and test counts.
  3. Write one rule to prevent duplicates or signer leakage.

Checkpoint

Before moving on

You can explain why models need unseen examples.

Pipeline note

Pipeline note

Record split names in metadata so training can be repeated later.