Lesson 5: Training, Testing, and Model Evaluation (1 hour)
Learning Objectives
- Understand why we need to split data into training and testing sets
- Understand the concept of model accuracy
- Recognize overfitting and why it's a problem
- Evaluate a simple ML model
- Understand the importance of data quality
Materials Needed
- Internet connection
- Teachable Machine or similar tool
- Student notebooks
- Examples of good vs. bad models
- Data visualization tools (optional)
Time Breakdown
- Review three types of ML (5 min)
- Training vs. testing data (20 min)
- Model evaluation and accuracy (15 min)
- Hands-on: Building and evaluating models (15 min)
- Wrap-up and unit review (5 min)
Activities
1. Review Three Types of ML (5 min)
- Quick quiz: Name the three types
- When would you use each?
- Bridge: "Now that we know how ML learns, how do we know if it learned well?"
2. Training vs. Testing Data (20 min)
The Problem:
- If we test a model on the same data it trained on, it might just memorize
- We need to see if it can handle NEW data (generalization)
The Solution: Split Data
Training Data (70-80% of data):
- Used to teach the model
- Model sees examples and learns patterns
- Like: Studying for a test with practice problems
Testing Data (20-30% of data):
- Used to evaluate the model
- Model has NEVER seen this data before
- Like: Taking the actual test with new problems
Analogy: Learning to Drive
- Training: Practice in parking lot, empty roads (training data)
- Testing: Drive on busy highway you've never been on (testing data)
- If you can only drive in the parking lot, you haven't really learned (overfitting)
Visual Example:
- Show dataset split visually
- Training: 80 examples
- Testing: 20 examples
- Model trains on 80, tests on 20
Why This Matters:
- A model might get 100% on training data (memorized)
- But only 60% on testing data (didn't really learn patterns)
- Good model: Similar performance on both
3. Model Evaluation and Accuracy (15 min)
What is Accuracy?
- How often the model is correct
- Accuracy = (Correct predictions / Total predictions) × 100%
- Example: 85 out of 100 correct = 85% accuracy
Good vs. Bad Models:
Good Model:
- Training accuracy: 90%
- Testing accuracy: 88%
- Close performance = learned general patterns
Bad Model (Overfitting):
- Training accuracy: 99%
- Testing accuracy: 65%
- Big gap = memorized training data, can't generalize
What is Overfitting?
- Model memorizes training data too well
- Learns noise and specific details instead of patterns
- Fails on new data
- Like: Memorizing answers to specific test questions instead of learning the concepts
Ways to Evaluate Models:
- Accuracy: Overall correctness
- Precision: When it says "yes," how often is it right?
- Recall: Of all the "yes" cases, how many did it find?
- Confusion Matrix: Shows what it got right/wrong
For this age group, focus mainly on accuracy, introduce precision/recall for advanced students
4. Hands-On: Building and Evaluating Models (15 min)
Activity: Train and Test a Model
Using Teachable Machine or similar:
-
Collect Data (5 min)
- Create image classification model
- Collect 30 images per class (2-3 classes)
- Split: Use 20 for training, 10 for testing (mentally note which are which)
-
Train Model (2 min)
- Train on training set
- Note training accuracy (if shown)
-
Test Model (5 min)
- Test on training images (should do well)
- Test on new testing images (real test)
- Compare: How does it perform?
- Try edge cases: Similar objects, different lighting, angles
-
Experiment (3 min)
- Try with very little training data (5 images) - what happens?
- Try with lots of training data (50 images) - what happens?
- Try with similar objects - what happens?
- Discuss observations
Key Observations:
- More training data usually = better (but not always)
- Diverse training data = better generalization
- Testing on new data is the real test
- Edge cases are harder
5. Wrap-Up and Unit Review (5 min)
Key Takeaways:
- Training data: What model learns from
- Testing data: How we evaluate if it really learned
- Accuracy: How often it's correct
- Overfitting: Memorizing instead of learning patterns
Unit 2 Summary:
- Machine learning: Learning from data
- Supervised: Learning with labels
- Unsupervised: Finding patterns without labels
- Reinforcement: Learning from rewards
- Training/testing: How we ensure models work well
Preview Unit 3: Neural Networks - how do models actually learn? What's inside the "black box"?
Differentiation Strategies
- Younger students: Focus on simple accuracy concept, use analogies, simpler models
- Older students: Explore precision/recall, analyze confusion matrices, research cross-validation
- Struggling learners: Provide more structure, use simpler examples, more guidance
- Advanced learners: Research validation sets, explore different evaluation metrics, analyze bias-variance tradeoff
Assessment
- Understanding of training vs. testing
- Successful model evaluation
- Quality of observations
- Unit quiz completion
Unit 2 Assessment Rubric
Formative Assessment (Throughout Unit)
- Participation in discussions: 20%
- Hands-on activities completion: 30%
- Reflection journal entries: 20%
- Homework assignments: 30%
End-of-Unit Assessment
Unit Quiz (25 questions, open-note):
- Machine learning basics (5 questions)
- Supervised learning (5 questions)
- Unsupervised learning (5 questions)
- Reinforcement learning (5 questions)
- Training/testing/evaluation (5 questions)
Hands-On Project: Build and Evaluate a Model
- Create image classification model using Teachable Machine
- Train with at least 2 classes, 20+ images per class
- Test model on new images
- Write reflection:
- What did you build?
- What was your accuracy?
- What worked well? What didn't?
- What would you improve?
Rubric for Hands-On Project
| Criteria | Excellent (4) | Good (3) | Satisfactory (2) | Needs Improvement (1) |
|---|---|---|---|---|
| Model Creation | Successfully created working model | Created model with minor issues | Created model with significant issues | Model doesn't work |
| Understanding | Shows deep understanding of ML concepts | Shows good understanding | Shows basic understanding | Shows limited understanding |
| Evaluation | Thoroughly evaluated model with testing | Evaluated model adequately | Basic evaluation | Limited or no evaluation |
| Reflection | Thoughtful reflection with insights | Good reflection | Basic reflection | Limited reflection |
Unit 2 Resources
Required Tools
- Teachable Machine: https://teachablemachine.withgoogle.com/
- Internet connection for demos
Recommended Exploration
- Google's Machine Learning Crash Course (simplified sections)
- "AI for Kids" resources on ML basics
- Interactive ML visualizations online
Teacher Notes
- Students may struggle with abstract concepts - use lots of analogies
- Hands-on activities are crucial for understanding
- Be prepared for technical issues with online tools
- Emphasize that ML is about patterns, not memorization
- Some students may want to dive deeper - have extension activities ready
Unit 2 Extension Activities (Optional)
For Advanced Students
- Research specific ML algorithms (linear regression, decision trees, etc.)
- Explore more complex models in Teachable Machine
- Research how recommendation systems work
- Create a comparison chart of different ML types with examples
- Research bias in ML models and data
For Students Needing More Support
- Create visual flashcards for ML types
- Draw diagrams showing training vs. testing
- Make a simple guide: "When to use each type of ML"
- Practice identifying ML types with more examples
Next Unit Preview
Unit 3: Neural Networks will explore how some AI systems actually learn. We'll look inside the "black box" to understand neurons, layers, and how neural networks process information. Get ready to visualize and build simple neural network concepts!