Unit 2: Machine Learning Basics

Lesson 5: Training, Testing, and Model Evaluation (1 hour)

Lesson content from Unit 2: Machine Learning Basics

Lesson 5: Training, Testing, and Model Evaluation (1 hour)

Learning Objectives

  • Understand why we need to split data into training and testing sets
  • Understand the concept of model accuracy
  • Recognize overfitting and why it's a problem
  • Evaluate a simple ML model
  • Understand the importance of data quality

Materials Needed

  • Internet connection
  • Teachable Machine or similar tool
  • Student notebooks
  • Examples of good vs. bad models
  • Data visualization tools (optional)

Time Breakdown

  • Review three types of ML (5 min)
  • Training vs. testing data (20 min)
  • Model evaluation and accuracy (15 min)
  • Hands-on: Building and evaluating models (15 min)
  • Wrap-up and unit review (5 min)

Activities

1. Review Three Types of ML (5 min)

  • Quick quiz: Name the three types
  • When would you use each?
  • Bridge: "Now that we know how ML learns, how do we know if it learned well?"

2. Training vs. Testing Data (20 min)

The Problem:

  • If we test a model on the same data it trained on, it might just memorize
  • We need to see if it can handle NEW data (generalization)

The Solution: Split Data

Training Data (70-80% of data):

  • Used to teach the model
  • Model sees examples and learns patterns
  • Like: Studying for a test with practice problems

Testing Data (20-30% of data):

  • Used to evaluate the model
  • Model has NEVER seen this data before
  • Like: Taking the actual test with new problems

Analogy: Learning to Drive

  • Training: Practice in parking lot, empty roads (training data)
  • Testing: Drive on busy highway you've never been on (testing data)
  • If you can only drive in the parking lot, you haven't really learned (overfitting)

Visual Example:

  • Show dataset split visually
  • Training: 80 examples
  • Testing: 20 examples
  • Model trains on 80, tests on 20

Why This Matters:

  • A model might get 100% on training data (memorized)
  • But only 60% on testing data (didn't really learn patterns)
  • Good model: Similar performance on both

3. Model Evaluation and Accuracy (15 min)

What is Accuracy?

  • How often the model is correct
  • Accuracy = (Correct predictions / Total predictions) × 100%
  • Example: 85 out of 100 correct = 85% accuracy

Good vs. Bad Models:

Good Model:

  • Training accuracy: 90%
  • Testing accuracy: 88%
  • Close performance = learned general patterns

Bad Model (Overfitting):

  • Training accuracy: 99%
  • Testing accuracy: 65%
  • Big gap = memorized training data, can't generalize

What is Overfitting?

  • Model memorizes training data too well
  • Learns noise and specific details instead of patterns
  • Fails on new data
  • Like: Memorizing answers to specific test questions instead of learning the concepts

Ways to Evaluate Models:

  1. Accuracy: Overall correctness
  2. Precision: When it says "yes," how often is it right?
  3. Recall: Of all the "yes" cases, how many did it find?
  4. Confusion Matrix: Shows what it got right/wrong

For this age group, focus mainly on accuracy, introduce precision/recall for advanced students

4. Hands-On: Building and Evaluating Models (15 min)

Activity: Train and Test a Model

Using Teachable Machine or similar:

  1. Collect Data (5 min)

    • Create image classification model
    • Collect 30 images per class (2-3 classes)
    • Split: Use 20 for training, 10 for testing (mentally note which are which)
  2. Train Model (2 min)

    • Train on training set
    • Note training accuracy (if shown)
  3. Test Model (5 min)

    • Test on training images (should do well)
    • Test on new testing images (real test)
    • Compare: How does it perform?
    • Try edge cases: Similar objects, different lighting, angles
  4. Experiment (3 min)

    • Try with very little training data (5 images) - what happens?
    • Try with lots of training data (50 images) - what happens?
    • Try with similar objects - what happens?
    • Discuss observations

Key Observations:

  • More training data usually = better (but not always)
  • Diverse training data = better generalization
  • Testing on new data is the real test
  • Edge cases are harder

5. Wrap-Up and Unit Review (5 min)

Key Takeaways:

  • Training data: What model learns from
  • Testing data: How we evaluate if it really learned
  • Accuracy: How often it's correct
  • Overfitting: Memorizing instead of learning patterns

Unit 2 Summary:

  • Machine learning: Learning from data
  • Supervised: Learning with labels
  • Unsupervised: Finding patterns without labels
  • Reinforcement: Learning from rewards
  • Training/testing: How we ensure models work well

Preview Unit 3: Neural Networks - how do models actually learn? What's inside the "black box"?

Differentiation Strategies

  • Younger students: Focus on simple accuracy concept, use analogies, simpler models
  • Older students: Explore precision/recall, analyze confusion matrices, research cross-validation
  • Struggling learners: Provide more structure, use simpler examples, more guidance
  • Advanced learners: Research validation sets, explore different evaluation metrics, analyze bias-variance tradeoff

Assessment

  • Understanding of training vs. testing
  • Successful model evaluation
  • Quality of observations
  • Unit quiz completion

Unit 2 Assessment Rubric

Formative Assessment (Throughout Unit)

  • Participation in discussions: 20%
  • Hands-on activities completion: 30%
  • Reflection journal entries: 20%
  • Homework assignments: 30%

End-of-Unit Assessment

Unit Quiz (25 questions, open-note):

  • Machine learning basics (5 questions)
  • Supervised learning (5 questions)
  • Unsupervised learning (5 questions)
  • Reinforcement learning (5 questions)
  • Training/testing/evaluation (5 questions)

Hands-On Project: Build and Evaluate a Model

  • Create image classification model using Teachable Machine
  • Train with at least 2 classes, 20+ images per class
  • Test model on new images
  • Write reflection:
    • What did you build?
    • What was your accuracy?
    • What worked well? What didn't?
    • What would you improve?

Rubric for Hands-On Project

Criteria Excellent (4) Good (3) Satisfactory (2) Needs Improvement (1)
Model Creation Successfully created working model Created model with minor issues Created model with significant issues Model doesn't work
Understanding Shows deep understanding of ML concepts Shows good understanding Shows basic understanding Shows limited understanding
Evaluation Thoroughly evaluated model with testing Evaluated model adequately Basic evaluation Limited or no evaluation
Reflection Thoughtful reflection with insights Good reflection Basic reflection Limited reflection

Unit 2 Resources

Required Tools

Recommended Exploration

  • Google's Machine Learning Crash Course (simplified sections)
  • "AI for Kids" resources on ML basics
  • Interactive ML visualizations online

Teacher Notes

  • Students may struggle with abstract concepts - use lots of analogies
  • Hands-on activities are crucial for understanding
  • Be prepared for technical issues with online tools
  • Emphasize that ML is about patterns, not memorization
  • Some students may want to dive deeper - have extension activities ready

Unit 2 Extension Activities (Optional)

For Advanced Students

  • Research specific ML algorithms (linear regression, decision trees, etc.)
  • Explore more complex models in Teachable Machine
  • Research how recommendation systems work
  • Create a comparison chart of different ML types with examples
  • Research bias in ML models and data

For Students Needing More Support

  • Create visual flashcards for ML types
  • Draw diagrams showing training vs. testing
  • Make a simple guide: "When to use each type of ML"
  • Practice identifying ML types with more examples

Next Unit Preview

Unit 3: Neural Networks will explore how some AI systems actually learn. We'll look inside the "black box" to understand neurons, layers, and how neural networks process information. Get ready to visualize and build simple neural network concepts!