Unit 2: Machine Learning Basics

Lesson 4: Reinforcement Learning (1 hour)

Lesson content from Unit 2: Machine Learning Basics

Lesson 4: Reinforcement Learning (1 hour)

Learning Objectives

  • Understand what reinforcement learning is
  • Recognize reinforcement learning examples
  • Understand the concepts of agents, environments, rewards, and actions
  • Experience reinforcement learning through interactive demos

Materials Needed

  • Internet connection
  • Reinforcement learning demos/games
  • Examples of RL in action
  • Student notebooks
  • Optional: Simple game or simulation

Time Breakdown

  • Review previous learning types (5 min)
  • Introduction to reinforcement learning (15 min)
  • RL concepts: Agent, Environment, Rewards (15 min)
  • Hands-on: RL demos and games (20 min)
  • Wrap-up (5 min)

Activities

1. Review Previous Learning Types (5 min)

  • Supervised learning: Learning with labeled examples
  • Unsupervised learning: Finding patterns without labels
  • Today: Learning through trial and error with rewards

2. Introduction to Reinforcement Learning (15 min)

  • Definition: Reinforcement learning learns by interacting with an environment and receiving rewards or penalties
  • Key Concept: It's like training a pet - you reward good behavior and it learns
  • Key Idea: Agent tries actions, gets feedback (reward/punishment), learns what works

Real-World Examples:

  • Game-playing AI (chess, Go, video games)
  • Self-driving cars (reward: staying on road, penalty: crash)
  • Robot learning to walk
  • Recommendation systems (reward: user clicks, penalty: user ignores)
  • Trading algorithms (reward: profit, penalty: loss)

Why RL?

  • When you can't provide labeled data
  • When the best action depends on the situation
  • When you need to learn through experience
  • When exploration is important

3. RL Concepts: Agent, Environment, Rewards (15 min)

The Four Key Components:

  1. Agent: The learner (AI system)

    • Example: Game-playing AI, robot, self-driving car
    • Makes decisions and takes actions
  2. Environment: The world the agent interacts with

    • Example: Game board, road, room
    • Changes based on agent's actions
  3. Actions: What the agent can do

    • Example: Move pieces, steer car, move robot arm
    • Agent chooses actions to take
  4. Rewards: Feedback (positive or negative)

    • Positive reward: Good outcome (score points, reach goal)
    • Negative reward (penalty): Bad outcome (lose points, crash)
    • Agent learns to maximize rewards

The Learning Process:

  1. Agent observes environment
  2. Agent chooses action
  3. Environment responds
  4. Agent receives reward/penalty
  5. Agent learns from experience
  6. Repeat - agent gets better over time

Simple Analogy: Training a Dog

  • Agent: Dog
  • Environment: Living room
  • Actions: Sit, stay, come, fetch
  • Rewards: Treats (positive), "No" (negative)
  • Dog learns which actions get treats

Game Example: Pac-Man

  • Agent: Pac-Man AI
  • Environment: Game maze
  • Actions: Move up, down, left, right
  • Rewards: +10 for eating dot, +200 for eating ghost, -1 for each step (encourages efficiency)
  • Penalties: -500 if caught by ghost
  • AI learns best strategies through playing

4. Hands-On: RL Demos and Games (20 min)

Activity 1: Simple RL Demo (if available online)

  • Show reinforcement learning visualization
  • Watch agent learn to navigate maze or play game
  • Observe: Starts poorly, improves over time
  • Discuss: What is the agent learning? What are the rewards?

Activity 2: Human RL Simulation

  • Game: "Find the Treasure"
    • Draw simple grid/maze on board
    • One student is "agent"
    • Other students give rewards (clap for good moves, "boo" for bad)
    • Agent learns best path
    • Compare: First attempt vs. later attempts

Activity 3: Online RL Games (if available)

  • Google's "Snake Game" with RL (if accessible)
  • Or other browser-based RL demos
  • Students watch agent learn
  • Discuss observations

Activity 4: Design Your Own RL Scenario

  • In pairs, students design simple RL problem:
    • Agent: What is learning?
    • Environment: Where is it?
    • Actions: What can it do?
    • Rewards: What are the goals?
  • Share examples with class
  • Examples: Robot learning to sort objects, AI learning to recommend movies, etc.

Reflection Questions:

  • How is RL different from supervised learning?
  • Why might RL be useful for games?
  • What makes a good reward system?
  • How does the agent balance exploring (trying new things) vs. exploiting (using what works)?

5. Wrap-Up (5 min)

  • Three types of ML: Supervised (with labels), Unsupervised (find patterns), Reinforcement (learn from rewards)
  • When would you use each type?
  • Preview: Next lesson - Putting it all together, training vs. testing

Differentiation Strategies

  • Younger students: Focus on game examples, simpler analogies, hands-on activities
  • Older students: Explore more complex RL concepts, research AlphaGo or similar, analyze reward functions
  • Struggling learners: Provide more structure, use very simple examples, more guidance
  • Advanced learners: Research Q-learning, explore policy gradients, analyze exploration vs. exploitation trade-offs

Assessment

  • Understanding of reinforcement learning concepts
  • Participation in RL activities
  • Quality of designed RL scenarios
  • Ability to distinguish the three ML types