Explore learning without labels and decision-making through rewards
35-40 minutes Advanced Level 10 Quiz Questions
Beyond Supervised Learning
So far, we've focused on supervised learning where we have labeled data to train our models. But what happens when we don't have labels? Or when we need to make sequential decisions in an environment? This lesson explores two powerful paradigms that expand the horizons of machine learning.
Unsupervised learning discovers hidden patterns in data without explicit targets, while reinforcement learning teaches agents to make optimal decisions through trial and error. Together, they represent some of the most exciting frontiers in AI.
🎯 The Three Pillars of Machine Learning
📊 Supervised Learning
Learning with labeled examples:
Input-output pairs provided
Goal: predict labels for new data
Examples: classification, regression
Like learning with a teacher
🔍 Unsupervised Learning
Finding patterns without labels:
Only input data provided
Goal: discover hidden structure
Examples: clustering, dimensionality reduction
Like learning by exploration
🎮 Reinforcement Learning
Learning through interaction and rewards:
Agent interacts with environment
Goal: maximize cumulative reward
Examples: game playing, robotics
Like learning through trial and error
Unsupervised Learning: Finding Hidden Patterns
What is Unsupervised Learning?
Unsupervised learning algorithms analyze data to find patterns, structures, or relationships without being given explicit target outputs. It's like being a detective looking for clues in data without knowing what crime was committed.
K-Means is one of the most popular clustering algorithms. Here's how it works:
Initialize: Choose k cluster centers randomly
Assign: Assign each point to the nearest cluster center
Update: Move cluster centers to the mean of assigned points
Repeat: Continue until cluster centers stabilize
🔧 Choosing the Right Number of Clusters (k)
Elbow Method: Plot inertia vs k, look for the "elbow"
Silhouette Analysis: Measures how similar points are within clusters vs between clusters
Domain Knowledge: Sometimes you know how many groups to expect
Reinforcement Learning: Learning Through Interaction
The RL Framework
Reinforcement Learning is inspired by how humans and animals learn through trial and error. An agent interacts with an environment, taking actions and receiving rewards or penalties, with the goal of maximizing cumulative reward.
🎮 The RL Loop
🤖
Agent
↔️
🌍
Environment
Agent observes state → takes action → receives reward + new state
Key RL Concepts
📊 State (S)
The current situation or configuration of the environment that the agent can observe
Example: Chess board position, robot's location
⚡ Action (A)
The set of possible moves or decisions the agent can make
Example: Move chess piece, turn left/right
🏆 Reward (R)
The feedback signal indicating how good/bad an action was
Example: +1 for winning, -1 for losing, 0 for neutral
🎯 Policy (π)
The strategy that defines how the agent chooses actions given states
Example: If state X, then take action Y
Popular RL Algorithms
🎯 Value-Based Methods
Q-Learning
Learns the value of taking each action in each state (Q-values)
Q-Table: Stores Q(state, action) values
Bellman Equation: Q(s,a) = R + γ × max Q(s', a')
Exploration vs Exploitation: ε-greedy strategy
Deep Q-Networks (DQN)
Uses neural networks to approximate Q-values for complex state spaces
Handles high-dimensional states (like images)
Experience replay and target networks for stability
Famous for mastering Atari games
🎮 Policy-Based Methods
Policy Gradient Methods
Directly optimize the policy without learning value functions
REINFORCE: Basic policy gradient algorithm
Actor-Critic: Combines policy gradients with value estimation
PPO (Proximal Policy Optimization): Stable and efficient modern method
Real-World Applications
🤖 Robotics
Unsupervised: Learning to walk without explicit movement instructions
RL: Robot navigation, manipulation, and control
🎮 Game AI
Unsupervised: Discovering game strategies from gameplay data