# Chapter 1: Introduction

• 1.1 What is Machine Learning

Notation

• Capital Letters: Random variables, matrices, number of classes (clear by context)
• Bold: Vectors (column vectors, unless transposed)
• Hats: estimates

Vocab

• i.i.d: Independently and identically distributed
• Joint Distribution: Probablility of something occuring taking into account 2 or more random variables
• Supervised Learning: Learning with a training set
• Unsupervised Learning: Learning with a set of data without values
• Feature Vectors: These are the vectors that “describe” a defining feature of the data
• Feature Space: Vector space containing the feature vectors
• Finite Set: Labels occur from {-1, 1}
• Non-Finite Set: Labels occur on real number line
• Classification (Pattern Recognition): Supervised learning with a finite set of labels
• Labels are called classes
• Clustering: Data grouped together giving away some sort of pattern

1.1 What is Machine Learning

• Supervised Learning: Machine Learning with a training set T. Feature vectors x generate labels y
• Data is drawn i.i.d from some joint distribution
• Vectors x_i are feature vectors or inputs
• Values y_i are known as labels  (outputs, responses)
• Feature Space: Vector space containing the feature vectors
• The goal of supervised learning is to create a map from the training data
• This allows us to create a model to predict output values
• This process is called learning or training a model
• Classification (Pattern Recognition)
• Labels are a finite-set {-1, 1}
• Labels are now known as classses
• Then the supervised problem is known as classification or pattenr recognition
•  Labels are  a non-finite set, any real number
• Then the supervised problem is known as regression
• Unsupervised Learning: Machine Learning without a training set but only a set of data (feature vectors)
• Data is drawn i.i.d and we want to infer something useful from the data
• Clustering is used to identify such patterns
• Take vectors from D and group them into similar groups