Table of Contents

- 1.1 What is Machine Learning

Notation

- Capital Letters: Random variables, matrices, number of classes (clear by context)
**Bold**: Vectors (column vectors, unless transposed)- Hats: estimates

Vocab

- i.i.d: Independently and identically distributed
- Joint Distribution: Probablility of something occuring taking into account 2 or more random variables
- Supervised Learning: Learning with a training set
- Unsupervised Learning: Learning with a set of data without values
- Feature Vectors: These are the vectors that “describe” a defining feature of the data
- Feature Space: Vector space containing the feature vectors
- Finite Set: Labels occur from {-1, 1}
- Non-Finite Set: Labels occur on real number line
- Classification (Pattern Recognition): Supervised learning with a finite set of labels
- Labels are called classes

- Clustering: Data grouped together giving away some sort of pattern

1.1 What is Machine Learning

**Supervised Learning**: Machine Learning with a training set T. Feature vectors x generate labels y- Data is drawn i.i.d from some joint distribution
- Vectors x_i are feature vectors or inputs
- Values y_i are known as labels (outputs, responses)
- Feature Space: Vector space containing the feature vectors
- The
**goal**of supervised learning is to create a**map**from the training data- This allows us to create a model to predict output values
- This process is called learning or training a model

- Classification (Pattern Recognition)
- Labels are a
**finite-set**{-1, 1}- Labels are now known as
**classses** - Then the supervised problem is known as
**classification**or pattenr recognition

- Labels are now known as
- Labels are a
**non-finite set,**any real number- Then the supervised problem is known as
**regression**

- Then the supervised problem is known as

- Labels are a
**Unsupervised Learning**: Machine Learning without a training set but only a set of data (feature vectors)- Data is drawn i.i.d and we want to infer something useful from the data
**Clustering**is used to identify such patterns- Take vectors from D and group them into similar groups