Chapter 1: Introduction

Table of Contents

  • 1.1 What is Machine Learning

Notation

  • Capital Letters: Random variables, matrices, number of classes (clear by context)
  • Bold: Vectors (column vectors, unless transposed)
  • Hats: estimates

Vocab

  • i.i.d: Independently and identically distributed
  • Joint Distribution: Probablility of something occuring taking into account 2 or more random variables
  • Supervised Learning: Learning with a training set
  • Unsupervised Learning: Learning with a set of data without values
  • Feature Vectors: These are the vectors that “describe” a defining feature of the data
  • Feature Space: Vector space containing the feature vectors
  • Finite Set: Labels occur from {-1, 1}
  • Non-Finite Set: Labels occur on real number line
  • Classification (Pattern Recognition): Supervised learning with a finite set of labels
    • Labels are called classes
  • Clustering: Data grouped together giving away some sort of pattern

1.1 What is Machine Learning

  • Supervised Learning: Machine Learning with a training set T. Feature vectors x generate labels y
    • Data is drawn i.i.d from some joint distribution 
    • Vectors x_i are feature vectors or inputs
    • Values y_i are known as labels  (outputs, responses)
    • Feature Space: Vector space containing the feature vectors
    • The goal of supervised learning is to create a map from the training data
      • This allows us to create a model to predict output values
      • This process is called learning or training a model
  • Classification (Pattern Recognition)
    • Labels are a finite-set {-1, 1}
      • Labels are now known as classses
      • Then the supervised problem is known as classification or pattenr recognition
    •  Labels are  a non-finite set, any real number
      • Then the supervised problem is known as regression
  • Unsupervised Learning: Machine Learning without a training set but only a set of data (feature vectors)
    • Data is drawn i.i.d and we want to infer something useful from the data
    • Clustering is used to identify such patterns
      • Take vectors from D and group them into similar groups

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s