Transcription of: A Gentle Introduction to Machine Learning




gonna start this tech quest with silly song but if you don't like silly songs that's okay stack quests hello I'm Josh stormer and welcome to stack quest today we're going to do a gentle introduction to machine learning note this stack quest was originally prepared for and presented at the Society for scientific advancements annual conference one of the things that Sosa does is promote science and technology in Jamaica let's start with a silly example do you like silly songs if you like silly songs are you interested in machine learning if you like silly songs and machine learning then you'll love stack quest if you like silly songs but not machine learning are you interested in statistics if you like silly songs and statistics but not machine learning then you'll still love stack quest otherwise you might not like stack quest won't Wang if you don't like silly songs are you interested in machine learning if you don't like silly songs but you like machine learning then you'll love stack quest if you don't like silly songs or machine learning are you interested in statistics if you don't like silly songs or machine learning but you're interested in statistics then you will love stack quest otherwise you might not like stack quest wah wah this is a silly example but it illustrates a decision tree a simple machine learning method the purpose of this particular decision tree is to predict whether or not someone will love stack quest alternatively we could say that this decision tree classifies a person as either someone who loves stack quest or someone who doesn't since decision trees are a type of machine learning then if you understand how we use this tree to predict or classify if someone would love stack quest you are well on your way to understanding machine learning BAM here's another silly example of machine learning imagine we measured how quickly someone could run 100 meters and how much yam they ate this is me I'm not very fast and I don't eat much yam these are some other people and this is Shane bolt hold is very fast Andy eats a lot of yam given this pretend data we see that the more yam someone eats the faster they run the 100-meter dash we can fit a black line to the data to show the trend but we can also use the black line to make predictions for example if someone told us they ate this much yam then we could use the black line to predict how fast that person might run this is the predicted speed the black line is a type of machine learning because we can use it to make predictions in general machine learning is all about making predictions and classifications BAM now that we can make predictions and classifications let's talk about some of the main ideas in machine learning first of all in machine learning lingo the original data is called training data so the black line is fit to training data alternatively we could have fit a green squiggle to the training data the green squiggle fits the training data better than the black line but remember the goal of machine learning is to make predictions so we need a way to decide if the green squiggle is better or worse than the black line at making predictions so we find a new person and measure how fast they run and how much ham they eat and then we find another and another and another altogether the blue dots represent testing data we use the testing data to compare the predictions made by the black line to the predictions made by the green squiggle let's start by seeing how well the black line predicts the speed of each person in the testing data here's the first person in the testing data they ate this much yam and they ran this fast however the black line predicts that someone who ate this much yam should run a little slower so let's measure the distance between the actual speed and the predicted speed and save the distance on the right while we focus on the other people in the testing data here's the second person in the testing data they ate this much yam and they ran this fast but the black line predicts that they will run a little faster so we measure the distance between the actual speed and the predicted speed and add it to the one we measured for the first person in the testing data then we measure the distance between the real and the predicted speed for the third person in the testing data and add it to our running total of distances between the real and predicted speeds for the black line then we do the same thing for the fourth person in the testing data and add that distance to our running total for the black line this is the sum of all the distances between the real and predicted speeds for the black line now let's calculate the distances between the real and predicted speeds using the green squiggle remember the green squiggle did a great job fitting the training data but when we are doing machine learning we are more interested in how well the green squiggle can make predictions with new data so just like before we determine this person's real speed and their predicted speed and measure the distance between them and just like we did for the black line we'll keep track of the distances for the green squiggle over here then we do the same thing for the second person in the testing data and the third person and the fourth person this is the sum of the distances between the real and predicted speeds for the green squiggle the sum of the distances is larger for the green squiggle than the black line in other words even though the green squiggle fit the training data way better than the black line the black line did a better job predicting speeds with the testing data so if we had to choose between using the black line or the green squiggle to make predictions we would choose the black line BAM this example teaches two main ideas about machine learning first we use testing data to evaluate machine learning methods second don't be fooled by how well a machine learning method fits the training data note fitting the training data well but making poor predictions is called the bias-variance tradeoff Ohno a shameless self-promotion if you want to learn more about the bias-variance tradeoff there's a stat quest that will walk you through it one step at a time before we move on you may be wondering why we used a simple black line in a silly green squiggle instead of a deep learning or convolutional neural network or insert sir who is with bestest most fancy machine learning method here here there are tons of fancy sounding machine learning methods and each year something new and exciting comes on the scene but regardless of what you use the most important thing isn't how fancy it is but how it performs with testing data double BAM now let's go back to the decision tree that we started with remember we wanted to classify if someone loved stat quest based on a few questions to create the decision tree we collected data from people who loved stat quest and from people who did not love stat quest altogether this was the training data and we used it to build the decision tree got data from a few more people who love stat quest and a few more people who did not love stat quest altogether this forms the testing data we can use the testing data to see how well our decision tree predicts if someone will love stat quest the first person in the testing data did not like silly songs so we go to the right side of the decision tree they didn't like machine learning either so we just keep on going down the right side of the decision tree they didn't like statistics either so the decision tree predicts that this person will not love stat quest however this person loves stat quest so the decision tree made a mistake want wall the second person in the testing data liked silly songs and that takes us down the left side of the decision tree they were also interested in machine learning so we predict that that person loves stat quest and since this person actually loves stat quest the decision tree did a good job hooray now we just run all of the other people in the testing data down the decision tree and compare the predictions to reality then we can compare this decision tree to the latest greatest machine learning method ultimately we pick the method that does the best job predicting if someone will love stat quest or not triple bam in summary machine learning is all about making predictions and classifications there are tons of fancy machine learning methods but the most important thing to know about them isn't what makes them so fancy it's that we decide which method fits our needs the best by using testing data one last thing before we go you may be wondering how we decide which data go into the training set and which data go into the testing set earlier we just arbitrarily decided that these red dots were the training data but the blue dots could have just as easily been the training data the good news is that there are ways to determine which samples should be used for training data and which samples should be used for testing data and if you're interested in learning more about this check out the stat quest and there are lots more stat quests that walk you through machine learning concepts step-by-step so check them out hooray we've made it to the end of another exciting stack quest if you like this stack quest and want to see more please subscribe and if you want to support stack quest well consider buying one or two of my original songs or getting a t-shirt or a hoodie or some other slick merchandise there's links on the screen and there's links in the description below alright until next time quest on