A Supervised Learning Primer for Machine Learning (ML) Algorithms


In this post, you’ll learn the basics of supervised learning algorithms with an example. You’ll also learn how to define your own model and find out how to select the best training data for it!

What is supervised learning?

Supervised learning is a machine learning task that uses labeled training data to learn a function. The goal of supervised learning is to learn from existing data, and predict an output value for a given input value. For example, we can use supervised learning methods to predict how much money you would spend on food each week if you were given your income level?

The labels are used to evaluate the output of the model: If our model predicts $50 per week as opposed to $100 per week (i.e., less than what would actually happen), this would be considered incorrect by our dataset’s labels!

What is a training set?

Your training set is the data you use to train your model. It should be representative of the real-world data you want to model, and large enough that your model can learn from it.

The goal of supervised learning is for a computer program or algorithm to make predictions about new instances based on its knowledge of past instances (i.e., its training set). For example, if we’re trying predict whether someone will like our product after they’ve used it for a week, then our training set would include all users who have used our product for at least one week and whether or not they liked it (or some other measure).

What is the role of a label in supervised learning?

Labels are used to assess the quality of a model. They provide an objective measure of how well your algorithm is performing, and they can be numeric or categorical. Labels can also be positive or negative, binary or multi-class, continuous or discrete–the possibilities are endless!

In supervised learning tasks where only one label exists for each example (e.g., spam vs non-spam emails), this single label acts as both input feature(s) and output prediction at once–that is why we call it a “supervised” task (i.e., it has both). On the other hand, when there are multiple possible outputs for each example (e.g., predicting whether someone will buy something online), then we need another set of data known as “unsupervised learning” where no labels exist yet; instead we try to find patterns between similar examples themselves rather than comparing them against some reference value like true/false answers do in classification problems with just one output per instance instead

How do you define a success or failure metric for your model?

You need to define a success metric and failure metric. A success metric is used to determine how well your model performed on a given task, while a failure metric is used to determine how badly it performed.

For example, you might have trained your neural network to classify images of cats and dogs into their respective categories (cats or dogs). You can then use this model as an image classifier by uploading new pictures into it and asking it which category they fall under: cat or dog? In this case, the accuracy rate would be one way of measuring how well your neural network does at recognizing cats versus dogs in images–it’s simply defined as:

Success Metric = Number Of Correct Answers / Total Number Of Answers For All Test Sets

Supervised learning algorithms are used to predict outputs from inputs, given a known set of labels.

Supervised learning algorithms are used to predict outputs from inputs, given a known set of labels.

Examples of supervised learning algorithms include:

  • Decision trees – You can think of these as simple if-then rules that you can use to classify data. For example, if someone’s age is greater than 18 years old and their height is less than 6 feet tall, then they’re likely an adult male (or something along those lines). The goal is for you to use these rules in order to predict whether another person will be male or female based on their age and height measurements.
  • Random forests – In this case we have multiple decision trees all working together with each other so that we can get better results overall than any individual model would give us alone! We call this ensemble learning because the models work together as part of one big team called an “ensemble”. Using this technique allows us not only improve our accuracy but also reduce overfitting–which happens when our model fits too closely with its training data causing it not perform well outside this dataset


Supervised learning is a powerful tool for predicting outputs from inputs. It can be used to solve many real world problems and has applications in fields like healthcare, finance, marketing and more.