Understanding K-Nearest Neighbors in Supervised Learning

K-nearest neighbors is a straightforward supervised machine learning algorithm that classifies new data points based on proximity to existing data. Learn how it uses simple distance measurements to identify the closest neighbors and make predictions.

Key Insights

K-nearest neighbors (KNN) classifies new data points by analyzing their proximity to existing labeled data, typically using a default value of k equal to three.
The algorithm identifies the closest data points (neighbors) in a two-dimensional space, then assigns the new data point to the class represented most frequently among those neighbors.
Unlike other complex machine learning methods, KNN consistently employs the same straightforward algorithm, making it a distinct type of supervised learning model.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

All right, so let's talk about how k-nearest neighbors works. Typically, you set k to three. We're going to look at two of our three nearest neighbors and see which class they belong to.

So in this example, we've got some things classified as green triangles, just to visualize them, and some things classified as yellow squares. Those could be, again, like, you know, dogs and cats by weight and height. I don't know.

But some points on an X and Y according to a couple of variables. And what k-nearest neighbors is going to do is we're going to train them on this data. We're going to say that's a cat or a yellow square or whatever.

That's a dog or a green triangle or whatever. Actually, we're just giving it numbers. And it knows this number; this one over here is this number; that one over there is this number.

So it learns from that where these things are in space, in their X and Y values. Then we can throw in a new instance. Once we've trained the model, saying, "Here's all the data, " we can then say, "Hey, I want to test you."

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

What do you think this is? And it runs a fairly simple algorithm. Again, it's unlike many other machine learning models. It's not coming up with a new algorithm.

It's always using the same k-nearest neighbors algorithm, which is why this is a slightly different kind of machine learning model. So this supervised learning model, it's going to say, okay, I'm going to look, according to my k-nearest neighbors algorithm, you told me to use a three.

I'm going to look at the three closest neighbors. And it's going to look at this green triangle and this yellow square and say those are the closest ones and more of them are green triangles than yellow squares. I'm going to guess that's a green triangle, whatever class that is.

And that's how k-nearest neighbors works. We'll be exploring that more in a moment.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning