Exploring the K-Nearest Neighbors Algorithm

Gain clarity on the k-nearest neighbors machine learning algorithm, a supervised model that classifies data points based on proximity to existing data. Learn how this method differs from regression by examining examples such as height, weight, and flower classification.

Key Insights

The k-nearest neighbors (KNN) algorithm is supervised machine learning that classifies data points by referencing the closest known data points rather than performing regression calculations.
KNN classification can be applied in practical contexts, such as categorizing animals based on height and weight or distinguishing flowers in standard machine learning datasets.
The article outlines initial setup instructions, including importing necessary libraries, initializing the KNN classifier model, and configuring Google Drive integration for data access.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Here in part three, we're going to be working on the k-nearest neighbors machine learning algorithm. The k-nearest neighbors algorithm is a supervised machine learning algorithm for classifying data points based on, hey, what is the value of the closest existing points? So it learns in a different way than the original algorithms, where we did regressions.

KNN is not a regression. KNN looks instead at its memory of the data points it has seen. Right? Again, we give it, hey, this data point is this value.

This data point is this value. And we can plot all kinds of things with that. For example, height and weight.

We could plot X against Y and get a lot of data points of height and weight, maybe of animals. And be able to classify, saying, "Okay, this height and weight area over here seems to be dogs, " and this one over here seems to be cats.

We're going to look at some data next in the second part of part three. And we're going to look at some flower data and classify flowers in a classic machine learning dataset. But for now, we're going to explore just the concept of k-nearest neighbors and try to visualize it and try to get a sense of what it's doing.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

All right. So very first thing, let's make sure we've got all of our ducks in a row regarding getting everything imported. We've combined a couple of things here.

But these are our basic data science imports, including our Jupyter Notebook ability to display images. Here we have our k-nearest neighbors classifier, our model. And we've also set up Google Drive.

So run that. And then as is typical, the base URL for any files we need to look at. Execute that.

And we can also execute this and start talking about k and N. All right. We'll talk about this image and what it means in just a moment.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning