· Hakan Çelik · OpenCV / Machine Learning · 2 dk okuma

Understanding K-Means Clustering

Learn the concepts of K-Means Clustering algorithm. We go through the step-by-step algorithm using a T-shirt size problem to explain how iterative centroid updates work.

Understanding K-Means Clustering

Goal

In this chapter, we will understand the concepts of K-Means Clustering, how it works etc.

Theory

Consider a company, which is going to release a new model of T-shirt to market. They will have to manufacture models in different sizes to satisfy people of all sizes. So the company makes a data of people’s height and weight, and plots them on to a graph:

T-shirt beginning

Company can’t create t-shirts with all the sizes. Instead, they divide people to Small, Medium and Large, and manufacture only these 3 models. This grouping of people into three groups can be done by k-means clustering, and algorithm provides us best 3 sizes:

K-Means Demo

How does it work?

This algorithm is an iterative process:

Step 1: Algorithm randomly chooses two centroids, C1 and C2.

Step 2: It calculates the distance from each point to both centroids. If a test data is more closer to C1, then that data is labelled with ‘0’. If it is closer to C2, then labelled as ‘1’.

Step 3: Next we calculate the average of all blue points and red points separately and that will be our new centroids.

Steps 2 and 3 are iterated until both centroids are converged to fixed points. These points are such that sum of distances between test data and their corresponding centroids are minimum:

J = Σ distance(C1, Red_Point) + Σ distance(C2, Blue_Point) → minimize


Source: OpenCV Python Tutorials — Original Documentation

Back to Blog

Related Posts

View All Posts »
How OpenCV-Python Bindings Work

How OpenCV-Python Bindings Work

OpenCV · 3 dk

Learn how OpenCV-Python bindings are generated from C++ headers. We cover CV_EXPORTS_W, CV_WRAP, and other macros, plus the gen2.py generator and hdr_parser.py header parser scripts.

Face Detection using Haar Cascades

Face Detection using Haar Cascades

OpenCV · 3 dk

Learn to use Haar Cascade classifiers in OpenCV for face and eye detection. This tutorial covers the theory behind Haar features, integral images, AdaBoost, and cascade classifiers.

High Dynamic Range (HDR) Imaging

High Dynamic Range (HDR) Imaging

OpenCV · 3 dk

Learn how to generate and display HDR images from an exposure sequence in OpenCV. We cover Debevec, Robertson, and Mertens exposure fusion algorithms with camera response function estimation.

Image Inpainting

Image Inpainting

OpenCV · 2 dk

Learn how to remove small noises, strokes, and damage from old photographs using OpenCV's cv.inpaint(). We cover the Telea and Navier-Stokes inpainting algorithms.