K-Means What…??

Niharika Dhanik
4 min readJul 19, 2021

--

In this blog, I will be sharing important aspects of k-mean clustering and its real use-case in the security domain.

In today’s world security is an aspect which is given higher priority by all political and government worldwide and aiming to reduce crime incidence. As data mining is the appropriate field to apply on high volume crime dataset and knowledge gained from data mining approaches will be useful and support police force. So In this article, we will discuss crime analysis, done by performing k-means clustering.

What is K-means Clustering —

K-means clustering is one of the method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. It is an unsupervised learning algorithm. There is no labeled data for this clustering, unlike in supervised learning. K-Means performs the division of objects into clusters that share similarities and are dissimilar to the objects belonging to another cluster.

K means algorithm complexity is O(tkn), where n is instances, c is clusters, and t is iterations and relatively efficient . It often terminates at a local optimum. Its disadvantage is applicable only when mean is defined and need to specify c, the number of clusters, in advance. It unable to handle noisy data and outliers and not suitable to discover clusters with non-convex shapes.

The term ‘K’ is a number. You need to tell the system how many clusters you need to create. For example, K = 2 refers to two clusters. There is a way of finding out what is the best or optimum value of K for a given data.

Types of Clustering —

Clustering is a type of unsupervised learning wherein data points are grouped into different sets based on their degree of similarity.

The various types of clustering are:

  • Hierarchical clustering
  • Partitioning clustering

Hierarchical clustering is further subdivided into:

  • Agglomerative clustering
  • Divisive clustering

Partitioning clustering is further subdivided into:

  • K-Means clustering
  • Fuzzy C-Means clustering

The following images gives a clear idea of the various types of clustering.

Fuzzy C-Means /K-means clustering / Agglomerative clustering / Divisive clustering /Hierarchical clustering

Applications of K-Means Clustering —

K-Means clustering is used in a variety of examples or business cases in real life, like:

  • Academic performance
  • Diagnostic systems
  • Search engines
  • Wireless sensor networks

Applications of K-Means Clustering With Respect To Cyber Security—

The following describes the use case of k-mean clustering in the security domain. The agenda of the use-case is discussed below —

1. Extraction of crime patterns by analysis of available crime and criminal data.

2. Prediction of crime based on spatial distribution of existing data and anticipation of crime rate using different data mining techniques

3. Detection of crime

The steps to be followed are — Initially, the number of clusters must be known let it be. Then the initial step is the choose a set of K instances as the center of the clusters. Next, the algorithm considers each instance and assigns it to the cluster which is closest. The cluster centroids are recalculated either after whole cycle of re-assignment or each instance assignment. This process is iterated.

This is one of the example that shows how K-Means Clustering Algorithm is used for security application. That’s all for this blog.

Thank You!

--

--

Niharika Dhanik
Niharika Dhanik

No responses yet