To write a program to implement the K Means Clustering for Customer Segmentation.
- Hardware โ PCs
- Anaconda โ Python 3.7 Installation / Jupyter notebook
1.Start the program
2.Import pandas and matplotlib.pyplot.
3.Read the dataset and transform it.
4.Import KMeans and fit the data in the model.
5.Plot the Cluster graph.
6.End the program
/*
Program to implement the K Means Clustering for Customer Segmentation.
Developed by: Karthick P
RegisterNumber: 212222100021
*/
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv("/content/Mall_Customers.csv")
data.head()
data.info()
data.isnull().sum()
from sklearn.cluster import KMeans
wcss = [] #Within-Cluster sum of square.
for i in range(1,11):
kmeans=KMeans(n_clusters = i,init = "k-means++")
kmeans.fit(data.iloc[:,3:])
wcss.append(kmeans.inertia_)
plt.plot(range(1,11),wcss)
plt.xlabel("No of Clusters")
plt.ylabel("wcss")
plt.title("Elbow Method")
km = KMeans(n_clusters = 5)
km.fit(data.iloc[:,3:])
y_pred = km.predict(data.iloc[:,3:])
y_pred
data["cluster"] = y_pred
df0 = data[data["cluster"]==0]
df1 = data[data["cluster"]==1]
df2 = data[data["cluster"]==2]
df3 = data[data["cluster"]==3]
df4 = data[data["cluster"]==4]
plt.scatter(df0["Annual Income (k$)"],df0["Spending Score (1-100)"],c="red",label="cluster0")
plt.scatter(df1["Annual Income (k$)"],df1["Spending Score (1-100)"],c="black",label="cluster1")
plt.scatter(df2["Annual Income (k$)"],df2["Spending Score (1-100)"],c="blue",label="cluster2")
plt.scatter(df3["Annual Income (k$)"],df3["Spending Score (1-100)"],c="green",label="cluster3")
plt.scatter(df4["Annual Income (k$)"],df4["Spending Score (1-100)"],c="magenta",label="cluster4")
plt.legend()
plt.title("Customer Segments")
data.head() function
data.info()
data.isnull().sum() function
Elbow method Graph
KMeans clusters
Customer segments Graph
Thus the program to implement the K Means Clustering for Customer Segmentation is written and verified using python programming.