- Find the best combination of scaler, encoder, clustering algorithm
- Print plot result of each combinations
- Clustering result analysis using a quality measure tool (Silhouetees score, knee method and purity)
- Create a program structure that will make it possible to automatically run different combinations.
- california housing price dataset from kaggle.
- https://www.kaggle.com/camnugent/california-housing-prices
- 10 columns(features)
- K-means, EM(GMM), CLARANS, DBSCAN, Spectral
- StandardScaler(), MinMaxScaler(), MaxAbsScaler(), RobustScaler()
- OrdinalEncoder(), OneHotEncoder(), LabelEncoder()