For the Toronto neighborhood data, a Wikipedia page exists that has all the information needed to explore and cluster the neighborhoods in Toronto. The data is scraped from the Wikipedia page and wrangled, cleaned and then read into a pandas dataframe so that it is in a structured format. Once the data is in a structured format, the neighborhoods in the city of Toronto are explored and clustered.
-
Scrape website and parse HTML code using the Python package Beautifulsoup and convert data into a pandas dataframe.
-
Implement k-means clustering, which is a form of unsupervised learning.
-
Use clustering and the Foursquare API to segment and cluster the neighborhoods in the city of Toronto.