The Titanic dataset is a well-known dataset, which contains information about the passengers who were on board the Titanic when it sank. The dataset provides a rich source of information for data analysts and machine learning enthusiasts to build models that predict survival rates of passengers. In this project, R language and Rstudio were used to predict the survival rate of passengers and create various data visualizations. The project begins by importing the Titanic dataset into Rstudio. The dataset is cleaned and preprocessed, ensuring that there are no missing values or irrelevant data. The data is then split into two subsets, a training set and a test set. The training set is used to train the machine learning model (Random Forest) and the test set is used to evaluate the model's performance. Once the model is trained, it is used to predict the survival rate of the test set. The predictions are then compared to the actual survival status of the passengers, and the model's accuracy is calculated. Data is further explored using ggplot2 library to create colourful visualizations. A scatter plot is used to visualize the relationship between age and fare, a bar plot is used to show the number of survivors and non-survivors in each passenger class, and many more such visualizations. The entire system works in a clean menu-driven format which displays all necessary information as and when requested by the user.
A menu-driven interface is visible on which we can select the appropriate option.
Choice 1 makes prediction on a new passenger about their survival rate using the model.
Choice 2 displays the model used and it's accuracy.
Choice 3 gives a menu for various different visualizations using ggplot2 library.
Choice 4 displays creator information and project details.
In conclusion, this project demonstrates the use of R language and Rstudio to analyze the Titanic dataset, build a machine learning model, and create various data visualizations. The project shows how ggplot2 packages can be used to visualize data, and how machine learning algorithms such as Random Forest can be used to predict survival rates of new passengers all of which is done using a user-friendly interactive menu-driven environment.