Three questions have been formulated and answered using the dataset. Each question uses a different methodology learned in class, and contains quantitative and qualitative components. Jupyter notebook is employed to state concrete questions, hypotheses, and methodologies, and to present the significance of the results.
Question 1: For long arrival delays, which passengers might be satisfied?
Question 2: How many passengers give high score level on “Food and Drink”?
Question 3: Estimate the accuracy of k-Nearest Neighbor (k=3, 5, and 7) and other Classifiers on this dataset?
Three proposed questions have been solved. The first problem shows that passengers who are satisfied are dominated by loyal customers. In the second problem, passengers might not have a strong opinion on food and drink, so there is a space of improvement for airline companies to improve the services to attract more customers. In the last question, machine learning tasks have been accomplished, and different classifiers have been tested on this dataset. The accuracy of each classifier is estimated.