Code Monkey home page Code Monkey logo

mini-group-project-2's Introduction

Mini-group-project-2

HKUST (GZ) course project

This project uses Python with Dask to efficiently manipulate and analyze a comprehensive dataset on global trade. Our objectives include deriving descriptive statistics, identifying trade patterns through unsupervised learning, and predicting trade flows using machine learning techniques.

Descriptive analysis

The project kicks off by identifying countries with the most and least trading partners, utilizing Dask's data frame capabilities for efficient computation. Further analysis extends into understanding overall trade volume and identifying sectors of highest value in the trade dataset.

Top and Bottom Trading Countries

  • Calculated the number of unique trading partners for each country.
  • Identified the top 10 and bottom 10 countries based on the number of trading partners.

Trade Volume and High-Value Sectors

  • Merged product descriptions with trade data to categorize trade volumes.
  • Highlighted sectors of economic significance, including petroleum, pharmaceuticals, metals, communications, and automobiles.

Result illustrations

Using functions such as group by, trade data are explored

image

image

image

image

image

image

Unsupervised Learning: Identifying Trade Patterns

Employing unsupervised learning techniques, we aim to uncover latent patterns in the global trade data. The approach involves normalizing data and applying K-means clustering to reveal clusters of countries based on their trading behaviors.

Clustering Countries Based on Trade Patterns

  • Applied K-means clustering to distinguish groups with similar export/import values and sectors.
  • Utilized PCA for dimensionality reduction, aiding in visualization and analysis.

Result illustrations

image image

Machine Learning: Predicting Trade Flows

A machine learning model predicts export quantities based on several features including distance, GDP, and specific sectors. Feature engineering and hyperparameter tuning are conducted to enhance model performance.

Model Building and Evaluation

  • Split the dataset into training and testing sets.
  • Utilized RandomForestRegressor for predictions.
  • Evaluated the model using metrics such as MAE, MSE, and R^2 scores.

Predictions for New Country-Sector Pairs

  • Predicted export quantities for unseen country-sector pairs.
  • Discussed the implications and limitations of using machine learning in trade flow predictions.

Bonus Questions: Economic Implications of Distance on Trade

Exploring the impact of distance on exporting commodities, focusing on transportation costs and the nature of the goods. This section theorizes how distance influences trade patterns and proposes an analytical framework for further investigation.

Economic Theory and Data Requirements

  • Outlined the necessity of export data, distance metrics, and transportation cost information.
  • Theorized the relationship between transportation costs, the nature of goods, and distance.

Details

  • Economically, the impact of distance on exporting commodities is related to transportation costs and the nature of the goods being traded.
  • Transportation Costs: The farther a commodity needs to travel to reach its market, the higher the transportation costs. The heavier a commodity is, the higher the cost.
  • Nature of Goods: Different commodities have different sensitivities to distance.
    • Perishable items like fruits and vegetables suffer more from long distances due to spoilage risks.
    • On the other hand, high-value goods like electronics might be less affected by distance because transportation costs form a smaller part of their overall value.
  • Data Needed:
    • Export Data: Detailed information on what each country exports and to where.
    • Distance Data: Distances between exporting countries and major trading partners.
    • Transportation Costs Data: Information on shipping costs, including rates and modes of transport.

Analytical Framework

  • Presented a structured approach for analyzing the impact of distance on trade patterns.
  • Suggested comparative studies, regression analysis, and detailed sector-specific case studies.

Details

  • Compare Distances: Calculate average distances for each commodity sector to different export markets.
  • Cost Analysis: Analyze how transportation costs vary across sectors and how they relate to distance. (Regression analysis)
  • Case Studies: Look at specific examples within sectors to understand how distance impacts trade.

mini-group-project-2's People

Contributors

qblyqq avatar artemis20123 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.