Fintech540-Machine Learning For Fintech - Group_projet:smiling_imp:

Description:

This is repo for a Machine learning Model-building project that takes cryptocurrency data as input and used some supervised(regression,classification) and unsupervised(density astimation, clustering) machine learning algorithms to find out some interesting patterns in the dataset.

Architecture:

Reggression (1 MODEL- liner regression: 2-3 people):

Fuction: As a benchmark to prove clusterings will be better.
Tasks: a. We need to have factors which bring influence on our cryptocurrencies' price. (S&P500, etc...) b. Choose three cryptocurrency to present c. Finish in one week

Requirements

1.one model 2.focus on one kind of crypocurrency or top 50/100 market cap 3.Deadline of presentation: 11/17/2022

Motivation:

Machine learning in finance is now considered a key aspect of several financial services and applications, including managing assets, evaluating levels of risk, calculating credit scores, and even approving loans. Machine learning is a subset of data science that provides the ability to learn and improve from experience without being programmed.
In this project, we will explore some possible ways that how unsupervised learnig algorithms(Clustering) could be applied on cryptocurrency and access their performance to find out whether there are some interesting discoveries.

Take a peep into our dataset:

You can find out dataset here: web_link For dataset, we have 1243590 entries and 12 columns:

time_open :

time_high : Time cryptocurrency reachs highest price.

time_low :Time cryptocurrency reachs lowest price.

quote.USD.open :

quote.USD.high :

quote.USD.low :

quote.USD.close :

quote.USD.volume :

quote.USD.market_cap : The total market value of a cryptocurrency's circulating supply. It is analogous to the free-float capitalization in the stock market.

quote.USD.timestamp :

symbol : The symbol of cryptocurrency

id : With symbol, they are the unique id for cryptocurrency.

Addtional Notes:

We might not try out all machine learning algorithms at the first stage. We might focus on unsupervised learning algorithm such as clustering.

Overall Progress:

Problems we're facing

Heyyy! Write your own progress here!👻

Patrick Duan

K-means finished
I did some EDA work and feature engineering on our data
- extract minute and sec as new features from time_high and time_low
- drop other categorical columns
- stanardize all numerical columns since distance matters in our model
Remain to do:
- result interpretation

Chenxi Rong & Yiwei Cheng

What we mainly did are the steps before Stacey's fancy plots!
- checked the raw data and dropped the missing values after testing.
- added a new column representing the symbol and id.
- extracted the date from time stamp.
- drawing the rough plot and made a few assumptions about clustering.
Remain to do:
- building up new models

Yiwei Cheng

worked with Stacey and Chenxi for EDA before clustering
Did GMM and DBSCAN model with the data frame after Petrick's feature engineering
- GMM Package(model and probability)
- DBSCAN: find and visualize the best EPS and min_samples
- DBSCAN result: With eps=1.5, min samples=4, and data= df[0: 10000], we have 3 clusters: cluster 0, cluster 1, and cluster 2
- DBSCAN result: Cluster -1 is the noise
Wrote Powerpoint slides for introduction, interpretation, and conclusion of DBSCAN model and revised some format problems of the presentation slides

Stacey Fang

-EDA for whole dataset finished

Zhuo Yang

On the basis of Steven's multi-model fitting, random forest was selected for further optimization.
1. Pull BTC price data directly in the parquet file
2. Routine and targeted data processing
3. Select a group of seven days for data restructuring in order to extract feature values
4. Extract feature values using tsfresh
5. Use train_test_split to partition the data into training and testing sets
6. Training Model
7. Using the model to make predictions
8. Evaluate models through numerical evaluation and visualization

Yiwei Cheng & Stacey Fang

Selected Coins- "BTC_1","ETH_1027", "BNB_1839", "ADA_2010"

petriiick / fintech540-group_project Goto Github PK

fintech540-group_project's Introduction

Fintech540-Machine Learning For Fintech - Group_projet:smiling_imp:

Description:

Architecture:

Requirements

Motivation:

Take a peep into our dataset:

Addtional Notes:

Overall Progress:

Problems we're facing

Heyyy! Write your own progress here!👻

Patrick Duan

Chenxi Rong & Yiwei Cheng

Yiwei Cheng

Stacey Fang

Zhuo Yang

Yiwei Cheng & Stacey Fang

Overall Progress:

fintech540-group_project's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org