Code Monkey home page Code Monkey logo

stock_embedding's Introduction

TODO: - add viz - make it a proper python package

Stock Embedding

This repo implements the methodology based on paper Learning Embedded Representation of the Stock Correlation Matrix using Graph Machine Learning.

The goal is to learn a vector representation of a ticker given the daily returns of a set of tickers. For example, AAPL is represented as a numpy array of size 32, just like word2vec.

'AAPL': array([-0.5099518 , -0.06737634,  0.48559666, -1.4538945 ,  1.6606339 ,
         0.3644821 , -0.01464582, -0.78605974,  0.47729206, -0.5726769 ,
         0.07515597, -1.2660204 , -0.42653227,  0.5527185 ,  0.07249957,
        -1.4414234 ,  0.6071886 , -0.22588927,  0.27790403,  0.898469  ,
        -1.2963198 ,  0.5084794 , -0.5935085 , -0.10640397, -0.94240546,
        -0.03292902,  0.741543  ,  0.8465433 , -0.81097144,  0.28370696,
        -0.12066484,  0.22394317], dtype=float32)

With the vectorized ticker, we can pass these elements in the vector to the downstream ML jobs.

Example

This repo is able to replicate most of the results presented in the paper. Here are some examples from the notebook under src.

# stock data are fetched via yfinance package, the log returns and correlations are cacluated internally
se = StockEmbedding(start_date='2021-01-01', end_date='2022-08-04')

# hyperparameter is tuned by maximizing average v-measure
param_lst = [{'l': 50, 'r': 10, 'p': 0.5, 'q': 2, 'w': 5, 'dim': 16},
             {'l': 100, 'r': 50, 'p': 2, 'q': 0.5, 'w': 5, 'dim': 16},
             {'l': 200, 'r': 10, 'p': 2, 'q': 0.5, 'w': 5, 'dim': 32}]

summary, opt_param = se.hyperparam_tuning(param_lst)

similar to Table 3 of the original paper ht

The stock embedding enables us to answer the following questions

  • stock similarity

    se.find_similar_stocks(ticker='JPM')

    All the top similar stocks are from Financial sector with most of them from Banks industry. sim_stock

  • Analogical Inference: answer question like "JPM is to GS as JNJ is to ?"

    se.analogical_inference(ticker='AAPL', ticker_1='JPM', ticker_2='GS')

    The result is JPM is to GS as AAPL is to AMZN with a similarity of 0.9349383.

    se.analogical_inference(ticker='JNJ', ticker_1='JPM', ticker_2='GS')

    The result is JPM is to GS as JNJ is to GILD with a similarity of 0.8816782.

  • Not Match Stock: Answer questions like: Does not match from JPM, MS, GS, GOOGL: GOOGL

    se.identify_not_match_stock(['JPM', 'MS', 'GS', 'GOOGL'])

    The result is Does not match from JPM, MS, GS, GOOGL: GOOGL.

    se.identify_not_match_stock(['JNJ', 'BMY', 'PFE', 'HD'])

    The result is Does not match from JNJ, BMY, PFE, HD: HD.

    se.identify_not_match_stock(['UAL', 'AAL', 'DAL', 'TSLA'])

    The result is Does not match from UAL, AAL, DAL, TSLA: TSLA.

stock_embedding's People

Contributors

wuxxx949 avatar

Stargazers

 avatar Chen Weiqiang avatar Ang avatar Luosen Anthony avatar  avatar  avatar  avatar Rayhan Momin avatar  avatar Syed Salman avatar Attila Zseller avatar  avatar Sathish avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.