This is the first version and still work in progress.
Sentiment from news and deep learning (LSTM) is used to help predict stock prices. The functionality is exposed via Flask API and web page as a simple frontend. It is also possible to run it from console.
- A stock symbol and company name are entered followed by pressing "Get forecasst"
- a list of names separated by commas is also tolerated. The first word in the list will be used as a stock symbol and the rest as search terms for searching the articles for mentions of the company
- A 20 year history of the stock is downloaded and cached for further use
- A LSTM RNN is trained using Keras and a Tensorflow backend on the stock history, a model is then saved for further use. Adapted from Machine Learning for Trading.
- News scraping engine is run using https://scrapy.org/ (separate repo - fin-news-data-collection)
- news from the front page are first parsed
- each article URL is hashed and stored in a log so articles are downloaded only once
- new articles are downloaded and stored in a csv
- Articles for the past 6 days are filtered for the stock symbol
- Sentiment polarity is calculated for the article titles.
- The algorithm used is Vader Sentiment Analyzer with a customized financial lexicon adapted to stock market conversations in microblogging services. The lexicon was further improved by the Loughran/McDonald finance-specific dictionary
- The resulting stock price value prediction and sentiment polarity are displayed
- main - main console app
- bot_api - using Flask API to exposing the function as a web service and SocketIO to update the frontend in a more user friendly way
- ml_utils - LSTM RNN implementaion, adapted from Machine Learning for Trading
- sia_utils - Vader Sentiment Analyzer implementaion
- stock_utils - download, caching and preparation of stock history data-collection
- article_utils - caching and preparation of news articles
- spider_utils - scraping of news articles (using Scrapy and scraper defined in fin-news-data-collection)
- lexicon_data/ - directory containing the customized financial lexicon and the Loughran/McDonald finance-specific dictionary
- templates/ - directory containing the frontend page in HTML format
- static/ - directory containing the resources for the frontend page in HTML format
Python 3.6 and the following libraries are required: pandas scrapy regex matplotlib pandas_datareader nltk keras tensorflow sklearn flask_socketio
(Google Cloud g1-small (1 vCPU, 1.7 GB memory) machine is enough)
- Create a new directory and get the code
- fin-news-sa for the main app
- fin-news-data-collection for the scraper
- Add the scraper project under the Python path (example Ubuntu 18.04.1 LTS):
- SITEDIR=$(python3 -m site --user-site)
- mkdir -p "$SITEDIR" (if needed)
- echo "$HOME/dir-where-fin-news-data-collection-is" > "$SITEDIR/indianstock.pth"
- Configure the Tiingo API key and set the env var (example Ubuntu 18.04.1 LTS):
- register on Tiingo and get the API key
- export TiingoAPI="aaaa123456789bbbbbbbbbbbbbbbbbbbbbb"
- Configure the connection between the web page and web service:
- edit templates/try-bootstrap.html and change the URL in two places
- under AJAX request (search for "url :" or "$(document).ready(function() {")
- under SocketIO connect (search for "var socket = io.connect(")
- Run the Flask backend - "python3 bot-api.py" and hit the URL, you should see a simple dashboard open.