Code Monkey home page Code Monkey logo

finq's Introduction

๐Ÿ”Ž Overview

The goal of finq is to provide an all-in-one Python library for quantitative portfolio analysis and optimization on historical and real-time financial data.

NOTE: Features are currently being determined and developed continuously. The repo is undergoing heavy modifications and could introduce breaking changes up until first major release. Current version is v0.4.0.

๐Ÿ“ฆ Installation

Either clone this repository and perform a local install with poetry accordingly

git clone https://github.com/wilhelmagren/finq.git
cd finq
poetry install

or install the most recent release from the Python Package Index (PyPI).

pip install pyfinq

๐Ÿš€ Example usage

finq supports a large number of major world indices which can be used through their respective Dataset implementation. To download all ticker OHLC data for a period and saving it locally for the OMXS30 index, you can do the following:

from finq.datasets import OMXS30

dataset = OMXS30(save=True)
dataset.run("2y")

The .run(period) function is a convenient wrapper for three other functions. The alternative way would be by running:

dataset.fetch_data("2y").fix_missing_data().verify_data()

and for a full list of all implemented indices, please take a look at the following link.

You can also define custom datasets, which may not be an already available major world index. To do that you need to use the CustomDataset and provide both a list of security names and their ticker symbols which are to be included in your dataset. Furthermore, you need to specify what market the ticker symbols are available on, e.g., NASDAQ or OMX. In code it can look like this:

from finq.datasets import CustomDataset

names = ["Alfa Laval", "Boliden", "SEB A", "Sv. Handelsbanken A"]
symbols = ["ALFA.ST", "BOL.ST", "SEB-A.ST", "SHB-A.ST"]

dataset = CustomDataset(names, symbols, market="OMX", save=False)
dataset = dataset.run("6mo")
...

The Dataset class supports visualizing any OHLC historical data for all considered tickers. Calling the .visualize(price_type="Close") function on the OMXS30 index would yield the following plot:

Show dataset plot

OMXS30 Visualization 6 months

and the majority of labels and styles for the plot can be customized regularly with matplotlib styling. You can also choose to visualize more advanced plots, but only for individual assets, through the mplfinance library. The below image is a candle plot of the BOL.ST ticker from the OMXS30 index, with three moving averages.

Show ticker plot

BOL.ST Candle plot


You can create a Portfolio in a large number of ways. The most convenient way is to first initialize a Dataset like above, and then pass that dataset to the portfolio class constructor. The class also supports passing in a list of Asset objects, or a np.ndarray, or a pd.DataFrame. But for the latter three alternatives you need to also provide the security names and ticker symbols from which the data came from. All different methods are showed in the snipper below:

from finq import Portfolio

# All four alternatives are equally viable.
# The latter three are implemented for whenever
# you preferably only want to work with local data.
from_dataset = Portfolio(dataset)
from_assets = Portfolio(assets, names=["SEB-A.ST", ...], symbols=...)
from_numpy = Portfolio(np_arr, names=["SEB-A.ST", ...], symbols=...)
from_pandas = Portfolio(pd_df, names=["SEB-A.ST", ...], symbols=...)
...

To optimize your portfolio against some objective function, you use the optimize(...) function. Optimizing the mean variance expression (maximizing sharpe ratio) requires you to specify the objective function that you want to minimize, the initial weights of your portfolio, optionally the bounds and constraints of the portfolio weights. You can do it in the following way:

from finq import Portfolio
from finq.datasets import OMXS30
from finq.formulas import mean_variance

dataset = OMXS30(save=True)
dataset = dataset.run("2y")

portfolio = Portfolio(dataset)
portfolio.initialize_random_weights(
    "lognormal",
    size=(len(dataset), 1),
)

risk_tolerance = 1

portfolio.set_objective_function(
    mean_variance,
    risk_tolerance * portfolio.daily_covariance(),
    portfolio.daily_returns_mean(),
)

portfolio.set_objective_bounds(
    [(0, 0.2) for _ in range(len(dataset))],
)

portfolio.optimize(
    method="COBYLA",
    options={"maxiter": 1000},
)

portfolio.plot_mean_variance(n_samples=10000, figsize=(8, 5))

The above code is directly taken from one of the scripts available in the examples directory in this repo. Running it can yield the following comparative plot of the sharpe ratios for: your optimized portfolio weights, and randomly sampled portfolio weights.

Show sharpe ratio plot

OMXS30 sharpe ratio plot

๐Ÿ“‹ License

All code is to be held under a general MIT license, please see LICENSE for specific information.

finq's People

Contributors

augustsoderlund avatar dependabot[bot] avatar wilhelmagren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

finq's Issues

Add CustomDataset functionality to try and fetch index from NASDAQ

Currently the CustomDataset can only be initialized with a list of symbols and a list of corresponding security names. We want to allow user to create CustomDataset from a potential valid index name from NASDAQ.

Suggestion (something like this):

class CustomDataset(Dataset):
    """ """

    def __init__(self, *,
        names: Optional[List[str]] = None,
        symbols: Optional[List[str]] = None,
        nasdaq_index: Optional[str] = None,
        save_path: Union[str, Path] = ".data/CUSTOM/",
        **kwargs: Dict,
    ):
        """ """

        if all(map(lambda x: x is None, (names, symbols, nasdaq_index))):
            raise ValueError("all can't be None")
        
        if nasdaq_index:
            names, symbols = _fetch_names_and_symbols(nasdaq_index)

        super(CustomDataset, self).__init__(
            names,
            symbols,
            save_path=save_path,
            **kwargs,
        )

DOD: we can pass an index name to the CustomDataset and fetch its data.

Empty df from nasdaq on sundays, weekday diff is broken

With the current weekday diff logic in nasdaq_requests.py we do not get any valid data from NASDAQ.

weekday_diff = 7 - datetime.today().isoweekday()

will be 0 when running on sundays (link)[https://pythontic.com/datetime/date/isoweekday]

Suggestion:

weekday_diff = max(datetime.today().isoweekday() - 5, 0)

will give us either 1 or 2 diffs, to friday, if datetime.today() is on the weekend. will have 0 diffs on weekdays.

Improve logging behaviour

Current problem

The log captures the output from tqdm (progress bar). This produces extremely ugly and unreadable logs like the one below:

[2023-10-15 10:48:21,647] [finq.datasets.dataset] [DEBUG        ] verifying that stored data has no missing values...
Verifying `ABB.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `ALFA.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `ASSA-B.ST`T`  no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ATCO-A.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ATCO-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `AZN.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `BOL.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `ELUX-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ERIC-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `EVO.ST`B.ST`  no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `GETI-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `HEXA-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `HM-B.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `INVE-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `KINV-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `NDA-SE.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `NIBE-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `SAND.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `SBB-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SCA-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SEB-A.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SHB-A.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SINCH.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SKF-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SWED-A.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `TEL2-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `TELIA.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `VOLV-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `VOLV-B.ST`    no missing values: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 30/30 [00:00<00:00, 1070.44it/s

Suggestions

We need to look into how to improve this behavour, so that the tqdm output is not handleded by the log, and simply prints to stdout. Possibly that the suggestion in the link below might work:

Also we should refactor all logs and make sure they actually provide some sort of valuable information to the user/developer. Currently there are a lot of debug and info logs that might be too verbose.

/W ๐Ÿˆ

Mock all tests to avoid hitting the rate limit on Yahoo! Finance

Problem

Currently I have hit the daily API limit, twice, because of running tests locally due to the OMXS30 dataset not being mocked. It should not be an issue for our testing workflows to hit the limit, but we are still generating unnecessary web traffic.

Suggestion

One tricky aspect is mocking the requests to NASDAQ, since these fetch an .xlsx file and saves it locally. To properly mock this we would have to first create the .xlsx file with appropriate content, then mock the request, and then read it as usual. Make sure to clean everything up afterwards.

References

Yahoo Developer Network: https://developer.yahooinc.com/dsp/api/docs/traffic/info/rate-limits.html?guccounter=1

GET requests to NASDAQ should use the CachedRatedLimiter as session

Currently the requests to NASDAQ to fetch index information is not cached nor rate limited.
This should be able to be solved by passing the CachedRateLimiter session to the function _fetch_names_and_symbols and calling session.get(...) instead of requests.get(...).

def _fetch_names_and_symbols(
    index: str,
    session: requests.Session,
    *,
    query_params: Dict = {},
    headers: Dict = {},
    filter_symbols: Optional[Callable] = None,
) -> Union[Exception, Tuple[List[str], List[str]]]:
    """ """
    ...
    response = session.get(
        url,
        params=query_params,
        headers=headers,
    )
    ...

The above suggestion works (but only minimal testing). Try implement across the library.

/W ๐Ÿˆ

Nasdaq index weighting not working on weekends

The link to the weighting of an index, try OMXS30 weighting and set start date to either a Saturday or Sunday, does not produce any weights on weekends consequently resulting in no ticker symbols being retrieved.

To fix this we should check if today in nasdaq_requests.py is a weekday before continuing

today = datetime.today().strftime("%Y-%m-%d")
params = {
    "tradeDate": f"{today}T00:00:00.000",
    "timeOfDay": "SOD",
}

classes and functions need documentation

Currently almost no classes or functions are documented. All classes and functions should be completely documented, preferably according to PEP8 standard.

Someone please do this :)

Prepare PyPI release

Some open questions

what name do we want on PyPI?

  • currently finq is already taken
  • we could do py-finq ? pyfinq ? finqpy ? something else?
  • please comment with suggestion

Chores

  • update pyproject.toml
    • authors,
    • version,
    • description,
    • etc ?..
  • create setup.py (?)
  • do we want CD workflow? deploy automatically to PyPI? might be a bit too much, we don't wanna do a lot of realeses?... thoughts?

Create load function for Dataset

Currently there is is functionality to optionally save the fetched tickers data, but no way to check for saved data and load it if it exists.

DOD: upon fetch attempt, looks locally for saved tickers and loads them instead of fetching them if they were previously saved.

index tickers varies through time

Current implementation gets the index tickers from today and assumes that it is the components of the index throughout the historical data that's downloaded. Instead, the index components should be retrieved continuously to assure an that the correct stocks are used as index components.

yfinance requests should be mocked in unit tests

Currently the unit tests perform real requests to Yahoo! Finance and nasdaq to test functionality. These should be mocked. Otherwise running the unit tests will take too long and the github actions will potentially cost a lot of money.

Look into using unittest.mock to mock all get requests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.