wilhelmagren / finq Goto Github PK

🔬 Quantitative analysis and management for financial applications.

License: MIT License

Makefile 0.32% Python 98.13% Jupyter Notebook 1.20% PowerShell 0.11% Shell 0.23%

finance nasdaq yahoo-finance data-analysis data-science quantitative-finance time-series analysis portfolio-management portfolio-optimization optimization investment monte-carlo

finq's Introduction

🔎 Overview

The goal of finq is to provide an all-in-one Python library for quantitative portfolio analysis and optimization on historical and real-time financial data.

NOTE: Features are currently being determined and developed continuously. The repo is undergoing heavy modifications and could introduce breaking changes up until first major release. Current version is v0.4.0.

📦 Installation

Either clone this repository and perform a local install with poetry accordingly

git clone https://github.com/wilhelmagren/finq.git
cd finq
poetry install

or install the most recent release from the Python Package Index (PyPI).

pip install pyfinq

🚀 Example usage

finq supports a large number of major world indices which can be used through their respective Dataset implementation. To download all ticker OHLC data for a period and saving it locally for the OMXS30 index, you can do the following:

from finq.datasets import OMXS30

dataset = OMXS30(save=True)
dataset.run("2y")

The .run(period) function is a convenient wrapper for three other functions. The alternative way would be by running:

dataset.fetch_data("2y").fix_missing_data().verify_data()

and for a full list of all implemented indices, please take a look at the following link.

You can also define custom datasets, which may not be an already available major world index. To do that you need to use the CustomDataset and provide both a list of security names and their ticker symbols which are to be included in your dataset. Furthermore, you need to specify what market the ticker symbols are available on, e.g., NASDAQ or OMX. In code it can look like this:

from finq.datasets import CustomDataset

names = ["Alfa Laval", "Boliden", "SEB A", "Sv. Handelsbanken A"]
symbols = ["ALFA.ST", "BOL.ST", "SEB-A.ST", "SHB-A.ST"]

dataset = CustomDataset(names, symbols, market="OMX", save=False)
dataset = dataset.run("6mo")
...

The Dataset class supports visualizing any OHLC historical data for all considered tickers. Calling the .visualize(price_type="Close") function on the OMXS30 index would yield the following plot:

Show dataset plot

and the majority of labels and styles for the plot can be customized regularly with matplotlib styling. You can also choose to visualize more advanced plots, but only for individual assets, through the mplfinance library. The below image is a candle plot of the BOL.ST ticker from the OMXS30 index, with three moving averages.

Show ticker plot

You can create a Portfolio in a large number of ways. The most convenient way is to first initialize a Dataset like above, and then pass that dataset to the portfolio class constructor. The class also supports passing in a list of Asset objects, or a np.ndarray, or a pd.DataFrame. But for the latter three alternatives you need to also provide the security names and ticker symbols from which the data came from. All different methods are showed in the snipper below:

from finq import Portfolio

# All four alternatives are equally viable.
# The latter three are implemented for whenever
# you preferably only want to work with local data.
from_dataset = Portfolio(dataset)
from_assets = Portfolio(assets, names=["SEB-A.ST", ...], symbols=...)
from_numpy = Portfolio(np_arr, names=["SEB-A.ST", ...], symbols=...)
from_pandas = Portfolio(pd_df, names=["SEB-A.ST", ...], symbols=...)
...

To optimize your portfolio against some objective function, you use the optimize(...) function. Optimizing the mean variance expression (maximizing sharpe ratio) requires you to specify the objective function that you want to minimize, the initial weights of your portfolio, optionally the bounds and constraints of the portfolio weights. You can do it in the following way:

from finq import Portfolio
from finq.datasets import OMXS30
from finq.formulas import mean_variance

dataset = OMXS30(save=True)
dataset = dataset.run("2y")

portfolio = Portfolio(dataset)
portfolio.initialize_random_weights(
    "lognormal",
    size=(len(dataset), 1),
)

risk_tolerance = 1

portfolio.set_objective_function(
    mean_variance,
    risk_tolerance * portfolio.daily_covariance(),
    portfolio.daily_returns_mean(),
)

portfolio.set_objective_bounds(
    [(0, 0.2) for _ in range(len(dataset))],
)

portfolio.optimize(
    method="COBYLA",
    options={"maxiter": 1000},
)

portfolio.plot_mean_variance(n_samples=10000, figsize=(8, 5))

The above code is directly taken from one of the scripts available in the examples directory in this repo. Running it can yield the following comparative plot of the sharpe ratios for: your optimized portfolio weights, and randomly sampled portfolio weights.

Show sharpe ratio plot

📋 License

All code is to be held under a general MIT license, please see LICENSE for specific information.

finq's People

Contributors

Stargazers

Watchers

finq's Issues

Add CustomDataset functionality to try and fetch index from NASDAQ

Currently the CustomDataset can only be initialized with a list of symbols and a list of corresponding security names. We want to allow user to create CustomDataset from a potential valid index name from NASDAQ.

Suggestion (something like this):

class CustomDataset(Dataset):
    """ """

    def __init__(self, *,
        names: Optional[List[str]] = None,
        symbols: Optional[List[str]] = None,
        nasdaq_index: Optional[str] = None,
        save_path: Union[str, Path] = ".data/CUSTOM/",
        **kwargs: Dict,
    ):
        """ """

        if all(map(lambda x: x is None, (names, symbols, nasdaq_index))):
            raise ValueError("all can't be None")
        
        if nasdaq_index:
            names, symbols = _fetch_names_and_symbols(nasdaq_index)

        super(CustomDataset, self).__init__(
            names,
            symbols,
            save_path=save_path,
            **kwargs,
        )

DOD: we can pass an index name to the CustomDataset and fetch its data.

Yahoo! Finance ticker info does not allow passing proxy

Currently we are fetching information about a ticker using the wrong function.
We want to be able to use a proxy, so we need to use the function get_info(proxy=proxy) on the yf.Ticker object.

https://github.com/ranaroussi/yfinance/blob/308e58b914a065f33402bdcb3ed72e2b2d881f64/yfinance/base.py#L1734

Empty df from nasdaq on sundays, weekday diff is broken

With the current weekday diff logic in nasdaq_requests.py we do not get any valid data from NASDAQ.

weekday_diff = 7 - datetime.today().isoweekday()

will be 0 when running on sundays (link)[https://pythontic.com/datetime/date/isoweekday]

Suggestion:

weekday_diff = max(datetime.today().isoweekday() - 5, 0)

will give us either 1 or 2 diffs, to friday, if datetime.today() is on the weekend. will have 0 diffs on weekdays.

Improve logging behaviour

Current problem

The log captures the output from tqdm (progress bar). This produces extremely ugly and unreadable logs like the one below:

[2023-10-15 10:48:21,647] [finq.datasets.dataset] [DEBUG        ] verifying that stored data has no missing values...
Verifying `ABB.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `ALFA.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `ASSA-B.ST`T`  no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ATCO-A.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ATCO-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `AZN.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `BOL.ST`       no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `ELUX-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `ERIC-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `EVO.ST`B.ST`  no missing values:   0%|                                                    | 0/30 [00:00<?, ?iVerifying `GETI-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `HEXA-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `HM-B.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `INVE-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `KINV-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `NDA-SE.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `NIBE-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `SAND.ST`      no missing values:   0%|                                                   | 0/30 [00:00<?, ?itVerifying `SBB-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SCA-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SEB-A.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SHB-A.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SINCH.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SKF-B.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `SWED-A.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `TEL2-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `TELIA.ST`     no missing values:   0%|                                                  | 0/30 [00:00<?, ?it/Verifying `VOLV-B.ST`    no missing values:   0%|                                                 | 0/30 [00:00<?, ?it/sVerifying `VOLV-B.ST`    no missing values: 100%|██████████████████████████████████████| 30/30 [00:00<00:00, 1070.44it/s

Suggestions

We need to look into how to improve this behavour, so that the tqdm output is not handleded by the log, and simply prints to stdout. Possibly that the suggestion in the link below might work:

https://stackoverflow.com/questions/41707229/why-is-tqdm-printing-to-a-newline-instead-of-updating-the-same-line

Also we should refactor all logs and make sure they actually provide some sort of valuable information to the user/developer. Currently there are a lot of debug and info logs that might be too verbose.

/W 🐈

Mock all tests to avoid hitting the rate limit on Yahoo! Finance

Problem

Currently I have hit the daily API limit, twice, because of running tests locally due to the OMXS30 dataset not being mocked. It should not be an issue for our testing workflows to hit the limit, but we are still generating unnecessary web traffic.

Suggestion

One tricky aspect is mocking the requests to NASDAQ, since these fetch an .xlsx file and saves it locally. To properly mock this we would have to first create the .xlsx file with appropriate content, then mock the request, and then read it as usual. Make sure to clean everything up afterwards.

References

Yahoo Developer Network: https://developer.yahooinc.com/dsp/api/docs/traffic/info/rate-limits.html?guccounter=1

NASDAQ request for index NDX returns empty lists of names and symbols

Currently we can't download the ticker symbols for the NDX index. Not in code, nor manually. The .xlsx file we get from them is empty. Have they dropped support for NDX? Or just temporary bug? See tomorrow if it resolves

GET requests to NASDAQ should use the CachedRatedLimiter as session

Currently the requests to NASDAQ to fetch index information is not cached nor rate limited.
This should be able to be solved by passing the CachedRateLimiter session to the function _fetch_names_and_symbols and calling session.get(...) instead of requests.get(...).

def _fetch_names_and_symbols(
    index: str,
    session: requests.Session,
    *,
    query_params: Dict = {},
    headers: Dict = {},
    filter_symbols: Optional[Callable] = None,
) -> Union[Exception, Tuple[List[str], List[str]]]:
    """ """
    ...
    response = session.get(
        url,
        params=query_params,
        headers=headers,
    )
    ...

The above suggestion works (but only minimal testing). Try implement across the library.

/W 🐈

Nasdaq index weighting not working on weekends

The link to the weighting of an index, try OMXS30 weighting and set start date to either a Saturday or Sunday, does not produce any weights on weekends consequently resulting in no ticker symbols being retrieved.

To fix this we should check if today in nasdaq_requests.py is a weekday before continuing

today = datetime.today().strftime("%Y-%m-%d")
params = {
    "tradeDate": f"{today}T00:00:00.000",
    "timeOfDay": "SOD",
}

Argument chooses closing/opening etc. price for yf Ticker

Add an argument to the fetch_data() function in Dataset to allow to save different time-of-day data.

OR we can just save all the types, and then let the user chose e.g. 'Closing' when retrieving it as a numpy array?..

Fix all mypy static type-checking errors

Currently the static typing in the finq code contains some errors.
You can see these by running poetry run mypy finq.
These should be resolved.

classes and functions need documentation

Currently almost no classes or functions are documented. All classes and functions should be completely documented, preferably according to PEP8 standard.

Someone please do this :)

Prepare PyPI release

Some open questions

what name do we want on PyPI?

currently finq is already taken
we could do py-finq ? pyfinq ? finqpy ? something else?
please comment with suggestion

Chores

update pyproject.toml
- authors,
- version,
- description,
- etc ?..
create setup.py (?)
do we want CD workflow? deploy automatically to PyPI? might be a bit too much, we don't wanna do a lot of realeses?... thoughts?

Create load function for Dataset

Currently there is is functionality to optionally save the fetched tickers data, but no way to check for saved data and load it if it exists.

DOD: upon fetch attempt, looks locally for saved tickers and loads them instead of fetching them if they were previously saved.

index tickers varies through time

Current implementation gets the index tickers from today and assumes that it is the components of the index throughout the historical data that's downloaded. Instead, the index components should be retrieved continuously to assure an that the correct stocks are used as index components.

yfinance requests should be mocked in unit tests

Currently the unit tests perform real requests to Yahoo! Finance and nasdaq to test functionality. These should be mocked. Otherwise running the unit tests will take too long and the github actions will potentially cost a lot of money.

Look into using unittest.mock to mock all get requests.