bsolomon1124 / pyfinance Goto Github PK

View Code? Open in Web Editor NEW

319.0 15.0 63.0 308 KB

Python package designed for general financial and security returns analysis.

Home Page: https://pypi.python.org/pypi/pyfinance

License: MIT License

Python 100.00%

python3 finance investment-analysis

pyfinance's Introduction

pyfinance

pyfinance is a Python package built for investment management and analysis of security returns.

It is meant to be a complement to existing packages geared towards quantitative finance, such as pyfolio, pandas-datareader, and fecon235.

Supports	Python 3.5 \| 3.6 \| 3.7 \| 3.8 \| 3.9
Latest Release
Package Status
License

pyfinance is best explored on a module-by-module basis:

Module	Description
`datasets.py`	Financial dataset download & assembly via `requests`.
`general.py`	General-purpose financial computations, such as active share calculation, returns distribution approximation, and tracking error optimization.
`ols.py`	Ordinary least-squares (OLS) regression, supporting static and rolling cases, built with a matrix formulation and implemented with NumPy.
`options.py`	Vectorized option calculations, including Black-Scholes Merton European option valuation, Greeks, and implied volatility, as well as payoff determination for common money-spread option strategies.
`returns.py`	Statistical analysis of financial time series through the CAPM framework, designed to mimic functionality of software such as FactSet Research Systems and Zephyr, with improved speed and flexibility.
`utils.py`	Utilities not fitting into any of the above.

Please note that returns and general are still in development; they are not thoroughly tested and have some NotImplemented features.

Installation

pyfinance is available via PyPI. Install with pip:

$ python3 -m install pyfinance

Note: pyfinance aims for compatibility with all minor releases of Python 3.x, but does not guarantee workability with Python 2.x.

Dependencies

pyfinance relies primarily on Python's scientific stack, including NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn, and StatsModels. Other dependencies include Beautiful Soup, Requests, xrld, and xmltodict.

See setup.py for specific version threshold requirements.

Tutorial

This is a walkthrough of some of pyfinance's features.

The returns.py module is designed for statistical analysis of financial time series through the CAPM framework, designed to mimic functionality of software such as FactSet Research Systems and Zephyr, with improved speed and flexibility.

Its main class is TSeries, a subclassed Pandas Series. The DataFrame equivalent, TFrame, is not yet implemented as of March 2018.

TSeries implements a collection of new methods that pertain specifically to investment management and the study of security returns and asset performance, such cumulative return indices and drawdown.

Here's an example of construction:

>>> import numpy as np
>>> import pandas as pd
>>> from pyfinance import TSeries

>>> np.random.seed(444)

# Normally distributed with 0.08% daily drift term.
>>> s = np.random.randn(400) / 100 + 0.0008
>>> idx = pd.date_range(start='2016', periods=len(s))  # default daily freq.
>>> ts = TSeries(s, index=idx)

>>> ts.head()
2016-01-01    0.0044
2016-01-02    0.0046
2016-01-03    0.0146
2016-01-04    0.0126
2016-01-05   -0.0086
Freq: D, dtype: float64

And a few "new" methods:

>>> ts.max_drawdown()
-0.12374551561531844

# Downsample to quarterly compounded returns.
>>> ts.rollup('Q')
2016-03-31    0.0450
2016-06-30    0.1240
2016-09-30    0.0631
2016-12-31   -0.0081
2017-03-31    0.1925
Freq: Q-DEC, dtype: float64

>>> ts.anlzd_stdev()
0.16318780660107757

>>> ts.sharpe_ratio(ddof=1)
2.501797257311737

Some statistics are benchmark-relative. For methods that take a benchmark parameter, benchmark can be either another TSeries, a Pandas Series, a 1d NumPy array.

>>> bmk = TSeries(np.random.randn(400) / 100 + .0005,
...               index=ts.index)
>>> ts.beta_adj(bmk)
0.3176455956603447

>>> ts.tracking_error(benchmark=bmk)
0.23506660057562254

With CAPM-related statistics such as alpha, beta, and R-squared, it can also be a Pandas DataFrame or 2d NumPy array.

>>> multi_bmk = pd.DataFrame(np.random.randn(400, 2) / 100 + .0005,
...                          index=ts.index)

>>> # Multifactor model support.
>>> ts.alpha(multi_bmk)
0.0010849614688207107

TSeries comes with just one additional and optional argument that must be as a keyword argument: freq (default None) allows for manual specification of the time-series frequency. It may be any frequency string or anchored offset string recognized by Pandas, such as 'D', '5D', 'Q', 'Q-DEC', or 'BQS-APR'.

>>> # This is okay as long as a frequency can be inferred.
>>> ts.freq is None
True

The purpose of this extra parameter is to create an annualization factor for statistics that are given on an annualized basis, such as standard deviation.

If no frequency is passed explicitly, pyfinance will attempt to infer an annualization factor from the Index, with an exception being raised if neither of these yield a frequency.

>>> no_idx = TSeries(np.random.laplace(size=24) * .01 + .005,
                     freq='M')

>>> no_idx.freq
'M'

>>> no_idx.anlzd_ret()
0.04975219957136123

freq can also be passed within some methods, which will override the class instance's .freq if it exists:

>>> no_idx.anlzd_ret(freq='W')  # Treat `no_idx` as weekly returns.
0.2341731795205313

datasets.py provides for financial dataset download & assembly via requests. It leverages sources including:

Ken French's data library (via pandas-datareader);
SEC.gov;
cboe.com;
AQR's dataset page;
fred.stlouisfed.org;
Robert Shiller's page at econ.yale.edu.

Below is a batch of examples.

Load SEC 13F filings:

# Third Point LLC June 2017 13F
>>> from pyfinance import datasets
>>> url = 'https://www.sec.gov/Archives/edgar/data/1040273/000108514617001787/form13fInfoTable.xml'  # noqa
>>> df = datasets.load_13f(url=url)
>>> df.head()
          nameOfIssuer   titleOfClass      cusip   value  votingAuthority
0  ALEXION PHARMACE...            COM  015351109  152088          1250000
1  ALIBABA GROUP HL...  SPONSORED ADS  01609W102  634050          4500000
2         ALPHABET INC   CAP STK CL A  02079K305  534566           575000
3           ANTHEM INC            COM  036752103  235162          1250000
4       BANCO MACRO SA     SPON ADR B  05961W105   82971           900000

Industry-portfolio monthly returns:

>>> from pyfinance import datasets
>>> ind = datasets.load_industries()
>>> ind.keys()
dict_keys([5, 10, 12, 17, 30, 38, 48])

# Monthly returns to 5 industry portfolios
>>> ind[5].head()
            Cnsmr  Manuf  HiTec  Hlth   Other
Date
1950-01-31   1.26   1.47   3.21   1.06   3.19
1950-02-28   1.91   1.29   2.06   1.92   1.02
1950-03-31   0.28   1.93   3.46  -2.90  -0.68
1950-04-30   3.22   5.21   3.58   5.52   1.50
1950-05-31   3.81   6.18   1.07   3.96   1.36

S&P 500 and interest rate data from Robert Shiller's website, 1871-present:

>>> from pyfinance import datasets
>>> shiller = datasets.load_shiller()
>>> shiller.iloc[:7, :5]
            sp50p  sp50d  sp50e      cpi  real_rate
date
1871-01-31   4.44   0.26    0.4  12.4641     5.3200
1871-02-28   4.50   0.26    0.4  12.8446     5.3233
1871-03-31   4.61   0.26    0.4  13.0350     5.3267
1871-04-30   4.74   0.26    0.4  12.5592     5.3300
1871-05-31   4.86   0.26    0.4  12.2738     5.3333
1871-06-30   4.82   0.26    0.4  12.0835     5.3367
1871-07-31   4.73   0.26    0.4  12.0835     5.3400

The ols.py module provides ordinary least-squares (OLS) regression, supporting static and rolling cases, and is built with a matrix formulation and implemented with NumPy.

First, let's load some data on currencies, interest rates, and commodities to generate a regression of changes in the trade-weighted USD against interest rate term spreads and copper.

>>> from pandas_datareader import DataReader

>>> syms = {
...     'TWEXBMTH': 'usd',
...     'T10Y2YM': 'term_spread',
...     'PCOPPUSDM': 'copper'
...     }

>>> data = DataReader(syms.keys(), data_source='fred',
...                   start='2000-01-01', end='2016-12-31')\
...     .pct_change()\
...     .dropna()\
...     .rename(columns=syms)

>>> y = data.pop('usd')

>>> data.head()
            term_spread  copper
DATE
2000-02-01      -1.4091 -0.0200
2000-03-01       2.0000 -0.0372
2000-04-01       0.5185 -0.0333
2000-05-01      -0.0976  0.0614
2000-06-01       0.0270 -0.0185

>>> y.head()
DATE
2000-02-01    0.0126
2000-03-01   -0.0001
2000-04-01    0.0056
2000-05-01    0.0220
2000-06-01   -0.0101

The OLS class implements "static" (single) linear regression, with the model being fit when the object is instantiated.

It is designed primarily for statistical inference, not out-of-sample prediction, and its attributes largely mimic the structure of StatsModels' RegressionResultsWrapper.

>>> from pyfinance import ols

>>> model = ols.OLS(y=y, x=data)

>>> model.alpha  # the intercept - a scalar
0.0012303204434167458

>>> model.beta  # the coefficients
array([-0.0006, -0.0949])

>>> model.fstat
33.42923069295481

# Residuals and predicted y values are NumPy arrays
# with the same shape as `y`.
>>> model.resids.shape
(203,)

>>> model.predicted.shape
(203,)

The module also supports rolling regression. (Iterative regressions done on sliding windows over the data.)

RollingOLS has methods that generate NumPy arrays as outputs.
PandasRollingOLS is a wrapper around RollingOLS and is meant to mimic the look of Pandas's deprecated MovingOLS class. It generates Pandas DataFrame and Series outputs.

Note: all solutions are generated through a matrix formulation, which takes advantage of NumPy's broadcasting capabilities to expand the classical matrix formulation to an additional dimension. This approach may be slow for significantly large datasets.

Also, note that windows are not "time-aware" in the way that Pandas time functionality is. Because of the NumPy implementation, specifying a window of 12 where the index contains one missing months would generate a regression over 13 months. To avoid this, simply reindex the input data to a set frequency.

# 12-month rolling regressions
# First entry would be the "12 months ending" 2001-01-30
>>> rolling = ols.PandasRollingOLS(y=y, x=data, window=12)

>>> rolling.beta.head()
            term_spread  copper
DATE
2001-01-01   9.9127e-05  0.0556
2001-02-01   4.7607e-04  0.0627
2001-03-01   1.4671e-03  0.0357
2001-04-01   1.6101e-03  0.0296
2001-05-01   1.5839e-03 -0.0449

>>> rolling.alpha.head()
DATE
2001-01-01    0.0055
2001-02-01    0.0050
2001-03-01    0.0067
2001-04-01    0.0070
2001-05-01    0.0048

>>> rolling.pvalue_alpha.head()
DATE
2001-01-01    0.0996
2001-02-01    0.1101
2001-03-01    0.0555
2001-04-01    0.0479
2001-05-01    0.1020

options.py is built for vectorized options calculations.

BSM encapsulates a European option and its associated value, Greeks, and implied volatility, using the Black-Scholes Merton model.

>>> from pyfinance.options import BSM
>>> op = BSM(S0=100, K=100, T=1, r=.04, sigma=.2)

>>> op.summary()
OrderedDict([('Value', 9.925053717274437),
             ('d1', 0.3),
             ('d2', 0.09999999999999998),
             ('Delta', 0.6179114221889526),
             ('Gamma', 0.019069390773026208),
             ('Vega', 38.138781546052414),
             ('Theta', -5.888521694670074),
             ('Rho', 51.86608850162082),
             ('Omega', 6.225774084360724)])

# What is the implied annualized volatility at P=10?
>>> op.implied_vol(value=10)
0.20196480875586834

# Vectorized - pass an array of strikes.
>>> import numpy as np
>>> ops = BSM(S0=100, K=np.arange(100, 110), T=1, r=.04, sigma=.2)

>>> ops.value()
array([9.9251, 9.4159, 8.9257, 8.4543, 8.0015, 7.567 , 7.1506, 6.7519,
       6.3706, 6.0064])

# Multiple array inputs are evaluated elementwise/zipped.
>>> ops2 = BSM(S0=np.arange(100, 110), K=np.arange(100, 110),
...            T=1, r=.04, sigma=.2)

>>> ops2
BSM(kind=call,
    S0=[100 101 102 103 104 105 106 107 108 109],
    K=[100 101 102 103 104 105 106 107 108 109],
    T=1,
    r=0.04,
    sigma=0.2)

>>> ops2.value()
array([ 9.9251, 10.0243, 10.1236, 10.2228, 10.3221, 10.4213, 10.5206,
       10.6198, 10.7191, 10.8183])

options.py also exports a handful of options strategies, such as Straddle, Straddle, Strangle, BullSpread, and ShortButterfly, to name a few.

All of these inherit from a generic and customizable OpStrat class, which can be built from an arbitrary number of puts and/or calls.

Here is an example of constructing a bear spread, which is a combination of 2 puts or 2 calls (put is the default). Here, we are short a put at 1950 and long a put at 2050. Like the case of a single option, the instance methods are vectorized, so we can compute payoff and profit across a vector or grid:

>>> from pyfinance import options as op

>>> spread = op.BearSpread(St=np.array([2100, 2000, 1900]),
...                        K1=1950., K2=2050.,
...                        price1=56.01, price2=107.39)

>>> spread.payoff()
array([  0.,  50., 100.])

>>> spread.profit()
array([-51.38,  -1.38,  48.62])

The utils.py module contains odds-and-ends utilities.

>>> from pyfinance import utils

# Generate 7 unique 5-letter mutual fund tickers
>>> utils.random_tickers(length=5, n_tickers=7, endswith='X')
['JXNQX', 'DPTJX', 'WAKOX', 'DZIHX', 'MDYXX', 'HSKWX', 'IDMZX']

# Same for ETFs
>>> utils.random_tickers(3, 8)
['FIS', 'FNN', 'FZC', 'PWV', 'PBA', 'RDG', 'BKY', 'CDW']

# Five-asset portfolio leveraged 1.5x.
>>> utils.random_weights(size=5, sumto=1.5)
array([0.3263, 0.1763, 0.4703, 0.4722, 0.0549])

# Two 7-asset portfolios leverage 1.0x and 1.5x, respectively.
>>> utils.random_weights(size=(2, 7), sumto=[1., 1.5])
array([[0.1418, 0.2007, 0.0255, 0.2575, 0.0929, 0.2272, 0.0544],
       [0.3041, 0.109 , 0.2561, 0.2458, 0.3001, 0.0333, 0.2516]])

>>> utils.random_weights(size=(2, 7), sumto=[1., 1.5]).sum(axis=1)
array([1. , 1.5])

# Convert Pandas offset alises to periods per year.
>>> from pyfinance import utils

>>> utils.get_anlz_factor('M')
12.0
>>> utils.get_anlz_factor('BQS-DEC')
4.0

API

For in-depth call syntaxes, see the source docstrings.

pyfinance's People

Contributors

Stargazers

Watchers

Forkers

inkenbrandt mtd91429 causasui0605 theodorb renjithmadhavan alibuang talaikis pjkonicki bdieu178 cj5815 toledy randysuen1991 wyrover jimjinyan kqsmea8 tgpradhan davedwards databill86 data-driven-allocator jinhuli bonsoirlt richwu stjordanis icjsb eagle0302 lunvs jjwwu komin92 christianjauregui huning2009 gwill lnsongxf macrofinancehub zerounnet anhmike leoyichen nfx friedhelmvictor seantzu vilasfe quantzhang raytone-d tigerdog pdevp shizelong1985 craftingdata klckerim likeapointer onenetbee gliptak wannabe-quant joemacdonald kaiyiweb cfaaron legendaryt traderlife8 wlwd13303 alwaysfish kuzmakoch koshsr maxclchen rovedream muflhi01

pyfinance's Issues

random minor fix

Hi,
i was testing pyfinance, i'm using 1.2.1

i wish this will be helpful:
returns.py
line 617
diff = self.excess_ret(benchmark).anlzd_ret()
instead
diff = self.excess_ret(benchmark).anlzd_return()

in utils.py i added the function convertfreq

def convertfreq(freq):
    freq = freq.upper()

    if freq=='M':
        periods = 12
    elif freq=='A':
        periods = 1
    elif freq== 'Q':
        periods = 4
    else:
        raise ValueError('freq must be equal to M,A,Q')
        
    return periods

Installation via pip doesn't work

First - thanks for this package, it sounds amazing.

The issue I'm facing is actually in installation - keep getting the following error after (installing via pip in anaconda):
!pip3 install pyfinance
Collecting pyfinance
Using cached https://files.pythonhosted.org/packages/74/20/7dd8767ec785cfcdce714e170a0a7b1c79feb7085e6217ec8dc43b4a74d3/pyfinance-1.1.1.tar.gz
Requirement already satisfied: beautifulsoup4>=4.6.0 in c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages (from pyfinance) (4.6.0)
Requirement already satisfied: matplotlib>=1.1 in c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages (from pyfinance) (2.1.0)
Requirement already satisfied: numpy>=0.7.0 in c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages (from pyfinance) (1.13.3)
Exception:
Traceback (most recent call last):
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages\pip_internal\basecommand.py", line 141, in main
status = self.run(options, args)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages\pip_internal\commands\install.py", line 299, in run
resolver.resolve(requirement_set)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages\pip_internal\resolve.py", line 102, in resolve
self._resolve_one(requirement_set, req)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages\pip_internal\resolve.py", line 261, in _resolve_one
check_dist_requires_python(dist)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\site-packages\pip_internal\utils\packaging.py", line 46, in check_dist_requires_python
feed_parser.feed(metadata)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\email\feedparser.py", line 175, in feed
self._input.push(data)
File "c:\users\mi113\appdata\local\continuum\anaconda3\lib\email\feedparser.py", line 103, in push
self._partial.write(data)
TypeError: string argument expected, got 'NoneType'
You are using pip version 18.0, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

PandasRollingOLS reports ValueError

the following code cannot work and report ValueError. hope to see a better version soon, thx.

from pyfinance import ols
data = {'A':[2,3,4,5,6],'B':[10,11,12,13,14]}
df = pd.DataFrame(data)
rolling = ols.RollingOLS(y=df['B'], x=df['A'], window=3,has_const=False,use_const=False)

File "C:\ProgramData\Anaconda3\lib\site-packages\pyfinance\utils.py", line 705, in rolling_windows
' a, {1}.'.format(window, a.shape[0]))

ValueError: Specified window length of 3 exceeds length of a, 1.

Should add option of minimum periods of none when there is a specific window size

Right now, when I use the window = n, it requires me to have at least n entries without empty values. I believe it would be better if we use the amount of available data if the length is smaller than n. Right now it is returning NAN

KeyError for file: utils.py line: 555

keyerror for function ts.anlzd_stdev() and ts.sharpe_ratio(ddof=1)

semi_stdev function is wrong

As you correctly said in comments formula is

    Formula: `sqrt( sum([min(self - thresh, 0] **2 ) / (n - ddof) )`

    Also known as: downside deviation.

so in the function --> line 867 of file returns.py:
ss = ((nansum(np.minimum(self - threshold, 0.) ** 2 )/ n) ** 0.5)
instead of:
ss = (nansum(np.minimum(self - threshold, 0.) ** 2) ** 0.5) / n

Can't get sample to work

The sample usage of PandasRollingOLS posted as an answer at https://stackoverflow.com/questions/37317727/deprecated-rolling-window-option-in-ols-from-pandas-to-statsmodels no longer works

I confirmed it wasn't a problem with my personal machine by running it on this online compiler
https://repl.it/@SamArthur/InfatuatedRubberyResources

Here's the error which was given on my personal machine as well as the online compiler

Traceback (most recent call last):
  File "main.py", line 29, in <module>
    model = PandasRollingOLS(y=y, x=x, window=window)
  File "/home/runner/.site-packages/pyfinance/ols.py", line 738, in __init__
    use_const=use_const)
  File "/home/runner/.site-packages/pyfinance/ols.py", line 377, in __init__
    self.solution = _rolling_lstsq(self.xwins, self.ywins)
  File "/home/runner/.site-packages/pyfinance/ols.py", line 36, in _rolling_lstsq
    return np.squeeze(np.matmul(np.linalg.inv(np.matmul(x.swapaxes(1, 2), x)),
  File "/usr/local/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 532, in inv
    ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
  File "/usr/local/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 89, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.linalg.LinAlgError: Singular matrix

min_window option in RollingOLS

i would like to have min_window in the rollingOLS function, because if we have a window of 90 it does not perform OLS on first 90 values. i would like to perform a OLS expanding until 90 observations starting when there is at least 12 observation (min_window), then rolling of 90 (window)

Is it possible to implement that or someone has a quick workaround for this?

Thanks

Testing... some suggestions

hi Brad, just looked through your options pricing module.
It might be helpful in general to add tests for your various
functions. This assures the user that e.g. the condor
is written correctly for a specific inputs, and also serves
as demonstrated snippets of code.

Check out https://doc.pytest.org -- good starting point.

As you write the tests, the skeleton for documentation will
emerge. When you modify code, the tests will assure you
that everything works as intended. Take care!

Deprecation warning for file: returns.py line: 124

FutureWarning: Index.is_all_dates is deprecated, will be removed in a future version. check index.inferred_type instead
if self.index.is_all_dates:

Any thoughts on fixing this here?

Thanks in advance,

PandasRollingOLS does not actually work with use_const=False

Even if you pass in use_const=False, the regression still appends and uses a constant. Tried tinkering to fix this but ran into dimensionality issues - some help would be appreciated.

FreqGroup class seems to be in pandas._libs.tslibs.dtypes

I'm trying to use PandasRollingOLS but the class FreqGroup is not found, neither the get_freq_code.

I have looked at Panda and seems that the lib is in:

from pandas._libs.tslibs.dtypes import FreqGroup instead of from pandas._libs.tslibs.frequencies import FreqGroup, get_freq_code

at utils.py line 80

If I do this change the lib works, just telling in case you want to change it, i'm not doing a PR because I don't know if that can affect something else, did really quick.

Cheers

it does not make sense when calling the property "rsq" and "rsq_adj"

First, thanks for fixing the problem and updating the package to 1.2.5.
When I use the newest version, I run into the following result. it seems to make no sense.

data = {'A':[2,3,4,5,6],'B':[10,11,12,13,14]}
df = pd.DataFrame(data)
rolling = ols.RollingOLS(y=df['B'], x=df['A'], window=3,has_const=False,use_const=False)
rolling.rsq
rolling.rsq_adj

the result is array([938.58741603, 868.32596507, 801.06451411])
array([-624.05827735, -577.21731005, -532.37634274])

it should in the interval [0,1]

Looking for new project maintainer

I periodically attempt to keep up with bugfixes, but I unfortunately do not have the time to properly maintain this project beyond that.

If anyone is interested in adopting this project, I would be glad to transfer it if that means giving it a more caring home.

function "get_anlz_factor" return error

description:
with pandas version: 1.3.4，there is a error when calling get_anlz_factor，which result in calling error when invoke function anlzd_stdev and sharpe_ratio

problem reason:
at function get_anlz_factor，the map PERIODS_PER_YEAR doesn't has correct key type int，it's FreqGroup.FR_ANN at pandas 1.3.4

possible solution:
use FreqGroup.value to get correction key.

Bad ValueError raised in window-length check

Version: 1.2.4

Example:

>>> from pyfinance import ols
>>> data = {"A": [2, 3, 4, 5, 6], "B": [10, 11, 12, 13, 14]}
>>> import pandas as pd
>>> df = pd.DataFrame(data)
>>> rolling = ols.RollingOLS(y=df["B"], x=df["A"], window=3, has_const=False, use_const=False)

ValueError                                Traceback (most recent call last)
<ipython-input-6-b8d3bfa13f07> in <module>
----> 1 rolling = ols.RollingOLS(y=df["B"], x=df["A"], window=3, has_const=False, use_const=False)

.../pyfinance/ols.py in __init__(self, y, x, window, has_const, use_const)
    373                                            use_const=use_const)
    374         self.window = self.n = window
--> 375         self.xwins = utils.rolling_windows(self.x, window=window)
    376         self.ywins = utils.rolling_windows(self.y, window=window)
    377         self.solution = _rolling_lstsq(self.xwins, self.ywins)

.../pyfinance/utils.py in rolling_windows(a, window)
    703     if window > a.shape[0]:
    704         raise ValueError('Specified `window` length of {0} exceeds length of'
--> 705                          ' `a`, {1}.'.format(window, a.shape[0]))
    706     if isinstance(a, (Series, DataFrame)):
    707         a = a.values

ValueError: Specified `window` length of 3 exceeds length of `a`, 1.

install error

I use git to pull the code, then I use the command python -m pip install -e .
then there is a error

(quant) E:\Python\quantaxis\pyfinance>python -m pip install -e .
Looking in indexes: http://pypi.douban.com/simple/
Obtaining file:///E:/Python/XXX/pyfinance
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "E:\Python\XXX\pyfinance\setup.py", line 8, in <module>
        with open(os.path.join(here, 'README.rst'), encoding='utf-8') as f:
      File "E:\ProgramData\Anaconda3\envs\quant\lib\codecs.py", line 898, in open
        file = builtins.open(filename, mode, buffering)
    FileNotFoundError: [Errno 2] No such file or directory: 'E:\\Python\\XXX\\pyfinance\\README.rst'

So I change the 'README.rst' to 'README.md', then is OK.

RollingOLS unnecesarily implements LRUCache, which causes memory issues with larger objects.

I'm working with large datasets, which and I'm running different RollingOLS classes multiple times on it. However I've noticed that after running the class I would soon start running out of memory, as each pvalue I calculated would get stored on the LRUcache. I believe this is unnecessary, or at least an option should be implemented to remove the LRUCache entirely.

Thanks!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.