Comments (7)
BTW, the default UX will be like what is in the examples now, i.e.
results = MarketSimulator(['AAPL', 'GOOG', 'AMZN']).backtest([strategy1, strategy2], '2000-01-01')
print(results)
results.plot()
(This is not the syntax now.)
from cvxportfolio.
No, is the other way round, right now only the methods inside marketsimulator are being used and I'm building outside of it their replacements with some improvements. Everything is tested (both old and new). I didn't touch the old pieces as I am building the new so no disruption there. The new pieces (marketdata etc.) are 100% tested but not currently plugged in the main api. Git branches are an overkill for this project at the moment.
from cvxportfolio.
Also, sorry I didn't mean to sound dismissive of your comment, you're right that it's not great to have code duplication in the master branch (I wrote the new stuff on Monday and I thought to merge earlier but will probably have to wait for the weekend). I can share what I have in mind. MarketSimulator.__init__
takes a list of cost functions (currently I wrote classes that implement a single method but will be simple callbacks instead) and **kwargs
that are passed to them (e.g., linear_transaction_cost
, cash_borrowing_spread
, ...). MarketData
holds and serve data to both the simulator and and the policy and is also initialized by the simulator with the optional **kwargs
. MarketSimulator
becomes a very thin class almost stateless that implements the simulate
and backtest
methods, also BacktestResult
will change and take on a few things that are currently done inside MarketSimulator
. Main issue is about multiprocessing, it's very tricky to pass classes with lots of internal state through the interprocess communication system (they all get pickled and unpickled on the other end) so I need to make sure that classes have as little state as possible. In fact MarketSimulator.backtest
will only implement logic to initialize a backtest (and break it up into smaller backtests as the universe changes with IPOs or bankruptcies, cfr. #85) and then call a separate _backtest
function, which is the one that is sent to multiprocessing. Again, the goal is to have a final product that is both easy to use (expose to the user a clear object model, not a bunch of functions), easy to extend (user-defined extensions should be easy to write) and easy to read and debug. Also, since everything goes through multiprocessing
, at the lower level it should ideally call functions or near-stateless classes' methods).
from cvxportfolio.
Ah, it's my bad then, I didn't realize that MarketData was new and should have checked the history not just the current code.
It's just that things like MarketData being unused confused me.
I think the proposed class structure (MarketData servicing both the policy and the simulator) makes a lot of sense. And can perhaps solve the problem with nan assets (#85) in the same go.
I will continue to get up to speed with the codebase.
from cvxportfolio.
Follow-up question on multiprocessing - can you help me understand how you envision a fully multiprocessing compatible backtest, as it was my understanding that backtesting on time t
requires state knowledge of holdings for t-1
?
from cvxportfolio.
Multiprocessing is used when doing multiple backtests (which will be largely automated once we make the hyperparameter optimization as part of the simulator logic). This was already implemented in old cvxportfolio (branch 0.0.X) and used in the example notebooks. It gets trickier with thicker classes, which is the reason why I'm refactoring the simulator logic now. Also we used to use an external multiprocess
library, instead I moved to the standard library multiprocessing
module which is less forgiving (but with better support). The problem with multiprocessing is that is hard to debug, hence code needs to be crystal clear. You can't really parallelize a single backtest because it is path-dependent (even without tcosts). All cvx* stack is mostly single-threaded (apart from a few BLAS level3 / Lapack calls which are very rare) so multiprocessing is a great fit. Also I'm not closing some old issues because they contain stuff that needs to be processed (e.g., in this one there is a lot of text that needs to go in the docs).
from cvxportfolio.
BTW, the above is true for optimization-based strategies, for toy strategies easier parallelization might work. Everything in cvxportfolio
maps the content of the book, the dynamics is explained in Chapter 2. The dynamics and parallelization are the same as they have always been in cvxportfolio
since the initial 2016 release.
from cvxportfolio.
Related Issues (20)
- pip install complains of scipy<1.11.0 HOT 4
- User provided returns forecast (`r_hat`) can be problematic HOT 1
- Interest rate on short sell cash proceeds
- Error from sp500_ndx100 example HOT 1
- Issue When Running examples/sp500_ndx100.py HOT 5
- I HOT 2
- NaN Handling HOT 3
- MultiPeriodOptimization Behavior Documentation HOT 10
- Single-step use question HOT 5
- do_asset_selection potential bug HOT 3
- Pandas 2.2.0 causes issues HOT 1
- Possible issue when using Cvxportfolio's internet access through a VPN HOT 4
- Make test suite fail on warnings
- Questions on accounting model (e.g., for short positions) HOT 3
- Robustify online execution of example strategies (when open prices are missing) HOT 1
- Failure on non-numeric (object dtpye) user-provided data difficult to understand HOT 6
- Example request - Margin in a different currency HOT 14
- BUG: packaging failed to include modules moved into submodules (constraints, data)
- Feature request: handle user-defined time-varying universes (and better error checks with temporary `nan`s in user-provided returns) HOT 13
- Data quality issues in `ftse100_daily` example strategy
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cvxportfolio.