Comments (10)
@wukan1986 thanks, seems like I need to further optimize column mapping to reduce memory footprint. I will release a fix soon.
Edit: @roelio just figured out how to allocate more virtual memory for Linux. The problem with Windows is that it will not overcommit memory and so you cannot allocate memory more than the RAM size + the current pagefile size, as I found here. This prevents vectorbt from defining arbitrarily large empty arrays.
from vectorbt.
I cannot reproduce the example, it works fine both on my machine and Colab.
from vectorbt.
Change num_tests = 2000
to num_tests = 20000
or it only crash on Windows
from vectorbt.
On my machine (MacBook Air 2017, 2,2 GHz Dual-Core Intel Core i7, 8 GB RAM) num_tests = 2000
, num_tests = 20000
, and even num_tests = 200000
run fine (although the last one runs pretty slow because of memory swapping). If your machine is Windows 32 bit, its address space is relatively small and that's why you might get errors rather than poor performance. See https://serverfault.com/a/75027 for general info. If that's the case, I have no other tip apart from running memory-expensive code on Colab, as high memory consumption is one of the general cons of vectorbt. You may also want to disable caching globally or run your code in chunks, as illustrated in my article.
from vectorbt.
my computer:
AMD Ryzen 7 4800U with Radeon Graphics 1.80 GHz
16.0 GB RAM
Windows 10 Home x64
numba 0.51.2
numpy 1.19.3
pandas 1.0.3
python 3.8.3
from vectorbt.
Can you check whether you can run the following snippet:
import numpy as np
from numba import njit
@njit
def col_map_nb(col_arr, n_cols):
col_idxs_out = np.empty((n_cols, len(col_arr)), dtype=np.int_)
print(col_idxs_out.shape)
col_ns_out = np.full((n_cols,), 0, dtype=np.int_)
for r in range(col_arr.shape[0]):
col = col_arr[r]
col_idxs_out[col, col_ns_out[col]] = r
col_ns_out[col] += 1
return col_idxs_out[:, :np.max(col_ns_out)], col_ns_out
col_map_nb(np.repeat(np.arange(10000), 50), 10000)
from vectorbt.
no crash
col_map_nb(np.repeat(np.arange(10000), 50), 10000)`
(10000, 500000)
Process finished with exit code 0
crash
col_map_nb(np.repeat(np.arange(10000), 50), 1)
(1, 500000)
Process finished with exit code -1073741819 (0xC0000005)
disable numba
Traceback (most recent call last):
File "D:/test_vectorbt/check_numba.py", line 23, in <module>
col_map_nb(np.repeat(np.arange(10000), 50), 1)
File "D:/test_vectorbt/check_numba.py", line 17, in col_map_nb
(1, 500000)
col_idxs_out[col, col_ns_out[col]] = r
IndexError: index 1 is out of bounds for axis 0 with size 1
from vectorbt.
disable numba
the iloc
need lot of memory
Traceback (most recent call last):
File "D:/test_vectorbt/demo_crash.py", line 71, in <module>
rb_portfolio.iloc[0]
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\base\indexing.py", line 24, in __getitem__
return self._indexing_func(lambda x: x.iloc.__getitem__(key), **self._indexing_kwargs)
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\portfolio\base.py", line 435, in _indexing_func
new_order_records = self._orders._col_idxs_records(col_idxs)
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\records\base.py", line 300, in _col_idxs_records
self.values, self.col_mapper.col_map, to_1d(col_idxs))
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\utils\decorators.py", line 215, in __get__
val = self.func(instance)
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\records\col_mapper.py", line 67, in col_map
return nb.col_map_nb(self.col_arr, len(self.wrapper.columns))
File "D:\Users\Kan\miniconda3\envs\py38_vectorbt\lib\site-packages\vectorbt\records\nb.py", line 99, in col_map_nb
col_idxs_out = np.empty((n_cols, len(col_arr)), dtype=np.int_)
MemoryError: Unable to allocate 21.6 GiB for an array with shape (11000, 527993) and data type int32
from vectorbt.
Check whether it can run now.
Apart from core optimizations, Portfolio.from_*
methods have two additional arguments: max_orders
and max_logs
. If you run into memory errors and any of them is caused by initializing order_records
or log_records
arrays, you can set the maximum number of records you expect the simulation to generate.
By default, if you have data of shape (1000, 1000)
, vectorbt will generate 1000 * 1000 = 1000000
empty records. But most of the time, you need only a fraction of that (for example, if buy and hold, you need only 1000 records - one per column), so this fraction can now be specified to consume less memory. If you don't need logging and you use Portfolio.from_order_func
, set max_logs
to 0.
from vectorbt.
Than you very much!
from vectorbt.
Related Issues (20)
- tp_stop with accumulate gives wrong result
- Data mismatch
- Incorrect Position Size Allocation Across Multiple Assets HOT 3
- For the same precision data, there is an accuracy error in the results.
- Issues with combining multiple plots into subplots in 1 figure HOT 1
- Plotting Error: Subplot 'trade_pnl' raised an exception HOT 1
- Getting Daily PNL Portfolio Change
- What is the difference in the documentation of vectorbt PRO vs the open source vectorbt?
- What is the benchmark plotted in pf.plot().show()?
- How to use run_combs in combination with other signals?
- Questions about backtesting
- StopLoss group_by
- Plots missing "close" curve
- Import Errors HOT 5
- Plotting with custom benchmark_rets got error (VectorBT used undefined attribute 'obj')
- Datetime support with from_order_func HOT 2
- Dockerfile changes to run /apps/candlestick-patterns app
- Multiprocessing Error When Creating Portofiolio
- Pandas MultiIndex on axis 0 HOT 1
- Closing all positions at the end of the day HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vectorbt.