clementperroud / gym-trading-env Goto Github PK
View Code? Open in Web Editor NEWA simple, easy, customizable Gymnasium environment for trading.
Home Page: https://gym-trading-env.readthedocs.io/
License: MIT License
A simple, easy, customizable Gymnasium environment for trading.
Home Page: https://gym-trading-env.readthedocs.io/
License: MIT License
Hi Clement,
From what I understand reading the docs for the MultiDatasetTradingEnv:
If this is the case are we introducing look ahead bias. Since at the end of the first episode we have already seen into the future. Would it be more realistic to iterate over all the datasets at time t, advance to time t+1 and again iterate over all the datasets up to time T?
Do you have any thoughts on this?
Best John.
Hi, on colab tensorflow 2.12 is default. And Gym-Trading-Env has a numpy>=1.24.2 requirement. But this end in a dependency conflicts:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.12.0 requires numpy<1.24,>=1.22, but you have numpy 1.24.3 which is incompatible.
numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.24.3 which is incompatible.
Question: Is numpy>=1.24.2 mandatory for Gym-Trading-Env?
In the MultiDatasetTradingEnv function, the randomization function of dataset selection has a strong bias towards earlier datasets in the list. (I noticed my first 10% of datasets getting selected far more often.)
The fix was to change line 387
From:
random_int = np.random.randint(potential_dataset_pathes.size)
To:
random_int = np.random.choice(potential_dataset_pathes)
After this change the datasets get hit very equally.
Hello,
Thank you for the awesome environment. I have a question. I'm trying to understand what should I consider as positions if I want to sell all my portfolio to have all USD, buy all BTC, or do nothing each timestep.
To further reduce the overfitting, it might also be useful to divide single records of pairs from a exchange into batches. So that the agent gets to see random time segments from a dataset in each episode when using MultiDatasetTradingEnv
The goal would be to reduce the probability that the agent simply learns the long-term price trend by heart. by this, he only sees random sections of the data set.
Does the thought process make sense? If yes, I would finish a PR to extend the download function with an optional argument "batch_size".
Something like this:
async def _download_symbol(exchange, symbol, timeframe='5m', since=int(datetime.datetime(year=2020, month=1, day=1).timestamp() * 1E3), until=int(datetime.datetime.now().timestamp() * 1E3), limit=1000, pause_every=10, pause=1, batch_size=None):
timedelta = int(pd.Timedelta(timeframe).to_timedelta64() / 1E6)
tasks = []
results = []
batch_num = 1
for step_since in range(since, until, limit * timedelta):
tasks.append(
asyncio.create_task(_ohlcv(exchange, symbol, timeframe, limit, step_since, timedelta))
)
if len(tasks) >= pause_every:
results.extend(await asyncio.gather(*tasks))
await asyncio.sleep(pause)
tasks = []
if batch_size is not None and batch_num % batch_size == 0:
final_df = pd.concat(results, ignore_index=True)
final_df = final_df.loc[(since < final_df["timestamp_open"]) & (final_df["timestamp_open"] < until), :]
del final_df["timestamp_open"]
final_df.set_index('date_open', drop=True, inplace=True)
final_df.sort_index(inplace=True)
final_df.dropna(inplace=True)
final_df.drop_duplicates(inplace=True)
save_file = f"{dir}/{exchange.id}-{symbol.replace('/', '')}-{timeframe}-batch{batch_num}.pkl"
final_df.to_pickle(save_file)
print(f"{symbol} downloaded from {exchange.id} and stored at {save_file}")
results = []
batch_num += 1
if len(tasks) > 0:
results.extend(await asyncio.gather(*tasks))
if len(results) > 0:
final_df = pd.concat(results, ignore_index=True)
final_df = final_df.loc[(since < final_df["timestamp_open"]) & (final_df["timestamp_open"] < until), :]
del final_df["timestamp_open"]
final_df.set_index('date_open', drop=True, inplace=True)
final_df.sort_index(inplace=True)
final_df.dropna(inplace=True)
final_df.drop_duplicates(inplace=True)
save_file = f"{dir}/{exchange.id}-{symbol.replace('/', '')}-{timeframe}-batch{batch_num}.pkl"
final_df.to_pickle(save_file)
print(f"{symbol} downloaded from {exchange.id} and stored at {save_file}")
This drove me crazy, there are many deprecation warnings with tensorflow and numpy, especially because you don't specify any versions or requirements. This line was the problem:
Gym-Trading-Env/src/gym_trading_env/environments.py
Lines 14 to 15 in 6d7b773
Over night the code stopped working and I have no clue why it worked before. But removing the termination on warning helped run it again.
Hi, can we add some functionality to add agents?
Thanks
I have a well-trained model, how should I switch the env to connect to Binance, is there any support here?
Hi,
When I check the environment with windows > 0 with print(check_env(env)) using stablelines3, I get the warning message that the vector has to be 1D, since it is now picture I am pretty new to RL, can I ignore that or should I convert and when yes how?
Thanks and best regards ste
I just wanted to write my own environment that works a bit differently. I noticed that it already is made quite well, so that it is easy to extend. In my case I want to create a portfolio that works with futures, they have a bit different characteristics.
So what I did is:
I removed
trading_fees=0,
borrow_interest_rate=0,
portfolio_initial_value=1000,
initial_position='random',
from the environment and added it to the Portfolio
class. Now the environment is decoupled from the asset class. I can now implement a FuturePortfolio, that has a tick_value
and is based on the amount of contracts traded not a balance between cash and assets. The environment just pulls the necessary info from the portfolio class and works the same way. With this I can also create different portfolios for different assets and train the same agent. For example Nasdaq is trading with a tick value of 5$ and the S&P500 with 12.5$, etc.
env = gym.make("TradingEnv",
name= "BTCUSD",
df = df, # Your dataset with your custom features
positions = [ -1, 0, 1], # -1 (=SHORT), 0(=OUT), +1 (=LONG)
trading_fees = 0.01/100, # 0.01% per stock buy / sell (Binance fees)
borrow_interest_rate= 0.0003/100, # 0.0003% per timestep (one timestep = 1h here)
)
How to keep a position?
Hi,
The _get_obs
here only consider the dynamic features instead of including all features!
Shouldn't we consider all the features?
Thanks
Hello,
Can you create an example on training and inference? Appreciate it.
(py39) C:\Users\leo\Gym-Trading-Env-main>python examples\example_environnement.py
Traceback (most recent call last):
File "C:\Users\leo\Gym-Trading-Env-main\examples\example_environnement.py", line 44, in
env.add_metric('Position Changes', lambda history : np.sum(np.diff(history['position']) != 0) )
File "C:\Users\leo\anaconda3\envs\py39\lib\site-packages\gymnasium\core.py", line 297, in getattr
logger.warn(
File "C:\Users\leo\anaconda3\envs\py39\lib\site-packages\gymnasium\logger.py", line 55, in warn
warnings.warn(
UserWarning: WARN: env.add_metric to get variables from other wrappers is deprecated and will be removed in v1.0, to get this variable you can do env.unwrapped.add_metric
for environment variables or env.get_attr('add_metric')
that will search the reminding wrappers.
need to fix: env.unwrapped.add_metric()
When I run the environment with parallel environments on my Windows device, I get an overflow error "Python int too large to convert to C long" in gymnasium\vector\vector_env.py, line 300, in _add_info
info_array[env_num], array_mask[env_num] = info[k], True
after several iterations.
On my Linux device, this problem does not occur. After some research, I found out that Windows specifically uses 32-bit integers instead of 64-bit.
I have found a workaround that involves converting the INT type to an int64 type. However, I am not sure if this method is legitimate and if it might introduce further errors. Nonetheless, I have modified the def _init_info_arrays in vector_env.py as follows. Additionally, I am not sure if the source of the error lies in gym or in the TradingEnv or MultiDatasetTradingEnv.
def _init_info_arrays(self, dtype: type) -> Tuple[np.ndarray, np.ndarray]:
"""Initialize the info array.
Initialize the info array. If the dtype is numeric
the info array will have the same dtype, otherwise
will be an array of `None`. Also, a boolean array
of the same length is returned. It will be used for
assessing which environment has info data.
Args:
dtype (type): data type of the info coming from the env.
Returns:
array (np.ndarray): the initialized info array.
array_mask (np.ndarray): the initialized boolean array.
"""
if dtype in [int, np.int64]:
array = np.zeros(self.num_envs, dtype=np.int64)
elif dtype in [float, bool] or issubclass(dtype, np.number):
array = np.zeros(self.num_envs, dtype=dtype)
else:
array = np.zeros(self.num_envs, dtype=object)
array[:] = None
array_mask = np.zeros(self.num_envs, dtype=bool)
return array, array_mask
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.