This repository contains Python code for optimizing the hyperparameters of an XGBoost classifier using the hyperopt
library. The model is trained to predict trading signals, which are categorized as long, short, or neutral, based on various technical indicators.
- Data Cleaning: Removes rows with missing values.
- Encoding: Encodes the categorical target column ('Signal').
- Hyperparameter Optimization: Uses Bayesian optimization with
hyperopt
to find the best hyperparameters for the XGBoost classifier. - Model Evaluation: Evaluates the model's performance on a test set, providing a classification report, confusion matrix, accuracy, and ROC-AUC score.
- Feature Importance: Visualizes the importance of each feature used in the model.
Make sure you have the following Python packages installed:
numpy
matplotlib
xgboost
sklearn
hyperopt
You can install them using pip
:
pip install numpy matplotlib xgboost sklearn hyperopt
- Clone this repository.
- Navigate to the repository's root directory in your terminal.
- Run the code using the following command:
python xgb_hyperop.py
After running the code, you will get:
- A printout of the best hyperparameters discovered during the optimization.
- A bar chart visualizing the feature importances.
- A detailed performance report including accuracy, ROC-AUC score, classification report, and confusion matrix.
- An ROC Curve plot.
Upon evaluating the XGBoost model's performance using the test dataset, we observed the following:
-
The confusion matrix shows that the model correctly predicted 218 'long' signals and 239 'short' signals. However, there were 54 'long' signals that were misclassified as 'short' and 54 'short' signals that were misclassified as 'long'.
-
The precision, recall, and F1-score for both 'long' and 'short' signals are approximately 0.81, indicating a balanced performance between the two classes.
-
The overall accuracy of the model stands at 81%, which means that the model correctly predicted the trading signals for 81% of the test data.
-
The ROC-AUC Score is 0.89, which is quite impressive. An ROC-AUC score closer to 1 suggests that the model has good discriminative power between positive and negative classes.
- Finding Features with XGBoost: Training and evaluating an XGBoost classifier on the Bitcoin technical indicators dataset. It aims to predict trading signals (like 'long', 'short', or 'neutral') based on the values of various indicators.
- Technical Analysis Repository: This repository fetches 120 days of hourly Bitcoin price data, calculates technical indicators, and analyzes the relations between these indicators.
Connect with me on LinkedIn.
For more insights into my work, check out my latest project: tafou.io.
I'm always eager to learn, share, and collaborate. If you have experiences, insights, or thoughts about RL, Prophet, XGBoost, SARIMA, ARIMA, or even simple Linear Regression in the domain of forecasting, please create an issue, drop a comment, or even better, submit a PR!
Let's learn and grow together! ๐ฑ