This code implements a Deep Learning (DL) framework to deal with the autonomous guidance of a chaser spacecraft during a minimum-fuel, time-fixed, impulsive linear rendezvous maneuver, with fixed initial and final relative conditions. Specifically, the aim of the mission is to come in close proximity of a passive target body, starting from given initial conditions. Moreover, two type of operational constraints are considered: a spherical keep-out zone, representing the space occupied by the target, and a visibility cone the chaser must move through during the final part of the maneuver, which guarantees that the chaser spacecraft is always visible from the target. The chaser trajectory is divided into a number of segments, and the Hill-Clohessy-Wiltshire (HCW) equations are used to propagate the chaser relative motion between any two impulsive Dvs.
The environment can be deterministic or stochastic (i.e., with state, observation and/or control uncertainties, as well as with stochastic dynamical perturbations). The initial state of the chaser can be fixed or uniformly scattered around a reference value. The resulting (stochastic) optimal control problem is solved by using a Deep Neural Network (DNN) trained by either Behavioral Cloning (BC) or Reinforcement Learning (RL), or both.
The program is based on Stable Baselines, an open-source library containing a set of improved implementations of RL algorithms based on OpenAI Baselines. All necessary information can be found in the corresponding GitHub repository.
Moreover, the program uses OpenMPI message passing interface implementation to run different environment realizations in parallel on several CPU cores during training.
To correctly set up the code on Linux (Ubuntu), please follow the present instructions:
-
First, you need Python3 and the system packages CMake, OpenMPI and zlib. You can install all these packages with:
$ sudo apt-get update $ sudo apt-get install cmake openmpi-bin libopenmpi-dev python3-dev zlib1g-dev
-
After installing Anaconda, you can create a virtual environment with a specific Python version (3.6.10) by using conda:
$ conda create --name myenv python=3.6.10
where
myenv
is the name of the environment, and activate the environment with:$ conda activate myenv
When the conda environment is active, your shell prompt is prefixed with (myenv). The environment can be deactivated with:
$ conda deactivate
-
Install all required packages in the virtual environment by using pip and the requirement file in the current repository:
(myenv)$ cd /path-to-cloned-GitHub-repository/DL-rendezvous-guidance/ (myenv)$ pip install -r requirements.txt
-
Install the gym RL environment with pip:
(myenv)$ pip install -e gym-rendezvous
Now, you can verify that all packages were successfully compiled and installed by running the following command:
(myenv)$ python main_rendezvous_validate.py
If this command executes without any error, then your DL-rendezvous-guidance installation is ready for use.
The plots are realized by using gnuplot. For this reason, you may need to install gnuplot on your machine:
$ sudo apt-get install -y gnuplot
Eventually, a LaTeX distribution should be installed on the computer in order to correctly visualize all graphs and plots. As an example, to install Tex Live LaTeX distribution on Ubuntu use the following command.
$ sudo apt-get install texlive-full
The program is composed by 4 main Python scripts (main_rendezvous_(...).py
), the RL environment (in gym-rendezvous/
), created by using the OpenAI Gym toolkit, and a number of additional Python modules (in custom_modules/
) with useful functions.
Specifically:
-
main_rendezvous_RL.py
is the main file that must be called to train the DNN by RL.All the environment settings, the RL hyper-parameter values and the DNN configuration must be specified in an external text file with ad-hoc formatting, to be placed in folder
settings_files/
and given as input to the script. For example, to start training the agent by RL with the settings specified in filesettings_files/settingsRL.txt
, you can use the following command:(myenv)$ python main_rendezvous_RL.py --settings "settingsRL.txt"
The information that must be included in the settings file, together with the right formatting, are described in detail in
settings_files/README.md
. A sample settings file, namedsettingsRL.txt
, is already included in foldersettings_files/
for convenience; you can look at it to better understand how to prepare new "legal" settings files from scratch.The specific linear rendezvous mission to study must be specified in a .dat file to be placed in folder
missions/
. The file should contain at least three rows, the first containing the columns headers, and the other two specifying the time (t
), the relative chaser position (x
,y
,z
) and velocity (xdot
,ydot
,zdot
) in a RTN frame centered on the target body at the beginning and at the end of the mission, respectively. All quantities are assumed to be in s, km and km/s. The name of the mission file must be specified in the settings file. For example, fileMSR.dat
contains the initial and final chaser state for a typical Mars Sample Return mission.It is also possible to re-train by RL, with the same or different settings, a pre-trained model (i.e., DNN). In this case, the program must be called with the command:
(myenv)$ python main_rendezvous_RL.py --settings "settings-name.txt" --input_model_folder "relative-path-to-input-model-folder/"
where
settings-name.txt
is the name of the settings file which contains the new training settings, andreletive-path-to-input-model-folder/
is the relative path (i.e., starting from the cloned repo directory) of the folder which contains the pre-trained model, namedbest_model.zip
orfinal_model.zip
At the end of the training, the final RL-trained model (
final_model.zip
), the best RL-trained model found according to the evaluation callback (best_model.zip
), together with all other output files, are saved in directorysol_saved/(i)env_(j)nb_(k)Msteps_(l)/
, where numbers (i), (j) and (k) depend on training settings specified in the settings file, and number (l) depends on how many solutions of the same kind are already contained insol_saved/
folder. In presence of a pre-trained model, the outputs of training are instead saved in directorypath-to-input-model-folder/RL/(i)env_(j)nb_(k)Msteps_(l)/
. The output RL folder that is obtained by launching the script with settings filesettings_files/settingsRL.txt
can be already found in directorysol_saved/
. -
main_rendezvous_BC.py
is the main file that must be called to train the DNN by BC.All the environment settings, the BC hyper-parameter values and the DNN configuration must be specified in an external text file with ad-hoc formatting, to be placed in folder
settings_files/
and given as input to the script. For example, to start training the DNN by BC with the settings specified in filesettings_files/settingsBC.txt
, use the following command:(myenv)$ python main_rendezvous_BC.py --settings_BC "settingsBC.txt"
The information that must be included in the settings file, together with the right formatting, are described in detail in
settings_files/README.md
. A sample settings file, namedsettingsBC.txt
, is already included in foldersettings_files/
for convenience; you can look at it to better understand how to prepare new "legal" settings files from scratch. The training is performed by using the expert trajectories in the.npz
file contained in theexpert_traj/
folder, whose name is specified in the settings file. Two different training sets (cone20-S1.npz
,cone20-eyelash-S1.npz
) for the environment specified in filesettings_files/settingsBC.txt
are already included in theexpert_traj/
directory. Similar expert trajectories sets for the problem of interest can be directly created by the user by following this guide by Stable Baselines.It is also possible to pre-train a DNN by BC on a given environment, and, subsequently, re-train the same network with RL on the same (or a different) environment. In this case, the program must be called with the command:
(myenv)$ python main_rendezvous_BC.py --settings_BC "settings-BC-name.txt" --settings_RL "settings-RL-name.txt"
where
settings-BC-name.txt
andsettings-RL-name.txt
are the names of the settings file which contains the BC and RL training settings, respectively.At the end of the training, the final BC-trained model (
final_model.zip
), the best BC-trained model found according to the evaluation callback (best_model.zip
), together with all other output files, are saved in directorysol_saved/(i)traj_(j)nb_(k)epochs_(h)frac_(l)/
, where numbers (i), (j), (k) and (h) depend on training settings specified in the settings file, and number (l) depends on how many solutions of the same kind are already contained insol_saved/
folder. If the re-training by RL is also performed, the outputs of training are instead saved in directorysol_saved/BC+RL/(i)traj_(j)nb_(k)epochs_(h)frac_(l)/
. The output BC folder that is obtained by launching the script with settings filesettings_files/settingsBC.txt
can be already found in directorysol_saved/
. -
main_rendezvous_load.py
is the file that allows generating the output files and the plots referred to the nominal (robust) trajectory, obtained by running a trained policy in a deterministic version of the environment. For doing so, it is sufficient to run the command:(myenv)$ python main_rendezvous_load.py --folder "relative-path-to-trained-model-folder/"
All output files and graphs are saved in the same directory.
-
main_rendezvous_MC.py
is the file that allows performing a Monte Carlo simulation of a given policy in a stochastic environment. For doing so, it is sufficient to run the command:(myenv)$ python main_rendezvous_MC.py --folder "relative-path-to-trained-model-folder/"
The graphs and output files are saved in the same directory. The settings of the stochastic environment where the MC simulations are performed are taken from the
settings.txt
file already included in the input directory.
Both main_rendezvous_load.py
and main_rendezvous_MC.py
are, by default, called at the end of the training procedure triggered by both script main_rendezvous_RL.py
and main_rendezvous_BC.py
.
Enjoy the code!