Code repository for "Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation" by J. de Heuvel, F. Seiler, M. Bennewitz, in Proceedings of IEEE International on Human & Robot Interactive Communication (RO-MAN), 2024.
Paper website: https://www.hrl.uni-bonn.de/publications/2024/deheuvel24roman
The repository consists of:
- a python module, which bundles all dependencies and a CLI tool for model training, visualization and analysis => enquery_extensions
- examples for specific API use cases => examples/
The python module comes with an API which is described below. The enquery module contains the following files:
- enquery/analysis.py contains stream_plot logic
- enquery/cli.py contains the CLI initialization
- enquery/evaluation.py wraps the code for running evaluation episodes and calculating stats (i.e. CR, SR, ..)
- enquery/finetuning.py contains the finetuning loop
- enquery/reward_model.py contains all code to train reward models
- enquery/storage.py contains the DemonstrationStorage class which is used to generate queries for the user
- igibson_extensions/* are extensions to iGibson (i.e. tasks and rewards)
- sb3_extensions/* contains the feature extractor, Ensemble-architecture and saveable Reward Buffer
The next section describes all relevant functionality (and references examples/ when required)
The following commanda will install Miniconda3 and create a virtual environment. This step can be skipped if conda is already installed in the system
#Install miniconda
curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh
echo "export PATH=$HOME/miniconda3/bin:$PATH" >> .bashrc
conda update conda
# creating Virtual environment
conda create -y -n <Name of environment> python=3.8
conda activate <Name of environment>
The igibson package is one of the pre-requisites and it has to be installed in the system. The igibson installation can be found here.
The following commands will install all dependencies and required extensions. It will also install a CLI tool called "enquery" which can be used to train the raw model and the ensemble model.
cd enquery_extensions
pip install -r requirements.txt
pip install -e .Note: All the following commands has to be exectued from enquery_extensions folder.
We will use the CLI tool installed above to train a raw_model and save the buffer and SB3 model.
enquery raw_model train --config ./config/turtlebot_nav.yaml --timesteps 50000 --lr 0.0001 --model_path ./outputs/model_raw/ --buffer_path ./outputs/buffer_raw/Let's test the performance of the raw model and calculate the mean reward for n episodes.
enquery raw_model eval --config ./config/turtlebot_nav.yaml --model_path ./outputs/model_raw.zip --eval_episodes 5 --interactive Trueenquery ensemble train --config ./config/turtlebot_nav.yaml --alpha 0.05 --ensemble_size 3 --timesteps 5000 --model_path ./outputs/model_raw.zip --buffer_path ./outputs/buffer_raw.pkl --ensemble_path ./outputs/ensembles/We can generate demonstrations using the set of ensemble members. The demonstrations are saved in a file and can be used to analyse the ensemble behavior or to create queries for human feedback. An example of how to work with the saved "demonstrations_path" can be found at examples/analyse_buffer.iypn. Note, that the demonstrations should be filtered for loops and collisions before using them in a Human Feedback process. This is shown in examples/collect_demonstrations.ipynb.
enquery demonstrations generate --config ./config/turtlebot_nav.yaml --ensemble_path ./outputs/ensembles/ --demonstrations_path ./demonstrations.pkl --episodes 10This is implemented in the finetuning API. To see an example of how to use the finetuning api, please have a look at examples/training_api.ipynb
This is implemented in the finetuning API. To see an example of how to use the finetuning api, please have a look at examples/training_api.ipynb
The stream plot data for a single model can be generated by executing the following command. In the command, change <finetuned_model> with model name.
enquery stream_plot generate --config ./config/turtlebot_nav.yaml --model_path ./models/finetuned/<finetuned_model> --min_coordinate -6 --max_coordinate 6 --streamlines_data_path ./streamlines_data/ --heatmap_data_path ./heatmap_data/ --stream_plot_config_path ../examples/stream_plot_config.jsonThe plot data can also be generated for all the finetuned models, by executing the following command.
enquery stream_plot generate --config ./config/turtlebot_nav.yaml --model_path ./models/finetuned/ --min_coordinate -6 --max_coordinate 6 --streamlines_data_path ./streamlines_data/ --heatmap_data_path ./heatmap_data/ --stream_plot_config_path ../examples/stream_plot_config.jsonThe generate stream plot command creates a file, which contains the direction vectors and one folder which contains the heatmap data for all models contained in --model_path. The data can be plotted using matplotlib. An example can be found in examples/plot_streamplot.ipynb