Skip to content

HumanoidsBonn/EnQuery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Code repository for "Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation" by J. de Heuvel, F. Seiler, M. Bennewitz, in Proceedings of IEEE International on Human & Robot Interactive Communication (RO-MAN), 2024.

Paper website: https://www.hrl.uni-bonn.de/publications/2024/deheuvel24roman

Code structure

The repository consists of:

  • a python module, which bundles all dependencies and a CLI tool for model training, visualization and analysis => enquery_extensions
  • examples for specific API use cases => examples/

The python module comes with an API which is described below. The enquery module contains the following files:

  • enquery/analysis.py contains stream_plot logic
  • enquery/cli.py contains the CLI initialization
  • enquery/evaluation.py wraps the code for running evaluation episodes and calculating stats (i.e. CR, SR, ..)
  • enquery/finetuning.py contains the finetuning loop
  • enquery/reward_model.py contains all code to train reward models
  • enquery/storage.py contains the DemonstrationStorage class which is used to generate queries for the user
  • igibson_extensions/* are extensions to iGibson (i.e. tasks and rewards)
  • sb3_extensions/* contains the feature extractor, Ensemble-architecture and saveable Reward Buffer

The next section describes all relevant functionality (and references examples/ when required)

Pre-requisties for installation

Creating virtual environment

The following commanda will install Miniconda3 and create a virtual environment. This step can be skipped if conda is already installed in the system

#Install miniconda

curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh

echo "export PATH=$HOME/miniconda3/bin:$PATH" >> .bashrc 

conda update conda

# creating Virtual environment

conda create -y -n <Name of environment> python=3.8

conda activate <Name of environment>

Igibson installation

The igibson package is one of the pre-requisites and it has to be installed in the system. The igibson installation can be found here.

enquery and CLI usage

Install enquery_extension package

The following commands will install all dependencies and required extensions. It will also install a CLI tool called "enquery" which can be used to train the raw model and the ensemble model.

cd enquery_extensions
pip install -r requirements.txt
pip install -e .

Note: All the following commands has to be exectued from enquery_extensions folder.

Train raw model

We will use the CLI tool installed above to train a raw_model and save the buffer and SB3 model.

enquery raw_model train --config ./config/turtlebot_nav.yaml --timesteps 50000 --lr 0.0001 --model_path ./outputs/model_raw/ --buffer_path ./outputs/buffer_raw/

Evaluate raw model

Let's test the performance of the raw model and calculate the mean reward for n episodes.

enquery raw_model eval --config ./config/turtlebot_nav.yaml --model_path ./outputs/model_raw.zip --eval_episodes 5 --interactive True

Train ensemble

enquery ensemble train --config ./config/turtlebot_nav.yaml --alpha 0.05 --ensemble_size 3 --timesteps 5000  --model_path ./outputs/model_raw.zip --buffer_path ./outputs/buffer_raw.pkl --ensemble_path ./outputs/ensembles/

Generate demonstrations

We can generate demonstrations using the set of ensemble members. The demonstrations are saved in a file and can be used to analyse the ensemble behavior or to create queries for human feedback. An example of how to work with the saved "demonstrations_path" can be found at examples/analyse_buffer.iypn. Note, that the demonstrations should be filtered for loops and collisions before using them in a Human Feedback process. This is shown in examples/collect_demonstrations.ipynb.

enquery demonstrations generate --config ./config/turtlebot_nav.yaml --ensemble_path ./outputs/ensembles/ --demonstrations_path ./demonstrations.pkl --episodes 10

Train reward model

This is implemented in the finetuning API. To see an example of how to use the finetuning api, please have a look at examples/training_api.ipynb

Finetune agent

This is implemented in the finetuning API. To see an example of how to use the finetuning api, please have a look at examples/training_api.ipynb

Generate stream plot data

The stream plot data for a single model can be generated by executing the following command. In the command, change <finetuned_model> with model name.

enquery stream_plot generate --config ./config/turtlebot_nav.yaml --model_path ./models/finetuned/<finetuned_model> --min_coordinate -6 --max_coordinate 6 --streamlines_data_path ./streamlines_data/ --heatmap_data_path ./heatmap_data/ --stream_plot_config_path ../examples/stream_plot_config.json

The plot data can also be generated for all the finetuned models, by executing the following command.

enquery stream_plot generate --config ./config/turtlebot_nav.yaml --model_path ./models/finetuned/ --min_coordinate -6 --max_coordinate 6 --streamlines_data_path ./streamlines_data/ --heatmap_data_path ./heatmap_data/ --stream_plot_config_path ../examples/stream_plot_config.json

Generate stream plot

The generate stream plot command creates a file, which contains the direction vectors and one folder which contains the heatmap data for all models contained in --model_path. The data can be plotted using matplotlib. An example can be found in examples/plot_streamplot.ipynb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages