Skip to content
This repository was archived by the owner on Apr 25, 2023. It is now read-only.

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Notifications You must be signed in to change notification settings

skyhookadventure/soft_optim

 
 

Repository files navigation

Soft Optimization

Use

Setup

First run the setup script (replace -j with the correct user):

sh ./setup.sh -j

Running a script

To run directly:

poetry run python soft_optim/fine_tune.py

To launch Accelerate use:

accelerate launch --config_file configs/deepspeed_configs/default_configs.yml examples/simulacra_tmp.py

About

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 92.6%
  • Shell 5.2%
  • Dockerfile 2.2%