Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More portable conda environment #8

Open
wants to merge 308 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
308 commits
Select commit Hold shift + click to select a range
8917028
update setup.py
imgeorgiev Apr 10, 2023
1eecb31
More extensive gradient plots
imgeorgiev Apr 11, 2023
a58cd01
separated alg and env cfgs, added q-value critic to shac2, warp cartp…
krishpop Apr 12, 2023
9d37e7d
update grad plot and collect scripts with more info
imgeorgiev Apr 12, 2023
fd80d3c
correct multi dim gradient plotting
imgeorgiev Apr 12, 2023
175c0da
updated grad_collect to use env for env params, fix ag_return_body de…
krishpop Apr 12, 2023
811b935
update multistep script
imgeorgiev Apr 12, 2023
614a652
update envs
imgeorgiev Apr 12, 2023
9cf7dc9
Merge branch 'benchmarks' of github.com:krishpop/SHAC into iter-grad
imgeorgiev Apr 12, 2023
85117c4
update grad_collect_iter with rew.backward()
imgeorgiev Apr 13, 2023
585d464
merge?
imgeorgiev Apr 14, 2023
115cd5b
update conda env and add debug
imgeorgiev Apr 18, 2023
c39fc22
add debug
imgeorgiev Apr 18, 2023
a4d1e24
formatting
imgeorgiev Apr 18, 2023
41b9957
Revert cartpole rotation
imgeorgiev Apr 18, 2023
f3b4454
added mpc baseline
krishpop Apr 20, 2023
b918478
grad bounce working
imgeorgiev Apr 21, 2023
b81771f
add ground bounce example
imgeorgiev Apr 23, 2023
8f861d4
added datasetq and qcriticmlp for shac2
krishpop Apr 23, 2023
4079d37
reformatted, fixed imports
krishpop Apr 23, 2023
6e5e814
added shac2 implementation with q-function, and model-based/model-fre…
krishpop Apr 23, 2023
841a816
fixes to mpc loading of joints/bodies
krishpop Apr 23, 2023
99e7fdb
fixed ant yaml for shac2 with q-learning
krishpop Apr 23, 2023
4c51f71
fix shac2 memory leak
krishpop Apr 25, 2023
ea01361
Fixed memory leak in shac2, modified configs
krishpop Apr 26, 2023
e92e067
Added done | contact_changed bool to reset envs
krishpop Apr 26, 2023
2dfa547
added truncate horizon on contact to dflex hopper
krishpop Apr 26, 2023
d602088
add jacobians
imgeorgiev Apr 27, 2023
d3a03fd
split up bounce env
imgeorgiev Apr 28, 2023
2631177
update env configs
imgeorgiev Apr 28, 2023
3b22799
make dflex cartpole match warp cartpole params
imgeorgiev Apr 28, 2023
766ca58
update repo settings
imgeorgiev Apr 28, 2023
30d6667
added hopper_warp and updated cartpole cfgs
krishpop Apr 29, 2023
289135b
add grad_bounce optimization
imgeorgiev Apr 29, 2023
520691c
simplify optimization script
imgeorgiev May 1, 2023
968d92f
update grad_collect.py
imgeorgiev May 1, 2023
3b65ef3
Merge branch 'benchmarks' of github.com:krishpop/SHAC into iter-grad
imgeorgiev May 1, 2023
c6a57f3
updated hopper yaml
krishpop May 1, 2023
bbd4466
format load utils
imgeorgiev May 1, 2023
7b2f61d
Merge branch 'benchmarks' of github.com:krishpop/SHAC into iter-grad
imgeorgiev May 1, 2023
2298ee3
fixed cfg loading for train, and env configs
krishpop May 2, 2023
aaa5383
Updated ant and hopper_warp configs
krishpop May 2, 2023
a59b3b7
Added mpc claw env for correct instantiation
krishpop May 2, 2023
c36b338
add test jacobian function
imgeorgiev May 2, 2023
95d5fc9
add dynamic clearing of gradient tape
imgeorgiev May 2, 2023
d7e52ec
add unit test to test jacobian
imgeorgiev May 2, 2023
8fb1d9d
cleanup
imgeorgiev May 2, 2023
4a28d1d
Merge branch 'benchmarks' of github.com:krishpop/SHAC into iter-grad
imgeorgiev May 2, 2023
05c9e5e
Removed SHAC Ant and Hopper configurations and updated Hopper and Ant…
krishpop May 2, 2023
c7c622d
update claw and cartpole configs, save rewards in mpc
krishpop May 2, 2023
0adfdf3
hack to get jacobians in dflex
imgeorgiev May 3, 2023
dfee8b6
update jacobian functions
imgeorgiev May 3, 2023
efdddc7
Added contact truncation directly to dflex model/sim, testing on hopper
krishpop May 3, 2023
9d64593
Enable Wandb and add support for contact truncation in SHAC algorithm
krishpop May 3, 2023
f9d014a
added dflex utils and contact_truncation option
krishpop May 4, 2023
4e266e9
added contact ke/kd to hopper, multi run with slurm launcher
krishpop May 7, 2023
54762d5
added contact ke / kd sweep
krishpop May 8, 2023
07f7fff
add akk grad bounce experiments
imgeorgiev May 8, 2023
209f34d
fix ppo flags
krishpop May 8, 2023
1d9b4e4
stabalize shac and hopper env
imgeorgiev May 9, 2023
d40b128
stabalize configs
imgeorgiev May 9, 2023
e1a70c8
correct shac rollout len config
imgeorgiev May 9, 2023
85cfc9b
ahac v0
imgeorgiev May 9, 2023
4ebcbf0
ahac v0 cleanup
imgeorgiev May 9, 2023
6b07394
add grad bounce clipping notebook
imgeorgiev May 10, 2023
588c784
update ball env analysis
imgeorgiev May 10, 2023
99a2ffa
add jacobian compute to cheetah
imgeorgiev May 10, 2023
c3701d2
Add contact_truncation parameter to SHAC algorithm
krishpop May 11, 2023
78184c1
added contact ke/kd/truncation to ant
krishpop May 12, 2023
2c39cef
tweaks to contact kd in hopper, stiffness sweep
krishpop May 12, 2023
7ddd3bc
more robust jacobian computation
imgeorgiev May 13, 2023
2b270da
semi-working AHAC
imgeorgiev May 13, 2023
610bf40
AHAC value learning
imgeorgiev May 14, 2023
e9aa5c3
remove shac q-critic option
imgeorgiev May 14, 2023
3171406
add fps logging
imgeorgiev May 14, 2023
955648d
correct ahac actor loss
imgeorgiev May 14, 2023
87fd9fb
shac cleanup and bring closer to ahac
imgeorgiev May 14, 2023
8d0d9e9
[temp] semi-adaptive AHAC
imgeorgiev May 15, 2023
9a382a2
sweeps moved to hidden
krishpop May 16, 2023
ec1faf0
added doubleqcriticmlp and options for critic networks in shac.yaml
krishpop May 17, 2023
91ea192
updated shac2 and shac to conform to warp envs syntax
krishpop Jun 7, 2023
960c0d3
Merge pull request #1 from krishpop/warp-envs
krishpop Jun 7, 2023
85904e2
fixed shac2 q-critic
krishpop Jun 7, 2023
8666ec5
update ball env experiments
imgeorgiev Jun 21, 2023
a48d653
update envs with jacobian
imgeorgiev Jun 21, 2023
665acb8
restructure ball env
imgeorgiev Jun 21, 2023
7dd6bb0
minor ahac cleanup
imgeorgiev Jun 21, 2023
d150a4c
add ahac explanation header
imgeorgiev Jun 21, 2023
5111daf
remove runaway pdfs
imgeorgiev Jun 21, 2023
0531ffc
revert minor changes to not mess with PR
imgeorgiev Jun 21, 2023
b42f92c
update ball env to work standalone
imgeorgiev Jun 29, 2023
f90f7a8
update dflex/warp jac tests
imgeorgiev Jun 29, 2023
a59045f
update dflex and warp grad collect scripts
imgeorgiev Jun 29, 2023
dceb755
add hopper analysis notebook
imgeorgiev Jun 29, 2023
c276c46
Merge pull request #2 from krishpop/iter-grad
imgeorgiev Jun 29, 2023
e7659e6
Merge branch 'benchmarks' of github.com:krishpop/SHAC into stabalise
imgeorgiev Jul 6, 2023
ff5e7e6
merge changes
imgeorgiev Jul 6, 2023
a63779b
update AHAC
imgeorgiev Jul 7, 2023
8ae9103
cleanup train.py and simplify PPO instantiation
imgeorgiev Jul 7, 2023
3dd20d7
simplify configs
imgeorgiev Jul 7, 2023
e4757e2
update AHAC and SHAC2 to new format and patchup multirun
imgeorgiev Jul 11, 2023
5723058
ppo sweeping changes
imgeorgiev Jul 18, 2023
d3a1804
streamline environments
imgeorgiev Jul 21, 2023
5448028
slightly clean up shac
imgeorgiev Jul 25, 2023
c4981b4
fix wandb crashed runs
imgeorgiev Jul 27, 2023
b288142
shac simpler logging
imgeorgiev Jul 27, 2023
8b39514
blackify
imgeorgiev Jul 27, 2023
08bcea1
working AHAC for single env
imgeorgiev Jul 28, 2023
723e55d
fix rendering and simplify train
imgeorgiev Jul 31, 2023
3c1ed2e
add contact count and forces logging
imgeorgiev Aug 1, 2023
e885c9e
training continuation fixes
imgeorgiev Aug 1, 2023
d7cf894
fix wandb logging for multiruns
imgeorgiev Aug 3, 2023
2e03536
add jacobian normalization to envs
imgeorgiev Aug 3, 2023
d219c5a
default nan fix for humanoid
imgeorgiev Aug 3, 2023
38991bf
Merge pull request #3 from krishpop/stabalise
krishpop Aug 6, 2023
fee209f
better env inheretence and add reset_all option
imgeorgiev Aug 8, 2023
719baed
better term/trunc interface
imgeorgiev Aug 9, 2023
c2a2e73
working ahac for 1 env
imgeorgiev Aug 10, 2023
6cba2fc
Add critic grad norm to SHAC
imgeorgiev Aug 11, 2023
f4d7e5e
single env ahac cleanup
imgeorgiev Aug 11, 2023
c2c9aaf
update shac2 to new interface
imgeorgiev Aug 11, 2023
ea6103e
tech correct termination interface
imgeorgiev Aug 11, 2023
f75ebb5
:coffee:
imgeorgiev Aug 11, 2023
95d5372
jacobian logging + minor
imgeorgiev Aug 17, 2023
4806a2f
added hopper warp env
krishpop Aug 20, 2023
cb09ab2
cleanup AHAC-1
imgeorgiev Aug 26, 2023
ed3b420
AHAC2 v0.1
imgeorgiev Aug 26, 2023
030c26c
add ahac2 and ahac5
imgeorgiev Sep 7, 2023
d8db4b9
imported reward configs and warp repose env
krishpop Sep 7, 2023
8c1db3a
Merge branch 'ahac' of github.com:krishpop/SHAC into ahac
krishpop Sep 8, 2023
a46918f
SHAC small cleanup
imgeorgiev Sep 9, 2023
9efc9b8
increase num_envs
imgeorgiev Sep 9, 2023
c3b3512
add model.py comments
imgeorgiev Sep 9, 2023
7eb72ed
ahac2/5 double iterative critic & QoL
imgeorgiev Sep 9, 2023
51f0556
remove hydra_resolvers from shac.envs init
krishpop Sep 14, 2023
ddc9e4a
update humanoid_snu and ppo
imgeorgiev Sep 17, 2023
1273ce3
add SVG
imgeorgiev Sep 17, 2023
1f77962
update grad_bounce
imgeorgiev Sep 20, 2023
8a3ca3c
config updates + snu_humanoid fix
imgeorgiev Sep 20, 2023
99b7475
snuhumanoid fix
imgeorgiev Sep 20, 2023
cac23af
rlgames compat for dflexenv
krishpop Sep 20, 2023
d4b7a0e
update train for SVG
imgeorgiev Sep 21, 2023
adf96a0
update svg submodule
imgeorgiev Sep 21, 2023
7a5859d
add dmanip configs
imgeorgiev Sep 21, 2023
f54c6e6
Merge remote-tracking branch 'origin/ahac' into ahac-warp
krishpop Sep 22, 2023
3bcd473
add anymal env v0
imgeorgiev Sep 22, 2023
d7c51ad
updated repose yaml
krishpop Sep 22, 2023
a725d23
imported reward configs and warp repose env
krishpop Sep 7, 2023
06539b5
remove hydra resolvers
krishpop Sep 22, 2023
fdc087c
fix configs
krishpop Sep 22, 2023
99bf86e
review changes
krishpop Sep 22, 2023
6280625
Merge pull request #4 from krishpop/ahac-warp
imgeorgiev Sep 22, 2023
0378b25
:coffee:
imgeorgiev Sep 22, 2023
12b9c4a
stable animal env
imgeorgiev Sep 23, 2023
3b2a5e4
add SAC and tune some params
imgeorgiev Sep 26, 2023
8b3c5ef
ahac-1 fix
imgeorgiev Sep 26, 2023
916f5f9
ahac contact norm and config
imgeorgiev Sep 26, 2023
5caff28
add more svg baselines
krishpop Oct 17, 2023
03664b4
add heaviside examples
imgeorgiev Dec 14, 2023
1b600bf
blackify
imgeorgiev Jul 27, 2023
a74efa8
working AHAC for single env
imgeorgiev Jul 28, 2023
f8e25e3
fix rendering and simplify train
imgeorgiev Jul 31, 2023
2f2fdc3
add contact count and forces logging
imgeorgiev Aug 1, 2023
38134c0
training continuation fixes
imgeorgiev Aug 1, 2023
aa48602
fix wandb logging for multiruns
imgeorgiev Aug 3, 2023
24b911a
add jacobian normalization to envs
imgeorgiev Aug 3, 2023
6ce057a
default nan fix for humanoid
imgeorgiev Aug 3, 2023
8e1eca7
better env inheretence and add reset_all option
imgeorgiev Aug 8, 2023
ca7ccdc
better term/trunc interface
imgeorgiev Aug 9, 2023
27c329c
working ahac for 1 env
imgeorgiev Aug 10, 2023
117c0e6
Add critic grad norm to SHAC
imgeorgiev Aug 11, 2023
f1c44f9
single env ahac cleanup
imgeorgiev Aug 11, 2023
cf76550
update shac2 to new interface
imgeorgiev Aug 11, 2023
05f0a6e
tech correct termination interface
imgeorgiev Aug 11, 2023
248bd33
:coffee:
imgeorgiev Aug 11, 2023
7be599a
jacobian logging + minor
Jan 29, 2024
4166ff9
imported reward configs and warp repose env
krishpop Sep 7, 2023
a3e6e02
cleanup AHAC-1
imgeorgiev Aug 26, 2023
24f8cae
AHAC2 v0.1
imgeorgiev Aug 26, 2023
0733326
add ahac2 and ahac5
imgeorgiev Sep 7, 2023
5a0c668
remove hydra_resolvers from shac.envs init
krishpop Sep 14, 2023
3c67165
rlgames compat for dflexenv
krishpop Sep 20, 2023
cdfc32e
SHAC small cleanup
imgeorgiev Sep 9, 2023
5cb60fa
increase num_envs
imgeorgiev Sep 9, 2023
baf210f
add model.py comments
imgeorgiev Sep 9, 2023
7a6a1ed
ahac2/5 double iterative critic & QoL
imgeorgiev Sep 9, 2023
588cc98
update humanoid_snu and ppo
imgeorgiev Sep 17, 2023
8ed9d2c
add SVG
imgeorgiev Sep 17, 2023
f05b699
update grad_bounce
imgeorgiev Sep 20, 2023
6908fb9
config updates + snu_humanoid fix
imgeorgiev Sep 20, 2023
06462bc
snuhumanoid fix
imgeorgiev Sep 20, 2023
945499a
update train for SVG
imgeorgiev Sep 21, 2023
c814c3c
update svg submodule
imgeorgiev Sep 21, 2023
35a89c3
add dmanip configs
imgeorgiev Sep 21, 2023
ed37c08
updated repose yaml
krishpop Sep 22, 2023
2a79c62
imported reward configs and warp repose env
krishpop Sep 7, 2023
e0a658e
remove hydra resolvers
krishpop Sep 22, 2023
5b40fec
fix configs
krishpop Sep 22, 2023
b16ccea
review changes
krishpop Sep 22, 2023
bdd8269
add name arg to ahac/shac
Jan 29, 2024
e933e64
upkeep
krishpop Feb 12, 2024
12a5948
shac common hyper-parameters and upright reward tune
krishpop Feb 12, 2024
f5c1842
integrate dflex envs
Feb 28, 2024
99eeea2
Merge branch 'ahac' into main
Feb 28, 2024
5b36e17
Revert "Merge branch 'ahac' into main"
Feb 28, 2024
fb0e1e1
Revert "Revert "Merge branch 'ahac' into main""
Feb 28, 2024
f8f81fb
upgrade to urchin
imgeorgiev Mar 1, 2024
4998c41
add dflex dependenices
imgeorgiev Mar 1, 2024
66b19f5
upgrade dflex to python 3.9+
imgeorgiev Mar 1, 2024
8db5b95
include assets
imgeorgiev Mar 1, 2024
94749ad
take 2
imgeorgiev Mar 1, 2024
ce301d1
take 3
imgeorgiev Mar 1, 2024
4ca90c3
take 4
imgeorgiev Mar 1, 2024
fdef70b
take 5
imgeorgiev Mar 1, 2024
1b6523f
take 6
imgeorgiev Mar 1, 2024
5d4a807
take 7
imgeorgiev Mar 1, 2024
c8d4baa
I'm lost
imgeorgiev Mar 1, 2024
2b686c8
take 8
imgeorgiev Mar 1, 2024
8dccfbb
setup works! cleanup!
imgeorgiev Mar 1, 2024
02b7f26
support for td-mpc
imgeorgiev Mar 7, 2024
31ddb66
update other envs
imgeorgiev Mar 8, 2024
f7f28e3
update anymal env
imgeorgiev Mar 8, 2024
6e78459
add anymal assets
imgeorgiev Mar 8, 2024
952a13c
add anymal assets
imgeorgiev Mar 8, 2024
70a57e5
parametarize env rewards
imgeorgiev Mar 12, 2024
1755095
add primals and smooth rewards
imgeorgiev Mar 12, 2024
62a5132
add log-barrier-clip
imgeorgiev Mar 12, 2024
29ba65c
remove src/shac/envs
imgeorgiev Mar 26, 2024
a8bd600
remove examples folder
imgeorgiev Mar 26, 2024
1468b9b
remove outdated scripts and configs
imgeorgiev Mar 26, 2024
247f324
update conda env file
imgeorgiev Mar 26, 2024
0b696bd
update configs to use dflex configs
imgeorgiev Mar 26, 2024
2d4cab6
rl_games compatibility
imgeorgiev Mar 26, 2024
94f7ac6
remove rl_games fork
imgeorgiev Mar 26, 2024
e328c20
cleanup algorithms
imgeorgiev Mar 26, 2024
4666a3a
alg update
imgeorgiev Mar 26, 2024
6537d50
train script cleanup
imgeorgiev Mar 26, 2024
f666443
update readme
imgeorgiev Mar 26, 2024
9f20398
typo
imgeorgiev Mar 26, 2024
6b1dedb
add video
imgeorgiev Mar 26, 2024
53a3227
readme typo
imgeorgiev Mar 26, 2024
c73c095
update title
imgeorgiev Mar 26, 2024
8abefba
update heaviside example
imgeorgiev Mar 27, 2024
9b09828
add results and figures
imgeorgiev Mar 29, 2024
00b2671
add heaviside figures
imgeorgiev Mar 29, 2024
6bf5e2d
update cartpole to new format
imgeorgiev Apr 21, 2024
0410ec5
add double pendulum
imgeorgiev Apr 23, 2024
da35b49
add easy access to DoublePendulumEnv class
imgeorgiev Apr 23, 2024
d4c154f
double pendulum fix
imgeorgiev Apr 23, 2024
bb59db5
force_reset bugfix
imgeorgiev May 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
27 changes: 27 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
**__pycache__/
**.ipynb_checkpoints/
*outputs/
*.swp
*.swo
tags
**.out
**.log
**.pdf

dflex/dflex/kernels*/
**logs/
jobs/

**data/
**.egg-info/

wandb/
checkpoints/
multirun/

scripts/sweeps/
scripts/outputs
scripts/runs/
good-shit

**.DS_Store
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "externals/svg"]
path = externals/svg
url = [email protected]:imgeorgiev/svg.git
6 changes: 6 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter"
},
"python.formatting.provider": "none",
}
110 changes: 33 additions & 77 deletions README.md
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,110 +1,66 @@
# SHAC
# Adaptive Horizon Actor Critic (AHAC)

This repository contains the implementation for the paper [Accelerated Policy Learning with Parallel Differentiable Simulation](https://short-horizon-actor-critic.github.io/) (ICLR 2022).
This repository contains the implementation for the paper [Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation](https://adaptive-horizon-actor-critic.github.io/) (ICML 2024).

In this paper, we build on previous work in differentiable simulation policy optimization, to create Adaptive Horizon Actor Critic (AHAC). Our approach deals with gradient error arising from stiff contact by dynamically adapting its model-based horizon to fit one robot gait and avoid excessive contact. This results in a higher performant and easier to use algorithm than its predecessor [Short Horizon Actor Critic (SHAC)](https://short-horizon-actor-critic.github.io/) while also outperofming PPO by 40% across a set of high-dimensional locomotion tasks.


In this paper, we present a GPU-based differentiable simulation and propose a policy learning method named SHAC leveraging the developed differentiable simulation. We provide a comprehensive benchmark set for policy learning with differentiable simulation. The benchmark set contains six robotic control problems for now as shown in the figure below.

<p align="center">
<img src="figures/envs.png" alt="envs" width="800" />
</p>
[![Watch the video](figures/envs.png)](https://adaptive-horizon-actor-critic.github.io/media/all_envs_trimmed.mp4)

## Installation

- `git clone https://github.com/NVlabs/DiffRL.git --recursive`

- The code has been tested on
- Operating System: Ubuntu 16.04, 18.04, 20.04, 21.10, 22.04
- Python Version: 3.7, 3.8
- GPU: TITAN X, RTX 1080, RTX 2080, RTX 3080, RTX 3090, RTX 3090 Ti

#### Prerequisites

- In the project folder, create a virtual environment in Anaconda:

```
conda env create -f diffrl_conda.yml
conda activate shac
```

- dflex
`git clone https://github.com/imgeorgiev/DiffRL --recursive`

```
cd dflex
pip install -e .
```

- rl_games, forked from [rl-games](https://github.com/Denys88/rl_games) (used for PPO and SAC training):

````
cd externals/rl_games
pip install -e .
````

- Install an older version of protobuf required for TensorboardX:
````
pip install protobuf==3.20.0
````

#### Test Examples

A test example can be found in the `examples` folder.
Setup this project with Anaconda
```
conda env create -f environment.yml
conda activate diffrl
pip install -e dflex
pip install -e .
```

For an unknown reason, you need to symlink cuda libraries for ninja to work:
```
python test_env.py --env AntEnv
ln -s $CONDA_PREFIX/lib $CONDA_PREFIX/lib64
```

If the console outputs `Finish Successfully` in the last line, the code installation succeeds.
If you want SVG as a baseline:

```
pip install -e externals/svg
```

## Training

Running the following commands in `examples` folder allows to train Ant with SHAC.
```
python train_shac.py --cfg ./cfg/shac/ant.yaml --logdir ./logs/Ant/shac
python train.py alg=ahac env=ant
```

We also provide a one-line script in the `examples/train_script.sh` folder to replicate the results reported in the paper for both our method and for baseline method. The results might slightly differ from the paper due to the randomness of the cuda and different Operating System/GPU/Python versions. The plot reported in paper is produced with TITAN X on Ubuntu 16.04.

#### SHAC (Our Method)
where you can change `alg` and `env` freely based in the provided hydra configurations.

For example, running the following commands in `examples` folder allows to train Ant and SNU Humanoid (Humanoid MTU in the paper) environments with SHAC respectively for 5 individual seeds.
The training script outputs tensorboard logs by default. If you want to use wandb, you can add the additional flag `general.run_wandb=True` and specify `wandb.project=<name>` `wnadb.entity=<entity>`.

```
python train_script.py --env Ant --algo shac --num-seeds 5
```
Note that dflex is not fully deterministic due to GPU acceleration and cannot reproduce the same results given then same seed.

```
python train_script.py --env SNUHumanoid --algo shac --num-seeds 5
```

#### Baseline Algorithms
## Testing

For example, running the following commands in `examples` folder allows to train Ant environment with PPO implemented in RL_games for 5 individual seeds,
You can load a policy and evluate it without training. Works only for AHAC and SHAC algorithms.

```
python train_script.py --env Ant --algo ppo --num-seeds 5
python train.py alg=ahac env=ant train=False checkpoint=<policy_path>
```

## Testing
You can also control the number of eval episodes with `env.player.games_num=10`.

To test the trained policy, you can input the policy checkpoint into the training script and use a `--play` flag to indicate it is for testing. For example, the following command allows to test a trained policy (assume the policy is located in `logs/Ant/shac/policy.pt`)
## Generating rendering files

```
python train_shac.py --cfg ./cfg/shac/ant.yaml --checkpoint ./logs/Ant/shac/policy.pt --play [--render]
```
The `general.render` flag indicates whether to export the video of the task execution. If does, the exported video is encoded in `.usd` format, and stored in the `examples/output` folder. To visualize the exported `.usd` file, refer to [USD at NVIDIA](https://developer.nvidia.com/usd).

The `--render` flag indicates whether to export the video of the task execution. If does, the exported video is encoded in `.usd` format, and stored in the `examples/output` folder. To visualize the exported `.usd` file, refer to [USD at NVIDIA](https://developer.nvidia.com/usd).
```python
python train.py alg=ahac env=ant general.train=False general.render=True general.checkpoint=<policy_path> env.config.stochastic_init=False env.player.games_num=1 env.player.num_actors=1 env.config.num_envs=1 alg.eval_runs=1
```

## Citation
Once you have generated a rendering file you can load it in USD Composer to generate a image or video render like the one above. To install Omniverse, follow the [Omniverse Install Page](https://www.nvidia.com/en-us/omniverse/download/). Then install [USD Composer](https://www.nvidia.com/en-us/omniverse/apps/create/) from the Omniverse GUI. Start USD Composer and load the usd files generated by the script above.

If you find our paper or code is useful, please consider citing:
```kvk
@inproceedings{xu2021accelerated,
title={Accelerated Policy Learning with Parallel Differentiable Simulation},
author={Xu, Jie and Makoviychuk, Viktor and Narang, Yashraj and Ramos, Fabio and Matusik, Wojciech and Garg, Animesh and Macklin, Miles},
booktitle={International Conference on Learning Representations},
year={2021}
}
```

Loading