Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MBRL #237

Draft
wants to merge 102 commits into
base: main
Choose a base branch
from
Draft

MBRL #237

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
e043d2f
First commit to re-weight
qiaoting159753 Mar 24, 2024
0beef63
Bring weights to device
qiaoting159753 Mar 24, 2024
58e36f3
Dyna Name error
qiaoting159753 Mar 24, 2024
60279b0
a line of comment
qiaoting159753 Mar 28, 2024
ffbc94e
Merge branch 'main' of https://github.com/UoA-CARES/cares_reinforceme…
qiaoting159753 Mar 28, 2024
26a4e2e
Update DYNA_SAC_Reweight.py
qiaoting159753 Apr 3, 2024
3bd0bf2
Merge branch 'main' of https://github.com/UoA-CARES/cares_reinforceme…
qiaoting159753 Apr 11, 2024
89b9fa2
Merge branch 'main' of https://github.com/UoA-CARES/cares_reinforceme…
qiaoting159753 Apr 22, 2024
58f9b2c
nothing
qiaoting159753 Apr 22, 2024
1ab2793
Change the SAC configurations and alpha learning rate.
qiaoting159753 Apr 22, 2024
65901f4
Correct refer to the parameters for SAC and DynaSAC
qiaoting159753 Apr 22, 2024
ca21ca3
SAC sample to forward.
qiaoting159753 Apr 22, 2024
b1abac3
set G_model to float for now.
qiaoting159753 Apr 22, 2024
d726a48
Merge branch 'refactor/sac_alpha_lr' of https://github.com/UoA-CARES/…
qiaoting159753 Apr 22, 2024
24d0f29
set G_model to float for now.
qiaoting159753 Apr 22, 2024
e42994b
Changed sum tree position
qiaoting159753 Apr 22, 2024
a01b112
Change different way of training and using reward networks.
qiaoting159753 Apr 22, 2024
9badc15
Reweight
qiaoting159753 Apr 22, 2024
fcce509
network_factory.py add alpha_lr parameter for SAC.
qiaoting159753 Apr 23, 2024
157a9b7
Update network_factory.py
qiaoting159753 Apr 24, 2024
5e304e6
Merge branch 'main' of https://github.com/UoA-CARES/cares_reinforceme…
qiaoting159753 Apr 24, 2024
32bcb07
Use Ensmeble of network and simple reward approximator.
qiaoting159753 Apr 24, 2024
137c5f6
reward network to device
qiaoting159753 Apr 24, 2024
aa3f958
reward network to device
qiaoting159753 Apr 28, 2024
b1ed6da
reward network to device
qiaoting159753 Apr 29, 2024
6727cf1
test with simple R.
qiaoting159753 Apr 29, 2024
437c8a8
test with simple R.
qiaoting159753 Apr 29, 2024
e55dc23
test with simple R.
qiaoting159753 Apr 29, 2024
b700836
Merge remote-tracking branch 'origin/main' into dev/weighted_loss
qiaoting159753 May 16, 2024
cae6ef4
Many algorithms.
qiaoting159753 May 16, 2024
3521a28
full variance.
qiaoting159753 May 16, 2024
c268897
Exacerbate the variance difference.
qiaoting159753 May 16, 2024
f68c9c4
Exacerbate the variance difference.
qiaoting159753 May 17, 2024
e83c123
Maximize the variance rescale.
qiaoting159753 May 17, 2024
05a56b2
Ablation exp
qiaoting159753 May 23, 2024
52d7027
Adjust algorithm.
qiaoting159753 Jun 12, 2024
78464b9
Adjust algorithm.
qiaoting159753 Jun 12, 2024
3e75218
clean up
qiaoting159753 Jun 16, 2024
99dd56e
clean up
qiaoting159753 Jun 16, 2024
ec841e4
gamma square
qiaoting159753 Jun 16, 2024
4cdf0e7
reweight_actor
qiaoting159753 Jun 22, 2024
1e722e3
reweight_actor
qiaoting159753 Jun 22, 2024
9a59f3f
Add baselines.
qiaoting159753 Jun 22, 2024
3445129
Add baselines.
qiaoting159753 Jun 23, 2024
7626722
Add baselines.
qiaoting159753 Jun 23, 2024
c1017d1
typo
qiaoting159753 Jun 23, 2024
82c95c7
typo
qiaoting159753 Jun 23, 2024
f14639e
typo
qiaoting159753 Jun 23, 2024
6491f54
typo
qiaoting159753 Jun 24, 2024
ec3f836
ablation_actor
qiaoting159753 Jul 6, 2024
ad8437f
typo
qiaoting159753 Jul 6, 2024
46e1ed8
typo
qiaoting159753 Jul 7, 2024
455c60a
typo
qiaoting159753 Jul 8, 2024
b22ad96
typo
qiaoting159753 Jul 12, 2024
3c53a92
Distinguish predict reward with next state and with current state and…
qiaoting159753 Jul 14, 2024
e67eebe
Clean Up and Add the Combo
qiaoting159753 Jul 16, 2024
dc948ad
Clean Up and Add the Combo
qiaoting159753 Jul 24, 2024
36dea8b
Tidy up ensemble models.
qiaoting159753 Aug 2, 2024
5e41d3f
steve
qiaoting159753 Aug 6, 2024
158c6cf
to device
qiaoting159753 Aug 6, 2024
777745e
naming convention
qiaoting159753 Aug 8, 2024
5f90177
naming convention
qiaoting159753 Aug 8, 2024
3b3449d
naming convention
qiaoting159753 Aug 8, 2024
9a2f73a
typo
qiaoting159753 Aug 8, 2024
aefeef9
typo
qiaoting159753 Aug 9, 2024
68803dd
typo
qiaoting159753 Aug 9, 2024
2124c71
typo
qiaoting159753 Aug 9, 2024
c866d80
typo
qiaoting159753 Aug 9, 2024
65d7d4c
typo
qiaoting159753 Aug 9, 2024
3b2d367
naming
qiaoting159753 Aug 9, 2024
2641b36
naming issue
qiaoting159753 Aug 9, 2024
295bcf0
space
qiaoting159753 Aug 9, 2024
5024671
Merge branch 'main' into dev/weighted_loss
qiaoting159753 Nov 14, 2024
641e2e0
Merge remote-tracking branch 'origin/main' into dev/weighted_loss
qiaoting159753 Dec 20, 2024
0e7f6af
merge
qiaoting159753 Dec 21, 2024
b7b1963
merge
qiaoting159753 Dec 21, 2024
0842673
merge
qiaoting159753 Dec 21, 2024
53b578d
merge
qiaoting159753 Dec 24, 2024
671e18c
merge
qiaoting159753 Dec 27, 2024
503addb
merge
qiaoting159753 Dec 27, 2024
1f398ee
merge
qiaoting159753 Dec 27, 2024
f22a069
merge
qiaoting159753 Dec 27, 2024
bd87ed5
Auto-format code 🧹🌟🤖
github-actions[bot] Dec 28, 2024
e983732
merge
qiaoting159753 Dec 29, 2024
659d6cd
Merge remote-tracking branch 'origin/dev/weighted_loss' into dev/weig…
qiaoting159753 Dec 29, 2024
051d3f0
merge
qiaoting159753 Dec 29, 2024
7884942
Auto-format code 🧹🌟🤖
github-actions[bot] Dec 29, 2024
3e2127b
Fix reward learning
qiaoting159753 Dec 30, 2024
5e25ad4
Merge remote-tracking branch 'origin/dev/weighted_loss' into dev/weig…
qiaoting159753 Dec 30, 2024
9a4cefa
Fix exploration
qiaoting159753 Dec 30, 2024
e3cf728
Fix exploration
qiaoting159753 Dec 30, 2024
971884c
Fix bounded exploration
qiaoting159753 Jan 1, 2025
86d9730
Auto-format code 🧹🌟🤖
github-actions[bot] Jan 1, 2025
4d1da67
Fix iw
qiaoting159753 Jan 5, 2025
82e017c
Auto-format code 🧹🌟🤖
github-actions[bot] Jan 5, 2025
2f5fc92
Fix iw
qiaoting159753 Jan 5, 2025
e9cb5c1
Merge remote-tracking branch 'origin/dev/weighted_loss' into dev/weig…
qiaoting159753 Jan 5, 2025
9e223d2
Auto-format code 🧹🌟🤖
github-actions[bot] Jan 5, 2025
eae247c
Fix iw
qiaoting159753 Jan 9, 2025
12e03c0
add yaos' for bounded exploration
qiaoting159753 Jan 9, 2025
e253741
Auto-format code 🧹🌟🤖
github-actions[bot] Jan 9, 2025
ef4c2e9
add yaos' for bounded exploration
qiaoting159753 Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
253 changes: 0 additions & 253 deletions cares_reinforcement_learning/algorithm/mbrl/DynaSAC.py

This file was deleted.

Loading
Loading