You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am having trouble finding ways to synchronize the checkpoint names configured by save_state with that used by load_state.
What I mean is when we save_state with project_configuration.automatic_checkpoint_naming = True, a checkpoint folder is created at output_dir/checkpoint_0 and the accelerator object keeps track of the checkpoint iterations with the class variable self.project_configuration.iteration at here.
If I reinitialize the accelerator object and load_state on, say, output_dir/checkpoint_5. The self.project_configuration.iteration is initialized at 0 for this new accelerator object. Therefore, if I do save_state, it saves to output_dir/checkpoint_0. Is there a way to synchronize this class variable during load_state so that I don't have to designate the exact checkpoint iteration?
The text was updated successfully, but these errors were encountered:
I am having trouble finding ways to synchronize the checkpoint names configured by
save_state
with that used byload_state
.What I mean is when we
save_state
withproject_configuration.automatic_checkpoint_naming = True
, a checkpoint folder is created atoutput_dir/checkpoint_0
and theaccelerator
object keeps track of the checkpoint iterations with the class variableself.project_configuration.iteration
at here.If I reinitialize the
accelerator
object and load_state on, say,output_dir/checkpoint_5
. Theself.project_configuration.iteration
is initialized at0
for this newaccelerator
object. Therefore, if I dosave_state
, it saves tooutput_dir/checkpoint_0
. Is there a way to synchronize this class variable duringload_state
so that I don't have to designate the exact checkpoint iteration?The text was updated successfully, but these errors were encountered: