[Bug Report] Obs and info semantics in PointMaze with continuing_task #258

younik · 2024-11-29T09:30:25Z

Describe the bug
The observation and info returned at the last step in PointMaze with continuing_task=True, aren't updated (i.e. they contain the old goal). This is not the intended general semantics: in a common RL loop, the agent will use the old observation to predict the action to go to the old goal, instead of the new one.

See related issue: Farama-Foundation/Minari#265
See:

Gymnasium-Robotics/gymnasium_robotics/envs/maze/point_maze.py

Lines 392 to 406 in 3719d9d

    
           def step(self, action): 
        
               obs, _, _, _, info = self.point_env.step(action) 
        
               obs_dict = self._get_obs(obs) 
        
               reward = self.compute_reward(obs_dict["achieved_goal"], self.goal, info) 
        
               terminated = self.compute_terminated(obs_dict["achieved_goal"], self.goal, info) 
        
               truncated = self.compute_truncated(obs_dict["achieved_goal"], self.goal, info) 
        
               info["success"] = bool( 
        
                   np.linalg.norm(obs_dict["achieved_goal"] - self.goal) <= 0.45 
        
               ) 
        
               # Update the goal position if necessary 
        
               self.update_goal(obs_dict["achieved_goal"]) 
        
               return obs_dict, reward, terminated, truncated, info

Code example
You need an expert policy to see this; check https://github.com/Farama-Foundation/minari-dataset-generation-scripts/blob/main/scripts/pointmaze/create_pointmaze_dataset.py

The text was updated successfully, but these errors were encountered:

younik changed the title ~~[Bug Report] Info semantics in PointMaze with continuing_task~~ [Bug Report] Obs and info semantics in PointMaze with continuing_task Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] Obs and info semantics in PointMaze with continuing_task #258

[Bug Report] Obs and info semantics in PointMaze with continuing_task #258

younik commented Nov 29, 2024

[Bug Report] Obs and info semantics in PointMaze with continuing_task #258

[Bug Report] Obs and info semantics in PointMaze with continuing_task #258

Comments

younik commented Nov 29, 2024