arXiv:2605.07278

Predictive but Not Plannable

RC-aux makes reconstruction-free latent world models more useful for long-horizon planning by teaching the representation what is reachable under a finite action budget.

Hokkaido University

Core idea

A latent model can predict locally and still fail the planner globally.

Latent world models are often trained with local predictive supervision, then deployed for goal-directed search over longer horizons. RC-aux targets that mismatch with lightweight supervision that preserves the LeWorldModel backbone while adding planning-aligned structure.

The objective combines multi-horizon open-loop prediction, budget-conditioned reachability supervision, and temporal hard negatives. At test time, the learned reachability signal can also guide the planner toward trajectories that are goal-directed and attainable.

Method

Two corrections for planning alignment

01

Time axis

Multi-horizon open-loop prediction trains the latent dynamics beyond one-step consistency, exposing the model to the rollout structure used by MPC-style search.

02

Space axis

Budget-conditioned reachability supervision encourages the latent space to separate states that are eventually reachable from those reachable within the current planning horizon.

03

Planning signal

A reachability-aware planner can use the learned signal as an auxiliary cost, favoring candidates that stay both goal-directed and feasible under the available budget.

Pixels
Latent state
Open-loop rollout
Reachability head
Planner
RC-aux method overview showing time alignment, space alignment, and reachability-aware planning.
Overview of RC-aux. Training aligns rollout time and reachability space, while planning favors rollouts that are both close to the goal and reachable.

Visualizations

Why Euclidean latent distance is not enough

The reference page uses large paper figures as the backbone of the story; here the figures are arranged to move from the latent failure case to qualitative rollouts.

Illustration of a blocked shortcut in state space and a misleading shortcut in latent space.
Latent shortcuts can look close under L2 distance while being unreachable within the action budget.
Trajectory overlays comparing LeWM and RC-aux on wall and tworoom examples.
Trajectory overlays show RC-aux reducing shortcut-like plans and producing paths that better respect obstacles and goals.
Temporal comparison of LeWM and RC-aux rollouts on Wall tasks.
Wall temporal rollouts show RC-aux moving through reachable corridors instead of taking latent shortcuts through obstacles.
Temporal comparison of LeWM and RC-aux rollouts on Cube manipulation tasks.
Cube temporal rollouts compare LeWM and RC-aux behavior from start through intermediate states to the goal.

Reported results

Gains across goal-conditioned pixel-control tasks

Mean +/- std over five fixed evaluation groups of 50 episodes. Matched deltas compare against LeWM-cont except Wall, which compares against local LeWM.

Wall +33.2

50.4 to 83.6

TwoRoom +9.2

88.8 to 98.0

Reacher +4.4

82.8 to 87.2

Cube +3.2

72.8 to 76.0

Bar chart comparing RC-aux against DINO-WM, PLDM, GCBC, IQL, IVL, LeWM, and LeWM-cont across five tasks.
Success-rate comparison across TwoRoom, Reacher, Push-T, Wall, and Cube. RC-aux improves the LeWM-style planning baseline most strongly on obstacle-heavy Wall.
Task LeWM LeWM-cont RC-aux Matched delta
TwoRoom 88.8 +/- 3.0 88.8 +/- 3.0 98.0 +/- 1.4 +9.2
Reacher 81.2 +/- 7.9 82.8 +/- 7.2 87.2 +/- 6.4 +4.4
Push-T 90.4 +/- 3.0 91.2 +/- 3.9 90.8 +/- 3.3 -0.4
Wall 50.4 +/- 6.5 -- 83.6 +/- 3.6 +33.2
Cube 72.4 +/- 5.9 72.8 +/- 5.2 76.0 +/- 7.5 +3.2

Code

Train, evaluate, and extend RC-aux

The repository includes pixel-control training, MPC evaluation, RC-aux objectives, configs, result summaries, and a LIBERO-Goal extension.

python eval.py --config-name=tworoom.yaml \
  cache_dir="$STABLEWM_HOME" \
  policy=tworoom_rcaux/rcaux_tworoom \
  +planner_override.use_reachability_cost=true \
  +planner_override.reachability_cost_weight=0.85

Citation

Reference the paper

@article{li2026predictive,
  title={Predictive but Not Plannable: RC-aux for Latent World Models},
  author={Li, Wenyuan and Li, Guang and Maeda, Keisuke and Ogawa, Takahiro and Haseyama, Miki},
  journal={arXiv preprint arXiv:2605.07278},
  year={2026}
}