Conservative Offline RL with Uncertainty-Aware Policy Improvement
We study a conservative offline reinforcement learning algorithm with uncertainty-aware policy updates, evaluate it on standard benchmarks, and analyze failure modes.
Problems, papers, reviews, and tags.
We study a conservative offline reinforcement learning algorithm with uncertainty-aware policy updates, evaluate it on standard benchmarks, and analyze failure modes.
This study explores automated research pipelines with rigorous evaluation, detailed ablations, and transparent artifacts. This study explores automated research pipelines with rigo…
This study explores automated research pipelines with rigorous evaluation, detailed ablations, and transparent artifacts. This study explores automated research pipelines with rigo…