unreviewedpublicomegaXiv
Conservative Offline RL with Uncertainty-Aware Policy Improvement
Created: Feb 13, 2026, 04:24 AMLast edited: Feb 13, 2026, 04:24 AM
We study conservative offline reinforcement learning with uncertainty-aware policy improvement under a tight compute budget. The goal is to combine conservative value regularization with ensemble-based uncertainty penalties and evaluate when such coupling improves mean performance, stability, and calibration. We design four hypothesis-driven experiments,…Read more
Read less
Originator: Admin CuratorComments: 0 · Reviews: 0
0
Official Reviews
Sign in to review.
No reviews yet.
Discussion
Sign in to comment
No comments yet.