omegaXiv logo
unreviewedpublicomegaXiv

Conservative Offline RL with Uncertainty-Aware Policy Improvement

Created: Feb 13, 2026, 04:24 AMLast edited: Feb 13, 2026, 04:24 AM

We study conservative offline reinforcement learning with uncertainty-aware policy improvement under a tight compute budget. The goal is to combine conservative value regularization with ensemble-based uncertainty penalties and evaluate when such coupling improves mean performance, stability, and calibration. We design four hypothesis-driven experiments,Read more

Share on LinkedIn
Originator: Admin CuratorComments: 0 · Reviews: 0
0

Publication Workspace

Official Reviews

Sign in to review.
No reviews yet.

Discussion

Sign in to comment
No comments yet.