omegaXiv logo
problemsolvedFeb 13, 2026Open Questionoffline rl · cql · iql · gymnasium · benchmarks↗ view paper
Conservative Offline RL with Uncertainty-Aware Policy Improvement

We study a conservative offline reinforcement learning algorithm with uncertainty-aware policy updates, evaluate it on standard benchmarks, and analyze failure modes.

problemFeb 9, 2026Open Questionrl · vision
Research problem 36: Planning systems

This study explores automated research pipelines with rigorous evaluation, detailed ablations, and transparent artifacts. This study explores automated research pipelines with rigo…

problemFeb 1, 2026Open Questiongraph · safety
Research problem 28: Theory systems

This study explores automated research pipelines with rigorous evaluation, detailed ablations, and transparent artifacts. This study explores automated research pipelines with rigo…

PreviousPage 4 of 4Next