Research

研究

  • Home
  • 研究業績
  • 強化学習
  • 研究業績

    カテゴリー

    研究領域

    • Does “Do Differentiable Simulators Give Better Policy Gradients?” Give Better Policy Gradients?

      Ku Onoda, Paavo Parmas, Manato Yaguchi, Yutaka Matsuo

      International Conference on Learning Representations 2026 (ICLR2026)

    • Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learnin

      Ru Wang, Wei Huang, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

      International Conference on Learning Representations 2026 (ICLR2026)

    • Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

      Toshinori Kitamura, Arnob Ghosh, Tadashi Kozuno, Wataru Kumagai, Kazumi Kasaura, Kenta Hoshino, Yohei Hosoe, Yutaka Matsuo.

      Advances in Neural Information Processing Systems (NeurIPS 2025)_Spotlight

    • Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form

      Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai, Kenta Hoshino, Yohei Hosoe, Kazumi Kasaura, Masashi Hamaya, Paavo Parmas, Yutaka Matsuo

      International Conference on Learning Representations (ICLR 2025)

    • Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

      Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Remi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvari, Wataru Kumagai, Yutaka Matsuo

      International Conference on Machine Learning (ICML 2023). July 2023.

    • Generalized Decision Transformer for Offline Hindsight Infomation Matching

      Hiroki Furuta, Yutaka Matsuo, and Shixiang Shane Gu

      International Conference on Learning Representations 2022 (ICLR2022, Spotlight).

    • Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

      Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, Yutaka Matsuo, and Shixiang Shane Gu.

      Advances in Neural Information Processing Systems 2021 (NeurIPS2021). December 2021.

    • Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

      Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, and Shixiang Shane Gu

      International Conference on Machine Learning 2021 (ICML2021).

    • Identifying Co-Adaptation of Algorithmic and implementational Innovations in Deep Reinforcement Learning: Taxonomy of Inference-based Algorithms

      Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima, Yutaka Matsuo, Shixiang Shane Gu.

      International Conference on Machine Learning 2021 (ICML2021).

    • Reward and Optimality Empowerments: Information-Theoretic Measures for Task Complexity in Deep Reinforcement Learning

      Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, and Shixiang Shane Gu.

      International Conference on Machine Learning 2021 (ICML2021). July 2021. [paper]