Yuki Kadokawa

Sim-to-Real Reinforcement Learning for Robotics

I am a researcher in reinforcement learning (RL), deep learning, and robotics. My research focuses on Sim-to-Real transfer — how robots can learn in simulation and operate in the real world — using methods such as domain randomization, policy distillation, and sample-efficient reinforcement learning. I have worked on real-world robotic tasks including manipulation, legged locomotion, multi-robot systems, and field and industrial robotics.

Cross-appointment

Researcher, Fukushima Institute for Research, Education and Innovation (F-REI)
Project Assistant Professor, Nara Institute of Science and Technology (NAIST) — Robot Learning Lab

Google Scholar GitHub

News & Highlights

2026

NEWGave an invited talk at the SICE Symposium on Decentralized Autonomous Systems (自律分散システム・シンポジウム) on Sim-to-Real RL for real-world robot tasks.

2026

DAPPER was accepted to IEEE Robotics & Automation Magazine (RAM) and presented at ICRA 2026 — joint work with ETH Zürich.

2026

Our paper on Distilled Iterative Value Conversion for neurochip-driven edge robots was published in IEEE Access.

2025

Progressive-Resolution Policy Distillation (PRPD) was published in IEEE T-ASE and featured in the Nikkan Kogyo Shimbun.

2025

Learning Quiet Walking for a Small Home Robot was presented at ICRA 2025 — collaboration with ETH Zürich and Sony.

2024

Completed my Doctor of Engineering at NAIST and began as a Project Assistant Professor in the Robot Learning Lab.

Publications

International Journal

Figure 1 of Distilled Iterative Value Conversion

IEEE Access2026

Distilled Iterative Value Conversion: Reducing FPNN-to-SNN Conversion Errors via Distillation in Reinforcement Learning for Neurochip-Driven Edge Robots

Yuki Kadokawa, Tomoya Yamanokuchi, Alonso Ramos Fernandez, Takanori Homma, and Takamitsu Matsubara

IEEE Access, 2026

Paper Video

IEEE RAM2026

DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning for Query-Efficient Robot Skill Acquisition

Yuki Kadokawa, Jonas Frey, Takahiro Miki, Takamitsu Matsubara, and Marco Hutter

IEEE Robotics & Automation Magazine (RAM), 2026

Paper Video

IEEE Access2026

Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning

Po-Yen Wu, Cheng-Yu Kuo, Yuki Kadokawa, and Takamitsu Matsubara

IEEE Access, 2026

Paper Project Page

Figure 1 of Progressive-Resolution Policy Distillation

IEEE T-ASE2025

Progressive-Resolution Policy Distillation: Leveraging Coarse-Resolution Simulations for Time-Efficient Fine-Resolution Policy Learning

Yuki Kadokawa, Hirotaka Tahara, and Takamitsu Matsubara

IEEE Transactions on Automation Science and Engineering (T-ASE), 2025

Paper Project Page Video Newspaper

Figure 1 of Robust Iterative Value Conversion

RAS2024

Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots

Yuki Kadokawa, Tomohito Kodera, Yoshihisa Tsurumine, Shinya Nishimura, and Takamitsu Matsubara

Robotics and Autonomous Systems (RAS), 2024

Paper Video

RAS2023

Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain Randomization

Yuki Kadokawa, Lingwei Zhu, Yoshihisa Tsurumine, and Takamitsu Matsubara

Robotics and Autonomous Systems (RAS), 2023

Paper Video

IEEE RA-L2021

Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGAStudent Paper Award

Yuki Kadokawa, Yoshihisa Tsurumine, and Takamitsu Matsubara

IEEE Robotics and Automation Letters (RA-L), vol. 6, no. 4, pp. 8545–8552, 2021

Paper Video

Under Review

ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

Issa Nakamura, Tomoya Yamanokuchi, Yuki Kadokawa, Jia Qu, Shun Otsubo, Shotaro Miwa, and Takamitsu Matsubara

Under review

Paper

Figure 1 of Autonomous Obstacle Removal for Excavators

Under Review

Autonomous Obstacle Removal for Excavators through Policy Learning with Particle Simulation

Yuki Kadokawa, Sandro M. Alcantara Tacora, Taro Abe, Daisuke Endo, Genki Yamauchi, Takeshi Hashimoto, and Takamitsu Matsubara

Under review

Paper Video

Under Review

Bridged SBI: Correcting Biased Low-Fidelity Posteriors for Cost-Efficient High-Fidelity Inference

Gahee Kim, Yuki Kadokawa, Sandro M. Alcantara Tacora, Taro Abe, Daisuke Endo, Genki Yamauchi, Takeshi Hashimoto, and Takamitsu Matsubara

Under review

Paper

International Conference

ICRA2026

DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning for Query-Efficient Robot Skill Acquisition

Yuki Kadokawa, Jonas Frey, Takahiro Miki, Takamitsu Matsubara, and Marco Hutter

IEEE International Conference on Robotics and Automation (ICRA, RAM option), 2026

Paper Video

ICRA2026

Progressive-Resolution Policy Distillation: Leveraging Coarse-Resolution Simulations for Time-Efficient Fine-Resolution Policy Learning

Yuki Kadokawa, Hirotaka Tahara, and Takamitsu Matsubara

IEEE International Conference on Robotics and Automation (ICRA, T-ASE option), 2026

Paper Project Page Video Newspaper

Figure 1 of Robust Sim-to-Real Cloth Untangling

CASE2026

Robust Sim-to-Real Cloth Untangling through Reduced-Resolution Observations via Adaptive Force-Difference Quantization

Yoshihisa Tsurumine, Yuki Kadokawa, Kohei Hayashi, Christian Diehm, and Takamitsu Matsubara

IEEE International Conference on Automation Science and Engineering (CASE), 2026

Paper Video

Figure 1 of Learning Quiet Walking for a Small Home Robot

ICRA2025

Learning Quiet Walking for a Small Home Robot

Ryo Watanabe, Takahiro Miki, Fan Shi, Yuki Kadokawa, Filip Bjelonic, Kento Kawaharazuka, Andrei Cramariuc, and Marco Hutter

International Conference on Robotics and Automation (ICRA), 2025

Paper Project Page Video

AROB2025

Scalable Domain Randomized Reinforcement Learning for Sim-to-Real Policy Transfer in Complex Robot Tasks

Yuki Kadokawa, and Takamitsu Matsubara

International Symposium on Artificial Life and Robotics (AROB), 2025

Conference

Figure 1 of Learning Robotic Powder Weighing from Simulation

IROS2023

Learning Robotic Powder Weighing from Simulation for Laboratory Automation

Yuki Kadokawa, Masashi Hamaya, and Kazutoshi Tanaka

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023

Paper Project Page Video

ICRA2023

Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA

Yuki Kadokawa, Yoshihisa Tsurumine, and Takamitsu Matsubara

IEEE International Conference on Robotics and Automation (ICRA, RA-L option), 2023

Paper Video

Under Review

DeReCo: Decoupling Representation and Coordination Learning for Object-Adaptive Decentralized Multi-Robot Cooperative Transport

Kazuki Shibata, Ryosuke Sota, Shandil Dhiresh Bosch, Yuki Kadokawa, Yoshihisa Tsurumine, and Takamitsu Matsubara

Under review

Paper Video

Domestic Conference

SICE SI2025

段階的解像度シミュレーションを用いた土砂の挙動を再現する粒子パラメータ自動推定

金加喜, 角川勇貴, Sandro Manuel Alcantara Tacora, 阿部太郎, 遠藤大輔, 山内元貴, 橋本毅, 松原崇充

計測自動制御学会システムインテグレーション部門講演会, 2025

Conference

RSJ2025

ドメインランダム化強化学習による油圧ショベルの土砂中の障害物除去タスク実現

角川勇貴, Sandro Manuel Alcantara Tacora, 松原崇充

日本ロボット学会学術講演会, 2025

Conference

RSJ2025

到達状態拡張による対照強化学習の汎化

中村維冴, 山之口智也, 角川勇貴, 曲佳, 大坪舜, 宮本健, 三輪祥太郎, 松原崇充

日本ロボット学会学術講演会, 2025

Conference

RSJ2025

柔軟な群ロボット協調輸送のための非対称アクター・クリティック型マルチエージェント強化学習

曽田涼介, 柴田一騎, 鶴峯義久, 角川勇貴, 松原崇充

日本ロボット学会学術講演会, 2025

Conference

RSJ2024

Deep Reinforcement Learning with FPNN-to-SNN Policy Distillation for Neurochip-driven RobotsBest Presentation Award Finalist

Alonso Ramos Fernandez, Yuki Kadokawa, Yoshihisa Tsurumine, and Takamitsu Matsubara

Annual Conference of the Robotics Society of Japan, 2024

Conference

ROBOMECH2024

Sim-to-Real 方策転移のためのダイナミクスランダム化対照強化学習

中村維冴, 山之口智也, 角川勇貴, 曲佳, 大坪舜, 三輪祥太郎, 松原崇充

ロボティクス・メカトロニクス講演会, 2024

Conference

SICE MSCS2023

ニューロチップ実装に適した量子化方策のエッジサーバー深層強化学習

小寺智仁, 角川勇貴, 鶴峯義久, 松原崇充

計測自動制御学会制御部門マルチシンポジウム, 2023

Conference

RSJ2020

FPGAを用いた実時間ロボット制御のための深層強化学習手法 Binary P-Network の提案

角川勇貴, 鶴峯義久, 松原崇充

日本ロボット学会学術講演会, 2020

Conference

Invited Talk

SICE DAS2026

実世界ロボットタスクにおけるSim-to-Real強化学習: パーティクルシミュレーションにおける計算量と精度のトレードオフ

角川勇貴, 松原崇充

計測自動制御学会自律分散システム・シンポジウム, 2026

Conference

Honors

Awards

Outstanding Student Award, Symposium of University Fellowship, NAIST, 2022 [Award Page]
Student Paper Award, IEEE Kansai Section, 2022 [NAIST] [IEEE]
Best Student of the Year, Toyama Prefectural University, 2019
Grand Prize, Monozukuri in Toyama, Toyama Mechanical and Electrical Industries Association, 2018
Best Presentation Award, Design Competition, Japan Society for Design Engineering, 2018
3rd Place (National Finals 3/10), Design Competition, Japan Society for Design Engineering, 2018
Best Student Presentation Award, Research Presentation, Japan Society for Design Engineering, 2018
Best Presentation Award, Design Competition, Japan Society for Design Engineering, 2017
Best Student of the Year, Toyama Prefectural Takaoka Kogei High School, 2015

Scholarship & Grant

Senju Monju Project (Research Grant), NAIST, 2024–2026
Scholarship (Return Exemption, Full Amount), Japan Student Services Organization, 2023
Research Fellowship for Young Scientists: DC2 (Funding & Salary), JSPS, 2023–2025
Research Fellowship (Research Grant & Salary), Japan Science and Technology Agency, 2022–2023 [Interview]
Scholarship (Return Exemption, Half Amount), Japan Student Services Organization, 2021
Outstanding Student Scholarship (Tuition Exemption), NAIST, 2021

Experience & Education

Work History

Apr 2024 – Mar 2026

Project Assistant Professor

Robot Learning Lab, NAIST
Aug 2023 – Oct 2023

Visiting Researcher

Robotic Systems Lab, ETH Zürich
Apr 2022 – Mar 2023

Research Internship

OMRON SINIC X Corporation

Project Page Video
Dec 2018 – Mar 2019

Part-time, 3D-CAD Modeling

Aluminum Factory Corporation

Education

2024

Doctor of Engineering

Program of Information Science and Engineering, NAIST, Japan
2021

Master of Engineering

Program of Information Science and Engineering, NAIST, Japan
2019

Bachelor of Engineering

Department of Intelligent Robotics, Toyama Prefectural University, Japan

Research Field

Reinforcement Learning Deep Learning Robotics Sim-to-Real Domain Randomization Edge Robots FPGA Neurochip

Contact

E-mail

kadokawa.yuki [at] naist.ac.jp

Address

8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan

Affiliation

Nara Institute of Science and Technology, Robot Learning Laboratory

Google Scholar GitHub