松尾・岩澤研究室では,「知能を創る」というミッションのもと、世界モデルをはじめとした深層学習やそれを超える基礎技術の開発、ロボティクスや大規模言語モデル、アルゴリズムの社会実証といった幅広い研究領域で活動しています。
こうした活動を更に拡大するため、リサーチインターンシップを開催し、15名の方にご参加いただきました。
▼リサーチインターンシップ概要
https://weblab.t.u-tokyo.ac.jp/news/20240417/
▼インターンテーマ/メンターの紹介記事
https://weblab.t.u-tokyo.ac.jp/2024-04-26/https://weblab.t.u-tokyo.ac.jp/2024-04-26/
本記事では、リサーチインターンに参加いただいたメンバーの体験記をご紹介します。
About Myself
I’m Shih-Min Yang, a Ph.D. student at Örebro University in Sweden. My research focuses on minimizing human supervision and enhancing learning efficiency in robotic manipulation, enabling robots to adapt to diverse tasks and move toward general-purpose capabilities. In pursuit of this goal, I focus on Hierarchical Learning, Unsupervised Reward-based Exploration, and Robotic Foundation Models. As robotic systems become increasingly integrated into real-world applications, improving their adaptability and autonomy is crucial for their long-term viability. My research aims to push the boundaries of robotic learning by exploring novel methods that reduce dependence on extensive human-labeled data while improving generalization across different tasks.
In 2024, I attended the IEEE International Conference on Robotics and Automation (ICRA) in Yokohama, Japan, one of the most known conferences in the field of robotics. The conference provided a great opportunity to learn about the latest advancements in robotics research, interact with leading researchers, and present my own work. On the last day of the conference, I had the opportunity to visit the MATSUO-IWASAWA Laboratory (松尾・岩澤研究室), where I engaged in insightful discussions with Yusuke Iwasawa and Tatsuya Matsushima regarding the challenges in current robotic foundation models.
The MATSUO-IWASAWA Laboratory actively conducts research in artificial intelligence, foundation reinforcement learning, large language models, and robotic foundation models. Researchers in the lab explore various topics, including but not limited to post-training improvements in robotic foundation models, integration with contact force/tactile sensors, robot teleoperation, and enhancing robotic foundation models through improved planning algorithms. During my visit, I had the opportunity to observe the lab’s ongoing projects and discuss their implications for the broader field of robotic learning.
This experience further deepened my interest in exploring new methodologies for improving robotic foundation models. I believe that collaborating with researchers at the MATSUO-IWASAWA Laboratory would be highly beneficial, particularly in developing more efficient and adaptable robotic foundation models, which closely align with my research direction.
About Research
During my time at the MATSUO-IWASAWA Laboratory, I worked under the guidance of Tatsuya Matsushima, focusing on improving foundation models for robotic manipulation tasks. Our research aimed to explore the limitations of existing robotic foundation models and propose methods to enhance their effectiveness. Throughout multiple discussions, both Tatsuya Matsushima and Yusuke Iwasawa provided valuable insights and recommended several relevent papers, which greatly helping my understanding of the key challenges in this field. These discussions helped refine our research hypothesis and experimental design.
One of the major limitations of current robotic foundation models is their reliance on pre-trained vision-language models (VLMs), which often lack an intuitive understanding of physics. For example, a VLM may fail to recognize that objects cannot teleport in space and time but must move along continuous paths. Similarly, it may not inherently understand that objects cannot spontaneously change their size, shape, or color, or that they get displaced when hit by moving objects. This shortcoming poses challenges in robotic manipulation tasks that require physical understanding. Unlike human cognition, which naturally incorporates an understanding of physics through experience and observation, pre-trained VLMs often struggle to grasp concepts such as object dynamics and environmental interactions.
Our goal is to implicitly introduce physics priors into robotic foundation models to enable to learn more efficiently from demonstration data. To address this challenge, we explored integrating reinforcement learning with LLaMA-2 7B, a pre-trained vision-language model. As a preliminary step toward incorporating physics priors into robotic foundation models, I set up training and evaluation environments using the Libero simulation framework, allowing us to test our ideas on a smaller scale.
About the Experience
The MATSUO-IWASAWA Laboratory creates a collaborative research environment. One of the lab’s notable aspects is its monthly poster session, where researchers present their current work, engage in discussions, and contribute to each other’s projects. These sessions provide an excellent platform for knowledge exchange, allowing researchers to receive constructive feedback and refine their ideas. During my stay, I participated in these sessions and found them highly beneficial. I received interesting questions and valuable insights on potential issues in my research approach from the audience.
I appreciate the opportunity to collaborate with the researchers at the MATSUO-IWASAWA Laboratory. Their guidance and support significantly enriched my research experience. Yusuke Iwasawa and Tatsuya Matsushima not only shared relevant research papers but also provided valuable insights during the method design phase, helping shape our approach to addressing key limitations. Additionally, Petr Khrapchenkov offered invaluable assistance in setting up the development workflow and optimizing the server infrastructure for training, ensuring a smooth research process. Navigating the technical challenges of research was greatly facilitated by collaboration with the lab members. Whether discussing novel methods, debugging code, optimizing training procedures, or analyzing experimental results, the support I received from my peers was valuable. Additionally, I had the opportunity to share my expertise in reinforcement learning and deep learning in robotics, contributing to discussions and ongoing research.
Beyond research, I am especially grateful to Rie Matsukawa for her support in managing the logistical aspects of my stay in Japan, including travel arrangements, transportation, and accommodation. Staying in a foreign country could be challenging, but her assistance made the experience much more manageable. Additionally, my time in Japan provided a wonderful opportunity to explore the local culture in Tokyo. I had the chance to experience traditional Japanese cuisine, visit historical landmarks, and enjoy breathtaking views.
Overall, my time at the MATSUO-IWASAWA Laboratory was meaningful. I gained not only valuable research insights but also a glimpse for Japan’s collaborative research culture and daily life. This experience has been a milestone in my academic journey, and I am deeply grateful for the opportunity to engage with such a talented and supportive research community. The discussions, mentorship, and technical support I received during my stay were helpful in refining my approach and shaping my research direction.
いかがでしたでしょうか?
松尾研では研究員を積極的に募集しております。気になる方は下記をご覧ください!
https://weblab.t.u-tokyo.ac.jp/joinus/career/