Hongjie (Tony) Fang 方泓杰

I am a fourth-year Ph.D. student @ Computer Science in Wu Wenjun Honorable Class, Shanghai Jiao Tong University (SJTU) & Shanghai Artificial Intelligence Laboratory, advised by Prof. Cewu Lu. Previously, I got my B. Eng. degree @ Computer Science and Engineering, and B. Ec. degree @ Finance from SJTU in 2022.

My research interests mainly lie on robotics fields, specifically, robotic manipulation (including contact-rich manipulation, dexterous manipulation, long-horizon manipulation, etc.), robot learning (including imitation learning, multimodal learning, data collection methods, in-context learning, reinforcement learning, etc.), and grasping. I am currently a member of SJTU Machine Vision and Intelligence Group (MVIG). My ultimate goal is to enable robots to perform various tasks in the real world under any circumstances, improving the quality of human life.

profile photo

Photo @ İstanbul, Türkiye 🇹🇷
Credit to Jingjing Chen

News
  • Apr. 2026Force Policy is accepted by RSS 2026. See you in Sydney!
  • Jan. 2026Three papers (HistRISE, DQ-RISE and MBA) are accepted to ICRA 2026. See you in Vienna!
  • Aug. 2025AirExo-2 is accepted by CoRL 2025.
  • Jun. 2025Three papers (FoAR, SIME, and KDIL) are accepted by IROS 2025.
  • Apr. 2025FoAR is accepted by RA-L.
  • Mar. 2025AirExo-2 is released! Check our website for more details.
  • Jan. 2025Two papers (S2I and CAGE) are accepted by ICRA 2025.
  • Jun. 2024RISE is accepted by IROS 2024.
  • Jan. 2024Four papers (AirExo, RH20T, Open X-Embodiment and AnyGrasp) are accepted by ICRA 2024.
  • Oct. 2023Open X-Embodiment is released! Proud of this wonderful collaboration in the robotics community.
  • Sept. 2023AirExo is released! Check our website for more details.
  • Apr. 2023AnyGrasp is accepted by T-RO.
  • Jun. 2022TransCG is accepted by RA-L.
Publications

* denotes equal contribution. # denotes corresponding author(s).

Filter by topic:
Multimodal Learning Imitation Learning Contact-Rich Manipulation Force/Torque Generalization
RSS 2026
Force Policy: Learning Hybrid Force-Position Control Policy under Interaction Frame for Contact-Rich Manipulation

We introduce a physically grounded interaction frame that decouples motion and force control axis from demonstrations. By combining a global vision policy and a high-frequency local policy with hybrid force-position control, Force Policy improves contact stability, force regulation, and generalization on real contact-rich tasks.

Human Video Imitation Learning Generalization
arXiv 2026
LIDEA: Human-to-Robot Imitation Learning via Implicit Feature Distillation and Explicit Geometry Alignment

LIDEA transfers human demonstrations through dual-stage 2D feature distillation and embodiment-agnostic 3D geometry alignment. This cross-embodiment design makes human-to-robot imitation more reliable and improves generalization to new setups.

Imitation Learning Multimodal Learning History/Memory
ICRA 2026
History-Aware Visuomotor Policy Learning via Point Tracking

We introduce an object-centric history representation built upon point tracks, compressing long-horizon observations into task-relevant object memory for diverse visuomotor policies. This efficient design consistently outperforms both Markovian and prior history-based baselines, improving decision quality and task success.

Imitation Learning Dexterous Manipulation Teleoperation
ICRA 2026
Learning Dexterous Manipulation with Quantized Hand State

DQ-RISE quantizes dexterous hand states and couples them with arm diffusion through a continuous relaxation for structured arm-hand learning. This balances the action space and yields more efficient learning in dexterous manipulation.

Dexterous Grasping Dexterous Manipulation
arXiv 2025
AnyDexGrasp: General Dexterous Grasping for Different Hands with Human-Level Learning Efficiency

We introduce AnyDexGrasp, a data-efficient dexterous grasping method that transfers across different robotic hands, built upon the intermediate contact-centric grasp representations. It achieves high real-world success in cluttered scenes with over 150 novel objects, demonstrating scalable cross-hand grasp generalization.

In-the-Wild Collection Imitation Learning Generalization 3D Perception
CoRL 2025 Oral
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons

We develop AirExo-2 for low-cost, large-scale in-the-wild collection and convert human demonstrations into pseudo-robot data. Together with the generalizable visuomotor policy RISE-2 that integrates 3D perception and 2D visual foundation models, this pipeline reaches strong performance without teleoperated data.

Imitation Learning Generalization
IROS 2025
Knowledge-Driven Imitation Learning: Enabling Generalization Across Diverse Conditions

We formulate object-centric knowledge as a semantic keypoint graph template, and use a coarse-to-fine matching strategy to inject them into policy learning. This design improves category-level abstraction and boosts generalization across objects.

Imitation Learning
IROS 2025
SIME: Enhancing Policy Self-Improvement with Modal-Level Exploration

We propose modal-level exploration to generate diverse multi-modal interaction data, then learn from the most informative trials and segments. This self-improvement loop raises data efficiency and steadily strengthens policy capability over time.

Imitation Learning Action Generation
ICCV 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions

We propose a bidirectionally expanded action head that unfolds action sequences in a coarse-to-fine manner. This design preserves the capability of policy backbone while enabling logarithmic-time inference for faster manipulation control.

Multimodal Learning Contact-Rich Manipulation Force/Torque
RA-L 2025 & IROS 2025
FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation

FoAR is a force-aware policy that fuses vision with high-frequency force/torque sensing using a future-contact-guided gating module. This enables phase-adaptive control and delivers more accurate, robust contact-rich manipulation.

Imitation Learning Object Pose Action Generation
RA-L 2025 & ICRA 2026
Motion Before Action: Diffusing Object Motion as Manipulation Condition

MBA is a plug-and-play module that cascades action diffusion for object motion generation and motion-guided robot action generation. Integrated into existing policies, it consistently improves manipulation performance in various tasks.

Imitation Learning Generalization
ICRA 2025
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation

CAGE is a data-efficient generalizable policy utilizing visual foundation models and causal attention. With 50 demonstrations in a single domain, it generalizes to unseen backgrounds, objects, and viewpoints while outperforming prior methods.

Imitation Learning Data Quality
ICRA 2025
Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization

S2I is a segment-level selection and optimization framework for mixed-quality demonstrations that plugs into existing policies. Using only a few expert references, it improves downstream performance and makes suboptimal data more usable.

Imitation Learning 3D Perception Generalization
IROS 2024
RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective

RISE is an end-to-end imitation policy that predicts continuous actions directly from single-view point clouds. With only 50 demonstrations per task, it outperforms representative 2D and 3D baselines in accuracy, efficiency and generalization.

Manipulation Dataset
ICRA 2024 Best Paper
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration, [...], Hongjie Fang, [...] (194 authors)

We contribute Open X-Embodiment, a 1M+ trajectory real-robot dataset spanning 22 embodiments, plus large RT-X models trained at scale. This breadth enables strong cross-embodiment co-training gains and advances robotic foundation models.

In-the-Wild Collection Teleoperation
ICRA 2024
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild

AirExo is a low-cost portable dual-arm exoskeleton for joint-level teleoperation and in-the-wild demonstration collection. Pre-training with scalable in-the-wild data improves sample efficiency and robustness.

Manipulation Dataset Teleoperation
ICRA 2024
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot

RH20T is a real-world dataset of 110k+ sequences across diverse skills, robots, viewpoints, and contexts with synchronized visual, force, audio, tactile, and action signals. Its scale and multimodal quality make it a great training source for one-shot and generalizable manipulation.

Dynamic Grasping
IROS 2023
Flexible Handover with Real-Time Robust Dynamic Grasp Trajectory Generation

We propose a flexible handover framework with real-time robust grasp-trajectory generation and future grasp prediction. This improves adaptability to dynamic handover scenes and raises success on moving-object grasps.

Dynamic Grasping
CVPR 2023
Target-Referenced Reactive Grasping for Dynamic Objects

We reformulate reactive grasping around target-referenced semantic consistency rather than only temporal smoothness. Tracking in generated grasp spaces improves grasp reliability for dynamic objects.

General Grasping
T-RO 2023 & ICRA 2024
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains

AnyGrasp is a unified model for static and dynamic general grasping that predicts accurate dense full-DoF grasps efficiently. It remains robust under severe depth noise, improving real-world deployment reliability.

Perception Dataset Grasping
RA-L 2022 & ICRA 2023
TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and a Grasping Baseline

TransCG is a large-scale real-world transparent object depth completion benchmark. We also propose a lightweight baseline DFNet for depth completion of transparent objects. This closes a key sensing gap and improves perception for transparent objects.

General Grasping 3D Perception
ICCV 2021
Graspness Discovery in Clutters for Fast and Accurate Grasp Detection

We propose graspness, a geometry-driven grasp quality measure for identifying graspable regions in clutter via look-ahead search. A learned graspness predictor enables fast, accurate grasp detection in practice.

Selected Projects
research project
EasyRobot
Hongjie Fang

Provides an easy and unified interface for robots, grippers, sensors and pedals.

course project of SJTU undergraduate course "Mobile Internet"
Oh-My-Papers

Proposes that we can learn "jargons" like "ResNet" and "YOLO" from academic paper citation information, and such citation information can be regarded as the searching results of the corresponding "jargon". For example, when searching "ResNet", the engine should return the "Deep Residual Learning for Image Recognition", instead of the papers that contains word "ResNet" in their titles, as current scholar search engines commonly return.

Academic Services

Reviewer for Conferences :

  • IEEE International Conference on Robotics and Automation (ICRA), 2023, 2024, 2025, 2026
  • IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, 2024, 2025, 2026
  • Conference on Robot Learning (CoRL), 2025, 2026
  • International Conference on Learning Representations (ICLR), 2025
  • Advances in Neural Information Processing Systems (NeurIPS), 2025, 2026
  • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

Reviewer for Journals :
  • IEEE Robotics and Automation Letters (RA-L)
  • IEEE Transactions on Cybernetics (T-CYB)
  • IEEE/ASME Transactions on Mechatronics (T-MECH)
  • IEEE Transactions on Automation Science and Engineering (T-ASE)

Talks

Collaboration & Mentoring

I collaborate closely with Hao-Shu Fang @ MIT, Chenxi Wang @ Noematrix, Shangning Xia @ Noematrix, Lixin Yang @ SJTU, Jun Lv @ Noematrix and Shiquan Wang @ Flexiv . I welcome opportunities for discussions and potential collaborations, and I am particularly interested in working with highly motivated undergraduate and master's students. Please feel free to contact me via email. I'm fortunate to work with the following students:

Course Notes

I share some of my notes in the courses I took at graduate school in this page. More notes during my undergraduate study can be found in this repository.