
Vaibhav Unhelkar
· Assistant Professor of Computer ScienceVerifiedRice University · Computer Science
Active 2014–2026
About
Professor Vaibhav Unhelkar is an Assistant Professor of Computer Science at Rice University, leading the Human-Centered AI and Robotics (HCAIR) Group within the Department of Computer Science. His research envisions a future where robots and AI systems act as assistants, teammates, and trainers to humans, enhancing human capability through interactive AI systems. The group focuses on developing computational foundations for these systems by creating machine learning and decision-making algorithms, prototyping them, and evaluating their effectiveness with human users. Their work is grounded in both near-term and futuristic applications, with ongoing projects in healthcare and disaster response. Collaborating with experts in human factors, team science, and medicine, Professor Unhelkar's research aims to model human behavior, train human-robot teams, and explain AI system behavior to improve human-AI collaboration. His research areas include interactive machine learning, explainable AI, human modeling and prediction, robot learning, and human-robot interaction.
Research topics
- Artificial Intelligence
- Computer Science
- Machine Learning
- Human–computer interaction
- Engineering
- Data science
- Programming language
Selected publications
Sampling-Based Motion Planning With Scene Graphs Under Perception Constraints
IEEE Robotics and Automation Letters · 2026-03-13
articleIt will be increasingly common for robots to operate in cluttered human-centered environments such as homes, workplaces, and hospitals, where the robot is often tasked to maintain perception constraints, such as monitoring people or multiple objects, for safety and reliability while executing its task. However, existing perception-aware approaches typically focus on low-degree-of-freedom (DOF) systems or only consider a single object in the context of high-DOF robots. This motivates us to consider the problem of perception-aware motion planning for high-DOF robots that accounts for multi-object monitoring constraints. We employ a scene graph representation of the environment, offering a great potential for incorporating longhorizon task and motion planning thanks to its rich semantic and spatial information. However, it does not capture perceptionconstrained information, such as the viewpoints the user prefers. To address these challenges, we propose MOPS-PRM, a roadmapbased motion planner, that integrates the perception cost of observing multiple objects or humans directly into motion planning for high-DOF robots. The perception cost is embedded to each object as part of a scene graph, and used to selectively sample configurations for roadmap construction, implicitly enforcing the perception constraints. Our method is extensively validated in both simulated and real-world experiments, achieving more than <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim 36\%$</tex-math></inline-formula> improvement in the average number of detected objects and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim 17\%$</tex-math></inline-formula> better track rate against other perception-constrained baselines, with comparable planning times and path lengths.
Cognitive Digital Twins in Surgery: Insights from an International Multidisciplinary Workshop
2026-02-06
articleOpen accessA multidisciplinary workshop, 'Toward Cognitive Digital Twins in Surgery,' was held during the Hamlyn Symposium in London, UK on Friday, June 27, 2025. This work summarizes the main insights that emerged from the discussion, including the potential benefits and limitations of Cognitive Digital Twins (CogDT) in surgery. The CogDT framework is designed to capture the entire clinical system, including patients and healthcare providers. We introduce the novel concept of the Surgical Cognitive Digital Twin (S-CogDT), highlighting its potential for advancement in surgical care. The primary identified benefits of S-CogDTs include creating dynamic, individualized profiles for personalized provider training, establishing real-time early-warning systems to detect and mitigate surgeon cognitive overload, and streamlining organizational performance within the surgical environment. However, the effective adoption of this technology faces significant challenges. These include the difficulty of measuring and validating dynamic cognitive states, the considerable hurdles in achieving user acceptance and trust, and critical questions surrounding data governance, privacy, and regulatory pathways. These results inform future research in this novel, multidisciplinary field by clearly showing the potential of S-CogDTs and mapping the potential challenges required for their successful implementation.
Sampling-Based Motion Planning with Scene Graphs Under Perception Constraints
Open MIND · 2026-03-03
preprintIt will be increasingly common for robots to operate in cluttered human-centered environments such as homes, workplaces, and hospitals, where the robot is often tasked to maintain perception constraints, such as monitoring people or multiple objects, for safety and reliability while executing its task. However, existing perception-aware approaches typically focus on low-degree-of-freedom (DoF) systems or only consider a single object in the context of high-DoF robots. This motivates us to consider the problem of perception-aware motion planning for high-DoF robots that accounts for multi-object monitoring constraints. We employ a scene graph representation of the environment, offering a great potential for incorporating long-horizon task and motion planning thanks to its rich semantic and spatial information. However, it does not capture perception-constrained information, such as the viewpoints the user prefers. To address these challenges, we propose MOPS-PRM, a roadmap-based motion planner, that integrates the perception cost of observing multiple objects or humans directly into motion planning for high-DoF robots. The perception cost is embedded to each object as part of a scene graph, and used to selectively sample configurations for roadmap construction, implicitly enforcing the perception constraints. Our method is extensively validated in both simulated and real-world experiments, achieving more than ~36% improvement in the average number of detected objects and ~17% better track rate against other perception-constrained baselines, with comparable planning times and path lengths.
Open MIND · 2026-02-20
preprintSenior authorWhen training artificial intelligence (AI) to perform tasks, humans often care not only about whether a task is completed but also how it is performed. As AI agents tackle increasingly complex tasks, aligning their behavior with human-provided specifications becomes critical for responsible AI deployment. Reward design provides a direct channel for such alignment by translating human expectations into reward functions that guide reinforcement learning (RL). However, existing methods are often too limited to capture nuanced human preferences that arise in long-horizon tasks. Hence, we introduce Hierarchical Reward Design from Language (HRDL): a problem formulation that extends classical reward design to encode richer behavioral specifications for hierarchical RL agents. We further propose Language to Hierarchical Rewards (L2HR) as a solution to HRDL. Experiments show that AI agents trained with rewards designed via L2HR not only complete tasks effectively but also better adhere to human specifications. Together, HRDL and L2HR advance the research on human-aligned AI agents.
arXiv (Cornell University) · 2026-02-20
articleOpen accessSenior authorWhen training artificial intelligence (AI) to perform tasks, humans often care not only about whether a task is completed but also how it is performed. As AI agents tackle increasingly complex tasks, aligning their behavior with human-provided specifications becomes critical for responsible AI deployment. Reward design provides a direct channel for such alignment by translating human expectations into reward functions that guide reinforcement learning (RL). However, existing methods are often too limited to capture nuanced human preferences that arise in long-horizon tasks. Hence, we introduce Hierarchical Reward Design from Language (HRDL): a problem formulation that extends classical reward design to encode richer behavioral specifications for hierarchical RL agents. We further propose Language to Hierarchical Rewards (L2HR) as a solution to HRDL. Experiments show that AI agents trained with rewards designed via L2HR not only complete tasks effectively but also better adhere to human specifications. Together, HRDL and L2HR advance the research on human-aligned AI agents.
Sampling-Based Motion Planning with Scene Graphs Under Perception Constraints
ArXiv.org · 2026-03-03
articleOpen accessIt will be increasingly common for robots to operate in cluttered human-centered environments such as homes, workplaces, and hospitals, where the robot is often tasked to maintain perception constraints, such as monitoring people or multiple objects, for safety and reliability while executing its task. However, existing perception-aware approaches typically focus on low-degree-of-freedom (DoF) systems or only consider a single object in the context of high-DoF robots. This motivates us to consider the problem of perception-aware motion planning for high-DoF robots that accounts for multi-object monitoring constraints. We employ a scene graph representation of the environment, offering a great potential for incorporating long-horizon task and motion planning thanks to its rich semantic and spatial information. However, it does not capture perception-constrained information, such as the viewpoints the user prefers. To address these challenges, we propose MOPS-PRM, a roadmap-based motion planner, that integrates the perception cost of observing multiple objects or humans directly into motion planning for high-DoF robots. The perception cost is embedded to each object as part of a scene graph, and used to selectively sample configurations for roadmap construction, implicitly enforcing the perception constraints. Our method is extensively validated in both simulated and real-world experiments, achieving more than ~36% improvement in the average number of detected objects and ~17% better track rate against other perception-constrained baselines, with comparable planning times and path lengths.
Hierarchical Imitation Learning of Team Behavior from Heterogeneous Demonstrations
ArXiv.org · 2025-02-24
preprintOpen accessSenior authorSuccessful collaboration requires team members to stay aligned, especially in complex sequential tasks. Team members must dynamically coordinate which subtasks to perform and in what order. However, real-world constraints like partial observability and limited communication bandwidth often lead to suboptimal collaboration. Even among expert teams, the same task can be executed in multiple ways. To develop multi-agent systems and human-AI teams for such tasks, we are interested in data-driven learning of multimodal team behaviors. Multi-Agent Imitation Learning (MAIL) provides a promising framework for data-driven learning of team behavior from demonstrations, but existing methods struggle with heterogeneous demonstrations, as they assume that all demonstrations originate from a single team policy. Hence, in this work, we introduce DTIL: a hierarchical MAIL algorithm designed to learn multimodal team behaviors in complex sequential tasks. DTIL represents each team member with a hierarchical policy and learns these policies from heterogeneous team demonstrations in a factored manner. By employing a distribution-matching approach, DTIL mitigates compounding errors and scales effectively to long horizons and continuous state representations. Experimental results show that DTIL outperforms MAIL baselines and accurately models team behavior across a variety of collaborative scenarios.
Hierarchical Imitation Learning of Team Behavior from Heterogeneous Demonstrations
2025-05-28
articleSenior authorSuccessful collaboration requires team members to stay aligned, especially in complex sequential tasks. Team members must dynamically coordinate which subtasks to perform and in what order. However, real-world constraints like partial observability and limited communication bandwidth often lead to suboptimal collaboration. Even among expert teams, the same task can be executed in multiple ways. To develop multi-agent systems and human-AI teams for such tasks, we are interested in data-driven learning of multimodal team behaviors. Multi-Agent Imitation Learning (MAIL) provides a promising framework for data-driven learning of team behavior from demonstrations, but existing methods struggle with heterogeneous demonstrations, as they assume that all demonstrations originate from a single team policy. Hence, in this work, we introduce DTIL: a hierarchical MAIL algorithm designed to learn multimodal team behaviors in complex sequential tasks. DTIL represents each team member with a hierarchical policy and learns these policies from heterogeneous team demonstrations in a factored manner. By employing a distribution-matching approach, DTIL mitigates compounding errors and scales effectively to long horizons and continuous state representations. Experimental results show that DTIL outperforms MAIL baselines and accurately models team behavior across a variety of collaborative scenarios.
The Need for Human-AI Collaborative Methods for Conducting Audits of Machine Learning Models
Proceedings of the AAAI Symposium Series · 2025-05-28
articleOpen accessSenior authorConducting application audits of ML models is essential for ensuring their safe and responsible deployment, particularly in high-stakes applications. However, the auditing of ML models deployed in domain-specific applications remains largely a manual process, relying on domain experts to identify model errors. The manual nature of the process limits scalability of audits and hinders the discovery of problematic model behaviors. We posit that a human-AI collaborative paradigm is key to conducting effective application audits. In this abstract, we propose a research agenda to develop Human-AI collaborative methods for conducting application audits of ML models.
Spatiotemporally Controlled Soft Robotics with Optically Responsive Liquid Crystal Elastomers
Advanced Intelligent Systems · 2025-04-15 · 3 citations
articleOpen accessLight‐responsive materials enable the development of soft robots that are controlled remotely in 3D space and time without the need for cumbersome wires, onboard batteries, or altering the local environment. Azobenzene liquid crystal polymer networks are one such material that can move and deform in response to light actuation. Previous works have demonstrated azo‐based soft robotic grippers and transporters that are remotely powered by light. However, highly adaptive, automated spatiotemporal optical control over these materials has not yet been realized. Herein, a system for an azobenzene liquid crystal elastomer soft robotic arm is created by dynamically patterning light for independently maneuverable joints. The nonlinear material response to optical actuation is characterized, and the broad actuation space is explored with diverse arm configurations. A neural network is trained on the arm configurations and corresponding laser pattern to automate the pattern generation for a desired configuration. Finally, the azobenzene liquid crystal elastomer arm demonstrates complex targeted motion, marking an important step toward optically actuated soft robotics with applications ranging from optomechanics to biomedical tools.
Frequent coauthors
- 27 shared
Julie Shah
- 14 shared
Quirin Tyroller
BMW (Germany)
- 14 shared
Ho Chit Siu
Worcester Polytechnic Institute
- 13 shared
Stefan Bartscher
BMW (Germany)
- 13 shared
Johannes Bix
- 12 shared
James C. Boerkoel
Harvey Mudd College
- 12 shared
Przemyslaw A. Lasota
Massachusetts Institute of Technology
- 12 shared
Jorge Perez
Massachusetts Institute of Technology
Labs
Human-Centered AI and Robotics GroupPI
Developing computational foundations of artificially intelligent systems that enhance human capability.
Education
- 2020
Ph.D. (Autonomous Systems)
Massachusetts Institute of Technology
- 2012
B. Tech., M. Tech., Aerospace Engineering
Indian Institute of Technology Bombay
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Vaibhav Unhelkar
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup