Byung-Cheol Min

· Associate Professor and University Faculty ScholarVerified

Purdue University · Department of Computer and Information Technology

Active 1996–2026

h-index18

Citations1.4k

Papers17694 last 5y

Funding$500k

Faculty page Lab page

See your match with Byung-Cheol Min — sign in to PhdFit.Sign in

About

Byung-Cheol (B.C.) Min is a Professor in the Department of Computer Science and the Department of Intelligent Systems Engineering at Indiana University Bloomington, a position he assumed in Fall 2025. He also serves as an Adjunct Professor in the School of Applied and Creative Computing at Purdue University. For the most current information about his work and academic activities, Professor Min directs visitors to his new webpage and the SMART Lab's website, which is his research laboratory. The provided text does not include specific details about his research focus, background, or key contributions beyond his academic appointments and affiliations.

Research topics

Computer Science
Engineering
Computer Security
Political Science
Artificial Intelligence
Systems engineering
Operations research
Risk analysis (engineering)
Simulation
Software engineering
Operations management
Human–computer interaction
Database
Business
Law

Selected publications

PrefMoE: Robust Preference Modeling with Mixture-of-Experts Reward Learning
arXiv (Cornell University) · 2026-05-01
preprintOpen accessSenior author
Preference-based reinforcement learning offers a scalable alternative to manual reward engineering by learning reward structures from comparative feedback. However, large-scale preference datasets, whether collected from crowdsourced annotators or generated by synthetic teachers, often contain heterogeneous and partially conflicting supervision, including disagreement across annotators and inconsistency within annotators. Existing reward learning methods typically fit a single reward model to such data, forcing it to average incompatible signals and thereby limiting robustness. To solve this, we propose PrefMoE, a mixture-of-experts reward learning framework for robust preference modeling. PrefMoE learns multiple specialized reward experts and uses trajectory-level soft routing to combine them adaptively, enabling the model to capture diverse latent preference patterns under noisy and heterogeneous preference supervision. A load-balancing regularizer further stabilizes training by preventing expert collapse. Across locomotion benchmarks from D4RL and manipulation tasks from MetaWorld, PrefMoE improves preference prediction robustness and leads to more reliable downstream policy learning than strong single-model baselines.
Publisher DOI
Pre-Execution Safety Gate & Task Safety Contracts for LLM-Controlled Robot Systems
arXiv (Cornell University) · 2026-04-07
preprintOpen accessSenior author
Large Language Models (LLMs) are increasingly used to convert task commands into robot-executable code, however this pipeline lacks validation gates to detect unsafe and defective commands before they are translated into robot code. Furthermore, even commands that appear safe at the outset can produce unsafe state transitions during execution in the absence of continuous constraint monitoring. In this research, we introduce SafeGate, a neurosymbolic safety architecture that prevents unsafe natural language task commands from reaching robot execution. Drawing from ISO 13482 safety standard, SafeGate extracts structured safety-relevant properties from natural language commands and applies a deterministic decision gate to authorize or reject execution. In addition, we introduce Task Safety Contracts, which decomposes commands that pass through the gate into invariants, guards, and abort conditions to prevent unsafe state transitions during execution. We further incorporate Z3 SMT solving to enforce constraint checking derived from the Task Safety Contracts. We evaluate SafeGate against existing LLM-based robot safety frameworks and baseline LLMs across 230 benchmark tasks, 30 AI2-THOR simulation scenarios, and real-world robot experiments. Results show that SafeGate significantly reduces the acceptance of defective commands while maintaining a high acceptance of benign tasks, demonstrating the importance of pre-execution safety gates for LLM-controlled robot systems
Publisher DOI
PrefMoE: Robust Preference Modeling with Mixture-of-Experts Reward Learning
arXiv (Cornell University) · 2026-05-01
articleOpen accessSenior author
Preference-based reinforcement learning offers a scalable alternative to manual reward engineering by learning reward structures from comparative feedback. However, large-scale preference datasets, whether collected from crowdsourced annotators or generated by synthetic teachers, often contain heterogeneous and partially conflicting supervision, including disagreement across annotators and inconsistency within annotators. Existing reward learning methods typically fit a single reward model to such data, forcing it to average incompatible signals and thereby limiting robustness. To solve this, we propose PrefMoE, a mixture-of-experts reward learning framework for robust preference modeling. PrefMoE learns multiple specialized reward experts and uses trajectory-level soft routing to combine them adaptively, enabling the model to capture diverse latent preference patterns under noisy and heterogeneous preference supervision. A load-balancing regularizer further stabilizes training by preventing expert collapse. Across locomotion benchmarks from D4RL and manipulation tasks from MetaWorld, PrefMoE improves preference prediction robustness and leads to more reliable downstream policy learning than strong single-model baselines.
Publisher OA PDF
Pre-Execution Safety Gate & Task Safety Contracts for LLM-Controlled Robot Systems
arXiv (Cornell University) · 2026-04-07
articleOpen accessSenior author
Large Language Models (LLMs) are increasingly used to convert task commands into robot-executable code, however this pipeline lacks validation gates to detect unsafe and defective commands before they are translated into robot code. Furthermore, even commands that appear safe at the outset can produce unsafe state transitions during execution in the absence of continuous constraint monitoring. In this research, we introduce SafeGate, a neurosymbolic safety architecture that prevents unsafe natural language task commands from reaching robot execution. Drawing from ISO 13482 safety standard, SafeGate extracts structured safety-relevant properties from natural language commands and applies a deterministic decision gate to authorize or reject execution. In addition, we introduce Task Safety Contracts, which decomposes commands that pass through the gate into invariants, guards, and abort conditions to prevent unsafe state transitions during execution. We further incorporate Z3 SMT solving to enforce constraint checking derived from the Task Safety Contracts. We evaluate SafeGate against existing LLM-based robot safety frameworks and baseline LLMs across 230 benchmark tasks, 30 AI2-THOR simulation scenarios, and real-world robot experiments. Results show that SafeGate significantly reduces the acceptance of defective commands while maintaining a high acceptance of benign tasks, demonstrating the importance of pre-execution safety gates for LLM-controlled robot systems
Publisher OA PDF
Personalization in Human-Robot Interaction Through Preference-Based Action Representation Learning
2025-05-19 · 3 citations
articleSenior author
Preference- based reinforcement learning (PbRL) has shown significant promise for personalization in human- robot interaction (HRI) by explicitly integrating human preferences into the robot learning process. However, existing practices often require training a personalized robot policy from scratch, resulting in inefficient use of human feedback. In this paper, we propose preference-based action representation learning (PbARL), an efficient fine-tuning method that decouples common task structure from preference by leveraging pre-trained robot policies. Instead of directly fine-tuning the pre-trained policy with human preference, PbARL uses it as a reference for an action representation learning task that maximizes the mutual information between the pre-trained source domain and the target user preference-aligned domain. This approach allows the robot to personalize its behaviors while preserving original task performance and eliminates the need for extensive prior information from the source domain, thereby enhancing efficiency and practicality in real-world HRI scenarios. Empirical results on the Assistive Gym benchmark and a real-world user study (N=8) demonstrate the benefits of our method compared to state-of-the-art approaches. Website at https://sites.google.com/view/pbarl.
Publisher DOI
Multi-Agent LLM Actor-Critic Framework for Social Robot Navigation
ArXiv.org · 2025-03-12 · 1 citations
preprintOpen accessSenior author
Recent advances in robotics and large language models (LLMs) have sparked growing interest in human-robot collaboration and embodied intelligence. To enable the broader deployment of robots in human-populated environments, socially-aware robot navigation (SAN) has become a key research area. While deep reinforcement learning approaches that integrate human-robot interaction (HRI) with path planning have demonstrated strong benchmark performance, they often struggle to adapt to new scenarios and environments. LLMs offer a promising avenue for zero-shot navigation through commonsense inference. However, most existing LLM-based frameworks rely on centralized decision-making, lack robust verification mechanisms, and face inconsistencies in translating macro-actions into precise low-level control signals. To address these challenges, we propose SAMALM, a decentralized multi-agent LLM actor-critic framework for multi-robot social navigation. In this framework, a set of parallel LLM actors, each reflecting distinct robot personalities or configurations, directly generate control signals. These actions undergo a two-tier verification process via a global critic that evaluates group-level behaviors and individual critics that assess each robot's context. An entropy-based score fusion mechanism further enhances self-verification and re-query, improving both robustness and coordination. Experimental results confirm that SAMALM effectively balances local autonomy with global oversight, yielding socially compliant behaviors and strong adaptability across diverse multi-robot scenarios. More details and videos about this work are available at: https://sites.google.com/view/SAMALM.
Publisher OA PDF DOI
Few-Shot Demonstration-Driven Task Coordination and Trajectory Execution for Multi-Robot Systems
ArXiv.org · 2025-10-17
preprintOpen accessSenior author
In this paper, we propose a novel few-shot learning framework for multi-robot systems that integrate both spatial and temporal elements: Few-Shot Demonstration-Driven Task Coordination and Trajectory Execution (DDACE). Our approach leverages temporal graph networks for learning task-agnostic temporal sequencing and Gaussian Processes for spatial trajectory modeling, ensuring modularity and generalization across various tasks. By decoupling temporal and spatial aspects, DDACE requires only a small number of demonstrations, significantly reducing data requirements compared to traditional learning from demonstration approaches. To validate our proposed framework, we conducted extensive experiments in task environments designed to assess various aspects of multi-robot coordination-such as multi-sequence execution, multi-action dynamics, complex trajectory generation, and heterogeneous configurations. The experimental results demonstrate that our approach successfully achieves task execution under few-shot learning conditions and generalizes effectively across dynamic and diverse settings. This work underscores the potential of modular architectures in enhancing the practicality and scalability of multi-robot systems in real-world applications. Additional materials are available at https://sites.google.com/view/ddace.
Publisher OA PDF DOI
ZeroCAP: Zero-Shot Multi-Robot Context Aware Pattern Formation via Large Language Models
2025-05-19 · 2 citations
articleSenior author
Incorporating language comprehension into robotic operations unlocks significant advancements in robotics, but also presents distinct challenges, particularly in executing spatially oriented tasks like pattern formation. This paper introduces ZeroCAP, a novel system that integrates large language models with multi-robot systems for zero-shot context aware pattern formation. Grounded in the principles of language-conditioned robotics, ZeroCAP leverages the interpretative power of language models to translate natural language instructions into actionable robotic configurations. This approach combines the synergy of vision-language models, cutting-edge segmentation techniques and shape descriptors, enabling the realization of complex, context-driven pattern formations in the realm of multi robot coordination. Through extensive experiments, we demonstrate the systems proficiency in executing complex context aware pattern formations across a spectrum of tasks, from surrounding and caging objects to infilling regions. This not only validates the system's capability to interpret and implement intricate context-driven tasks but also underscores its adaptability and effectiveness across varied environments and scenarios. The experimental videos and additional information about this work can be found at https://sites.google.com/view/zerocap/home.
Publisher DOI
PrefCLM: Enhancing Preference-Based Reinforcement Learning With Crowdsourced Large Language Models
IEEE Robotics and Automation Letters · 2025-01-15 · 9 citations
articleSenior author
Preference-based reinforcement learning (PbRL) is emerging as a promising approach to teaching robots through human comparative feedback without complex reward engineering. However, the substantial volume of human feedback required hinders broader applications. In this work, we introduce PrefCLM, a novel framework that utilizes crowdsourced large language models (LLMs) as synthetic teachers in PbRL. We utilize Dempster-Shafer Theory to fuse individual preference beliefs from multiple LLM agents at the score level, efficiently leveraging their diversity and collective intelligence. We also introduce a human-in-the-loop pipeline, enabling iterative and collective refinements that adapt to the nuanced and individualized preferences inherent to human-robot interaction (HRI) scenarios. Experimental results across various general RL tasks show that PrefCLM achieves competitive performance compared to expert-engineered scripted teachers and excels in facilitating more natural and efficient behaviors. A real-world user study (N = 10) further demonstrates its capability to tailor robot behaviors to individual user preferences, enhancing user satisfaction in HRI scenarios.
Publisher DOI
Multimodal Audio-Based Disease Prediction With Transformer-Based Hierarchical Fusion Network
IEEE Transactions on Audio Speech and Language Processing · 2025-01-01 · 2 citations
articleSenior author
Audio-based disease prediction is emerging as a promising supplement to traditional medical diagnosis methods, facilitating early, convenient, and non-invasive disease detection and prevention. Multimodal fusion, which integrates features from various domains within or across bio-acoustic modalities, has proven effective in enhancing diagnostic performance. However, most existing methods in the field employ unilateral fusion strategies that focus solely on either intra-modal or inter-modal fusion. This approach limits the full exploitation of the complementary nature of diverse acoustic feature domains and bio-acoustic modalities. Additionally, the inadequate and isolated exploration of latent dependencies within modality-specific and modality-shared spaces curtails their capacity to manage the inherent heterogeneity in multimodal data. To fill these gaps, we propose a transformer-based hierarchical fusion network designed for general multimodal audio-based disease prediction. Specifically, we seamlessly integrate intra-modal and inter-modal fusion in a hierarchical manner and proficiently encode the necessary intra-modal and inter-modal complementary correlations, respectively. Comprehensive experiments demonstrate that our model achieves state-of-the-art performance in predicting three diseases: COVID-19, Parkinson's disease, and pathological dysarthria, showcasing its promising potential in a broad context of audio-based disease prediction tasks. Additionally, extensive ablation studies and qualitative analyses highlight the significant benefits of each main component within our model.
Publisher DOI

Recent grants

CAREER: Adaptive Human Multi-Robot Systems
NSF · $500k · 2019–2025

Frequent coauthors

Wonse Jo
Purdue University West Lafayette
39 shared
Eric T. Matson
Purdue University System
24 shared
Shyam Sundar Kannan
24 shared
Richard M. Voyles
22 shared
Tamzidul Mina
Sandia National Laboratories
21 shared
Donghan Kim
17 shared
Ruiqi Wang
Start Making A Reader Today
14 shared
Dezhong Zhao
14 shared

Education

Postdoc, Robotics Institute
Carnegie Mellon University
2015
Ph.D., Computer and Information Technology
Purdue University
2014

Awards & honors

NSF CAREER Award (2019)
Purdue PPI Outstanding Faculty Award in Discovery (2019)
Purdue CIT Outstanding Faculty Award in Discovery (2019)
Purdue CIT Outstanding Graduate Mentor Award (2019)
Purdue Focus Award (2020)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Byung-Cheol Min

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you