Lui Sha
· Donald B. Gillies Chair in Computer ScienceVerifiedUniversity of Illinois Urbana-Champaign · Computer Science
Active 1983–2026
About
Lui Sha graduated with a Ph.D. in Electrical and Computer Engineering from Carnegie Mellon University in 1985. He worked at the Software Engineering Institute from 1986 to 1998 before joining the University of Illinois Urbana-Champaign in 1998 as a full professor. Currently, he holds the positions of Donald B. Gillies Chair Professor of Computer Science and Daniel C. Drucker Eminent Faculty at UIUC's College of Engineering. His research has significantly impacted real-time and embedded computing technologies, leading to revisions of IEEE standards that are now widely adopted in systems such as airplanes, robots, cars, ships, trains, and medical devices. His work on real-time and safety-critical system integration has influenced major high-technology programs including GPS, Space Station, and Mars Pathfinder. In recent years, his research focuses on autonomous vehicles, physics-informed deep neural networks, and medical guidance systems aimed at reducing preventable medical errors. Sha is a fellow of IEEE and ACM, and a recipient of the IEEE Simon Ramo Medal for exceptional achievement in systems engineering and systems science. His contributions include leading research on trustworthy autonomous systems, provably safe AI-enhanced cyber-physical systems, and developing models that combine physical laws with deep learning to ensure stability and physical consistency.
Research topics
- Artificial Intelligence
- Computer Science
- Machine Learning
- Programming language
- Distributed computing
- Geology
- Engineering
- Mathematics
- Seismology
- Mathematical optimization
- Theoretical computer science
Selected publications
Understanding the Theoretical Foundations of Deep Neural Networks through Differential Equations
ArXiv.org · 2026-03-18
articleOpen accessDeep neural networks (DNNs) have achieved remarkable empirical success, yet the absence of a principled theoretical foundation continues to hinder their systematic development. In this survey, we present differential equations as a theoretical foundation for understanding, analyzing, and improving DNNs. We organize the discussion around three guiding questions: i) how differential equations offer a principled understanding of DNN architectures, ii) how tools from differential equations can be used to improve DNN performance in a principled way, and iii) what real-world applications benefit from grounding DNNs in differential equations. We adopt a two-fold perspective spanning the model level, which interprets the whole DNN as a differential equation, and the layer level, which models individual DNN components as differential equations. From these two perspectives, we review how this framework connects model design, theoretical analysis, and performance improvement. We further discuss real-world applications, as well as key challenges and opportunities for future research.
Understanding the Theoretical Foundations of Deep Neural Networks through Differential Equations
arXiv (Cornell University) · 2026-03-18
preprintOpen accessDeep neural networks (DNNs) have achieved remarkable empirical success, yet the absence of a principled theoretical foundation continues to hinder their systematic development. In this survey, we present differential equations as a theoretical foundation for understanding, analyzing, and improving DNNs. We organize the discussion around three guiding questions: i) how differential equations offer a principled understanding of DNN architectures, ii) how tools from differential equations can be used to improve DNN performance in a principled way, and iii) what real-world applications benefit from grounding DNNs in differential equations. We adopt a two-fold perspective spanning the model level, which interprets the whole DNN as a differential equation, and the layer level, which models individual DNN components as differential equations. From these two perspectives, we review how this framework connects model design, theoretical analysis, and performance improvement. We further discuss real-world applications, as well as key challenges and opportunities for future research.
Bayesian Data Augmentation and Training for Perception DNN in Autonomous Aerial Vehicles
2025-01-03 · 1 citations
articleLearning-based solutions have enabled incredible capabilities for autonomous systems. Autonomous vehicles, both aerial and ground, rely on Deep Neural Networks (DNN) for various integral tasks, including perception. The efficacy of supervised learning solutions, such as the DNN used for perception tasks, hinges on the quality of the training data. Discrepancies between training data and operating conditions result in faults that can lead to catastrophic incidents. However, collecting and labeling vast amounts of context-sensitive data, with broad coverage of possible variations in the operating environment, is prohibitively difficult. To overcome this limitation, synthetic data generation techniques for DNN training emerged, allowing for the easy exploration of diverse scenarios. While significant synthetic data generation solutions exist for ground vehicles, for aerial vehicles such support is still lacking. This work presents a data augmentation framework for aerial vehicle perception training, leveraging photorealistic simulation seamlessly integrated with high-fidelity vehicle dynamics, control, and planning algorithms. Safe landing in urban environments is a crucial challenge in the development of autonomous air taxis, and therefore, landing maneuver is chosen as the focus of this work. With repeated simulations of landing maneuvers in scenarios with varying vehicle states, weather conditions, and time of day, we assess the landing performance of the VTOL (Vertical Take off and Landing) type UAV and gather valuable data. The landing performance is used as the objective function to optimize the DNN through retraining. Given the high computational cost of DNN retraining, we incorporated Bayesian Optimization in our framework that systematically explores the data augmentation parameter space to retrain the best-performing models. The framework allowed us to identify high-performing data augmentation parameters that are consistently effective across different landing scenarios. Utilizing the capabilities of this data augmentation framework, we obtained a robust perception model. The model consistently improved the perception-based landing success rate by at least 20% under different lighting and weather conditions.
Real-DRL: Teach and Learn in Reality
ArXiv.org · 2025-10-30
preprintOpen accessSenior authorThis paper introduces the Real-DRL framework for safety-critical autonomous systems, enabling runtime learning of a deep reinforcement learning (DRL) agent to develop safe and high-performance action policies in real plants (i.e., real physical systems to be controlled), while prioritizing safety! The Real-DRL consists of three interactive components: a DRL-Student, a PHY-Teacher, and a Trigger. The DRL-Student is a DRL agent that innovates in the dual self-learning and teaching-to-learn paradigm and the real-time safety-informed batch sampling. On the other hand, PHY-Teacher is a physics-model-based design of action policies that focuses solely on safety-critical functions. PHY-Teacher is novel in its real-time patch for two key missions: i) fostering the teaching-to-learn paradigm for DRL-Student and ii) backing up the safety of real plants. The Trigger manages the interaction between the DRL-Student and the PHY-Teacher. Powered by the three interactive components, the Real-DRL can effectively address safety challenges that arise from the unknown unknowns and the Sim2Real gap. Additionally, Real-DRL notably features i) assured safety, ii) automatic hierarchy learning (i.e., safety-first learning and then high-performance learning), and iii) safety-informed batch sampling to address the learning experience imbalance caused by corner cases. Experiments with a real quadruped robot, a quadruped robot in NVIDIA Isaac Gym, and a cart-pole system, along with comparisons and ablation studies, demonstrate the Real-DRL's effectiveness and unique features.
ACM Transactions on Cyber-Physical Systems · 2025-06-10 · 1 citations
articleThis paper proposes the runtime learning machine for safety-critical learning-enabled cyber-physical systems (CPS). The learning machine has three interactive components: a high-performance (HP)-Student, a high-assurance (HA)-Teacher, and a Coordinator. The HP-Student is a high-performance but not fully verified Phy-DRL (physics-regulated deep reinforcement learning) agent that performs runtime learning in real CPS, using real-time sensor data from real-time physical environments. On the other hand, HA-Teacher is a verified but simplified design, focusing on safety-critical functions only. As a complementary, HA-Teacher's novelty lies in real-time patch for two missions: i) correcting unsafe learning of HP-Student, and ii) backing up safety. The Coordinator manages the interaction between HP-Student and HA-Teacher. Powered by the three interactive components, the runtime learning machine notably features i) assuring lifetime safety (i.e., safety guarantee in any runtime learning stage), ii) tolerating unknown unknowns, iii) addressing Sim2Real gap, and iv) automatic hierarchy learning (i.e., safety-first learning, and then high-performance learning). Experiments involving a cart-pole system and two quadruped robots, as well as comparisons with state-of-the-art safe DRL, fault-tolerant DRL, and approaches for addressing Sim2Real gap, demonstrate the learning machine's effectiveness and unique features.
ArXiv.org · 2025-10-29
preprintOpen accessSenior authorWe present VISAT, a novel open dataset and benchmarking suite for evaluating model robustness in the task of traffic sign recognition with the presence of visual attributes. Built upon the Mapillary Traffic Sign Dataset (MTSD), our dataset introduces two benchmarks that respectively emphasize robustness against adversarial attacks and distribution shifts. For our adversarial attack benchmark, we employ the state-of-the-art Projected Gradient Descent (PGD) method to generate adversarial inputs and evaluate their impact on popular models. Additionally, we investigate the effect of adversarial attacks on attribute-specific multi-task learning (MTL) networks, revealing spurious correlations among MTL tasks. The MTL networks leverage visual attributes (color, shape, symbol, and text) that we have created for each traffic sign in our dataset. For our distribution shift benchmark, we utilize ImageNet-C's realistic data corruption and natural variation techniques to perform evaluations on the robustness of both base and MTL models. Moreover, we further explore spurious correlations among MTL tasks through synthetic alterations of traffic sign colors using color quantization techniques. Our experiments focus on two major backbones, ResNet-152 and ViT-B/32, and compare the performance between base and MTL models. The VISAT dataset and benchmarking framework contribute to the understanding of model robustness for traffic sign recognition, shedding light on the challenges posed by adversarial attacks and distribution shifts. We believe this work will facilitate advancements in developing more robust models for real-world applications in autonomous driving and cyber-physical systems.
Verification and Validation of a Vision-Based Landing System for Autonomous VTOL Air Taxis
2025-01-03 · 1 citations
articleAutonomous air taxis are poised to revolutionize urban mass transportation. A key challenge inhibiting their adoption is ensuring the safety and reliability of the autonomy solutions that will control these vehicles. Validating these solutions on full-scale air taxis in the real world presents complexities, risks, and costs that further convolute the challenge of ensuring safety and reliability of these autonomous vehicles. Verification and Validation (V&V) frameworks play a crucial role in the design and development of highly reliable systems by formally verifying safety properties and validating algorithm behavior across diverse operational scenarios. Advancements in high-fidelity simulators have significantly enhanced their capability to emulate real-world conditions, encouraging their use for validating autonomous air taxi solutions, especially during early development stages. This evolution underscores the growing importance of simulation environments, not only as complementary tools to real-world testing but as essential platforms for evaluating algorithms in a controlled, reproducible, and scalable manner. This work presents a V&V framework for a vision-based landing system for air taxis with vertical take-off and landing (VTOL) capabilities. Specifically, we use Verse, a tool for formal verification, to model and verify the safety of the system by obtaining and analyzing the reachable sets. To conduct this analysis, we utilize a photorealistic simulation environment. The simulation environment, built on Unreal Engine, provides realistic terrain, weather, and sensor characteristics to emulate real-world conditions with high fidelity. To validate the safety analysis results, we conduct extensive scenario-based testing to assess the reachability set and robustness of the landing algorithm in various conditions. This approach showcases the representativeness of high-fidelity simulators, offering an effective means to analyze and refine algorithms before real-world deployment.
arXiv (Cornell University) · 2025-01-13
preprintOpen accessEnd-to-end deep neural networks have achieved remarkable success across various domains but are often criticized for their lack of interpretability. While post hoc explanation methods attempt to address this issue, they often fail to accurately represent these black-box models, resulting in misleading or incomplete explanations. To overcome these challenges, we propose an inherently transparent model architecture called Neural Probabilistic Circuits (NPCs), which enable compositional and interpretable predictions through logical reasoning. In particular, an NPC consists of two modules: an attribute recognition model, which predicts probabilities for various attributes, and a task predictor built on a probabilistic circuit, which enables logical reasoning over recognized attributes to make class predictions. To train NPCs, we introduce a three-stage training algorithm comprising attribute recognition, circuit construction, and joint optimization. Moreover, we theoretically demonstrate that an NPC's error is upper-bounded by a linear combination of the errors from its modules. To further demonstrate the interpretability of NPC, we provide both the most probable explanations and the counterfactual explanations. Empirical results on four benchmark datasets show that NPCs strike a balance between interpretability and performance, achieving results competitive even with those of end-to-end black-box models while providing enhanced interpretability.
AirTaxiSim: A Simulator for Autonomous Air Taxis
2025-07-16 · 2 citations
articleSenior authorThe rapid advancements in air mobility vehicles is paving the way for air taxis to become a viable mode of public transportation. The next technological frontier for air taxis is fully autonomous operation. Developing safe and efficient autonomous control for air taxis presents greater challenges than for ground vehicles due to the inherent instability of aerial vehicles. Therefore, simulation solutions for autonomous air taxis will play a crucial role in accelerating their development and eventual safe deployment. This paper introduces AirTaxiSim, an end to end simulation framework for autonomous air taxis. AirTaxiSim is designed to model and analyze the complexities of autonomous air taxi operations in dynamic and cluttered urban environments. AirTaxiSim integrates high fidelity physical models of vertical take-off and landing air vehicles in photo-realistic urban environments. The primary purpose of AirTaxiSim is to evaluate the safety, performance, and efficiency of autonomous air taxi services, across a variety of scenarios, including dangerous edge cases. AirTaxiSim also provides methods for generating datasets and establishing benchmarks for autonomous air taxis. This paper describes the simulator’s construction, functionalities, and some of the use cases, providing critical information to facilitate its use in advancing autonomy in aerial vehicles.
Bayesian Data Augmentation and Training for Perception DNN in Autonomous Aerial Vehicles
arXiv (Cornell University) · 2024-12-10
preprintOpen accessLearning-based solutions have enabled incredible capabilities for autonomous systems. Autonomous vehicles, both aerial and ground, rely on DNN for various integral tasks, including perception. The efficacy of supervised learning solutions hinges on the quality of the training data. Discrepancies between training data and operating conditions result in faults that can lead to catastrophic incidents. However, collecting vast amounts of context-sensitive data, with broad coverage of possible operating environments, is prohibitively difficult. Synthetic data generation techniques for DNN allow for the easy exploration of diverse scenarios. However, synthetic data generation solutions for aerial vehicles are still lacking. This work presents a data augmentation framework for aerial vehicle's perception training, leveraging photorealistic simulation integrated with high-fidelity vehicle dynamics. Safe landing is a crucial challenge in the development of autonomous air taxis, therefore, landing maneuver is chosen as the focus of this work. With repeated simulations of landing in varying scenarios we assess the landing performance of the VTOL type UAV and gather valuable data. The landing performance is used as the objective function to optimize the DNN through retraining. Given the high computational cost of DNN retraining, we incorporated Bayesian Optimization in our framework that systematically explores the data augmentation parameter space to retrain the best-performing models. The framework allowed us to identify high-performing data augmentation parameters that are consistently effective across different landing scenarios. Utilizing the capabilities of this data augmentation framework, we obtained a robust perception model. The model consistently improved the perception-based landing success rate by at least 20% under different lighting and weather conditions.
Recent grants
NSF · $900k · 2019–2023
SGER: Stability of Real Time Software Systems
NSF · $198k · 2006–2008
NSF · $220k · 2018–2022
CSR EHS: Formal Model Based Health and Medical System Composition
NSF · $750k · 2007–2010
CSR-EHCS(CPS),TM: Architecture for the Safe Composition of Complex Medical Systems
NSF · $962k · 2008–2013
Frequent coauthors
- 61 shared
Marco Caccamo
Technical University of Munich
- 36 shared
John P. Lehoczky
Carnegie Mellon University
- 34 shared
Naira Hovakimyan
- 29 shared
Tarek Abdelzaher
- 29 shared
Yu Jiang
Tsinghua University
- 25 shared
Po-Liang Wu
Disaster Prevention & Water Environment Research Center
- 25 shared
Richard B. Berlin
Carle Foundation Hospital
- 24 shared
Heechul Yun
Education
- 1985
Ph.D., ECE
Carnegie Mellon University
Awards & honors
- IEEE Simon Ramo Medal
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Lui Sha
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup