Sharad Mehrotra

· Distinguished ProfessorVerified

University of California, Irvine · Computer Science

Active 1991–2026

h-index53

Citations17.7k

Papers42688 last 5y

Funding$13.1M

Faculty page

See your match with Sharad Mehrotra — sign in to PhdFit.Sign in

About

Sharad Mehrotra is a Distinguished Professor in the School of Information and Computer Science at the University of California, Irvine, and serves as the Director of the Center for Emergency Response Technologies (CERT) at UCI. He is also the Director and Principal Investigator of the RESCUE project (Responding to Crisis and Unexpected Events), which is funded by the NSF through its large ITR program and spans seven schools with 60 members. Mehrotra is associated with the Cal-IT2 institute, a multidisciplinary research facility spanning UC Irvine and UC San Diego. His research expertise lies in data management and distributed systems, with pioneering contributions including the concept of 'database as a service' and the use of information retrieval techniques, particularly relevance feedback, in multimedia search. His work has earned numerous awards and nominations, including the SIGMOD Best Paper award in 2001 and best paper awards at DASFAA 2004. His current research focuses on building sentient spaces using multimodal sensors, data privacy, and data quality, with recent efforts emphasizing situational awareness from multimodal input such as conversational speech data. Many of his research contributions have been incorporated into software used at various first responder sites.

Research topics

Computer Science
Data Mining
Computer Security
Artificial Intelligence
Information Retrieval
Medicine
Programming language
Database
Mathematics
Engineering
Virology
Data science
World Wide Web
Operating system

Selected publications

O <scp>cto</scp> S <scp>elector</scp> : Efficient and Effective Batch-Aware Model Selection for Large Language Models
Proceedings of the ACM on Management of Data · 2026-05-18
articleOpen accessSenior author
Large Language Models (LLMs) vary significantly in metrics such as accuracy, latency, and cost, making it challenging for users and applications to decide which model to invoke for each query. This paper presents O cto S elector , a framework for LLM selection that satisfies user-defined objectives and constraints across multiple metrics. In the pre-processing phase, O cto S elector learns difficulty-aware representations of queries based on both input and output complexity, clustering them into similar difficulty groups to enable efficient performance estimation across multiple LLMs. During inference, O cto S elector supports LLM selection for batched workload, formulating it as an Integer Linear Programming (ILP) problem that optimizes a user-defined objective (e.g., minimizing cost or latency, or maximizing accuracy) while enforcing constraints on other metrics. We evaluate O cto S elector on two types of tasks: NL2SQL using the Spider and BIRD benchmarks, and sentiment analysis using the IMDb benchmark. When optimizing for cost under accuracy and latency constraints, O cto S elector achieves up to a 67.7% cost reduction on NL2SQL tasks for batched workloads compared to state-of-the-art approaches.
Publisher DOI
DIM-SUM: Dynamic IMputation for Smart Utility Management
Proceedings of the VLDB Endowment · 2025-07-01 · 2 citations
article
Time series imputation models have traditionally been developed using complete datasets with artificial masking patterns to simulate missing values. However, in real-world infrastructure monitoring, practitioners often encounter datasets where large amounts of data are missing and follow complex, heterogeneous patterns. We introduce DIM-SUM, a preprocessing framework for training robust imputation models that bridges the gap between artificially masked training data and real missing patterns. DIM-SUM combines pattern clustering and adaptive masking strategies with theoretical learning guarantees to handle diverse missing patterns actually observed in the data. Through extensive experiments on over 2 billion readings from California water districts, electricity datasets, and benchmarks, we demonstrate that DIM-SUM outperforms traditional methods by reaching similar accuracy with lower processing time and significantly less training data. When compared against a large pre-trained model, DIM-SUM averages 2x higher accuracy with significantly less inference time.
Publisher DOI
Search over Secret-Shared Datasets
2025-01-01
book-chapter1st authorCorresponding
Publisher DOI
Modeling Inhabited Smart Spaces to Support Interoperable IoT-Based Applications
2025-06-02
articleOpen access
IoT deployments in smart spaces can enable the development of useful services for their inhabitants. However, the diversity of smart spaces and their sensor infrastructures makes it challenging to develop space-agnostic applications. Moreover, existing schemas addressing interoperability challenges often lack the vocabulary needed to represent the integration of smart space systems and their inhabitants. We present a schema to annotate inhabited smart spaces in support of inhabitant-oriented applications. Our schema integrates well-known ontologies to represent inhabitants, events/activities, and the space itself, along with their interconnections. It also supports the representation of uncertain information from IoT and mobile sensors (e.g., a person's location or occupancy/attendance at an event). Additionally, we introduce an annotation tool that uses an easy-to-use GUI to describe a smart space based on our schema. We demonstrate the potential of our approach through a series of SPARQL queries and a system deployed at the UCI campus that annotates sensor data to support a space-agnostic occupancy monitoring application.
Publisher OA PDF DOI
Search over Encrypted Data
2025-01-01
book-chapter
Publisher DOI
Meaningful Data Erasure in the Presence of Dependencies
Proceedings of the VLDB Endowment · 2025-06-01 · 1 citations
articleOpen access
Data regulations like GDPR require systems to support data erasure but leave the definition of "erasure" open to interpretation. This ambiguity makes compliance challenging, especially in databases where data dependencies can lead to erased data being inferred from remaining data. We formally define a precise notion of data erasure that ensures any inference about deleted data, through dependencies, remains bounded to what could have been inferred before its insertion. We design erasure mechanisms that enforce this guarantee at minimal cost. Additionally, we explore strategies to balance cost and throughput, batch multiple erasures, and proactively compute data retention times when possible. We demonstrate the practicality and scalability of our algorithms using both real and synthetic datasets.
Publisher OA PDF DOI
Graph structure prompt learning: A novel methodology to improve performance of graph neural networks
Applied Intelligence · 2025-11-01
articleSenior author
Publisher DOI
Towards Secure Data Management using Multi-Cryptographic Solutions (Invited)
2025-06-22
article
Several secure data outsourcing systems incorporate various cryptographic techniques to balance security, functionalities, and efficiency. However, their security properties can be ad hoc and sometimes obscure. Our recent work, Secure Normal Form (SNF) [ICDE’24], presents a principled approach that allows data owners to define acceptable leakages of nonsensitive aspects of their data. This approach enables efficient processing of queries while ensuring no unintended leakage of sensitive information. In this paper, we discuss the benefits and challenges of implementing SNF within advanced computational environments and modern data management architectures. We argue that its applicability may extend beyond merely offloading secure query execution to the cloud.
Publisher DOI
Privacy of Outsourced Data
2025-01-01
book-chapter
Publisher DOI
DIM-SUM: Dynamic IMputation for Smart Utility Management
ArXiv.org · 2025-06-24
preprintOpen access
Time series imputation models have traditionally been developed using complete datasets with artificial masking patterns to simulate missing values. However, in real-world infrastructure monitoring, practitioners often encounter datasets where large amounts of data are missing and follow complex, heterogeneous patterns. We introduce DIM-SUM, a preprocessing framework for training robust imputation models that bridges the gap between artificially masked training data and real missing patterns. DIM-SUM combines pattern clustering and adaptive masking strategies with theoretical learning guarantees to handle diverse missing patterns actually observed in the data. Through extensive experiments on over 2 billion readings from California water districts, electricity datasets, and benchmarks, we demonstrate that DIM-SUM outperforms traditional methods by reaching similar accuracy with lower processing time and significantly less training data. When compared against a large pre-trained model, DIM-SUM averages 2x higher accuracy with significantly less inference time.
Publisher OA PDF DOI

Recent grants

Information Technology Research (ITR): Responding to the Unexpected
NSF · $9.5M · 2003–2010
ITR: Privacy in Database-As-A-Service (DAS) Model
NSF · $595k · 2002–2007
RAPID: An Organizational Scale Approach to Privacy-Enabled Contact Tracing in COVID-19
NSF · $100k · 2020–2022
III: Small: Query and Goal Driven Entity Resolution Framework
NSF · $569k · 2011–2015
III: Small: EnrichDB - Supporting Enrichment in Database Systems
NSF · $532k · 2020–2025

Frequent coauthors

Nalini Venkatasubramanian
184 shared
Roberto Yus
University of Maryland, Baltimore County
68 shared
Shantanu Sharma
59 shared
Andrew Chio
University of California, Irvine
55 shared
Daokun Jiang
University of California, Irvine
54 shared
Dmitri V. Kalashnikov
Voronezh State University
42 shared
Peeyush Gupta
University of California, Irvine
40 shared
Daniela Nicklas
University of Bamberg
36 shared

Awards & honors

Outstanding Graduate Student Mentor Award (2005)
C. W. Gear Outstanding Junior Faculty Award
SIGMOD Best Paper award (2001)
Best of VLDB 1994 submissions
best paper award in DASFAA (2004)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Sharad Mehrotra

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you