
Joy Arulraj
· Associate ProfessorVerifiedGeorgia Institute of Technology · Computer Science
Active 2010–2026
About
Joy Arulraj is an associate professor in the School of Computer Science at Georgia Institute of Technology. His research interests include database management systems, specifically large-scale data analytics, main memory systems, machine learning, and big code analytics. He is a member of the Database group at Georgia Tech, contributing to advancements in these areas.
Research topics
- Computer Science
- Database
- Information Retrieval
- Data Mining
- Programming language
- Operating system
- Software engineering
- Computer hardware
- Embedded system
Selected publications
Halo: Domain-Aware Query Optimization for Long-Context Question Answering
arXiv (Cornell University) · 2026-03-18
preprintOpen accessSenior authorLong-context question answering (QA) over lengthy documents is critical for applications such as financial analysis, legal review, and scientific research. Current approaches, such as processing entire documents via a single LLM call or retrieving relevant chunks via RAG have two drawbacks: First, as context size increases, response quality can degrade, impacting accuracy. Second, iteratively processing hundreds of input documents can incur prohibitively high costs in API calls. To improve response quality and reduce the number of iterations needed to get the desired response, users tend to add domain knowledge to their prompts. However, existing systems fail to systematically capture and use this knowledge to guide query processing. Domain knowledge is treated as prompt tokens alongside the document: the LLM may or may not follow it, there is no reduction in computational cost, and when outputs are incorrect, users must manually iterate. We present Halo, a long-context QA framework that automatically extracts domain knowledge from user prompts and applies it as executable operators across a multi-stage query execution pipeline. Halo identifies three common forms of domain knowledge - where in the document to look, what content to ignore, and how to verify the answer - and applies each at the pipeline stage where it is most effective: pruning the document before chunk selection, filtering irrelevant chunks before inference, and ranking candidate responses after generation. To handle imprecise or invalid domain knowledge, Halo includes a fallback mechanism that detects low-quality operators at runtime and selectively disables them. Our evaluation across finance, literature, and scientific datasets shows that Halo achieves up to 13% higher accuracy and 4.8x lower cost compared to baselines, and enables a lightweight open-source model to approach frontier LLM accuracy at 78x lower cost.
PRISM: Navigating Cost–Accuracy Trade-offs for NL2SQL
Proceedings of the ACM on Management of Data · 2026-04-02
articleOpen accessSenior authorLarge language models (LLMs) have achieved strong performance on natural language to SQL (NL2SQL) tasks, but their practical effectiveness depends on tuning a complex pipeline of interacting components. Real-world deployments must navigate a critical trade-off between execution accuracy and monetary cost, a factor that has been largely overlooked by prior work focused primarily on maximizing accuracy. Navigating this trade-off is non-trivial: the ideal configuration of components (e.g., LLM, prompting strategy, schema linking) is not only interdependent but also highly sensitive to the target database schema. This creates a challenging, schema-aware configuration tuning problem that lacks a systematic solution. We present PRISM, a framework that systematically identifies high-accuracy, cost-efficient NL2SQL configurations tailored to each schema. Adopting an optimize-then-deploy strategy, PRISM first uses cost-aware Bayesian Optimization in an offline phase to efficiently explore the configuration space and curate a pool of high-performing pipelines. In an online phase, it deploys these configurations either as a single, cost-effective candidate or as an ensemble to maximize accuracy. Experiments on the BIRD benchmark demonstrate that PRISM achieves 69.48% execution accuracy in the single-candidate setting, improving accuracy by 2.34% over the strongest baseline while reducing cost by 92%. In the ensemble setting, PRISM boosts accuracy further to 74.9%.
Halo: Domain-Aware Query Optimization for Long-Context Question Answering
ArXiv.org · 2026-03-18
articleOpen accessSenior authorLong-context question answering (QA) over lengthy documents is critical for applications such as financial analysis, legal review, and scientific research. Current approaches, such as processing entire documents via a single LLM call or retrieving relevant chunks via RAG have two drawbacks: First, as context size increases, response quality can degrade, impacting accuracy. Second, iteratively processing hundreds of input documents can incur prohibitively high costs in API calls. To improve response quality and reduce the number of iterations needed to get the desired response, users tend to add domain knowledge to their prompts. However, existing systems fail to systematically capture and use this knowledge to guide query processing. Domain knowledge is treated as prompt tokens alongside the document: the LLM may or may not follow it, there is no reduction in computational cost, and when outputs are incorrect, users must manually iterate. We present Halo, a long-context QA framework that automatically extracts domain knowledge from user prompts and applies it as executable operators across a multi-stage query execution pipeline. Halo identifies three common forms of domain knowledge - where in the document to look, what content to ignore, and how to verify the answer - and applies each at the pipeline stage where it is most effective: pruning the document before chunk selection, filtering irrelevant chunks before inference, and ranking candidate responses after generation. To handle imprecise or invalid domain knowledge, Halo includes a fallback mechanism that detects low-quality operators at runtime and selectively disables them. Our evaluation across finance, literature, and scientific datasets shows that Halo achieves up to 13% higher accuracy and 4.8x lower cost compared to baselines, and enables a lightweight open-source model to approach frontier LLM accuracy at 78x lower cost.
Buffer Management for Out-of-GPU LLM Execution
2025-06-22
articleOpen accessThe rapid advancement of large language models (LLMs) has caused their parameter sizes to grow beyond the memory capacity of a single GPU. Although distributed inference across multiple GPUs is a solution in enterprise settings, it remains inaccessible for most non-commercial users. Thus, there is a growing demand to run LLMs on a single GPU when the model does not fit entirely in GPU memory. A common approach is to offload parts of the model from the GPU to the CPU during inference. However, repeatedly transferring parameters between these devices incurs significant overhead. To address this challenge, we propose a new buffer management policy, LIRS-M, which maximizes buffer hits and minimizes data transfer. Experimental results show that our approach achieves a 2.0× speedup compared to StoA offloading techniques while delivering robust buffer-hit performance.
TRACER: Efficient Object Re-Identification in Networked Cameras through Adaptive Query Processing
ArXiv.org · 2025-07-13
preprintOpen accessSenior authorEfficiently re-identifying and tracking objects across a network of cameras is crucial for applications like traffic surveillance. Spatula is the state-of-the-art video database management system (VDBMS) for processing Re-ID queries. However, it suffers from two limitations. Its spatio-temporal filtering scheme has limited accuracy on large camera networks due to localized camera history. It is not suitable for critical video analytics applications that require high recall due to a lack of support for adaptive query processing. In this paper, we present Tracer, a novel VDBMS for efficiently processing Re-ID queries using an adaptive query processing framework. Tracer selects the optimal camera to process at each time step by training a recurrent network to model long-term historical correlations. To accelerate queries under a high recall constraint, Tracer incorporates a probabilistic adaptive search model that processes camera feeds in incremental search windows and dynamically updates the sampling probabilities using an exploration-exploitation strategy. To address the paucity of benchmarks for the Re-ID task due to privacy concerns, we present a novel synthetic benchmark for generating multi-camera Re-ID datasets based on real-world traffic distribution. Our evaluation shows that Tracer outperforms the state-of-the-art cross-camera analytics system by 3.9x on average across diverse datasets.
Aero: Adaptive Query Processing of ML Queries
Proceedings of the ACM on Management of Data · 2025-06-17
articleOpen accessQuery optimization is critical in relational database management systems (DBMSs) for ensuring efficient query processing. The query optimizer relies on precise selectivity and cost estimates to generate optimal query plans for execution. However, this static query optimization approach falls short for DBMSs handling machine learning (ML) queries. ML-centric DBMSs face distinct challenges in query optimization. First, performance bottlenecks shift to user-defined functions (UDFs), often encapsulating deep learning models, making it difficult to estimate UDF statistics without profiling the query. Second, optimal query plans for ML queries are data-dependent, requiring dynamic plan adjustments during execution. To address these challenges, we introduce Aero, an ML-centric DBMS that utilizes adaptive query processing (AQP) for efficiently processing ML queries. Aero optimizes the evaluation of UDF-based query predicates by dynamically adjusting predicate evaluation order and enhancing UDF execution scalability. By integrating AQP, Aero continuously monitors UDF statistics, routes data to predicates in an optimal order, and dynamically allocates resources for evaluating predicates. Aero achieves up to 6.4x speedup compared to a state-of-the-art ML-centric DBMS across four diverse use cases, with no impact on accuracy.
A Framework For Inferring Properties of User-Defined Functions
2024-04-12
articleOpen accessUser-defined functions (UDFs) are widely used to enhance the capabilities of DBMSs. However, using UDFs comes with a significant performance penalty because DBMSs treat UDFs as black boxes, which hinders their ability to optimize queries that invoke such UDFs. To mitigate this problem, in this paper we present LAMBDA, a technique and framework for improving DBMSs' performance in the presence of UDFs. The core idea of LAMBDA is to statically infer properties of UDFs that facilitate UDF processing. Taking one such property as an example, if DBMSs know that a UDF is pure, that is it returns the same result given the same arguments, they can leverage a cache to avoid repetitive UDF invocations that have the same call arguments.
SketchQL Demonstration: Zero-shot Video Moment Querying with Sketches
arXiv (Cornell University) · 2024-05-28 · 1 citations
preprintOpen accessIn this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajectory similarity, SketchQL achieves zero-shot video moments retrieval by performing similarity searches over the video to identify clips that are the most similar to the visual query. In this demonstration, we introduce the graphic user interface of SketchQL and detail its functionalities and interaction mechanisms. We also demonstrate the end-to-end usage of SketchQL from query composition to video moments retrieval using real-world scenarios.
Hydro: Adaptive Query Processing of ML Queries
arXiv (Cornell University) · 2024-03-22
preprintOpen accessQuery optimization in relational database management systems (DBMSs) is critical for fast query processing. The query optimizer relies on precise selectivity and cost estimates to effectively optimize queries prior to execution. While this strategy is effective for relational DBMSs, it is not sufficient for DBMSs tailored for processing machine learning (ML) queries. In ML-centric DBMSs, query optimization is challenging for two reasons. First, the performance bottleneck of the queries shifts to user-defined functions (UDFs) that often wrap around deep learning models, making it difficult to accurately estimate UDF statistics without profiling the query. This leads to inaccurate statistics and sub-optimal query plans. Second, the optimal query plan for ML queries is data-dependent, necessitating DBMSs to adapt the query plan on the fly during execution. So, a static query plan is not sufficient for such queries. In this paper, we present Hydro, an ML-centric DBMS that utilizes adaptive query processing (AQP) for efficiently processing ML queries. Hydro is designed to quickly evaluate UDF-based query predicates by ensuring optimal predicate evaluation order and improving the scalability of UDF execution. By integrating AQP, Hydro continuously monitors UDF statistics, routes data to predicates in an optimal order, and dynamically allocates resources for evaluating predicates. We demonstrate Hydro's efficacy through four illustrative use cases, delivering up to 11.52x speedup over a baseline system.
SketchQL Demonstration: Zero-Shot Video Moment Querying with Sketches
Proceedings of the VLDB Endowment · 2024-08-01 · 1 citations
articleIn this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajectory similarity, SketchQL achieves zero-shot video moments retrieval by performing similarity searches over the video to identify clips that are the most similar to the visual query. In this demonstration, we introduce the graphic user interface of SketchQL and detail its functionalities and interaction mechanisms. We also demonstrate the end-to-end usage of SketchQL from query composition to video moments retrieval using real-world scenarios.
Recent grants
III: Small: Automatic Detection and Resolution of Anti-Patterns in Database Applications
NSF · $500k · 2019–2024
CAREER: Data Management for Exploratory Video Analytics
NSF · $484k · 2023–2028
Frequent coauthors
- 25 shared
Andrew Pavlo
Carnegie Mellon University
- 11 shared
Jiashen Cao
- 9 shared
Pramod Chunduri
- 6 shared
Jaeho Bang
Georgia Institute of Technology
- 6 shared
Shan Lu
Microsoft (United States)
- 6 shared
Hyesoon Kim
Georgia Institute of Technology
- 6 shared
Guoliang Jin
City University of Hong Kong
- 6 shared
Gaurav Tarlok Kakkar
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Joy Arulraj
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup