Boon Thau Loo
· Assistant ProfessorVerifiedUniversity of Pennsylvania · Computer and Information Science
Active 2001–2026
Research topics
- Computer Science
- Operating system
- Distributed computing
- Computer network
- Software engineering
- Database
- Engineering
- Real-time computing
- Embedded system
- World Wide Web
- Telecommunications
Selected publications
ScaleGANN: Accelerate Large-Scale ANN Indexing by Cost-effective Cloud GPUs
arXiv (Cornell University) · 2026-05-11
preprintOpen accessSenior authorGraph-based ANNS algorithms have gained increasing research interest and market adoption due to their efficiency and accuracy in retrieval. Existing approaches primarily rely on CPUs for graph index construction and retrieval, but this often requires significant time, especially for large-scale and high-dimensional datasets. Some studies have explored GPU-based solutions. However, GPUs are costly and their limited memory makes handling large datasets challenging. In this paper, we propose a novel end-to-end system ScaleGANN that enables users to efficiently construct graph indexes for large-scale, high-dimensional datasets by leveraging low-cost spot GPU resources in a distributed cloud system. ScaleGANN utilized the idea of divide-and-merge, with an optimized vector partitioning algorithm to further improve the indexing time and space efficiency while guaranteeing good index quality. Its novel resource allocation strategy realized multi-GPU indexing parallelism and overall cost-effectiveness for both build and query. Besides, we designed a task scheduler and cost model for better spot instance management and evaluation. We tested our system on large real-world datasets. Experiment results show that our approach can significantly accelerate the index build time to up to 9x times at even 6x lower price compared with the state-of-the-art extendable ANNS benchmark DiskANN.
ScaleGANN: Accelerate Large-Scale ANN Indexing by Cost-effective Cloud GPUs
ArXiv.org · 2026-05-11
articleOpen accessSenior authorGraph-based ANNS algorithms have gained increasing research interest and market adoption due to their efficiency and accuracy in retrieval. Existing approaches primarily rely on CPUs for graph index construction and retrieval, but this often requires significant time, especially for large-scale and high-dimensional datasets. Some studies have explored GPU-based solutions. However, GPUs are costly and their limited memory makes handling large datasets challenging. In this paper, we propose a novel end-to-end system ScaleGANN that enables users to efficiently construct graph indexes for large-scale, high-dimensional datasets by leveraging low-cost spot GPU resources in a distributed cloud system. ScaleGANN utilized the idea of divide-and-merge, with an optimized vector partitioning algorithm to further improve the indexing time and space efficiency while guaranteeing good index quality. Its novel resource allocation strategy realized multi-GPU indexing parallelism and overall cost-effectiveness for both build and query. Besides, we designed a task scheduler and cost model for better spot instance management and evaluation. We tested our system on large real-world datasets. Experiment results show that our approach can significantly accelerate the index build time to up to 9x times at even 6x lower price compared with the state-of-the-art extendable ANNS benchmark DiskANN.
Multiplexed Heterogeneous LLM Serving via Stage-Aligned Parallelism
2025-11-19
articleOpen accessModern LLM serving workloads are increasingly heterogeneous, involving a growing portfolio of models with vastly different compute and memory requirements. Existing approaches to model serving-ranging from static GPU partitioning to dynamic reconfiguration and GPU multiplexing—fail to effectively support heterogeneity.
BFTGym: An Interactive Playground for BFT Protocols
Proceedings of the VLDB Endowment · 2024-08-01 · 3 citations
articleSenior authorByzantine Fault Tolerant (BFT) protocols serve as a fundamental yet intricate component of distributed data management systems in untrustworthy environments. BFT protocols exhibit different design principles and performance characteristics under varying workloads and fault scenarios. The proliferation of BFT protocols and their growing complexity have made it increasingly challenging to analyze the performance and possible application scenarios of each protocol. This demonstration showcases BFTGym , an interactive platform that allows audience members to (1) evaluate, compare, and gather insights into the performance of various BFT protocols under a wide range of conditions, and (2) prototype new BFT protocols rapidly.
Towards Truly Adaptive Byzantine Fault-Tolerant Consensus
ACM SIGOPS Operating Systems Review · 2024-08-14 · 4 citations
articleTo acheive maximum performance, Byzantine fault-tolerant (BFT) systems must be manually tuned when hardware, network, or workload properties change. This paper presents our vision for a reinforcement learning (RL) based Byzantine fault-tolerant (BFT) system that adjusts effectively in realtime to changing fault scenarios and workloads. We identify several variables that can impact the performance of a BFT protocol, and show how these variables can serve as features in an RL engine in order to choose the context-dependent bestperforming BFT protocol in real-time. We further outline a decentralized RL approach capable of tolerating adversarial data pollution, where nodes share local metering values and reach the same learning output by consensus.
Verifying Declarative Smart Contracts
2024-04-12 · 2 citations
articleOpen accessSenior authorSmart contracts manage a large number of digital assets nowadays. Bugs in these contracts have led to significant financial loss. Verifying the correctness of smart contracts is, therefore, an important task. This paper presents an automated safety verification tool, DCV, that targets declarative smart contracts written in De-Con, a logic-based domain-specific language for smart contract implementation and specification. DCV proves safety properties by mathematical induction and can automatically infer inductive invariants using heuristic patterns, without annotations from the developer. Our evaluation on 23 benchmark contracts shows that DCV is effective in verifying smart contracts adapted from public repositories, and can verify contracts not supported by other tools. Furthermore, DCV significantly outperforms baseline tools in verification time.
2024-02-29 · 1 citations
book-chapterSenior authorUtilizing big-data analytics for crowdfunding platforms (e.g., AngelList and Crunchbase) and social media sites (e.g., Facebook and Twitter), this study investigates the impact of social media marketing on the start-up fundraising success through the lens of social capital theory. The results show that cognitive, structural, and relational dimensions of social capital sources served as predictors of fundraising for start-ups. Specifically, shared values (e.g., the number of followers, the number of investors of start-up companies) and attention (e.g., product/service descriptions and videos), which account for the cognitive dimension, positively led to the increased amount of investor funding. Trust (e.g., the quality rating, the number of rounds of funding, and the number of investors) as the subconstruct of the relational dimension was a determinant of fundraising. Social interaction ties (e.g., the number of likes, the number of social media followers) were found to increase the amount of funding as an aspect of the structural dimension. The results of a further analysis demonstrate a process of social capital formation by examining the effect of social interaction ties on the amount of funding mediated by shared values and trust. The current study contributes to extend knowledge on the start-up communication aligned with a resource-driven view of strategy.
Distributed Transaction Processing in Untrusted Environments
2024-05-23 · 1 citations
articleSenior authorByzantine Fault-Tolerant (BFT) protocols have recently been extensively used by distributed and decentralized data management systems with non-trustworthy infrastructures to establish consensus on the order of transactions. BFT protocols cover a broad spectrum of design dimensions from infrastructure settings, such as the communication topology, to more technical features, such as commitment strategy and even fundamental social choice properties like order-fairness. The proliferation of different protocols has made it difficult to navigate the BFT landscape, let alone determine the protocol that best meets application needs. In this tutorial, we discuss BFT protocols that are used in modern large-scale data management systems, present a design space consisting of a set of design dimensions and explore several design choices that capture the trade-offs between different design space dimensions. The presented design space and its design choices will help developers analyze BFT protocols, understand how different protocols are related to each other, and find the protocol that best fits their needs.
BFTBrain: Adaptive BFT Consensus with Reinforcement Learning
arXiv (Cornell University) · 2024-08-12 · 1 citations
preprintOpen accessThis paper presents BFTBrain, a reinforcement learning (RL) based Byzantine fault-tolerant (BFT) system that provides significant operational benefits: a plug-and-play system suitable for a broad set of hardware and network configurations, and adjusts effectively in real-time to changing fault scenarios and workloads. BFTBrain adapts to system conditions and application needs by switching between a set of BFT protocols in real-time. Two main advances contribute to BFTBrain's agility and performance. First, BFTBrain is based on a systematic, thorough modeling of metrics that correlate the performance of the studied BFT protocols with varying fault scenarios and workloads. These metrics are fed as features to BFTBrain's RL engine in order to choose the best-performing BFT protocols in real-time. Second, BFTBrain coordinates RL in a decentralized manner which is resilient to adversarial data pollution, where nodes share local metering values and reach the same learning output by consensus. As a result, in addition to providing significant operational benefits, BFTBrain improves throughput over fixed protocols by $18\%$ to $119\%$ under dynamic conditions and outperforms state-of-the-art learning based approaches by $44\%$ to $154\%$.
Towards Full Stack Adaptivity in Permissioned Blockchains
Proceedings of the VLDB Endowment · 2024-01-01 · 2 citations
articleSenior authorThis paper articulates our vision for a learning-based untrustworthy distributed database. We focus on permissioned blockchain systems as an emerging instance of untrustworthy distributed databases and argue that as novel smart contracts, modern hardware, and new cloud platforms arise, future-proof permissioned blockchain systems need to be designed with full-stack adaptivity in mind. At the application level, a future-proof system must adaptively learn the best-performing transaction processing paradigm and quickly adapt to new hardware and unanticipated workload changes on the fly. Likewise, the Byzantine consensus layer must dynamically adjust itself to the workloads, faulty conditions, and network configuration while maintaining compatibility with the transaction processing paradigm. At the infrastructure level, cloud providers must enable cross-layer adaptation, which identifies performance bottlenecks and possible attacks, and determines at runtime the degree of resource disaggregation that best meets application requirements. Within this vision of the future, our paper outlines several research challenges together with some preliminary approaches.
Recent grants
CAREER: Towards a Unified Declarative Platform for Composable Verifiable Networks
NSF · $450k · 2009–2016
NeTS: Medium: Collaborative Research: DEFIND: DEclarative Formal Interative Network Design
NSF · $400k · 2015–2021
FIND: Wireless Knowledge Infrastructure (WiKI)
NSF · $235k · 2007–2009
NGNI-Small: Declarative Secure Networked Information Systems
NSF · $449k · 2008–2013
NSF · $199k · 2011–2016
Frequent coauthors
- 65 shared
Wenchao Zhou
Alibaba Group (China)
- 44 shared
Ion Stoica
- 41 shared
Joseph M. Hellerstein
University of California, Berkeley
- 30 shared
Andreas Haeberlen
University of Pennsylvania
- 27 shared
Anduo Wang
Temple University
- 27 shared
Micah Sherr
Georgetown University
- 23 shared
Andre Scedrov
- 22 shared
Changbin Liu
Xinjiang Academy of Agricultural and Reclamation Science
Labs
Penn Engineering's TeamPI
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Boon Thau Loo
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup