
Andreas Haeberlen
· Associate ProfessorVerifiedUniversity of Pennsylvania · Computer and Information Science
Active 2000–2026
Research topics
- Computer Science
- Distributed computing
- Computer network
- Data Mining
- Theoretical computer science
- Database
- World Wide Web
- Algorithm
- Programming language
- Operating system
- Embedded system
- Parallel computing
- Mathematics
Selected publications
Running Distributed Systems like Clockwork
Leibniz-Zentrum für Informatik (Schloss Dagstuhl) · 2026-01-01
articleOpen accessDistributed Systems are commonly built using a set of standard assumptions: we assume that message delays are unbounded, that any packet can be lost in the network, and that clocks cannot be closely synchronized. On the one hand, these conservative assumptions result in robust systems that can operate reliably in a wide variety of conditions. On the other hand, they also force the system to do a lot of complex ad-hoc coordination and thus limit the performance it can achieve. In this paper, we take a look at what lies beyond this standard model. We observe that, on modern hardware in a single-tenant data center, distributed systems are able to closely coordinate and essentially "run like clockwork" with very little effort. If we are willing to additionally rule out some worst-case failure scenarios, this results in a large performance improvement, both in practice and even in theory. We demonstrate this effect using state-machine replication (SMR) as a case study: our SMR protocol, Watchmaker, exceeds the throughput of state-of-the-art algorithms by two orders of magnitude, and it requires only half as many replicas to tolerate the same number of faults.
RoboRebound: Multi-Robot System Defense with Bounded-Time Interaction
2025-03-26 · 1 citations
articleByzantine Fault Tolerance (BFT) is a classic technique for defending distributed systems against a wide range of faults and attacks. However, existing solutions are designed for systems where nodes can interact only by exchanging messages. They are not directly applicable to systems where nodes have sensors and actuators and can also interact in the physical world - perhaps by blocking each other's path or by crashing into each other.
2025-11-17
articleRecently, there has been increasing concern about a new failure mode in data-center systems: when there is an external shock, such as a sudden load spike or some machine failures, systems will sometimes respond with reduced throughput - but, in contrast to a traditional overload situation, the throughput does not recover once the external shock disappears, and remains permanently degraded. This phenomenon has been called a metastable failure.
2023-10-30 · 5 citations
articleOpen access1st authorCorrespondingWe present a vision for the future of an emerging category of cloud service: the metaverse of 3D virtual worlds. Today, hundreds of millions of users are active daily in such worlds, but they are partitioned into small groups of at most a few hundred players. Each group joins a different virtual world instance, and players can only interact in 3D with others players in the same group during that session. Current platforms are designed in ways that simply cannot scale much further, and solutions from other cloud services do not generalize to the more interactive, bidirectional, and latency-sensitive interactive 3D domain. We outline some of the technical challenges that currently stand in the way of a metaverse without inherent technical limitations on the number of users in a shared experience. We argue that, although these obviously touch on many other areas of Computer Science such as computer graphics and numerical simulation, the core challenges lie squarely within the systems domain.
Arboretum: A Planner for Large-Scale Federated Analytics with Differential Privacy
2023-10-03 · 3 citations
articleSenior authorFederated analytics is a way to answer queries over sensitive data that is spread across multiple parties, without sharing the data or collecting it in a single place. Prior work has developed solutions that can scale to large deployments with millions of devices but, due to the distributed nature of federated analytics, these solutions can support only a limited class of queries - typically various forms of numerical queries, which can be answered with lightweight cryptographic primitives. Supporting richer queries, such as categorical queries, requires heavier cryptography, whose cost can quickly exceed even the resources of a powerful data center.
2021 · 19 citations
Senior authorCorresponding- Computer Science
- Computer Science
- Theoretical computer science
This paper introduces Mycelium, the first system to process differentially private queries over large graphs that are distributed across millions of user devices. Such graphs occur, for instance, when tracking the spread of diseases or malware. Today, the only practical way to query such graphs is to upload them to a central aggregator, which requires a great deal of trust from users and rules out certain types of studies entirely. With Mycelium, users' private data never leaves their personal devices unencrypted, and each user receives strong privacy guarantees. Mycelium does require the help of a central aggregator with access to a data center, but the aggregator merely facilitates the computation by providing bandwidth and computation power; it never learns the topology of the graph or the underlying data. Mycelium accomplishes this with a combination of homomorphic encryption, a verifiable secret redistribution scheme, and a mix network based on telescoping circuits. Our evaluation shows that Mycelium can answer a range of different questions from the medical literature with millions of devices.
2021-04-21 · 4 citations
articleOpen accessThis paper shows how to use bounded-time recovery (BTR) to defend distributed systems against non-crash faults and attacks. Unlike many existing fault-tolerance techniques, BTR does not attempt to completely mask all symptoms of a fault; instead, it ensures that the system returns to the correct behavior within a bounded amount of time. This weaker guarantee is sufficient, e.g., for many cyber-physical systems, where physical properties -such as inertia and thermal capacityprevent quick state changes and thus limit the damage that can result from a brief period of undefined behavior.
Do Not Overpay for Fault Tolerance!
2021-05-01 · 6 citations
articleSenior authorIn this paper, we argue that distributed real-time and embedded systems sometimes “overpay” for fault tolerance, by using a protocol that is more powerful than what is actually needed, or by failing to take advantage of unique features in these systems. As a result, these systems sometimes perform more computation or communication than is strictly necessary, or they can be unnecessarily complex, and thus more difficult to analyze. We take a look at the design space for two common problems, broadcast and consensus, and we show that, in a number of scenarios that would be common in real-time systems, these problems have trivial solutions. We then examine two solutions from the literature and propose alternatives that are substantially simpler, less expensive, and more reliable.
DNA: Dynamic Resource Allocation for Soft Real-Time Multicore Systems
2021 · 15 citations
Senior authorCorresponding- Computer Science
- Computer Science
- Distributed computing
Modern latency-sensitive and real-time systems often use multi-core platforms; thus, tasks on different cores share certain hardware resources, such as the memory bus and certain cache levels. This has two undesirable consequences: (1) tasks can interfere With each other, causing high latency for the system as a whole, and (2) it becomes difficult to meet deadlines, since the worst-case timing of a given task depends on all the tasks it might have to compete with. Static partitioning isolates tasks from each other by allocating a certain fraction of the resources to each; however, many tasks execute in different phases (e.g., memory-intensive and CPU-intensive) that have different requirements. Thus, system designers are left with a choice between overprovisioning, based on the most demanding phase, or suboptimal performance.In this paper, we propose a pair of techniques, called DNA and DADNA, to address the above challenge. DNA increases throughput and decreases latency, by building an execution profile of each task to identify the phases, and then dynamically allocating resources based on which task can benefit the most; DADNA further adds support for soft real-time workloads by taking deadlines into account. We have built a prototype of both techniques in the Xen hypervisor; our experimental results show that, compared to a state-of-the-art solution, DNA and DADNA can substantially improve schedulability, reduce job deadline miss ratios, and cut latencies by more than a factor of two even in extremely overloaded situations.
Bounded-time recovery for distributed real-time systems
2020-04-01 · 8 citations
articleSenior authorThis paper explores bounded-time recovery (BTR), a new approach to making cyber-physical systems robust to crash faults. Rather than trying to mask the symptoms of a fault with massive redundancy, BTR detects faults at runtime and enables the system to recover from them – e.g., by transferring tasks to other nodes that are still working correctly. When a fault does occur, there is a brief period of instability during which the system can produce incorrect outputs. However, many cyber-physical systems have physical properties – such as inertia or thermal capacity – that limit the rate at which the state of the system can change; thus, a very brief outage is often acceptable, as long as its duration can be bounded, to perhaps a few milliseconds.BTR has some interesting properties: for instance, it has a much lower overhead than Paxos, and, unlike Paxos, it can take useful actions even when the system partitions or a majority of the nodes fails. However, it also poses a very unusual scheduling problem that involves creating sets of interrelated schedules for different failure modes. We present a scheduling algorithm called Cascade that can quickly find suitable schedules. Using a prototype implementation, we show that Cascade scales far better than a baseline algorithm and reduces the scheduling time from hours to a few seconds, without sacrificing quality.
Recent grants
CAREER: Evidence in Federated Distributed Systems
NSF · $509k · 2011–2018
NSF · $846k · 2011–2017
CNS Core: Medium: The Synchronous Data Center
NSF · $1.2M · 2020–2026
Frequent coauthors
- 30 shared
Boon Thau Loo
- 30 shared
Peter Druschel
- 25 shared
Wenchao Zhou
Alibaba Group (China)
- 16 shared
Alan Mislove
Northeastern University
- 15 shared
Krishna P. Gummadi
- 14 shared
Micah Sherr
Georgetown University
- 14 shared
Ang Chen
Jiangsu University
- 14 shared
Marcel Dischinger
Max Planck Institute for Software Systems
Labs
Penn Engineering's TeamPI
Education
- 2009
PhD, Computer Science
Rice University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Andreas Haeberlen
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup