Abraham Matta

· ProfessorVerified

Boston University · Computer Science

Active 1991–2025

h-index36

Citations7.4k

Papers22219 last 5y

Funding$858k

Faculty page Lab page

See your match with Abraham Matta — sign in to PhdFit.Sign in

About

Abraham Matta is a Professor in the Computer Science Department at Boston University, where he served as Chair of the department during 2018-2024. He received his Ph.D. in computer science from the University of Maryland at College Park in 1995. His research focuses on the design of network protocols and architectures based on principles such as inter-process communication, decomposition, and recursion, as well as mathematical techniques including probabilistic analysis, queuing theory, optimization, and control theory. His work encompasses performance evaluation tools like simulation and emulation, with application domains including the Internet, wireless, mobile, sensor, and disruption-tolerant networks, as well as cloud and distributed systems. He has published over 150 peer-reviewed technical papers and has received numerous awards, including the NSF CAREER award in 1997, a patent in 2011, and several best-paper awards for work on wireless ad hoc and sensor networks, cloud computing, and experimental work on the GENI and FABRIC testbeds. He has been actively involved in projects such as GENI since 2013, contributing to outreach, education, and collaboration efforts in cyberinfrastructure. He serves on various scientific advisory boards and has held leadership roles in technical program committees and organizing committees for major conferences. He is a senior member of the ACM and IEEE, leads the DASNet group, and is an Associate Editor for IEEE Networking Letters.

Research signals

Five dimensions sourced from public faculty / publication signals. Sign in to compare against your own profile and see your match score.

Research topics

Computer Science
Artificial Intelligence
Computer Security
Machine Learning
Distributed computing
Embedded system
Operating system

Selected publications

Design and Modeling of a New File Transfer Architecture to Reduce Undetected Errors Evaluated in the FABRIC Testbed
ACM SIGMETRICS Performance Evaluation Review · 2025-06-16
article
Ensuring data integrity for petabyte-scale file transfers is critical for scientific applications. As packet sizes increase, so does the likelihood of undetected errors. Multi-Level Error Detection (MLED) is a recursive architecture that leverages in-network resources to reduce undetected error probability (UEP) in file transfer. MLED organizes communication in layers at different levels, with each layer implementing configurable policies. Through experimentation on the FABRIC testbed, we show that while the traditional---transport and data link layer---error detection results in corrupt file transfers requiring retransmission, MLED detects and corrects these errors at intermediate levels, when an adversarial error model is used. MLED thus achieves a 100% gain in goodput reaching over 800 Mbps on a single connection with no appreciable increase in delay.
Publisher DOI
CAPSys: Contention-aware task placement for data stream processing
2025-03-26 · 4 citations
articleOpen accessSenior author
In the context of streaming dataflow queries, the task placement problem aims to identify a mapping of operator tasks to physical resources in a distributed cluster. We show that task placement not only significantly affects query performance but also the convergence and accuracy of auto-scaling controllers. We propose CAPSys, an adaptive resource controller for dataflow stream processors, that considers auto-scaling and task placement in concert. CAPSys relies on Contention-Aware Placement Search (CAPS), a new placement strategy that ensures compute-intensive, I/O-intensive, and networkintensive tasks are balanced across available resources.
Publisher DOI
SERFLOW: A Cross-Service Cost Optimization Framework for SLO-Aware Dynamic ML Inference
ArXiv.org · 2025-10-31
preprintOpen accessSenior author
Dynamic offloading of Machine Learning (ML) model partitions across different resource orchestration services, such as Function-as-a-Service (FaaS) and Infrastructure-as-a-Service (IaaS), can balance processing and transmission delays while minimizing costs of adaptive inference applications. However, prior work often overlooks real-world factors, such as Virtual Machine (VM) cold starts, requests under long-tail service time distributions, etc. To tackle these limitations, we model each ML query (request) as traversing an acyclic sequence of stages, wherein each stage constitutes a contiguous block of sparse model parameters ending in an internal or final classifier where requests may exit. Since input-dependent exit rates vary, no single resource configuration suits all query distributions. IaaS-based VMs become underutilized when many requests exit early, yet rapidly scaling to handle request bursts reaching deep layers is impractical. SERFLOW addresses this challenge by leveraging FaaS-based serverless functions (containers) and using stage-specific resource provisioning that accounts for the fraction of requests exiting at each stage. By integrating this provisioning with adaptive load balancing across VMs and serverless functions based on request ingestion, SERFLOW reduces cloud costs by over $23\%$ while efficiently adapting to dynamic workloads.
Publisher OA PDF DOI
Design and Modeling of a New File Transfer Architecture to Reduce Undetected Errors Evaluated in the FABRIC Testbed
Proceedings of the ACM on Measurement and Analysis of Computing Systems · 2025-05-27 · 2 citations
articleOpen access
Ensuring the integrity of petabyte-scale file transfers is essential for the data gathered from scientific instruments. As packet sizes increase, so does the likelihood of errors, resulting in a higher probability of undetected errors in the packet. This paper presents a Multi-Level Error Detectio n (MLED) framework that leverages in-network resources to reduce undetected error probability (UEP) in file transmission. MLED is based on a configurable recursive architecture that organizes communication in layers at different levels, decoupling network functions such as error detection, routing, addressing, and security. Each layer L ij at level i implements a policy P ij that governs its operation, including the error detection mechanism used, specific to the scope of that layer. MLED can be configured to mimic the error detection mechanisms of existing large-scale file transfer protocols. The recursive structure of MLED is analyzed and it shows that adding additional levels of error detection reduces the overall UEP. An adversarial error model is designed to introduce errors into files that evade detection by multiple error detection policies. Through experimentation using the FABRIC testbed the traditional approach, with transport- and data link- layer error detection, results in a corrupt file transfer requiring retransmission of the entire file. Using its recursive structure, an implementation of MLED detects and corrects these adversarial errors at intermediate levels inside the network, avoiding file retransmission under non-zero error rates. MLED therefore achieves a 100% gain in goodput over the traditional approach, reaching a goodput of over 800 Mbps on a single connection with no appreciable increase in delay.
Publisher DOI
Design and Modeling of a New File Transfer Architecture to Reduce Undetected Errors Evaluated in the FABRIC Testbed
2025-06-04
article
Publisher DOI
[SoK] Systematizing Inference Placement For Deep Learning Across Edge And Cloud Platforms: A Multi-Objective Optimization Perspective
Journal of Systems Research · 2025-12-30
articleOpen accessSenior author
Edge intelligent applications like VR/AR and language model based chatbots have become widespread with the rapid expansion of IoT and mobile devices. However, constrained edge devices often cannot serve the increasingly large and complex deep learning (DL) models. To mitigate these challenges, researchers have proposed optimizing and offloading partitions of DL models among user devices, edge servers, and the cloud. In this setting, users can take advantage of different services to support their intelligent applications. For example, edge resources offer low response latency. In contrast, cloud platforms provide low monetary cost computation resources for computation-intensive workloads. However, communication between DL model partitions can introduce transmission bottlenecks and pose risks of data leakage. Recent research aims to balance accuracy, computation delay, transmission delay, and privacy concerns. They address these issues with model compression, model distillation, transmission compression, and model architecture adaptations, including internal classifiers. This survey contextualizes the state-of-the-art model offloading methods and model adaptation techniques by studying their implication to a multi-objective optimization comprising inference latency, data privacy, and resource monetary cost.
Publisher OA PDF DOI
PraxiPaaS: A Decomposable Machine Learning System for Efficient Container Package Discovery
2024-09-24 · 1 citations
article
Due to the increasing complexity of cloud architectures, automatically tracking and inspecting container packages in Platform-as-a-Service (PaaS) clusters are challenging tasks. This introspection capability, however, is critical to identify vulnerable packages and compile an accurate Software Bill of Materials (SBOM). Motivated by introspection frameworks focusing on virtual machine (VM) settings and ML methods for software discovery, we design PraxiPaaS as a framework to inspect PaaS container images with a highly scalable ML inference pipeline by scanning file changes during package installations. Our ML pipeline includes a structured collection of word2vec encoders and a corresponding structured ML model to achieve short incremental training time for incorporating additional packages while maintaining a high F1-score in generating the SBOM. Our evaluation shows that our structured ML pipeline provides an exponential drop in incremental training time from 2.8 hours to $8.6 \mathbf{s}$ with 32 CPU cores, while maintaining an F1-score of 0.82, compared to the traditional monolithic model design. We deploy a prototype of PraxiPaaS in the New England Research Cloud (NERC) OpenShift cluster and evaluate the inference time comparing structured versus monolithic model design.
Publisher DOI
Privacy and Efficiency of Communications in Federated Split Learning
ArXiv.org · 2023-01-04 · 3 citations
preprintOpen accessSenior author
Everyday, large amounts of sensitive data is distributed across mobile phones, wearable devices, and other sensors. Traditionally, these enormous datasets have been processed on a single system, with complex models being trained to make valuable predictions. Distributed machine learning techniques such as Federated and Split Learning have recently been developed to protect user data and privacy better while ensuring high performance. Both of these distributed learning architectures have advantages and disadvantages. In this paper, we examine these tradeoffs and suggest a new hybrid Federated Split Learning architecture that combines the efficiency and privacy benefits of both. Our evaluation demonstrates how our hybrid Federated Split Learning approach can lower the amount of processing power required by each client running a distributed learning system, reduce training and inference time while keeping a similar accuracy. We also discuss the resiliency of our approach to deep learning privacy inference attacks and compare our solution to other recently proposed benchmarks.
Publisher OA PDF DOI
Inverse Response Time Ratio Scheduler: Optimizing Throughput and Response Time for Serverless Computing
2023-12-04
articleSenior author
We explore the problem of scheduling in a distributed multi-cloud serverless scenario, in the case where requests to function instances contend over the same resources. For this case, we present an efficient scheduling algorithm that leverages function profiling to detect resource contention, and improves the response time of requests, as well as the overall completion time of all requests. We compare our work to other schedulers such as a simple random scheduler, a simple round-robin scheduler, and schedulers that either load balance requests for each function across clouds, choose the cloud with the best profile, or select the cloud with the most available resources. Besides simulations, we have created a simple experiment of our scheduler running against two OpenWhisk serverless instances over the FABRIC testbed. We show that our inverse response time ratio scheduling algorithm can yield an improvement in average response time of around 32% over the best of the other schedulers when the execution time of a function on a given cloud is twice as much as on another. We also show that the improvement increases as the dispersion of the execution time across the distributed environment increases.
Publisher DOI
Configuration and Placement of Serverless Applications Using Statistical Learning
IEEE Transactions on Network and Service Management · 2023-03-08 · 22 citations
article
In the last decade, serverless computing emerged as a new compelling paradigm for the deployment of cloud applications and services. It represents an evolution of cloud computing with a simplified programming model, that aims to abstract away most operational concerns. Running serverless applications requires users to configure multiple parameters, such as memory, CPU, cloud provider, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">etc</i> . While relatively simpler, configuring such parameters correctly while minimizing cost and meeting delay constraints is not trivial. In this paper, we present COSE, a framework that uses Bayesian Optimization to find the optimal resource configuration and placement for functions in a serverless application. COSE uses statistical learning techniques to intelligently collect samples and predict the cost and execution time of a serverless function across unseen configuration values. Our framework uses the predicted cost and execution time on available locations to select the “best” configuration parameters and placement for running a serverless application while satisfying customer objectives. We evaluate COSE on AWS Lambda with real-world applications consisting of multiple functions (both linear chains and service graphs), where we successfully found optimal/near-optimal configurations. We also evaluate COSE over a wide range of simulated distributed cloud environments that confirm the efficacy of our approach.
Publisher DOI

Recent grants

CNS-NeTS:Medium: A Recursive Internet Architecture
NSF · $559k · 2010–2015
CNS Core: Small: Collaborative Research: HEECMA: A Hybrid Elastic Edge-Cloud Application Management Architecture
NSF · $299k · 2019–2023

Frequent coauthors

Azer Bestavros
57 shared
Flavio Esposito
Saint Louis University
25 shared
Nabeel Akhtar
Akamai (United States)
23 shared
Yuefeng Wang
Akamai (United States)
20 shared
A. Udaya Shankar
18 shared
Liang Guo
17 shared
Hany Morcos
Boston University
15 shared
Mina Guirguis
Texas State University
14 shared

Education

Ph.D., Computer Science
University of Maryland at College Park
1995

Awards & honors

NSF CAREER award (1997)
two best-paper awards (2008 and 2010)
best-paper award for cloud computing work (2021)
awards for experimental work on the GENI testbed (2018)
awards for work on the FABRIC testbed (2023)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Abraham Matta

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you