
Matthew Farrens
· ProfessorUniversity of California, Davis · Computer Science
Active 1986–2025
About
Professor Matthew Farrens is a faculty member in the Computer Science / ECE Department at UC Davis, where he is involved in research within the Computer Architecture Research Laboratory. His work focuses on various aspects of computer design and performance, including applications and architectures for novel technologies, photonic interconnects for communication, mechanisms to enhance processor functionality and usability, increasing the information content of data streams, optimizing the use of billions of transistors, reducing memory access penalties, and investigating optimal multiprocessor cache configurations. His research aims to address fundamental challenges in computer architecture to improve system performance and efficiency.
Research topics
- Computer Science
- Operating system
- Computer network
- Artificial Intelligence
- Distributed computing
- Parallel computing
- Telecommunications
- Embedded system
Selected publications
2025-05-12
articlePrecision rehabilitation, therapy tailored to patient-specific impairments, is hindered by the low resolution of clinical assessments and the limited uptake of high resolution technological solutions that are too cumbersome for regular use. To address this, we developed C-MoRe, a phone-based system that applies computer vision to clinical assessments to produce quantitative metrics of upper extremity motor function. We tested C-MoRe in 7 chronic stroke participants performing the Box and Block Test (BBT), which scores the number of blocks participants can transfer over a divider in 1 minute. We used ML models to identify assessment and hand landmarks, and developed a custom algorithm to autoscore the assessment and quantify movement duration, amplitude, and velocity during grasping and transfer movements. Our algorithm had high fidelity in block counting (98.4 % accuracy) and identifying task movement phases (ICC $>0.99$) compared to human raters. Movement velocity, grasp, and transfer duration were sensitive to functional differences between limbs, and grasp duration was significantly related to finger proprioception, a known predictor of therapy outcomes. Thus, C-MoRe can provide meaningful, quantitative measures of movement quality in BBT, and its simplicity may enable widespread use and the creation of large databases needed for predictive modeling.
Understanding and Leveraging Cluster Heterogeneity for Efficient Execution of Cloud Services
2021 · 2 citations
Senior authorCorresponding- Computer Science
- Computer Science
- Distributed computing
Cloud warehouses are becoming increasingly heterogeneous by introducing different types of processors of varying speed and energy-efficiency. Developing an optimal strategy for distributing latency-critical service (LC-service) requests across multiple instances in a heterogeneous cluster is non-trivial. In this paper, we present a detailed analysis of the impact of cluster heterogeneity on the achieved server utilization and energy footprint to meet the required service-level latency bound (SLO) of LC-services. We develop cluster-level control plane strategies to address two forms of cluster heterogeneity - capacity and energy-efficiency. First, we propose Maximum-SLO-Guaranteed Capacity (MSG-Capacity) proportional load balancing for LC-Services to address the capacity heterogeneity and show that it can achieve higher utilization than naive performance-based heterogeneity awareness. Then, we present Efficient-First (E-First) heuristic-based Instance Scaling to address the efficiency heterogeneity. Finally, to address the bi-dimensional (capacity and energy-efficiency) heterogeneity, we superimpose the two approaches to propose Energy-efficient and MSG-Capacity (E2MC) based control-plane strategy that maximizes utilization while minimizing the energy footprint.
Leveraging Network Delay Variability to Improve QoE of Latency Critical Services
2021 · 2 citations
Senior authorCorresponding- Computer Science
- Computer Science
- Computer network
Even as cloud providers offer strict guarantees on the intra-cloud delay of requests for Latency-Critical (LC) Services, a high external network delay can result in a large end-to-end delay, causing a low user Quality of Experience (QoE). Furthermore, due to the variability in the external network delay, there is a disconnect between the user’s QoE and the cloud guaranteed service level objective (SLO). Specifically, a request that meets the SLO, can have a high or low QoE depending on the external network delay. In this work we propose a usercentric End-to-end Service Level Objective (ESLO), an extension of the traditional cloud-centric SLO, that guarantees stricter bounds on end-to-end delay and thereby achieving a higher QoE. We show how the variability in the external network delay can be both addressed and leveraged to meet the ESLO and improve server utilization. We propose ESLO-aware extensions to the Kubernetes infrastructure, that uses information about the external network delay and its distribution - (a) to reduce the number of QoE-violating responses by using deadline-based scheduling at the service instances, and (b) to appropriately scale service instances with load. We implement the ESLO-aware framework on the NSF Chameleon cloud testbed and present experimental results demonstrating the benefit of the proposed paradigm.
HCAPP: Scalable Power Control for Heterogeneous 2.5D Integrated Systems
2020 · 1 citations
- Computer Science
- Computer Science
- Embedded system
Package pin allocation is becoming a key bottleneck in the capabilities of designs due to the increased bandwidth requirements. 2.5D integration compounds these package-level requirements while introducing an increased number of compute units within the package. We propose a decentralized power control implementation called Heterogeneous Constant Average Power Processing (HCAPP) to maintain the power limit while maximizing the efficiency of the package pins allocated for power. HCAPP uses a hardware-based decentralized design to handle fast power limits, maintain scalability and enable simplified control for heterogeneous systems while maximizing performance. As extensions, we evaluate a software interface and the impact of different accelerator designs. Overall, HCAPP achieves 7% speedup over a RAPL-like implementation. The power utilization improves from 79.7% (RAPL-like) to 93.9% (HCAPP) with this design. A priority-based static software control methodology alongside HCAPP provides average speedups of 8.3% (CPU), 5.4% (GPU), and 12% (Accelerator) for the prioritized component compared to the unprioritized version.
Model-Driven Joint Optimization of Power and Latency Guarantee in Data center Applications
SN Computer Science · 2019-10-19 · 3 citations
articleCo-optimizing Latency and Energy for IoT services using HMP servers in Fog Clusters
2019-06-01 · 5 citations
articleSenior authorFog computing has the potential to be an energy-efficient alternative to cloud computing for guaranteeing latency requirements of Latency-critical (LC) IoT services. However, even in fog computing low energy-efficiency of homogeneous multi-core server processors can be a major contributor to energy wastage. Recent studies have shown that Heterogeneous Multi-core Processors (HMPs) can improve energy efficiency of servers by adapting to dynamic load changes of LC-services. However, proposed approaches optimize energy only at a single server level. In our work, we demonstrate that optimization at the cluster-level across many HMP-servers can offer much greater energy savings through optimal work distribution across the HMP-servers while still guaranteeing the Service Level Objectives (SLO) of LC-services. In this paper, we present Greeniac, a cluster-level task manager that employs Reinforcement Learning to identify optimal configurations at the server- and cluster-levels for different workloads. We develop a server-level service scheduler and a cluster-level load balancing module to assign services and distribute tasks across HMP servers based on the learned configurations. In addition to meeting the required SLO targets, Greeniac achieves up to 28% energy saving compared to best-case cluster scheduling techniques with local HMP-aware scheduling on a 4-server fog cluster, with potentially larger savings in a larger cluster.
2019-08-01 · 3 citations
articleSenior authorLatency-Critical (LC) cloud applications pose three important challenges: 1) meeting tail latency Service-Level Objective (SLO), 2) attaining predictable tail latency, and 3) achieving high energy efficiency. In this paper we consider multicore end-systems (leaf nodes) and we study how the two important workload-dependent latency sources in network I/O processing, namely, interrupts and queuing, contribute to these problems. Firstly, we show that frequency-scaled centralized interrupt processing can be as energy efficient and achieve more predictable latency compared to traditional distributed interrupt processing. And secondly, we show that a controlled dynamic frequency scaling approach that adapts to socket buffer queue length can mitigate tail latency problems due to queuing. We design and implement a Runtime Engine that employs the proposed techniques through online monitoring of the workload and dynamically allocates resources to meet the tail latency performance. Finally, we present a study for six LC applications with different latency and service characteristics. The study shows that our proposed scaling approach saves up to 16% more energy compared to Linux on-demand frequency scaling governor that adapts based on the CPU utilization.
Position Paper: A case for exposing extra-architectural state in the ISA
2018-05-25 · 11 citations
articleThe recent Meltdown and Spectre attacks took the community by surprise. Rather than exploiting an incorrect implementation of the ISA, these attacks leverage the undocumented implementation-specific speculation behavior of high-performance microarchitectures to affect the extra-architectural state of the machine (e.g., caches).
Improving Provisioned Power Efficiency in HPC Systems with GPU-CAPP
2018-12-01 · 6 citations
articleIn this paper we propose a microarchitectural technique called GPU Constant Average Power Processing (GPU-CAPP) that improves the power utilization of power provisioning-limited systems by using provisioned power as much as possible to accelerate computation on parallel work-loads. GPU-CAPP uses a flexible, decentralized control to ensure fast response times and the scalability required for increasingly parallel GPU designs. We use GPGPU-Sim and GPUWattch to simulate GPU-CAPP and evaluate its capabilities on a subset of the Rodinia benchmark suite. Overall, GPU-CAPP enables speedup by an average of 26% and 12% over equivalent fixed frequency systems at two power targets.
A case for exposing extra-architectural state in the ISA: position paper.
2018-01-01 · 1 citations
article
Frequent coauthors
- 25 shared
Christopher Nitta
- 24 shared
Venkatesh Akella
University of California, Davis
- 20 shared
Gary Tyson
Florida State University
- 15 shared
Dipak Ghosal
University of California, Davis
- 14 shared
Andrew R. Pleszkun
University of Colorado Boulder
- 9 shared
Arvin Park
University of California, Davis
- 9 shared
Vishal Ahuja
- 8 shared
Brian Tierney
Labs
Awards & honors
- Mohammad Sadoghi and Matt Farrens Named ACM senior members (…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Matthew Farrens
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup