Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Mary Lou Soffa

Mary Lou Soffa

Verified

University of Virginia · Computer Science

Active 1977–2024

h-index51
Citations10.7k
Papers28211 last 5y
Funding$1.2M
See your match with Mary Lou Soffa — sign in to PhdFit.Sign in

About

Mary Lou Soffa is a professor with a distinguished career in computer science, focusing on areas related to software testing, performance optimization, and computer architecture. Her research encompasses the development of testing frameworks for neural networks using deep generative models, as well as addressing processor over-provisioning on large-scale multi-core platforms. Throughout her career, she has supervised numerous students and post-doctoral researchers, contributing to advancements in dynamic binary parallelization, resource contention mitigation in warehouse-scale computers, and fault detection frameworks. Her work is characterized by a strong emphasis on improving the reliability, performance, and efficiency of computing systems, and she has been actively involved in mentoring the next generation of computer scientists.

Research topics

  • Computer Science
  • Artificial Intelligence
  • Data Mining
  • Machine Learning
  • Operating system
  • Embedded system
  • Programming language

Selected publications

  • CIT4DNN: Generating Diverse and Rare Inputs for Neural Networks Using Latent Space Combinatorial Testing

    2024-04-12 · 12 citations

    articleOpen accessSenior author

    Deep neural networks (DNN) are being used in a wide range of applications including safety-critical systems. Several DNN test generation approaches have been proposed to generate fault-revealing test inputs. However, the existing test generation approaches do not systematically cover the input data distribution to test DNNs with diverse inputs, and none of the approaches investigate the relationship between rare inputs and faults. We propose cit4dnn, an automated black-box approach to generate DNN test sets that are feature-diverse and that comprise rare inputs. cit4dnn constructs diverse test sets by applying combinatorial interaction testing to the latent space of generative models and formulates constraints over the geometry of the latent space to generate rare and fault-revealing test inputs. Evaluation on a range of datasets and models shows that cit4dnn generated tests are more feature diverse than the state-of-the-art, and can target rare fault-revealing testing inputs more effectively than existing methods.

  • Input Distribution Coverage: Measuring Feature Interaction Adequacy in Neural Network Testing

    ACM Transactions on Software Engineering and Methodology · 2022 · 20 citations

    Senior authorCorresponding
    • Computer Science
    • Computer Science
    • Machine Learning

    Testing deep neural networks (DNNs) has garnered great interest in the recent years due to their use in many applications. Black-box test adequacy measures are useful for guiding the testing process in covering the input domain. However, the absence of input specifications makes it challenging to apply black-box test adequacy measures in DNN testing. The Input Distribution Coverage (IDC) framework addresses this challenge by using a variational autoencoder to learn a low dimensional latent representation of the input distribution, and then using that latent space as a coverage domain for testing. IDC applies combinatorial interaction testing on a partitioning of the latent space to measure test adequacy. Empirical evaluation demonstrates that IDC is cost-effective, capable of detecting feature diversity in test inputs, and more sensitive than prior work to test inputs generated using different DNN test generation methods. The findings demonstrate that IDC overcomes several limitations of white-box DNN coverage approaches by discounting coverage from unrealistic inputs and enabling the calculation of test adequacy metrics that capture the feature diversity present in the input space of DNNs.

  • Message from the Program Chairs

    2021-02-27

    articleOpen access1st authorCorresponding

    We are pleased to welcome you to CGO 2021, the first virtual CGO Conference. In addition, the Program Committee was virtual due to the worldwide infection rate of the coronavirus. On behalf of the Program Committee, we are pleased to present an exciting and stimulating program for the 2021 International Symposium on Code Generation and Optimization Conference.

  • Artifact: Distribution-Aware Testing of Neural Networks Using Generative Models

    2021-05-01

    articleSenior author

    The artifact used for the experimental evaluation of Distribution-Aware Testing of Neural Networks Using Generative Models is publicly available on GitHub and it is reusable. The artifact consists of python scripts, trained deep neural network model files and data required for running the experiments. It is also provided as a VirtualBox VM image for reproducing the paper results. Users should be familiar with using VirtualBox software and Linux platform to reproduce or reuse the artifact.

  • Distribution-Aware Testing of Neural Networks Using Generative Models

    2021-05-01 · 3 citations

    preprintOpen accessSenior author

    The reliability of software that has a Deep Neural Network (DNN) as a component is urgently important today given the increasing number of critical applications being deployed with DNNs. The need for reliability raises a need for rigorous testing of the safety and trustworthiness of these systems. In the last few years, there have been a number of research efforts focused on testing DNNs. However the test generation techniques proposed so far lack a check to determine whether the test inputs they are generating are valid, and thus invalid inputs are produced. To illustrate this situation, we explored three recent DNN testing techniques. Using deep generative model based input validation, we show that all the three techniques generate significant number of invalid test inputs. We further analyzed the test coverage achieved by the test inputs generated by the DNN testing techniques and showed how invalid test inputs can falsely inflate test coverage metrics. To overcome the inclusion of invalid inputs in testing, we propose a technique to incorporate the valid input space of the DNN model under test in the test generation process. Our technique uses a deep generative model-based algorithm to generate only valid inputs. Results of our empirical studies show that our technique is effective in eliminating invalid tests and boosting the number of valid test inputs generated.

  • Testing deep neural networks (keynote)

    2020-11-15

    article1st authorCorresponding

    The reliability of software that has a Deep Neural Network (DNN) as a component is urgently important today given the increasing number of critical applications being deployed with DNNs. The need for reliability raises a need for rigorous testing of the safety and trustworthiness of these systems. In the last few years, there have been a number of research efforts focused on testing DNNs. However, the test generation techniques proposed so far lack a check to determine whether the test inputs they are generating are valid, and thus invalid inputs are produced. To illustrate this situation, we explored three recent DNN testing techniques. Using deep generative model based input validation, we show that all the three techniques generate significant number of invalid test inputs. We further analyzed the test coverage achieved by the test inputs generated by the DNN testing techniques and showed how invalid test inputs can falsely inflate test coverage metrics. To overcome the inclusion of invalid inputs in testing, we propose a technique to incorporate the valid input space of the DNN model under test in the test generation process. Our technique uses a deep generative model-based algorithm to generate only valid inputs. Results of our empirical studies show that our technique is effective in eliminating invalid tests and boosting the number of valid test inputs generated.

  • A Language for Autonomous Vehicles Testing Oracles

    arXiv (Cornell University) · 2020-06-17 · 1 citations

    preprintOpen access

    Testing autonomous vehicles (AVs) requires complex oracles to determine if the AVs behavior conforms with specifications and humans' expectations. Available open source oracles are tightly embedded in the AV simulation software and are developed and implemented in an ad hoc way. We propose a domain specific language that enables defining oracles independent of the AV solutions and the simulator. A testing analyst can encode safety, liveness, timeliness and temporal properties in our language. To show the expressiveness of our language we implement three different types of available oracles. We find that the same AV solutions may be ranked significantly differently across existing oracles, thus existing oracles do not evaluate AVs in a consistent manner.

  • Is rust used safely by software developers?

    2020 · 57 citations

    Senior authorCorresponding
    • Computer Science
    • Computer Science
    • Operating system

    Rust, an emerging programming language with explosive growth, provides a robust type system that enables programmers to write memory-safe and data-race free code. To allow access to a machine's hardware and to support low-level performance optimizations, a second language, Unsafe Rust, is embedded in Rust. It contains support for operations that are difficult to statically check, such as C-style pointers for access to arbitrary memory locations and mutable global variables. When a program uses these features, the compiler is unable to statically guarantee the safety properties Rust promotes. In this work, we perform a large-scale empirical study to explore how software developers are using Unsafe Rust in real-world Rust libraries and applications. Our results indicate that software engineers use the keyword unsafe in less than 30% of Rust libraries, but more than half cannot be entirely statically checked by the Rust compiler because of Unsafe Rust hidden somewhere in a library's call chain. We conclude that although the use of the keyword unsafe is limited, the propagation of unsafeness offers a challenge to the claim of Rust as a memory-safe language. Furthermore, we recommend changes to the Rust compiler and to the central Rust repository's interface to help Rust software developers be aware of when their Rust code is unsafe.

  • ESEC/FSE 2019 - A Statistics-based Performance Testing Methodology for Cloud Applications

    Figshare · 2019-01-01

    articleOpen accessSenior author

    There are the experiment result data sets for ESEC/FSE paper:<br>“<i>A Statistics-based Performance Testing Methodology for Cloud Applications</i>”<br><br>Including source code and dataset<br>For details please refer to Install and Readme

  • A statistics-based performance testing methodology for cloud applications

    2019-08-09 · 60 citations

    articleSenior author

    The low cost of resource ownership and flexibility have led users to increasingly port their applications to the clouds. To fully realize the cost benefits of cloud services, users usually need to reliably know the execution performance of their applications. However, due to the random performance fluctuations experienced by cloud applications, the black box nature of public clouds and the cloud usage costs, testing on clouds to acquire accurate performance results is extremely difficult. In this paper, we present a novel cloud performance testing methodology called PT4Cloud. By employing non-parametric statistical approaches of likelihood theory and the bootstrap method, PT4Cloud provides reliable stop conditions to obtain highly accurate performance distributions with confidence bands. These statistical approaches also allow users to specify intuitive accuracy goals and easily trade between accuracy and testing cost. We evaluated PT4Cloud with 33 benchmark configurations on Amazon Web Service and Chameleon clouds. When compared with performance data obtained from extensive performance tests, PT4Cloud provides testing results with 95.4% accuracy on average while reducing the number of test runs by 62%. We also propose two test execution reduction techniques for PT4Cloud, which can reduce the number of test runs by 90.1% while retaining an average accuracy of 91%. We compared our technique to three other techniques and found that our results are much more accurate.

Recent grants

Frequent coauthors

  • Rajiv Gupta

    University of California, Riverside

    74 shared
  • Bruce R. Childers

    University of Pittsburgh

    30 shared
  • Jack W. Davidson

    19 shared
  • Jason Mars

    17 shared
  • David A. Berson

    Intel (United Kingdom)

    14 shared
  • Wei Wang

    14 shared
  • Atif M. Memon

    Apple (United States)

    14 shared
  • Rastislav Bodík

    Google (United States)

    13 shared

Labs

  • Mary Lou Soffa's LabPI

    Research in software engineering, computer architecture, and parallel computing

Awards & honors

  • Fellow of the Association for Computing Machinery (ACM)
  • Fellow of The Institute of Electrical and Electronic Enginee…
  • Ken Kennedy Award (2012)
  • Anita Borg Technical Leadership Award (2011)
  • ACM SIGSOFT Influential Educator Award (2014)
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Mary Lou Soffa

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup