
Wil Robertson
VerifiedNortheastern University · Electrical and Energy Engineering
Active 1811–2026
About
Wil Robertson is an interdisciplinary faculty member affiliated with the Khoury College of Computer Sciences and the College of Engineering at Northeastern University. His research focuses on protecting systems from attacks, with specific interests in web security, anomaly detection, program analysis, and electronic voting security. He is part of the International Secure Systems Lab, a collaborative computer security research effort co-directed by his Northeastern colleague Engin Kirda. Prior to his tenure at Northeastern, Robertson was a postdoctoral researcher at UC Berkeley in the computer security research group and a research assistant at UC Santa Barbara SecLab. He also co-founded WebWise Security, Inc., where he served as chief technology officer and contributed to the development of a high-speed anomaly-based web application firewall.
Research topics
- Computer Science
- Artificial Intelligence
- Computer Security
- Programming language
- Operating system
- Embedded system
- Software engineering
Selected publications
ACE: A Security Architecture for LLM-Integrated App Systems
2026-01-01
articleOpen accessLLM-integrated app systems extend the utility of Large Language Models (LLMs) with third-party apps that are invoked by a system LLM using interleaved planning and execution phases to answer user queries.These systems introduce new attack vectors where malicious apps can cause integrity violation of planning or execution, availability breakdown, or privacy compromise during execution.In this work, we identify new attacks impacting the integrity of planning, as well as the integrity and availability of execution in LLM-integrated apps, and demonstrate them against IsolateGPT, a recent solution designed to mitigate attacks from malicious apps.We propose Abstract-Concrete-Execute (ACE), a new secure architecture for LLM-integrated app systems that provides security guarantees for system planning and execution.Specifically, ACE decouples planning into two phases by first creating an abstract execution plan using only trusted information, and then mapping the abstract plan to a concrete plan using installed system apps.We verify that the plans generated by our system satisfy user-specified secure information flow constraints via static analysis on the structured plan output.During execution, ACE enforces data and capability barriers between apps, and ensures that the execution is conducted according to the trusted abstract plan.We show experimentally that ACE is secure against attacks from the INJECAGENT and Agent Security Bench benchmarks for indirect prompt injection, and our newly introduced attacks.We also evaluate the utility of ACE in realistic environments, using the Tool Usage suite from the LangChain benchmark.Our architecture represents a significant advancement towards hardening LLM-based systems using system security principles. Planning Step
MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks
arXiv (Cornell University) · 2026-02-09
articleOpen accessLarge language model (LLM) based web agents are increasingly deployed to automate complex online tasks by directly interacting with web sites and performing actions on users' behalf. While these agents offer powerful capabilities, their design exposes them to indirect prompt injection attacks embedded in untrusted web content, enabling adversaries to hijack agent behavior and violate user intent. Despite growing awareness of this threat, existing evaluations rely on fixed attack templates, manually selected injection surfaces, or narrowly scoped scenarios, limiting their ability to capture realistic, adaptive attacks encountered in practice. We present MUZZLE, an automated agentic framework for evaluating the security of web agents against indirect prompt injection attacks. MUZZLE utilizes the agent's trajectories to automatically identify high-salience injection surfaces, and adaptively generate context-aware malicious instructions that target violations of confidentiality, integrity, and availability. Unlike prior approaches, MUZZLE adapts its attack strategy based on the agent's observed execution trajectory and iteratively refines attacks using feedback from failed executions. We evaluate MUZZLE across diverse web applications, user tasks, and agent configurations, demonstrating its ability to automatically and adaptively assess the security of web agents with minimal human intervention. Our results show that MUZZLE effectively discovers 37 new attacks on 4 web applications with 10 adversarial objectives that violate confidentiality, availability, or privacy properties. MUZZLE also identifies novel attack strategies, including 2 cross-application prompt injection attacks and an agent-tailored phishing scenario.
MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks
Open MIND · 2026-02-09
preprintLarge language model (LLM) based web agents are increasingly deployed to automate complex online tasks by directly interacting with web sites and performing actions on users' behalf. While these agents offer powerful capabilities, their design exposes them to indirect prompt injection attacks embedded in untrusted web content, enabling adversaries to hijack agent behavior and violate user intent. Despite growing awareness of this threat, existing evaluations rely on fixed attack templates, manually selected injection surfaces, or narrowly scoped scenarios, limiting their ability to capture realistic, adaptive attacks encountered in practice. We present MUZZLE, an automated agentic framework for evaluating the security of web agents against indirect prompt injection attacks. MUZZLE utilizes the agent's trajectories to automatically identify high-salience injection surfaces, and adaptively generate context-aware malicious instructions that target violations of confidentiality, integrity, and availability. Unlike prior approaches, MUZZLE adapts its attack strategy based on the agent's observed execution trajectory and iteratively refines attacks using feedback from failed executions. We evaluate MUZZLE across diverse web applications, user tasks, and agent configurations, demonstrating its ability to automatically and adaptively assess the security of web agents with minimal human intervention. Our results show that MUZZLE effectively discovers 37 new attacks on 4 web applications with 10 adversarial objectives that violate confidentiality, availability, or privacy properties. MUZZLE also identifies novel attack strategies, including 2 cross-application prompt injection attacks and an agent-tailored phishing scenario.
An Enumerative Embedding of the Python Type System in ACL2s
Electronic Proceedings in Theoretical Computer Science · 2025-07-24
articleOpen accessSenior authorPython is a high-level interpreted language that has become an industry standard in a wide variety of applications.In this paper, we take a first step towards using ACL2s to reason about Python code by developing an embedding of a subset of the Python type system in ACL2s.The subset of Python types we support includes many of the most commonly used type annotations as well as user-defined types comprised of supported types.We provide ACL2s definitions of these types, as well as defdata enumerators that are customized to provide code coverage and identify errors in Python programs.Using the ACL2s embedding, we can generate instances of types that can then be used as inputs to fuzz Python programs, which allows us to identify bugs in Python code that are not detected by state-of-the-art Python type checkers.We evaluate our work against four open-source repositories, extracting their type information and generating inputs for fuzzing functions with type signatures that are in the supported subset of Python types.Note that we only use the type signatures of functions to generate inputs and treat the bodies of functions as black boxes.We measure code coverage, which ranges from about 68% to more than 80%, and identify code patterns that hinder coverage such as complex branch conditions and external file system dependencies.We conclude with a discussion of the results and recommendations for future work.
Model-free data assimilation in embedded space
SSRN Electronic Journal · 2025-01-01
preprintOpen accessHypervisor Dissociative Execution: Programming Guests for Monitoring, Management, and Security
2024-12-09
articleSenior authorBoth cloud providers and users wish to manage, monitor, and secure virtualized guest systems. This is typically accomplished with custom agent programs that run inside a guest or complex virtual machine introspection (VMI) systems that operate outside a guest. Agents are limited by the need to install and maintain them in each guest, while VMI systems are limited by the need to understand guest kernel internals. We introduce Hypervisor Dissociative Execution, or HyDE, a new approach that operates between these extremes to avoid their limitations and provide a robust and flexible mechanism to examine and modify a guest from the outside. In the HyDE model, developers assemble programs that mix out-of-guest logic with in-guest system calls. These programs are launched from outside a guest where they are able to co-opt the execution of guest processes. We present an open-source prototype HyDE implementation paired with 10 HyDE programs that address a wide range of user needs from password resets and guest process enumeration to dynamically generating a software bill of materials. We evaluate the utility, robustness, and performance of HyDE by executing the example programs while concurrently running standard benchmarks within multiple guest systems. Our results show that HyDE maintains system stability and incurs negligible overhead for one-off analyses or modifications. In persistent operation, HyDE incurs overhead as low as 7% in a multi-node cloud application benchmark.
Data-driven dynamic modal bias analysis and correction for Earth system models
2024-10-16
preprintOpen accessPredicting Earth systems weeks or months into the future is an important yet challenging problem due to the high dimensionality, chaotic behaviour, and coupled dynamics of the ocean, atmosphere, and other subsystems of the Earth. Numerical models invariably contain model error due to incomplete domain knowledge, limited capabilities of representation, and unresolved processes due to finite spatial resolution. Hybrid modeling, the pairing of a physics-driven model with a data-driven component, has shown promise in outperforming both purely physics-driven and data-driven approaches in predicting complex systems. Here we demonstrate two new hybrid methods that combine temporal or spatiotemporal models with a data-driven component that may be modally decomposed to give insight into model bias, or used to correct the bias of model projections. These techniques are demonstrated on a simulated chaotic system and two empirical Ocean variables: coastal sea surface elevation and sea surface temperature, which highlight that the inclusion of the data-driven components increases the skill of predicting their short-term evolution. Our work demonstrates that these hybrid approaches may prove valuable for: improving models during model development, creating novel methods for data assimilation, and enhancing the predictive accuracy of forecasts when available models have significant structural error.
Modal error analysis and prediction compensation for Earth system models
2024-06-21
preprintOpen accessPredicting Earth systems is an important yet challenging problem due to the high dimensionality, chaotic behaviour, and coupled dynamics of the ocean, atmosphere, and other subsystems of the Earth. Numerical models derived to predict these systems invariably contain model error due to incomplete domain knowledge, limited capabilities of representation, and unresolved processes due to spatial resolution. Hybrid modeling, the pairing of a physics-driven model with a data-driven component, has shown promise in outperforming both purely physics-driven and data-driven approaches in predicting complex systems. Here we demonstrate two new hybrid methods that combine temporal or spatiotemporal models with a data-driven component that may be modally decomposed to give insight into model error, or used to compensate a model during prediction. These techniques are demonstrated on two Earth system variables: coastal sea surface elevation and sea surface temperature, which highlight that the inclusion of the data-driven components increases the skill of predicting their evolution. Our work demonstrates that this hybrid approach may prove valuable for: improving models during model development, creating novel methods for data assimilation, and enhancing predictive accuracy when available models have significant structural error.
A Viewpoint: Safer Heaps With Practical Architectural Security Primitives
IEEE Security & Privacy · 2024-07-01
article1st authorCorrespondingIn this viewpoint, we argue that architectural security primitives are a promising basis for fast and secure program heaps. We discuss MPKAlloc, a recent research effort demonstrating the concrete benefits of this approach using Intel MPK to harden a production allocator. We end by discussing promising future directions for the field.
Black-box Attacks Against Neural Binary Function Detection
2023-10-03 · 4 citations
preprintOpen accessSenior authorBinary analyses based on deep neural networks (DNNs), or neural binary analyses (NBAs), have become a hotly researched topic in recent years. DNNs have been wildly successful at pushing the performance and accuracy envelopes in the natural language and image processing domains. Thus, DNNs are highly promising for solving binary analysis problems that are hard due to a lack of complete information resulting from the lossy compilation process. Despite this promise, it is unclear that the prevailing strategy of repurposing embeddings and model architectures originally developed for other problem domains is sound given the adversarial contexts under which binary analysis often operates.
Recent grants
SaTC: CORE: Medium: Collaborative: Taming Memory Corruption with Security Monitors
NSF · $400k · 2019–2023
Frequent coauthors
- 74 shared
Engin Kirda
Northeastern University
- 65 shared
William Phillips
Dalhousie University
- 42 shared
Giovanni Vigna
University of California, Santa Barbara
- 33 shared
Christopher Kruegel
University of California, Santa Barbara
- 23 shared
Nauman Aslam
- 20 shared
S. Sivakumar
- 17 shared
Alireza Nafarieh
Dalhousie University
- 17 shared
Sajjad Arshad
Florida International University
Education
- 2011
Postdoctoral Scholar, Electrical Engineering and Computer Science
UC Berkeley
- 2009
PhD, Computer Science
UC Santa Barbara
- 2002
BS, Computer Science
UC Santa Barbara
Awards & honors
- $4.8M NSF renewal CyberCorps® Scholarship for Service Grant…
- $500K NSF grant for analysis tool (2014)
- NSF grant for cybersecurity workforce training (2012)
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Wil Robertson
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup