
Pablo Parrilo
VerifiedMassachusetts Institute of Technology · Electrical Engineering & Computer Science
Active 1996–2025
About
Pablo Parrilo is the Joseph F. and Nancy P. Keithley Professor in Electrical Engineering at MIT. His research areas include Artificial Intelligence + Machine Learning, Information Science and Systems, Optimization and Game Theory, and Systems Theory, Control, and Autonomy. He is involved in developing techniques for the analysis and synthesis of systems that interact with the external world through perception, communication, and action, while also learning, making decisions, and adapting to changing environments. His work leverages computational, theoretical, and experimental tools to address challenges in sensing, processing, energy transduction, and physical substrates for computation.
Research topics
- Mathematics
- Combinatorics
- Computer science
- Mathematical optimization
- Applied mathematics
Selected publications
A New Semidefinite Relaxation for Linear and Piecewise-Affine Optimal Control with Time Scaling
ArXiv.org · 2025-04-17
preprintOpen accessWe introduce a semidefinite relaxation for optimal control of linear systems with time scaling. These problems are inherently nonconvex, since the system dynamics involves bilinear products between the discretization time step and the system state and controls. The proposed relaxation is closely related to the standard second-order semidefinite relaxation for quadratic constraints, but we carefully select a subset of the possible bilinear terms and apply a change of variables to achieve empirically tight relaxations while keeping the computational load light. We further extend our method to handle piecewise-affine (PWA) systems by formulating the PWA optimal-control problem as a shortest-path problem in a graph of convex sets (GCS). In this GCS, different paths represent different mode sequences for the PWA system, and the convex sets model the relaxed dynamics within each mode. By combining a tight convex relaxation of the GCS problem with our semidefinite relaxation with time scaling, we can solve PWA optimal-control problems through a single semidefinite program.
Convex Ternary Quartics Are SOS-Convex
SIAM Journal on Optimization · 2025-08-26
articleSenior authorMixed Discrete and Continuous Planning using Shortest Walks in Graphs of Convex Sets
ArXiv.org · 2025-07-15
preprintOpen accessWe study the Shortest-Walk Problem (SWP) in a Graph of Convex Sets (GCS). A GCS is a graph where each vertex is paired with a convex program, and each edge couples adjacent programs via additional costs and constraints. A walk in a GCS is a sequence of vertices connected by edges, where vertices may be repeated. The length of a walk is given by the cumulative optimal value of the corresponding convex programs. To solve the SWP in GCS, we first synthesize a piecewise-quadratic lower bound on the problem's cost-to-go function using semidefinite programming. Then we use this lower bound to guide an incremental-search algorithm that yields an approximate shortest walk. We show that the SWP in GCS is a natural language for many mixed discrete-continuous planning problems in robotics, unifying problems that typically require specialized solutions while delivering high performance and computational efficiency. We demonstrate this through experiments in collision-free motion planning, skill chaining, and optimal control of hybrid systems.
Multi-query Shortest-Path Problem in Graphs of Convex Sets
Springer proceedings in advanced robotics · 2025-10-30
preprintOpen accessA New Semidefinite Relaxation for Linear and Piecewise-Affine Optimal Control with Time Scaling
2025-05-19
articleWe introduce a semidefinite relaxation for optimal control of linear systems with time scaling. These problems are inherently nonconvex, since the system dynamics involves bilinear products between the discretization time step and the system state and controls. The proposed relaxation is closely related to the standard second-order semidefinite relaxation for quadratic constraints, but we carefully select a subset of the possible bilinear terms and apply a change of variables to achieve empirically tight relaxations while keeping the computational load light. We further extend our method to handle piecewise-affine (PWA) systems by formulating the PWA optimal-control problem as a shortest-path problem in a graph of convex sets (GCS). In this GCS, different paths represent different mode sequences for the PWA system, and the convex sets model the relaxed dynamics within each mode. By combining a tight convex relaxation of the GCS problem with our semidefinite relaxation with time scaling, we can solve PWA optimal-control problems through a single semidefinite program.
Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework
ArXiv.org · 2025-06-05
preprintOpen accessSenior authorConventional preference learning methods often prioritize opinions held more widely when aggregating preferences from multiple evaluators. This may result in policies that are biased in favor of some types of opinions or groups and susceptible to strategic manipulation. To address this issue, we develop a novel preference learning framework capable of aligning aggregate opinions and policies proportionally with the true population distribution of evaluator preferences. Grounded in social choice theory, our approach infers the feasible set of evaluator population distributions directly from pairwise comparison data. Using these estimates, the algorithm constructs a policy that satisfies foundational axioms from social choice theory, namely monotonicity and Pareto efficiency, as well as our newly-introduced axioms of population-proportional alignment and population-bounded manipulability. Moreover, we propose a soft-max relaxation method that smoothly trades off population-proportional alignment with the selection of the Condorcet winner (which beats all other options in pairwise comparisons). Finally, we validate the effectiveness and scalability of our approach through experiments on both tabular recommendation tasks and large language model alignment.
arXiv (Cornell University) · 2024-05-20
preprintOpen accessSenior authorInverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning, which involve inferring and shaping the underlying reward function of sequential decision-making problems based on observed human demonstrations and feedback. Most prior work in reward learning has relied on prior knowledge or assumptions about decision or preference models, potentially leading to robustness issues. In response, this paper introduces a novel linear programming (LP) framework tailored for offline reward learning. Utilizing pre-collected trajectories without online exploration, this framework estimates a feasible reward set from the primal-dual optimality conditions of a suitably designed LP, and offers an optimality guarantee with provable sample efficiency. Our LP framework also enables aligning the reward functions with human feedback, such as pairwise trajectory comparison data, while maintaining computational tractability and sample efficiency. We demonstrate that our framework potentially achieves better performance compared to the conventional maximum likelihood estimation (MLE) approach through analytical examples and numerical experiments.
Towards Tight Convex Relaxations for Contact-Rich Manipulation
2024-07-15 · 11 citations
articleOpen accessWe present a novel method for global motion planning of robotic systems that interact with the environment through contacts.Our method directly handles the hybrid nature of such tasks using tools from convex optimization.We formulate the motion-planning problem as a shortest-path problem in a graph of convex sets, where a path in the graph corresponds to a contact sequence and a convex set models the quasi-static dynamics within a fixed contact mode.For each contact mode, we use semidefinite programming to relax the nonconvex dynamics that results from the simultaneous optimization of the object's pose, contact locations, and contact forces.The result is a tight convex relaxation of the overall planning problem, that can be efficiently solved and quickly rounded to find a feasible contact-rich trajectory.As an initial application for evaluating our method, we apply it on the task of planar pushing.Exhaustive experiments show that our convexoptimization method generates plans that are consistently within a small percentage of the global optimum, without relying on an initial guess, and that our method succeeds in finding trajectories where a state-of-the-art baseline for contactrich planning usually fails.We demonstrate the quality of these plans on a real robotic system.
Towards Tight Convex Relaxations for Contact-Rich Manipulation
arXiv (Cornell University) · 2024-02-15
preprintOpen accessWe present a novel method for global motion planning of robotic systems that interact with the environment through contacts. Our method directly handles the hybrid nature of such tasks using tools from convex optimization. We formulate the motion-planning problem as a shortest-path problem in a graph of convex sets, where a path in the graph corresponds to a contact sequence and a convex set models the quasi-static dynamics within a fixed contact mode. For each contact mode, we use semidefinite programming to relax the nonconvex dynamics that results from the simultaneous optimization of the object's pose, contact locations, and contact forces. The result is a tight convex relaxation of the overall planning problem, that can be efficiently solved and quickly rounded to find a feasible contact-rich trajectory. As an initial application for evaluating our method, we apply it on the task of planar pushing. Exhaustive experiments show that our convex-optimization method generates plans that are consistently within a small percentage of the global optimum, without relying on an initial guess, and that our method succeeds in finding trajectories where a state-of-the-art baseline for contact-rich planning usually fails. We demonstrate the quality of these plans on a real robotic system.
Acceleration by Stepsize Hedging: Multi-Step Descent and the Silver Stepsize Schedule
Journal of the ACM · 2024-12-13 · 4 citations
articleOpen accessSenior authorCan we accelerate the convergence of gradient descent without changing the algorithm—just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in \(\kappa ^{\log _{\rho } 2} \approx \kappa ^{0.7864}\) iterations, where \(\rho =1+\sqrt {2}\) is the silver ratio and κ is the condition number. This is intermediate between the textbook unaccelerated rate κ and the accelerated rate \(\kappa ^{1/2}\) due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous partially accelerated rate \(\varepsilon ^{-\log _{\rho } 2} \approx \varepsilon ^{-0.7864}\) . We conjecture and provide partial evidence that these rates are optimal among all stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period \(\kappa ^{\log _{\rho } 2}\) . This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime). The core algorithmic intuition is hedging between individually suboptimal strategies—short steps and long steps—since bad cases for the former are good cases for the latter, and vice versa. Properly combining these stepsizes yields faster convergence due to the misalignment of worst-case functions. The key challenge in proving this speedup is enforcing long-range consistency conditions along the algorithm’s trajectory. We do this by developing a technique that recursively glues constraints from different portions of the trajectory, thus removing a key stumbling block in previous analyses of optimization algorithms. More broadly, we believe that the concepts of hedging and multi-step descent have the potential to be powerful algorithmic paradigms in a variety of contexts in optimization and beyond. This article publishes and extends the first author’s 2018 master’s thesis (advised by the second author)—which established for the first time that judiciously choosing stepsizes can enable acceleration in convex optimization. Prior to this thesis, the only such result was for the special case of quadratic optimization, due to Young in 1953.
Recent grants
Novel Game-Theoretic Tools and Solution Concepts with Applications to Network Dynamics and Control
NSF · $390k · 2010–2015
AF: Large: Collaborative Research: Algebraic Proof Systems, Convexity, and Algorithms
NSF · $2.1M · 2016–2022
Optimization and Control of Stochastic Wireless
NSF · $240k · 2006–2009
FRG: Collaborative Research: Semidefinite optimization and convex algebraic geometry
NSF · $240k · 2008–2012
Frequent coauthors
- 86 shared
Rekha R. Thomas
- 83 shared
João Gouveia
- 76 shared
James Saunderson
- 72 shared
Hamza Fawzi
- 63 shared
Alan S. Willsky
- 54 shared
Asuman Ozdaglar
- 39 shared
Amir Ali Ahmadi
- 35 shared
Venkat Chandrasekaran
California Institute of Technology
Labs
MIT EECS Communication LabPI
Education
PhD, Control and Dynamical Systems
California Institute of Technology
Awards & honors
- 2025-26 EECS Faculty Award Roundup
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Pablo Parrilo
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup