Ceyhun Eksin

· Associate Professor, Industrial & Systems Engineering, Corrie and Jim Furber '64 Faculty Fellow, Affiliated Faculty, Electrical & Computer EngineeringVerified

Texas A&M University · Industrial & Systems Engineering

Active 2008–2025

h-index18

Citations1.3k

Papers11854 last 5y

Funding$1.3M1 active

Faculty page

See your match with Ceyhun Eksin — sign in to PhdFit.Sign in

About

Ceyhun Eksin is an Associate Professor in the Department of Industrial & Systems Engineering at Texas A&M University, where he also holds the Corrie and Jim Furber '64 Faculty Fellowship. His educational background includes a Ph.D. in Electrical Systems & Engineering from the University of Pennsylvania, obtained in 2015. His research focuses on the analysis and design of networked multi-agent systems, with particular interest in distributed optimization, game theory, evolutionary game theory, networks, autonomous systems, energy systems, and epidemics. Eksin's work involves understanding complex interactions within these systems to improve their efficiency, robustness, and behavior, especially in contexts such as energy markets and disease dynamics.

Research topics

Computer Science
Physics
Medicine
Economics
Psychology
Econometrics
Biology

Selected publications

A Lagrangian Framework for Safe Cooperative Reinforcement Learning
2025-12-09
articleSenior author
We consider the problem of safe cooperative multiagent reinforcement learning (MARL) within the framework of a constrained multiagent Markov decision process (MDP). Agents share a common value function and learn to coordinate their actions to maximize a joint objective while adhering to system-level constraints. These constraints can enforce safety, reliability, or additional regulatory requirements governing the evolution of the multiagent system. We propose a Lagrangian-based approach, where agents iteratively solve a relaxed Lagrangian MDP using a joint learning mechanism. During execution, agents independently follow their policies, accumulating constraint violations over an epoch, which are then used to update the Lagrange multipliers. We show that continuous execution of this primal-dual algorithm produces episodes which are feasible almost surely. Further, we prove that the sequence of policies generated by the algorithm yields a nonstationary approximately optimal solution for the safe cooperative MARL problem.
Publisher DOI
The Lagrangian Method for Solving Constrained Markov Games
ArXiv.org · 2025-03-13
preprintOpen accessSenior author
We propose the concept of a Lagrangian game to solve constrained Markov games. Such games model scenarios where agents face cost constraints in addition to their individual rewards, that depend on both agent joint actions and the evolving environment state over time. Constrained Markov games form the formal mechanism behind safe multiagent reinforcement learning, providing a structured model for dynamic multiagent interactions in a multitude of settings, such as autonomous teams operating under local energy and time constraints, for example. We develop a primal-dual approach in which agents solve a Lagrangian game associated with the current Lagrange multiplier, simulate cost and reward trajectories over a fixed horizon, and update the multiplier using accrued experience. This update rule generates a new Lagrangian game, initiating the next iteration. Our key result consists in showing that the sequence of solutions to these Lagrangian games yields a nonstationary Nash solution for the original constrained Markov game.
Publisher OA PDF DOI
Fictitious Play in Product Markov Games with Kullback-Leibler Control Cost
2025-10-26
article
We present and analyze fictitious play for a new class of product Markov games with a Kullback-Leibler (KL) control cost. In a product Markov game, state transitions are the product of n Markov transition functions, where each agent controls its own local state transition dynamics given the common state and incurs a KL control cost for their efforts. Fictitious play entails each agent best-responding to minimize its discounted sum of instantaneous costs, that depend on KL control cost and a state cost, given local beliefs about other agents’ policies. Agents update their beliefs about other agents’ policies upon observation of the realized states. We show that the fictitious play converges asymptotically to a Nash equilibrium of a product Markov potential game. Simulation results on a multi-agent cloud radio access network confirm the convergence result for the game with non-identical payoffs and demonstrate the speed of convergence.
Publisher DOI
Learnings graph-Fourier spectra of textured surface images for defect localization
Manufacturing Letters · 2024-10-01 · 2 citations
articleOpen access
In the realm of industrial manufacturing, product inspection remains a significant bottleneck, with only a small fraction of manufactured items undergoing inspection for surface defects. Advances in imaging systems and AI can allow automated full inspection of manufactured surfaces. However, even the most contemporary imaging and machine learning methods perform poorly for detecting defects in images with highly textured backgrounds, that stem from diverse manufacturing processes. This paper introduces an approach based on graph Fourier analysis to automatically identify defective images, as well as crucial graph Fourier coefficients that inform the defects in the images. The approach thereby facilitates precise localization and characterization of defects, amidst highly textured backgrounds. A convolutional neural network model (1D-CNN) was trained with the coefficients of the graph Fourier transform of the images as the input to identify, with classification accuracy of 99.4%, if the image contains a defect. An explainable AI method using SHAP (SHapley Additive exPlanations) was used to further analyze the trained 1D-CNN model to discern important spectral coefficients for each image. This approach sheds light on the crucial contribution of low-frequency graph eigen waveforms to precisely localize surface defects in images, thereby advancing the realization of zero-defect manufacturing.
Publisher DOI
Analyzing the Use of Non-Pharmaceutical Personal Protective Measures Through Self-Interest and Social Optimum for the Control of an Emerging Disease
SSRN Electronic Journal · 2024-01-01
preprintOpen access
Publisher DOI
Simulation-Based Optimistic Policy Iteration For Multi-Agent MDPs with Kullback-Leibler Control Cost
arXiv (Cornell University) · 2024-10-19
preprintOpen access
This paper proposes an agent-based optimistic policy iteration (OPI) scheme for learning stationary optimal stochastic policies in multi-agent Markov Decision Processes (MDPs), in which agents incur a Kullback-Leibler (KL) divergence cost for their control efforts and an additional cost for the joint state. The proposed scheme consists of a greedy policy improvement step followed by an m-step temporal difference (TD) policy evaluation step. We use the separable structure of the instantaneous cost to show that the policy improvement step follows a Boltzmann distribution that depends on the current value function estimate and the uncontrolled transition probabilities. This allows agents to compute the improved joint policy independently. We show that both the synchronous (entire state space evaluation) and asynchronous (a uniformly sampled set of substates) versions of the OPI scheme with finite policy evaluation rollout converge to the optimal value function and an optimal joint policy asymptotically. Simulation results on a multi-agent MDP with KL control cost variant of the Stag-Hare game validates our scheme's performance in terms of minimizing the cost return.
Publisher OA PDF DOI
Learning Nash in Constrained Markov Games With an α -Potential
IEEE Control Systems Letters · 2024-01-01 · 1 citations
articleSenior author
We develop a best-response algorithm for solving constrained Markov games assuming limited violations for the potential game property. The limited violations of the potential game property mean that changes in value function due to unilateral policy alterations can be measured by the potential function up to an error <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula>. We show the existence of stationary <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-approximate constrained Nash policy whenever the set of feasible stationary policies is non-empty. Our setting has agents accessing an efficient probably approximately correct solver for a constrained Markov decision process which they use for generating best-response policies against the other agents’ former policies. For an accuracy threshold <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\epsilon \gt 4\alpha $ </tex-math></inline-formula>, the best-response dynamics generate provable convergence to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-Nash policy in finite time with probability at least <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1-\delta $ </tex-math></inline-formula> at the expense of polynomial bounds on sample complexity that scales with the reciprocal of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\delta $ </tex-math></inline-formula>.
Publisher DOI
Almost Sure Convergence of Networked Policy Gradient over Time-Varying Networks in Markov Potential Games
arXiv (Cornell University) · 2024-10-26
preprintOpen accessSenior author
We propose networked policy gradient play for solving Markov potential games with continuous and/or discrete state-action pairs. During the game, agents use parametrized and differentiable policies that depend on the current state and the policy parameters of other agents. During training, agents update their policy parameters following stochastic gradients. The gradient estimation involves two consecutive episodes, generating unbiased estimators of reward and policy score functions. In addition, it involves keeping estimates of others' parameters using consensus steps given local estimates received through a time-varying communication network. In Markov potential games, there exists a potential value function among agents with gradients corresponding to the gradients of local value functions. Using this structure, we prove almost sure convergence to a stationary point of the potential value function with rate $O(1/ε^2)$. Compared to previous works, our results do not require bounded policy gradients or initial agreement on the values of individual policy parameters. Numerical experiments on a dynamic multi-agent newsvendor problem verify the convergence of local beliefs and gradients. It further shows that networked policy gradient play converges as fast as independent policy gradient updates, while collecting higher rewards.
Publisher OA PDF DOI
Analyzing the use of non-pharmaceutical personal protective measures through self-interest and social optimum for the control of an emerging disease
Mathematical Biosciences · 2024-07-04
article
Publisher DOI
Average Submodularity of Maximizing Anticoordination in Network Games
SIAM Journal on Control and Optimization · 2024-09-20
articleSenior author
Publisher DOI

Recent grants

CAREER: Evolutionary Games in Dynamic and Networked Environments for Modeling and Controlling Large-Scale Multi-agent Systems
NSF · $503k · 2023–2028
CIF: Small: Communication-Aware Decentralized Game-Theoretic Learning Algorithms for Networked Systems with Uncertainty
NSF · $361k · 2020–2024
Modeling and Control of Ceovolutionary Network Formation with Applications to Finishing Processes for 3D Printed Components
NSF · $430k · 2020–2024

Frequent coauthors

Alejandro Ribeiro
California University of Pennsylvania
34 shared
Joshua S. Weitz
University of Maryland, College Park
23 shared
Keith Paarporn
21 shared
Armita Nourmohammad
19 shared
Sarper Aydın
17 shared
Ali Jadbabaie
15 shared
Pooya Molavi
Northwestern University
14 shared
Furkan Sezer
12 shared

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Ceyhun Eksin

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you