Stephen Guy

· Associate ProfessorVerified

University of Minnesota · Computer Science and Engineering

Active 1998–2026

h-index32

Citations5.8k

Papers10214 last 5y

Funding$170k

Faculty page

See your match with Stephen Guy — sign in to PhdFit.Sign in

About

Stephen Guy is an Associate Professor in the Department of Computer Science & Engineering at the University of Minnesota Twin Cities. He joined the department as an assistant professor in 2012 and was promoted to associate professor in 2018. He also serves as the director of graduate studies. His educational background includes a B.S. in Computer Engineering from the University of Virginia (2006), and both an M.S. (2009) and a Ph.D. (2012) in Computer Science from the University of North Carolina at Chapel Hill. His research focuses on motion planning, predictive simulations, and human behavioral analysis, with particular emphasis on approaches that leverage large-scale data analysis, optimization, and machine learning. He is interested in cross-disciplinary applications of his work to domains such as robotics, video games, virtual reality, and medicine. Stephen Guy has received recognition for his teaching, including the Charles E. Bowers Faculty Teaching Award in 2018, and is a member of professional organizations such as ACM, IEEE, the IEEE Computer Society, and the IEEE Robotics and Automation Society.

Research topics

Computer Science
Artificial Intelligence
Machine Learning
Data Mining
Statistics
Human–computer interaction
Mathematics
Distributed computing
Real-time computing
Algorithm

Selected publications

Least-effort trajectories lead to emergent crowd behaviors
UNC Libraries · 2026-04-03
articleOpen access
Pedestrian crowds often have been modeled as many-particle systems, usually using computer models known as multiagent simulations. The key challenge in modeling crowds is to develop rules that guide how the particles or agents interact with each other in a way that faithfully reproduces paths and behaviors commonly seen in real human crowds. Here, we propose a simple and intuitive formulation of these rules based on biomechanical measurements and the principle of least effort. We present a constrained optimization method to compute collision-free paths of minimum caloric energy for each agent, from which collective crowd behaviors can be reproduced. We show that our method reproduces common crowd phenomena, such as arching and self-organization into lanes. We also validate the flow rates and paths produced by our method and compare them to those of real-world crowd trajectories.
Publisher DOI
Design Principles Guide Meaningful Play by Improving Ease and Intentionality of Game Design
International Journal of Social Science Research · 2026-03-15
articleOpen access
This project examines the role of design principles and how they impact the experience of players using a digital minigame created and inspired by the CLUE board game. We examine the role of design principles in guiding users toward finding hidden clues and whether or not users are drawn to spots with design principles predetermined and embedded. We investigate this by examining if pre-selected spots with design principles are perceived as more intentional than a baseline of spots identified by random points, mediated through the GroundingDINO object detector. Results showed that players were twice as likely to find design principles spots than randomly selected spots. Players found the design principles spots more intuitive and perceived intentional than the random baseline. In addition, results showed that there were varying effects from different design principles and can improve game design and allow designers to control the difficulty of the game experience.
Publisher OA PDF DOI
Regularized Multilevel Multinomial Regression for Select-All-That-Apply Responses and High-Dimensional Predictors with Applications to Perception of Facial Expressions
Psychometrika · 2026-04-22
articleOpen access
This article develops an analysis pipeline for quantifying and relating mouth shape variation to the emotions perceived from facial expressions. We use open-source data that contains ratings from 802 fairgoers on 27 smile-like expressions. Each rater was given a list of seven emotions (happy, sad, anger, contempt, fear, surprise, and disgust) and asked to select all of the words that best described the facial expression. To develop a generalizable method for quantifying mouth shape variation, we leverage statistical shape analysis techniques to parameterize each mouth's shape in terms of 30 systematically placed landmarks that outline the upper and lower lips. Furthermore, we demonstrate that a three-dimensional representation of these landmark coordinates produces an interpretable feature set that outperforms the original and full-dimensional feature sets in terms of predictive performance. To connect the mouth shape features to the emotion ratings, we develop a nonparametric multinomial regression model that is capable of shrinkage and selection with high-dimensional predictors. Our results demonstrate that the proposed method can produce easily interpretable model predictions that enhance our understanding of the nature in which subtle variations in mouth shape affect the perception of a facial expression.
Publisher OA PDF DOI
Simultaneous Localization and Affordance Prediction of Tasks from Egocentric Video
2025-05-19
articleSenior author
Vision-Language Models (VLMs) have shown great success as foundational models for downstream vision and natural language applications in a variety of domains. However, these models are limited to reasoning over objects and actions currently visible on the image plane. We present a spatial extension to the VLM, which leverages spatially-localized egocentric video demonstrations to augment VLMs in two ways - through understanding spatial task-affordances, i.e. where an agent must be for the task to physically take place, and the localization of that task relative to the egocentric viewer. We show our approach outperforms the baseline of using a VLM to map similarity of a task's description over a set of location-tagged images. Our approach has less error both on predicting where a task may take place and on predicting what tasks are likely to happen at the current location. The resulting representation will enable robots to use egocentric sensing to navigate to, or around, physical regions of interest for novel tasks specified in natural language.
Publisher DOI
Comparing Robotic and Computer Vision Assessments of Unilateral and Bilateral Reaching in Healthy Adults
bioRxiv (Cold Spring Harbor Laboratory) · 2025-08-07
preprintOpen access
Abstract Assessment of reaching is foundational to upper limb neurorehabilitation. Current neurorehabilitation needs have increased the demand for quantitative clinical assessments of bilateral coordination. Robotics and computer vision for motion tracking are two means to provide relevant quantitative metrics but have many differences including the dimensionality of reaching movements (planar versus three-dimensional) and data acquisition. We do not know how consistent measures of bilateral coordination performance are between these different assessments. In this study, we examined how one robotic and one computer vision method can identify differences between symmetrical and asymmetrical reaching, and the correlations in movement time, and hand lag between these two approaches. Thirty healthy young adults completed four reaching games using the Kinarm exoskeleton robot and a custom developed augmented reality assessment using computer vision. We found that both approaches were able to detect well-established movement time and hand lag differences between symmetrical and asymmetrical reaching, with the differences between symmetrical and asymmetrical being larger with the computer vision approach. Moderate correlations were found between approaches for unilateral and symmetric reaching in both movement time and hand lags; however, no significant correlations were found between approaches for asymmetric reaching. Our results show that reaching task performance differs between robotic and computer vision-based assessment, however, both approaches provide quantitative metrics of unilateral and bilateral reaching that are consistent with prior research. There are benefits and tradeoffs to each approach, and this study informs how clinicians and researchers can consider the methodological differences when determining which assessment method to use.
Publisher OA PDF DOI
Comparing robotic and computer vision assessments of unilateral and bilateral reaching in healthy adults
Smart Health · 2025-10-01
article
Publisher DOI
Reaching Motion Characterization Across Childhood via Augmented Reality Games
ArXiv.org · 2025-02-20
preprintOpen accessSenior author
While performance in coordinated motor tasks has been shown to improve in children as they age, the characterization of children's movement strategies has been underexplored. In this work, we use upper-body motion data collected from an augmented reality reaching game, and show that short (13 second) sections of motion are are sufficient to reveal arm motion differences across child development. To explore what drives this trend, we characterize the movement patterns across different age groups by analyzing (1) directness of path, (2) maximum speed, and (3) progress towards the reaching target. We find that although maximum arm velocity decreases with age (p~=~0.02), their paths to goal are more direct (p~=~0.03), allowing for faster time to goal overall. We also find that older children exhibit more anticipatory reaching behavior, enabling more accurate goal-reaching (i.e. no overshooting) compared to younger children. The resulting analysis has potential to improve the realism of child-like digital characters and advance our understanding of motor skill development.
Publisher OA PDF DOI
A Picture is Worth a Thousand Clinical Assessments: Use of Video-Based Pose Estimation to Augment Post-Stroke Upper Limb Assessment of Bilateral Tasks
Restorative Neurology and Neuroscience · 2025-12-22
articleOpen accessSenior author
Clinical assessments of the post-stroke upper limbs have several limitations in that they focus primarily on unilateral movements, rely on observer-based ordinal scales, and give limited insight into movement quality. Human pose estimation uses computer vision to extract motion data from videos, making it a clinically feasible tool to assess movement and overcome many challenges of traditional clinical assessments. Our objective of this work was to demonstrate the use of video-based pose estimation to enhance the assessment of bilateral tasks in individuals post-stroke through visualizations and quantitative metrics. Using single camera video recordings of the Chedoke Hand and Arm Activity Inventory in two individuals with chronic stroke and one neurologically intact individual, we demonstrate differences in movement patterns including increased compensatory movements of proximal joints and asymmetries. We were able to detect differences that the traditional assessment scoring could not, demonstrating the potential of computer vision to enhance clinical assessment.
Publisher OA PDF DOI
SENT Map -- Semantically Enhanced Topological Maps with Foundation Models
ArXiv.org · 2025-11-05
preprintOpen access
We introduce SENT-Map, a semantically enhanced topological map for representing indoor environments, designed to support autonomous navigation and manipulation by leveraging advancements in foundational models (FMs). Through representing the environment in a JSON text format, we enable semantic information to be added and edited in a format that both humans and FMs understand, while grounding the robot to existing nodes during planning to avoid infeasible states during deployment. Our proposed framework employs a two stage approach, first mapping the environment alongside an operator with a Vision-FM, then using the SENT-Map representation alongside a natural-language query within an FM for planning. Our experimental results show that semantic-enhancement enables even small locally-deployable FMs to successfully plan over indoor environments.
Publisher OA PDF DOI
Characterizing masticatory motion of dogs using optical and electromagnetic motion tracking
Frontiers in Veterinary Science · 2025-07-03
articleOpen access
Introduction Accurate knowledge of masticatory motion across a variety of food materials is essential for ex-vivo testing and simulation of the food-teeth interaction. Yet, the masticatory motion has never been fully characterized in the domestic dog ( Canis lupus ), limiting our ability for ex-vivo modelling. Objective The aim of this study was to characterize masticatory motion among a variety of different foods in beagle dogs using optical and electromagnetic motion tracking. Results We confirmed that the masticatory pattern in the beagle is a hinge motion with no clinically meaningful horizontal motion of the mandible. The mouth opening was not significantly difference among different food and treat types regardless of food stiffness and force to fracture of the food, with a mean and standard deviation of 2.51 ± 0.33 (range 1.93–2.95) cm between the canine teeth during chewing. Conversely, frequency of chewing was influenced by food type, with kibbles having a significantly higher peak mean chewing frequency (2.93 Hz) compared to other feeds. Frequency of chewing was linearly correlated to the force to fracture of the food material ( p = 0.03, R 2 = 0.56), while stiffness of food did not significantly affect peak chewing frequency. Conclusion Data from this study can guide ex-vivo modelling of the feed-teeth interaction for product design and testing, especially those that focus on prevention of periodontal disease and dentoalveolar trauma.
Publisher OA PDF DOI

Recent grants

EAGER: Uncertainty-aware Planning for Robot Navigation in Human Environments
NSF · $170k · 2017–2020

Frequent coauthors

Dinesh Manocha
University of Maryland, College Park
44 shared
Ming C. Lin
31 shared
Ioannis Karamouzas
University of California, Riverside
25 shared
Nick Sohre
University of Minnesota System
17 shared
Jur van den Berg
14 shared
Sujeong Kim
12 shared
Sofía Lyford-Pike
University of Minnesota
11 shared
Nathaniel E. Helwig
University of Minnesota
11 shared

Labs

Applied Motion LabPI

Awards & honors

2018: College of Science and Engineering - Charles E. Bowers…

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Stephen Guy

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you