Jon E. Froehlich

· ProfessorVerified

University of Washington · Computer Science & Engineering

Active 1967–2026

h-index48

Citations10.8k

Papers20070 last 5y

Funding$4.5M1 active

Faculty page Lab page

See your match with Jon E. Froehlich — sign in to PhdFit.Sign in

About

Jon E. Froehlich is the Lab Director of the Makeability Lab at the University of Washington's Computer Science department. He earned his PhD in 2011 from the University of Washington with a dissertation titled 'Sensing and Feedback of Everyday Activities to Promote Environmental Behaviors.' His research focuses on designing, building, and evaluating new interactive tools and techniques to address pressing societal challenges. The Makeability Lab emphasizes both technological innovations that enable new human abilities and an educational mission to help students gain new skills through research, invention, and human-centered design.

Research topics

Mathematics
Engineering
Computer Science
Mathematics education
World Wide Web
Epistemology
Psychology
Multimedia
Human–computer interaction
Geography

Selected publications

CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation
Open MIND · 2026-02-20
preprintSenior author
Vision-Language Models (VLMs) have shown remarkable progress in Vision-Language Navigation (VLN), offering new possibilities for navigation decision-making that could benefit both robotic platforms and human users. However, real-world navigation is inherently conditioned by the agent's mobility constraints. For example, a sweeping robot cannot traverse stairs, while a quadruped can. We introduce Capability-Conditioned Navigation (CapNav), a benchmark designed to evaluate how well VLMs can navigate complex indoor spaces given an agent's specific physical and operational capabilities. CapNav defines five representative human and robot agents, each described with physical dimensions, mobility capabilities, and environmental interaction abilities. CapNav provides 45 real-world indoor scenes, 473 navigation tasks, and 2365 QA pairs to test if VLMs can traverse indoor environments based on agent capabilities. We evaluate 13 modern VLMs and find that current VLM's navigation performance drops sharply as mobility constraints tighten, and that even state-of-the-art models struggle with obstacle types that require reasoning on spatial dimensions. We conclude by discussing the implications for capability-aware navigation and the opportunities for advancing embodied spatial reasoning in future VLMs. The benchmark is available at https://github.com/makeabilitylab/CapNav
DOI
CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation
arXiv (Cornell University) · 2026-02-20
articleOpen accessSenior author
Vision-Language Models (VLMs) have shown remarkable progress in Vision-Language Navigation (VLN), offering new possibilities for navigation decision-making that could benefit both robotic platforms and human users. However, real-world navigation is inherently conditioned by the agent's mobility constraints. For example, a sweeping robot cannot traverse stairs, while a quadruped can. We introduce Capability-Conditioned Navigation (CapNav), a benchmark designed to evaluate how well VLMs can navigate complex indoor spaces given an agent's specific physical and operational capabilities. CapNav defines five representative human and robot agents, each described with physical dimensions, mobility capabilities, and environmental interaction abilities. CapNav provides 45 real-world indoor scenes, 473 navigation tasks, and 2365 QA pairs to test if VLMs can traverse indoor environments based on agent capabilities. We evaluate 13 modern VLMs and find that current VLM's navigation performance drops sharply as mobility constraints tighten, and that even state-of-the-art models struggle with obstacle types that require reasoning on spatial dimensions. We conclude by discussing the implications for capability-aware navigation and the opportunities for advancing embodied spatial reasoning in future VLMs. The benchmark is available at https://github.com/makeabilitylab/CapNav
Publisher OA PDF
Unseen City Canvases: Exploring Blind and Low Vision People's Perspectives on Urban and Public Art Accessibility
arXiv (Cornell University) · 2026-03-27
preprintOpen access
Public art can hold cultural, social, political, and aesthetic significance, enriching urban environments and promoting well-being. However, a majority of urban art is inaccessible to blind and low vision (BLV) people. Most art access research has focused on private and curated settings (e.g., museums, galleries) and most urban access work has centered on outdoor navigation, leaving urban and public art accessibility largely understudied. We conducted semi-structured interviews with 16 BLV participants, using design probes featuring AI-generated descriptions and real-time AI interactions to investigate preferences for both discovering and engaging with urban art. We found that BLV people valued spontaneous art exploration, multisensory (e.g., tactile, auditory, olfactory) engagement, and detailed descriptions of culturally significant artwork. Participants also highlighted challenges distinct to urban art contexts: safety took precedence over art exploration, multisensory access measures could be disruptive to others in the public space, and inaccurate AI descriptions could lead to cultural erasure. Our contributions include empirical insights on BLV preferences for urban art discovery and engagement, seven design dimensions for public art access solutions, and implications for expanding HCI urban accessibility research beyond navigation.
Publisher DOI
ARGaze: Autoregressive Transformers for Online Egocentric Gaze Estimation
Open MIND · 2026-02-04
preprint
Online egocentric gaze estimation predicts where a camera wearer is looking from first-person video using only past and current frames, a task essential for augmented reality and assistive technologies. Unlike third-person gaze estimation, this setting lacks explicit head or eye signals, requiring models to infer current visual attention from sparse, indirect cues such as hand-object interactions and salient scene content. We observe that gaze exhibits strong temporal continuity during goal-directed activities: knowing where a person looked recently provides a powerful prior for predicting where they look next. Inspired by vision-conditioned autoregressive decoding in vision-language models, we propose ARGaze, which reformulates gaze estimation as sequential prediction: at each timestep, a transformer decoder predicts current gaze by conditioning on (i) current visual features and (ii) a fixed-length Gaze Context Window of recent gaze target estimates. This design enforces causality and enables bounded-resource streaming inference. We achieve state-of-the-art performance across multiple egocentric benchmarks under online evaluation, with extensive ablations validating that autoregressive modeling with bounded gaze history is critical for robust prediction. We will release our source code and pre-trained models.
DOI
GeoVisA11y: An AI-based Geovisualization Question-Answering System for Screen-Reader Users
Open MIND · 2026-03-08
preprintSenior author
Geovisualizations are powerful tools for communicating spatial information, but are inaccessible to screen-reader users. To address this limitation, we present GeoVisA11y, an LLM-based question-answering system that makes geovisualizations accessible through natural language interaction. The system supports map reading, analysis, interpretation and navigation by handling analytical, geospatial, visual and contextual queries. Through user studies with 12 screen-reader users and sighted participants, we demonstrate that GeoVisA11y effectively bridges accessibility gaps while revealing distinct interaction patterns between user groups. We contribute: (1) an open-source, accessible geovisualization system, (2) empirical findings on query and navigation differences, and (3) a dataset of geospatial queries to inform future research on accessible data visualization.
DOI
ARGaze: Autoregressive Transformers for Online Egocentric Gaze Estimation
ArXiv.org · 2026-02-04
articleOpen access
Online egocentric gaze estimation predicts where a camera wearer is looking from first-person video using only past and current frames, a task essential for augmented reality and assistive technologies. Unlike third-person gaze estimation, this setting lacks explicit head or eye signals, requiring models to infer current visual attention from sparse, indirect cues such as hand-object interactions and salient scene content. We observe that gaze exhibits strong temporal continuity during goal-directed activities: knowing where a person looked recently provides a powerful prior for predicting where they look next. Inspired by vision-conditioned autoregressive decoding in vision-language models, we propose ARGaze, which reformulates gaze estimation as sequential prediction: at each timestep, a transformer decoder predicts current gaze by conditioning on (i) current visual features and (ii) a fixed-length Gaze Context Window of recent gaze target estimates. This design enforces causality and enables bounded-resource streaming inference. We achieve state-of-the-art performance across multiple egocentric benchmarks under online evaluation, with extensive ablations validating that autoregressive modeling with bounded gaze history is critical for robust prediction. We will release our source code and pre-trained models.
Publisher OA PDF
BikeButler: A Personalized, Context-sensitive Bike Routing Tool using Open Data and VLM-based Analyses of Street View Imagery
2026-04-13 · 1 citations
articleOpen accessSenior author
Urban cycling benefits personal wellbeing, public health, and global sustainability. While current tools such as Google and Apple Maps provide bike route recommendations, they do not account for a person’s dynamic context (e.g., commuting, recreation). We introduce BikeButler, a personalized, context-sensitive bicycle route generation tool that enables users to generate, compare, virtually preview, and iteratively customize bike routes via custom profiles that encode seven bikeability features, including bike lane existence, slope, vegetation, and surface quality—fusing data from OpenStreetMap, open government data, and a custom VLM-based analysis of Street View images. To design BikeButler, we employed a human-centered, iterative approach starting with formative interviews and culminating in a user study (N=16). Our findings demonstrate that bike routing preferences change as a function of context, that BikeButler enables users to quickly create and iterate context-sensitive routes, and that generated routes differ significantly from Google Maps bike routing, reinforcing the importance of personalization.
Publisher DOI
GeoVisA11y: An AI-based Geovisualization Question-Answering System for Screen-Reader Users
ArXiv.org · 2026-03-08
articleOpen accessSenior author
Geovisualizations are powerful tools for communicating spatial information, but are inaccessible to screen-reader users. To address this limitation, we present GeoVisA11y, an LLM-based question-answering system that makes geovisualizations accessible through natural language interaction. The system supports map reading, analysis, interpretation and navigation by handling analytical, geospatial, visual and contextual queries. Through user studies with 12 screen-reader users and sighted participants, we demonstrate that GeoVisA11y effectively bridges accessibility gaps while revealing distinct interaction patterns between user groups. We contribute: (1) an open-source, accessible geovisualization system, (2) empirical findings on query and navigation differences, and (3) a dataset of geospatial queries to inform future research on accessible data visualization.
Publisher OA PDF
Unseen City Canvases: Exploring Blind and Low Vision People's Perspectives on Urban and Public Art Accessibility
arXiv (Cornell University) · 2026-03-27
articleOpen access
Public art can hold cultural, social, political, and aesthetic significance, enriching urban environments and promoting well-being. However, a majority of urban art is inaccessible to blind and low vision (BLV) people. Most art access research has focused on private and curated settings (e.g., museums, galleries) and most urban access work has centered on outdoor navigation, leaving urban and public art accessibility largely understudied. We conducted semi-structured interviews with 16 BLV participants, using design probes featuring AI-generated descriptions and real-time AI interactions to investigate preferences for both discovering and engaging with urban art. We found that BLV people valued spontaneous art exploration, multisensory (e.g., tactile, auditory, olfactory) engagement, and detailed descriptions of culturally significant artwork. Participants also highlighted challenges distinct to urban art contexts: safety took precedence over art exploration, multisensory access measures could be disruptive to others in the public space, and inaccurate AI descriptions could lead to cultural erasure. Our contributions include empirical insights on BLV preferences for urban art discovery and engagement, seven design dimensions for public art access solutions, and implications for expanding HCI urban accessibility research beyond navigation.
Publisher OA PDF
Towards Human-AI Accessibility Mapping in India: VLM-Guided Annotations and POI-Centric Analysis in Chandigarh
ArXiv.org · 2026-02-09
articleOpen access
Project Sidewalk is a web-based platform that enables crowdsourcing accessibility of sidewalks at city-scale by virtually walking through city streets using Google Street View. The tool has been used in 40 cities across the world, including the US, Mexico, Chile, and Europe. In this paper, we describe adaptation efforts to enable deployment in Chandigarh, India, including modifying annotation types, provided examples, and integrating VLM-based mission guidance, which adapts instructions based on a street scene and metadata analysis. Our evaluation with 3 annotators indicates the utility of AI-mission guidance with an average score of 4.66. Using this adapted Project Sidewalk tool, we conduct a Points of Interest (POI)-centric accessibility analysis for three sectors in Chandigarh with very different land uses, residential, commercial and institutional covering about 40 km of sidewalks. Across 40 km of roads audited in three sectors and around 230 POIs, we identified 1,644 of 2,913 locations where infrastructure improvements could enhance accessibility.
Publisher OA PDF

Recent grants

EXP: BodyVis: Advancing New Science Learning and Inquiry Experiences via Custom Designed Wearable On-Body Sensing and Visualization
NSF · $550k · 2014–2019
HCC: Medium: Combining Crowdsourcing and Computer Vision for Street-level Accessibility
NSF · $1.2M · 2013–2019
CAREER: A Tangible-Graphical Approach to Engage Young Children in Wearable Design
NSF · $541k · 2017–2024
SCC-IRG Track 1: Crowd+AI Tools to Map, Analyze, and Visualize Sidewalk Accessibility for Inclusive Cities
NSF · $2.0M · 2021–2026
CAREER: A Tangible-Graphical Approach to Engage Young Children in Wearable Design
NSF · $227k · 2017–2018

Frequent coauthors

Leah Findlater
Apple (United States)
72 shared
Liang He
24 shared
Lee Stearns
University of Maryland, College Park
20 shared
Dhruv Jain
19 shared
James A. Landay
Stanford University
17 shared
Michael Saugstad
University of Washington
17 shared
Manaswi Saha
Accenture (Switzerland)
16 shared
Uran Oh
Ewha Womans University
14 shared

Labs

Makeability LabPI
An advanced research lab in Human-Computer Interaction and AI

Education

Ph.D., Computer Science
University of Washington
2009
M.S., Computer Science
University of Washington
2004
B.S., Computer Science and Engineering
University of California, San Diego
2002

Awards & honors

21 Best Paper and Honorable Mention awards
Sloan Fellowship
UW Distinguished Dissertation Award
Google Faculty Research Awards
UW College of Engineering Outstanding Faculty Award (2021)

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Jon E. Froehlich

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you