Emily Kyle Fox

· Teaching Associate ProfessorVerified

Virginia Tech · Computer Science

Active 1958–2025

h-index60

Citations14.8k

Papers782112 last 5y

Funding$2.5M

Faculty page

See your match with Emily Kyle Fox — sign in to PhdFit.Sign in

About

Emily Kyle Fox is a Teaching Associate Professor at the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. She completed her Ph.D. in Computer Science at the same university in 2013, following her M.S. in 2010 and B.S. in 2008, all from the University of Illinois Urbana-Champaign. Her research interests include algorithms, computational geometry and topology, and graph algorithms. Fox has contributed to the field through various publications in reputable journals and conference proceedings, focusing on topics such as clustering with faulty centers, minimum cuts in hypergraphs and surface graphs, geometric edit distance, and approximation schemes for geometric transportation. She has received recognition for her teaching, including the Best Teacher award at the University of Texas at Dallas for the 2019-2020 academic year, and her research has been supported by the National Science Foundation CAREER Award. Her academic career includes positions at the University of Texas at Dallas before joining the University of Illinois Urbana-Champaign.

Research topics

Artificial Intelligence
Computer Science
Information Retrieval
Natural Language Processing
Data Mining
Machine Learning
Database
Data science
Linguistics
Psychology
Law

Selected publications

From Data Deficient to Big Data in Shark Conservation
Fish and Fisheries · 2025-08-11 · 1 citations
articleOpen access
ABSTRACT Citizen science is increasingly harnessed worldwide to gather data otherwise requiring a prohibitive investment of funding and time. Meanwhile, the revolution in digital communication offers opportunities from crowdsourcing, big data approaches and social network mining to quickly and cost‐effectively fill major gaps in knowledge necessary to protect endangered populations. Sharks are among the most endangered and data‐poor vertebrates in the ocean. Mainly due to overfishing, many shark populations are declining worldwide, while most species lack basic abundance, distribution and life‐history data. Hence, filling knowledge gaps across taxa, ecosystems, and regions is urgently needed to increase our understanding of their ecology, develop effective conservation actions and reverse their loss. Here, we introduce a novel citizen science and crowdsourcing approach for conservation through sharkPulse, a new platform automating data ingestion and organisation to build the largest database of shark occurrence records to date. Designed to complement and extend similar biodiversity monitoring tools relying heavily on user submissions, sharkPulse aims to source large streams of online shark images and transform them into occurrence records, filling knowledge gaps in shark ecology and biology. This platform offers a blueprint to leverage AI and big data approaches, social network data mining and participatory science to efficiently and continuously source visual media materials and transform the monitoring of data‐limited marine and terrestrial animal populations.
Publisher OA PDF DOI
AI-Facilitated Episodic Future Thinking For Adults with Obesity
ArXiv.org · 2025-03-08
preprintOpen accessSenior author
Episodic Future Thinking (EFT) involves vividly imagining personal future events and experiences in detail. It has shown promise as an intervention to reduce delay discounting-the tendency to devalue delayed rewards in favor of immediate gratification- and to promote behavior change in a range of maladaptive health behaviors. We present EFTeacher, an AI chatbot powered by the GPT-4-Turbo large language model, designed to generate EFT cues for users with lifestyle-related conditions. To evaluate the feasibility and usability of EFTeacher, we conducted a mixed-methods study that included usability assessments, user evaluations based on content characteristics questionnaires, and semi-structured interviews. Qualitative findings indicate that participants perceived EFTeacher as communicative and supportive through an engaging dialogue. The chatbot facilitated imaginative thinking and reflection on future goals. Participants appreciated its adaptability and personalization features, though some noted challenges such as repetitive dialogue and verbose responses. Our findings underscore the potential of large language model-based chatbots in EFT interventions targeting maladaptive health behaviors.
Publisher OA PDF DOI
Don’t let food poisoning crash your picnic – six tips to keep your spread safe
2025-07-11
preprint1st authorCorresponding
Publisher DOI
What’s in a cue?: Using natural language processing to quantify content characteristics of episodic future thinking in the context of overweight and obesity
Health Psychology and Behavioral Medicine · 2025-06-02
articleOpen access
Episodic future thinking (EFT), an intervention in which participants vividly imagine their future, has been explored as a cognitive intervention to reduce delay discounting and decrease engagement in harmful health behaviors. In these studies, participants generate text descriptions of personally meaningful future events. The content of these text descriptions, or cues, is heterogeneous and can vary along several dimensions (e.g. references to health, celebrations, family; vividness; emotional valence). However, little work has quantified this heterogeneity or potential importance for EFT's efficacy. To better understand the potential impact of EFT content in the context of health behavior change (e.g. diet) among people with or at risk for obesity and related conditions, we used data from 19 prior EFT studies, including 1705 participants (mean body mass index = 33.1) who generated 9714 cues. We used natural language processing to classify EFT content and examined whether EFT content moderated effects on delay discounting. Cues most commonly involved recreation, food, and spending time with family, and least commonly involved references to health and self-improvement. Cues were generally classified as highly vivid, episodic, and positively valent (consistent with the intervention's design). In multivariate regression with model selection, EFT content did not significantly moderate the effect of the episodic thinking intervention. Thus, we find no evidence that any of the content characteristics we examined were important moderators of the efficacy of EFT in reducing delay discounting. This suggests that EFT's efficacy is robust against variability in these characteristics. However, note that in all studies, EFT methods were designed to generate high levels of vividness, episodicity, and emotional valence, potentially resulting in a ceiling effect in these content areas. Moreover, EFT content was not experimentally manipulated, limiting causal inference. Future studies should experimentally examine these and other content characteristics and evaluate their possible role in EFT's efficacy.
Publisher DOI
Toward Robust URL Extraction for Open Science: A Study of arXiv File Formats and Temporal Trends
ArXiv.org · 2025-09-05
preprintOpen access
In this work, we study how URL extraction results depend on input format. We compiled a pilot dataset by extracting URLs from 10 arXiv papers and used the same heuristic method to extract URLs from four formats derived from the PDF files or the source LaTeX files. We found that accurate and complete URL extraction from any single format or a combination of multiple formats is challenging, with the best F1-score of 0.71. Using the pilot dataset, we evaluate extraction performance across formats and show that structured formats like HTML and XML produce more accurate results than PDFs or Text. Combining multiple formats improves coverage, especially when targeting research-critical resources. We further apply URL extraction on two tasks, namely classifying URLs into open-access datasets and software and the others, and analyzing the trend of URLs usage in arXiv papers from 1992 to 2024. These results suggest that using a combination of multiple formats achieves better performance on URL extraction than a single format, and the number of URLs in arXiv papers has been steadily increasing since 1992 to 2014 and has been drastically increasing from 2014 to 2024. The dataset and the Jupyter notebooks used for the preliminary analysis are publicly available at https://github.com/lamps-lab/arxiv-urls
Publisher OA PDF DOI
Five Years of Ublituximab in Relapsing Multiple Sclerosis: Additional Results from Open-label Extension of ULTIMATE I and II Studies (P11-1.006)
Neurology · 2025-04-07
article
To evaluate the long-term clinical efficacy and safety of ublituximab.
Publisher DOI
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
2025-01-01
articleOpen access
Ming Cheng, Jiaying Gong, Chenhan Yuan, William A Ingram, Edward Fox, Hoda Eldardiry. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025.
Publisher OA PDF DOI
Analyzing Quality, Bias, and Performance in Text-to-Image Generative Models
arXiv (Cornell University) · 2024-06-28 · 2 citations
preprintOpen accessSenior author
Advances in generative models have led to significant interest in image synthesis, demonstrating the ability to generate high-quality images for a diverse range of text prompts. Despite this progress, most studies ignore the presence of bias. In this paper, we examine several text-to-image models not only by qualitatively assessing their performance in generating accurate images of human faces, groups, and specified numbers of objects but also by presenting a social bias analysis. As expected, models with larger capacity generate higher-quality images. However, we also document the inherent gender or social biases these models possess, offering a more complete understanding of their impact and limitations.
Publisher OA PDF DOI
Multi-dimensional Edge-Embedded GCNs for Arabic Text Classification
Lecture notes in computer science · 2024-01-01
book-chapterSenior author
Publisher DOI
Automating Chapter-Level Classification for Electronic Theses and Dissertations
2024-12-15 · 2 citations
articleSenior author
Traditional archival practices for describing electronic theses and dissertations (ETDs) rely on broad, high-level metadata schemes that fail to capture the depth, complexity, and interdisciplinary nature of these long scholarly works. The lack of detailed, chapter-level content descriptions impedes researchers’ ability to locate specific sections or themes, thereby reducing discoverability and overall accessibility. By providing chapter-level metadata information, we improve the effectiveness of ETDs as research resources. This makes it easier for scholars to navigate them efficiently and extract valuable insights. The absence of such metadata further obstructs interdisciplinary research by obscuring connections across fields, hindering new academic discoveries and collaboration. In this paper, we propose a machine learning and AI-driven solution to automatically categorize ETD chapters. This solution is intended to improve discoverability and promote understanding of chapters. Our approach enriches traditional archival practices by providing context-rich descriptions that facilitate targeted navigation and improved access. We aim to support interdisciplinary research and make ETDs more accessible. By providing chapter-level classification labels and using them to index in our developed prototype system, we make content in ETD chapters more discoverable and usable for a diverse range of scholarly needs. Implementing this AI-enhanced approach allows archives to serve researchers better, enabling efficient access to relevant information and supporting deeper engagement with ETDs. This will increase the impact of ETDs as research tools, foster interdisciplinary exploration, and reinforce the role of archives in scholarly communication within the data-intensive academic landscape.
Publisher DOI

Recent grants

SGER: DL-VT416: A Digital Library Testbed for Research Related to 4/16/2007 at Virginia Tech
NSF · $212k · 2007–2009
Collaborative Project: Ensemble: Enriching Communities and Collections to Support Education in Computing
NSF · $510k · 2008–2014
III: Small: Integrated Digital Event Archiving and Library (IDEAL)
NSF · $500k · 2013–2017
I-Corps: Automated Summarization Technology
NSF · $50k · 2019–2021
III:Small:Integrated Digital Library Support for Crisis, Tragedy, and Recovery
NSF · $500k · 2009–2013

Frequent coauthors

Krzysztof Selmaj
University of Warmia and Mazury in Olsztyn
252 shared
Robert J. Fox
245 shared
J. Theodore Phillips
196 shared
Eva Havrdová
General University Hospital in Prague
175 shared
Michael Hutchinson
University College Dublin
161 shared
Mariko Kita
Virginia Mason Medical Center
161 shared
Robert Herndon
California State Polytechnic University
147 shared
Ralf Gold
Ruhr University Bochum
147 shared

Education

Ph.D., Computer Science
University of Illinois at Urbana-Champaign
2005
M.S., Computer Science
University of Illinois at Urbana-Champaign
2001
B.S., Computer Science
University of Illinois at Urbana-Champaign
1999

Awards & honors

Best Teacher in Computer Science, Erik Jonsson School of Eng…
National Science Foundation CAREER Award (CCF-1942597) (2020…

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Emily Kyle Fox

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you