Natasha Holmes

· Ann S. Bowers Associate Professor Astronomy, CDER, PhysicsVerified

Cornell University · Physics

Active 2009–2025

h-index21

Citations1.7k

Papers12980 last 5y

Funding$787k

Faculty page Lab page

See your match with Natasha Holmes — sign in to PhdFit.Sign in

About

Natasha Holmes is an Ann S. Bowers Associate Professor in the Department of Physics at Cornell University. Her research focuses on teaching and learning in physics and other STEM courses, exploring how students acquire skills and content knowledge, how different course environments influence student motivation, persistence, and understanding of science and measurement. Her work employs both qualitative and quantitative methods to evaluate variables affecting student learning and experiences in physics and STEM education. Holmes's primary research area is the efficacy of hands-on laboratory courses. She investigates assessment methods to determine what labs achieve and how teaching methods can improve learning outcomes. Her team has developed instruments to assess critical thinking and reasoning about uncertainty and measurement, and they evaluate the impact of various pedagogical strategies, including course redesigns and research experiences. Additionally, her research explores student experiences such as group work dynamics, social biases, and the relationship between coursework and undergraduate research experiences, aiming to inform more effective teaching practices in physics education.

Research topics

Computer Science
Mathematics education
Psychology
Political Science
Sociology
Artificial Intelligence
Social Science
Multimedia
Library science
Medicine
Human–computer interaction
Medical education

Selected publications

Invest in science education research to make science open to all
Nature Physics · 2025-08-25
article1st authorCorresponding
Publisher DOI
Bias in physics peer recognition does not explain gaps in perceived peer recognition
Nature Physics · 2025-03-05 · 5 citations
articleSenior author
Publisher DOI
Structuring groups for gender equitable equipment usage in labs
Physical Review Physics Education Research · 2025-06-03 · 1 citations
articleOpen accessSenior author
Previous research has found gender inequitable equipment usage across various lab course contexts. Few studies, however, have tested possible remediation strategies. In this work, we use hierarchical linear modeling to compare men and women’s lab equipment usage in two group work structures across three course contexts. In one in-person course, students formed their own groups in class and rotated into new groups every unit. In the other two courses, one in person and one remote, students were assigned groups formed by the instructor and worked with the same group all semester. In line with former studies, we found gender inequitable equipment usage in the course with in-class formed, rotated groups. We did not observe gender inequitable equipment usage, however, in the course with instructor-assigned, fixed groups. Analyzing equipment usage across the semester within each course, our results suggest that this improvement comes from a combination of both instructor-assigned groups and keeping groups fixed for the semester. Our findings present many opportunities for subsequent controlled studies to probe these practices.
Publisher DOI
Perceptions of interdisciplinary critical thinking among biology and physics undergraduates
Physical Review Physics Education Research · 2025-04-09 · 4 citations
articleOpen accessSenior author
There is a growing need for more effective interdisciplinary science instruction across undergraduate degree programs. In addition to supporting students’ connections between disciplinary concepts, interdisciplinary learning can develop students’ critical thinking skills and allow them to evaluate scientific investigations and claims between diverse topics. Physics Education Research literature has particularly focused on introductory physics courses for life sciences students, in part because students majoring in life sciences represent one of the largest demographics enrolled in physics courses. This literature has primarily focused on students’ development of conceptual understanding, modeling skills, and perspectives of the two fields. In this study, we explored how biology and physics undergraduates approach and perceive critical thinking between the two disciplines. We conducted structured think-aloud interviews with biology and physics students, asking students to first complete portions of established biology and physics critical thinking assessments and then respond to several follow-up questions about critical thinking more generally. Using thematic analysis to inductively code interview responses into emergent themes, we found that most students, regardless of major, described different approaches to evaluating biology and physics experiments. However, physics students provided similar definitions of critical thinking in the two disciplines, while biology students provided similar and different definitions in almost equal numbers. The exception was related to the use of quantitative methods solely being associated with critical thinking in physics, despite both critical thinking assessments involving quantitative data analysis. When looking across constructs, we saw no clear trends or relationships between individual students’ responses to each of the interview questions. We also explored students’ broader perspectives on the two fields and found that physics students assume that physics is needed to understand biology but not vice versa, which did not align with their perspectives on critical thinking between disciplines. We use this complexity to motivate future work to understand the impact of biology and physics instruction, as well as other STEM disciplines, on developing students’ critical thinking skills and perceptions.
Publisher OA PDF DOI
Comparing large language models for supervised analysis of students’ lab notes
Physical Review Physics Education Research · 2025-03-31 · 6 citations
articleOpen accessSenior author
Recent advancements in large language models (LLMs) hold significant promise for improving physics education research that uses machine learning. In this study, we compare the application of various models for conducting a large-scale analysis of written text grounded in a physics education research classification problem: identifying skills in students’ typed lab notes through sentence-level labeling. Specifically, we use training data to fine-tune two different LLMs, BERT and LLaMA, and compare the performance of these models to both a traditional bag-of-words approach and a few-shot LLM (without fine-tuning). We evaluate the models based on their resource use, performance metrics, and research outcomes when identifying skills in lab notes. We find that higher-resource models often, but not necessarily, perform better than lower-resource models. We also find that all models report similar trends in research outcomes, although the absolute values of the estimated measurements are not always within uncertainties of each other. We use the results to discuss relevant considerations for education researchers seeking to select a model type for use as a classifier.
Publisher DOI
Dynamics of productive confirmation framing in an introductory lab
Physical Review Physics Education Research · 2024-08-23 · 3 citations
articleOpen access
In introductory physics laboratory instruction, students often expect to confirm or demonstrate textbook physics concepts. This expectation is largely undesirable: labs that emphasize confirmation of textbook physics concepts are generally unsuccessful at teaching those concepts and even in contexts that do not emphasize confirmation, such expectations can lead to students disregarding or manipulating their data in order to obtain the expected result. In other words, when students expect their lab activities to confirm a known result, they may relinquish epistemic agency and violate disciplinary practices. We present a contrasting case where, we claim, confirmatory expectations can actually support productive disciplinary engagement. In this case study, we analyze the complex dynamics of students’ epistemological framing in a lab where students’ confirmatory expectations support and even generate epistemic agency and disciplinary practices, including developing original ideas, measures, and apparatuses to apply to the material world. Published by the American Physical Society 2024
Publisher OA PDF DOI
Method to assess the trustworthiness of machine coding at scale
Physical Review Physics Education Research · 2024-03-06 · 3 citations
articleOpen accessSenior author
Physics education researchers are interested in using the tools of machine learning and natural language processing to make quantitative claims from natural language and text data, such as open-ended responses to survey questions. The aspiration is that this form of machine coding may be more efficient and consistent than human coding, allowing much larger and broader datasets to be analyzed than is practical with human coders. Existing work that uses these tools, however, does not investigate norms that allow for trustworthy quantitative claims without full reliance on cross-checking with human coding, which defeats the purpose of using these automated tools. Here we propose a four-part method for making such claims with supervised natural language processing: evaluating a trained model, calculating statistical uncertainty, calculating systematic uncertainty from the trained algorithm, and calculating systematic uncertainty from novel data sources. We provide evidence for this method using data from two distinct short response survey questions with two distinct coding schemes. We also provide a real-world example of using these practices to machine code a dataset unseen by human coders. We offer recommendations to guide physics education researchers who may use machine-coding methods in the future. Published by the American Physical Society 2024
Publisher OA PDF DOI
Applying machine learning models in multi-institutional studies can generate bias
2024-09-12
articleOpen accessSenior author
There is increasing interest in deploying machine learning models at scale for multi-institutional studies in physics education research.Here we investigate the efficacy of applying machine learning models to institutions outside of their training set, using natural language processing to code open-ended survey responses.We find that, in general, changing institutional contexts can affect machine learning estimates of code frequencies: either previously documented sources of uncertainty increase in magnitude, new unknown sources of uncertainty emerge, or both.We also find an example where uncertainties do not change between the institution used in the training data and an institution not in the training data.Results suggest that attention to uncertainty is critical, especially when making measurements of student writing across multi-institutional data sets.
Publisher OA PDF DOI
What topics of peer interactions correlate with student performance in physics courses?
European Journal of Physics · 2024-03-19 · 2 citations
articleOpen accessSenior authorCorresponding
Abstract Research suggests that interacting with more peers about physics course material is correlated with higher student performance. Some studies, however, have demonstrated that different topics of peer interactions may correlate with their performance in different ways, or possibly not at all. In this study, we probe both the peers with whom students interact about their physics course and the particular aspects of the course material about which they interacted in six different introductory physics courses: four lecture courses and two lab courses. Drawing on social network analysis methods, we replicate prior work demonstrating that, on average, students who interact with more peers in their physics courses have higher final course grades. Expanding on this result, we find that students discuss a wide range of aspects of course material with their peers: concepts, small-group work, assessments, lecture, and homework. We observe that in the lecture courses, interacting with peers about concepts is most strongly correlated with final course grade, with smaller correlations also arising for small-group work and homework. In the lab courses, on the other hand, small-group work is the only interaction topic that significantly correlates with final course grade. We use these findings to discuss how course structures (e.g. grading schemes and weekly course schedules) may shape student interactions and add nuance to prior work by identifying how specific types of student interactions are associated (or not) with performance.
Publisher OA PDF DOI
Comparing large language models for supervised analysis of students' lab notes
arXiv (Cornell University) · 2024-12-13
preprintOpen accessSenior author
Recent advancements in large language models (LLMs) hold significant promise in improving physics education research that uses machine learning. In this study, we compare the application of various models to perform large-scale analysis of written text grounded in a physics education research classification problem: identifying skills in students' typed lab notes through sentence-level labeling. Specifically, we use training data to fine-tune two different LLMs, BERT and LLaMA, and compare the performance of these models to both a traditional bag of words approach and a few-shot LLM (without fine-tuning).} We evaluate the models based on their resource use, performance metrics, and research outcomes when identifying skills in lab notes. We find that higher-resource models often, but not necessarily, perform better than lower-resource models. We also find that all models estimate similar trends in research outcomes, although the absolute values of the estimated measurements are not always within uncertainties of each other. We use the results to discuss relevant considerations for education researchers seeking to select a model type to use as a classifier.
Publisher OA PDF DOI

Recent grants

Collaborative Research: Student Thinking About Measurements Across the Physics Curriculum
NSF · $160k · 2018–2023
Collaborative Research: Investigating How to Better Prepare Undergraduate Students for Physics Labs that Focus on Experimental Science
NSF · $292k · 2020–2023
Studying Equity in Undergraduate Physics Labs
NSF · $335k · 2019–2024

Frequent coauthors

Emily M. Smith
University of Nevada, Reno
23 shared
Emily M. Stump
Cornell University
23 shared
Meagan Sundstrom
Cornell University
21 shared
Cole Walsh
18 shared
Carl Wieman
Stanford University
17 shared
Gina Passante
California State University, Fullerton
14 shared
Ashley B. Heim
Cornell University
11 shared
Katherine N. Quinn
Princeton University
10 shared

Education

PhD, Physics and Astronomy
University of British Columbia
2014
M.Sc., Physics and Astronomy
University of British Columbia
2011
B.Sc.(Hons), Physics
University of Guelph
2009

Awards & honors

Endowed professorships (23 faculty)
NSF-funded postdocs to research education across disciplines
Provost’s seminar celebrates innovation in teaching

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Natasha Holmes

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you