
Cayelan Carey
· ProfessorVerifiedVirginia Tech · Biology
Active 2002–2026
About
Cayelan Carey is a professor in the Department of Biological Sciences at Virginia Tech and serves as Co-Director of the Center for Ecosystem Forecasting. Her work lies at the intersection of freshwater ecosystem science and data science, with a focus on nutrient and carbon cycling in freshwater ecosystems and the feedbacks between aquatic biogeochemical cycles and plankton food webs. She is particularly interested in exploring anthropogenic effects on inland waters and how local communities value and manage water resources, which has implications for water quality. At Virginia Tech, her research has expanded into near-term ecological forecasting, where she collaborates with computer scientists, environmental engineers, decision scientists, and water utilities to integrate high-frequency sensor data and ecosystem models. Her lab group generates daily water quality forecasts to predict freshwater ecosystem services for lakes and reservoirs across the U.S., studying how managers use these forecasts to control hypoxia and algal blooms, thereby influencing biogeochemical cycling and greenhouse gas dynamics. She is also dedicated to advancing undergraduate training in environmental data science through the NSF-supported Macrosystems EDDIE program, which develops teaching modules that incorporate high-frequency sensor datasets into ecology curricula to enhance students' ecological understanding and quantitative skills.
Research topics
- Environmental science
- Ecology
- Geology
- Oceanography
- Biology
- Computer Science
- Atmospheric sciences
- Meteorology
- Botany
- Physics
- Environmental resource management
- Geography
- Engineering
- Geotechnical engineering
- Operations research
Selected publications
Developing scenario-based, near-term iterative forecasts to inform water management
2026-02-23
articleOpen accessSenior authorNear-term, iterative ecosystem forecasts with scenarios representing alternate management decisions have high potential for providing valuable insights on how decisions may impact future ecological conditions. Scenario-based forecasts may be particularly useful for informing the decision-making around freshwater ecosystems managed for multiple objectives. We applied a near-term, iterative forecasting system at Lake Alexandrina, Australia, which is located at the nexus of one of the world’s largest catchments and a protected coastal wetland. Lake Alexandrina is highly managed through multi-gate barrages to optimize catchment, lake, and outflow targets, providing an unparalleled opportunity to investigate the value of scenario-based probabilistic forecasting for decision-makers. We co-developed the forecasting system with managers to generate automated 1-30 day-ahead barrage operation scenario forecasts. Over ~2 years, forecasts accurately predicted lake water level, water temperature, and salinity and outperformed null baseline models up to 10-30 days ahead. Alternative barrage scenarios were particularly important during drier hydrological conditions and resulted in substantive changes to lake level and salt export from the lake. Lake temperatures and salinity were less sensitive to barrage operations and generally more dependent on variability in lake inflow discharge. Overall, our study provides a framework for anticipatory decision-making to meet lake water quantity and quality targets.
2026-05-08
articlePhytoplankton blooms pose escalating risks to drinking-water security, yet translating high-frequency monitoring data into timely, reliable, and interpretable forecasts remains challenging. Here, we present a robust phytoplankton forecasting framework that specifies uncertainty in one to seven day-ahead predictions. The workflow systematically integrates an eXtreme Gradient Boosting (XGBoost) machine learning model with Optuna-driven automated hyperparameter optimization and bootstrap ensemble uncertainty quantification. By incorporating global and local horizon-specific explainability through SHapley Additive exPlanations (SHAP), the framework translates black-box machine learning predictions into mechanistically grounded, management-oriented insights. Evaluated on out-of-sample, high-frequency in situ chlorophyll- a observations spanning multiple years, our forecasting model with 10 ensemble members achieved the best probabilistic skill across all model configurations, with continuous ranking probability scores (CRPS) ranging from 1.2 µg/L at one day-ahead to 2.7 µg/L at seven days-ahead. SHAP attribution revealed that initial chlorophyll- a dominated one day-ahead predictions (70% feature importance), with its influence declining to 50% by seven days-ahead as the role of thermal stratification and organic matter progressively gained predictive prominence. Our forecasting framework is transferable to other monitored freshwater systems and supports proactive, risk-informed water quality management.
2026-05-06
articlePhytoplankton blooms pose escalating risks to drinking-water security, yet translating high-frequency monitoring data into timely, reliable, and interpretable forecasts remains challenging. Here, we present a robust phytoplankton forecasting framework that specifies uncertainty in one to seven day-ahead predictions. The workflow systematically integrates an eXtreme Gradient Boosting (XGBoost) machine learning model with Optuna-driven automated hyperparameter optimization and bootstrap ensemble uncertainty quantification. By incorporating global and local horizon-specific explainability through SHapley Additive exPlanations (SHAP), the framework translates black-box machine learning predictions into mechanistically grounded, management-oriented insights. Evaluated on out-of-sample, high-frequency in situ chlorophyll- a observations spanning multiple years, our forecasting model with 10 ensemble members achieved the best probabilistic skill across all model configurations, with continuous ranking probability scores (CRPS) ranging from 1.2 µg/L at one day-ahead to 2.7 µg/L at seven days-ahead. SHAP attribution revealed that initial chlorophyll- a dominated one day-ahead predictions (70% feature importance), with its influence declining to 50% by seven days-ahead as the role of thermal stratification and organic matter progressively gained predictive prominence. Our forecasting framework is transferable to other monitored freshwater systems and supports proactive, risk-informed water quality management.
Developing scenario-based, near-term iterative forecasts to inform water management
2025-06-27
preprintOpen accessSenior authorNear-term, iterative ecosystem forecasts with scenarios representing alternate management decisions have high potential for providing valuable insights on how different decisions may impact future ecological conditions. For freshwater ecosystems, which are often managed for multiple objectives, scenario-based forecasting has yet to be commonly applied for water quality management, despite a rich history in water quantity forecasting (e.g., floods, droughts). We applied a near-term, iterative water quality forecasting system at Lake Alexandrina, Australia, which is located at the nexus of one of the world’s largest catchments and a protected coastal wetland. Lake Alexandrina is highly managed through a series of multi-gate barrages to optimize multiple catchment, lake, and outflow targets, providing an unparalleled opportunity to investigate the value of scenario-based forecasting for decision makers. We co-developed the forecasting system with the lake managers to generate automated 1-30 day-ahead barrage operation scenario forecasts to inform management decisions. Over ~2 years, forecasts were able to accurately predict lake water level, water temperature, and salinity and outperformed null baseline models up to 10-30 days ahead. Alternative barrage management scenarios were particularly important during drier hydrological conditions and resulted in substantive changes to lake level and export of salt from the lake. In-situ lake temperatures and salinity were less sensitive to barrage operations and generally more dependent on variability in lake inflow discharge. Overall, our study helps provide a framework for anticipatory decision-making to meet water quantity and quality targets in a highly-managed system increasingly experiencing climate stressors.
Warming air temperatures alter the timing and magnitude of reservoir zooplankton biomass
Ecological Modelling · 2025-07-31
articleSenior author2025-10-31
articleOpen accessSenior authorDissolved organic matter (DOM) plays an important role in aquatic carbon cycling and is a valuable metric of ecosystem functioning and water quality in freshwater ecosystems. Despite its importance for biogeochemical cycling and water quality, no near-term iterative forecasts have previously been developed for freshwater DOM concentrations. To advance both our understanding of freshwater DOM dynamics and management, we developed 1-34 day-ahead forecasts of fluorescent DOM (fDOM) in three drinking water reservoirs. These temperate reservoirs are co-located in Virginia, USA and experience variable DOM dynamics (range: 5 - 27 QSU (quinine sulfate units)). We developed six different forecasting models to predict fDOM in each reservoir. Three models were time series models based on forecasted drivers (water temperature and meteorology) that were updated daily from high-frequency fDOM sensors. The other forecast models included a neural network machine learning model and two baseline reference models (climatology and persistence). Altogether, our forecasts were able to capture observed dynamics over a year in all three reservoirs, with one time series model outperforming the baseline models across the full 34-day forecast horizon. Aggregated across reservoirs and models over a year, forecast RMSE increased from 0.7 to 4.1 QSU over the 1-34 day-ahead forecast horizon. Forecast skill varied substantially across seasons, with greatest accuracy in the spring and winter compared to the summer and fall across reservoirs. These forecasts can help improve our understanding of the predictability of DOM and inform management in freshwater ecosystems as carbon dynamics become more variable due to global change.
LakeBeD-US: a benchmark dataset for lake water quality time series and vertical profiles
Earth system science data · 2025-07-02 · 1 citations
articleOpen accessCorrespondingAbstract. Water quality in lakes is an emergent property of complex biotic and abiotic processes that differ across spatial and temporal scales. Water quality is also a determinant of ecosystem services that lakes provide and is thus of great interest to ecologists. Machine learning and other computer science techniques are increasingly being used to predict water quality dynamics as well as to gain a greater understanding of water quality patterns and controls. To benefit the sciences of both ecology and computer science, we have created a benchmark dataset of lake water quality time series and vertical profiles. LakeBeD-US contains over 500 million unique observations of lake water quality collected by multiple long-term monitoring programs across 17 water quality variables from 21 lakes in the United States. There are two published versions of LakeBeD-US: the “Ecology Edition” published in the Environmental Data Initiative repository (https://doi.org/10.6073/pasta/c56a204a65483790f6277de4896d7140, McAfee et al., 2024) and the “Computer Science Edition” published in the Hugging Face repository (https://doi.org/10.57967/hf/3771, Pradhan et al., 2024). Each edition is formatted in a manner conducive to inquiries and analyses specific to each domain. For ecologists, LakeBeD-US: Ecology Edition provides an opportunity to study the spatial and temporal dynamics of several lakes with varying water quality, ecosystem, and landscape characteristics. For computer scientists, LakeBeD-US: Computer Science Edition acts as a benchmark dataset that enables the advancement of machine learning for water quality prediction.
Water Research · 2025-10-03 · 1 citations
article2025-10-06
articleOpen accessSenior authorEcosystem states are often influenced by both concurrent and antecedent environmental drivers. However, the relative importance of antecedent conditions varies within and among ecosystems. Here, we analyzed long-term depth-profile data from 38
2025-10-13
peer-reviewSenior author
Recent grants
NSF · $846k · 2018–2024
NSF · $450k · 2024–2028
NSF · $720k · 2020–2024
NSF · $21k · 2016–2018
NSF · $1000k · 2018–2021
Frequent coauthors
- 100 shared
Caryn C. Vaughn
Oklahoma Biological Survey
- 100 shared
Peter B. McIntyre
- 100 shared
Carla L. Atkinson
University of Alabama
- 100 shared
Amanda T. Rugenski
- 100 shared
Colden V. Baxter
Idaho State University
- 100 shared
Weston H. Nowlin
- 100 shared
Kate S. Boersma
- 100 shared
Krista A. Capps
Savannah River National Laboratory
Labs
Education
Ph.D., Ecology & Evolutionary Biology
Cornell University
Awards & honors
- 2024 Robert and Maude Gledden Visiting Fellowship, Universit…
- 2024 Outstanding Faculty Service Award, Department of Biolog…
- 2023 SORTEE Open Science Researcher Award Finalist
- 2022 Earth Leadership Fellowship, Future Earth and Stanford…
- 2022 Fulbright Future Fellowship, Australian-U.S. Fulbright…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Cayelan Carey
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup