Iftekhar Ahmed
· Assistant Professor of Informatics; Computer ScienceVerifiedUniversity of California, Irvine · Political Science
Active 1998–2026
About
Professor Iftekhar Ahmed is an Associate Professor at UC Irvine's Donald Bren School of Information & Computer Sciences. His research focuses on software engineering, particularly on how to combine software testing, analysis, and data mining to ensure software safety and high quality. He emphasizes the importance of identifying problems before software hits the market by studying factors that lead to bad code, including technical issues like bloated program constructs and socio-technical factors such as merge conflicts. Using this knowledge, he develops prediction models to reveal bug-prone areas in new software projects. Professor Ahmed is also working on providing developers with efficient tools and techniques for testing increasingly complex systems, including non-deterministic and machine learning systems. His work on applying these techniques to the Linux kernel has helped identify critical bugs and address testing gaps. He stresses the importance of confidence in software systems, especially in safety-critical applications like autonomous vehicles. His approach involves techniques that focus on problem areas, such as prioritizing images that were not correctly classified during testing, to reduce errors and improve system safety and accuracy.
Research topics
- Computer Science
- Internet privacy
- Economic growth
- Data science
- Psychology
- Medicine
- Software engineering
- Applied psychology
- Operating system
- Human–computer interaction
- World Wide Web
Selected publications
Evolving with AI: A Longitudinal Analysis of Developer Logs
arXiv (Cornell University) · 2026-01-15
preprintOpen accessSenior authorAI-powered coding assistants are rapidly becoming fixtures in professional IDEs, yet their sustained influence on everyday development remains poorly understood. Prior research has focused on short-term use or self-reported perceptions, leaving open questions about how sustained AI use reshapes actual daily coding practices in the long term. We address this gap with a mixed-method study of AI adoption in IDEs, combining longitudinal two-year fine-grained telemetry from 800 developers with a survey of 62 professionals. We analyze five dimensions of workflow change: productivity, code quality, code editing, code reuse, and context switching. Telemetry reveals that AI users produce substantially more code but also delete significantly more. Meanwhile, survey respondents report productivity gains and perceive minimal changes in other dimensions. Our results offer empirical insights into the silent restructuring of software workflows and provide implications for designing future AI-augmented tooling.
Evolving with AI: A Longitudinal Analysis of Developer Logs
arXiv (Cornell University) · 2026-01-15
articleOpen accessSenior authorAI-powered coding assistants are rapidly becoming fixtures in professional IDEs, yet their sustained influence on everyday development remains poorly understood. Prior research has focused on short-term use or self-reported perceptions, leaving open questions about how sustained AI use reshapes actual daily coding practices in the long term. We address this gap with a mixed-method study of AI adoption in IDEs, combining longitudinal two-year fine-grained telemetry from 800 developers with a survey of 62 professionals. We analyze five dimensions of workflow change: productivity, code quality, code editing, code reuse, and context switching. Telemetry reveals that AI users produce substantially more code but also delete significantly more. Meanwhile, survey respondents report productivity gains and perceive minimal changes in other dimensions. Our results offer empirical insights into the silent restructuring of software workflows and provide implications for designing future AI-augmented tooling.
Automated Repair of Alloy Specifications in the Era of Large Language Models
IEEE Transactions on Software Engineering · 2026-01-01
articleTest smell: A parasitic energy consumer in software testing
Information and Software Technology · 2025-02-03 · 2 citations
articleOpen accessSenior authorTraditionally, energy efficiency research has focused on reducing energy consumption at the hardware level and, more recently, in the design and coding phases of the software development life cycle. However, software testing’s impact on energy consumption did not receive attention from the research community. Specifically, how test code design quality and test smell (e.g., sub-optimal design and bad practices in test code) impact energy consumption has not been investigated yet. This study aims to examine open-source software projects to analyze the association between test smell and its effects on energy consumption in software testing. We conducted a mixed-method empirical analysis from two perspectives; software (data mining in 12 Apache projects) and developers’ views (a survey of 62 software practitioners). Our findings show that: (1) test smell is associated with energy consumption in software testing. Specifically, the smelly part of a test case consumes more energy compared to the non-smelly part. (2) certain test smells are more energy-hungry than others, (3) refactored test cases tend to consume less energy than their smelly counterparts, and (4) most developers (45 % of the survey respondents) lack knowledge about test smells’ impact on energy consumption. Based on the results, we emphasize raising developers awareness regarding the impact of test smells on energy consumption. Additionally we present several observations that can direct future research and developments.
What Makes a Great Software Quality Assurance Engineer?
IEEE Transactions on Software Engineering · 2025-02-17 · 1 citations
articleSoftware Quality Assurance (SQA) Engineers play a critical role in evaluating products throughout the software development lifecycle to ensure that the outcomes of each phase and the final product possess the desired quality standards. In general, a great SQA engineer requires a different set of abilities from development engineers to effectively oversee the entire product development process. While recent empirical studies have explored the attributes of software engineers and managers, the quality assurance role is overlooked. As software quality gains increasing priority in the development cycles, both employers seeking skilled professionals and new graduates aspiring to excel in Software Quality Assurance (SQA) roles face a critical question: What makes a great SQA Engineer? To address this gap, we conducted 25 semi-structured interviews and surveyed 363 SQA engineers from diverse companies worldwide. We use the data collected from these activities to derive a comprehensive set of attributes for great SQA Engineers, categorized into five key areas: personal, social, technical, management, and decision-making attributes. Among these, curiosity, effective communication, and critical thinking emerged as defining characteristics of great SQA engineers. These findings offer valuable insights for future research with SQA practitioners, contextual considerations, and practical implications for research and practice.
2025-04-24 · 2 citations
articleOpen accessEvaluating LLMs Effectiveness in Detecting and Correcting Test Smells: An Empirical Study
ArXiv.org · 2025-06-09
preprintOpen accessTest smells indicate poor development practices in test code, reducing maintainability and reliability. While developers often struggle to prevent or refactor these issues, existing tools focus primarily on detection rather than automated refactoring. Large Language Models (LLMs) have shown strong potential in code understanding and transformation, but their ability to both identify and refactor test smells remains underexplored. We evaluated GPT-4-Turbo, LLaMA 3 70B, and Gemini-1.5 Pro on Python and Java test suites, using PyNose and TsDetect for initial smell detection, followed by LLM-driven refactoring. Gemini achieved the highest detection accuracy (74.35\% Python, 80.32\% Java), while LLaMA was lowest. All models could refactor smells, but effectiveness varied, sometimes introducing new smells. Gemini also improved test coverage, unlike GPT-4 and LLaMA, which often reduced it. These results highlight LLMs' potential for automated test smell refactoring, with Gemini as the strongest performer, though challenges remain across languages and smell types.
ArXiv.org · 2025-01-16
preprintOpen accessSenior authorCommit messages are crucial in software development, supporting maintenance tasks and communication among developers. While Large Language Models (LLMs) have advanced Commit Message Generation (CMG) using various software contexts, some contexts developers consider are often missed by CMG techniques and can't be easily retrieved or even retrieved at all by automated tools. To address this, we propose Commit Message Optimization (CMO), which enhances human-written messages by leveraging LLMs and search-based optimization. CMO starts with human-written messages and iteratively improves them by integrating key contexts and feedback from external evaluators. Our extensive evaluation shows CMO generates commit messages that are significantly more Rational, Comprehensive, and Expressive while outperforming state-of-the-art CMG methods and human messages 88.2%-95.4% of the time.
Context Conquers Parameters: Outperforming Proprietary Llm in Commit Message Generation
2025-04-26 · 1 citations
articleCommit messages provide descriptions of the modifications made in a commit using natural language, making them crucial for software maintenance and evolution. Recent developments in Large Language Models (LLMs) have led to their use in generating high-quality commit messages, such as the Omniscient Message Generator (OMG). This method employs GPT-4 to produce state-of-the-art commit messages. However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG. Our experiments show that an open-source LLM can generate commit messages comparable to those produced by OMG. In addition, through a series of contextual refinements, we propose OMEGA, a commit message generation approach that uses a 4-bit quantized 8B open-source LLM. OMEGA produces state-of-the-art commit messages, surpassing the performance of GPT-4 in practitioners' preference.
ArXiv.org · 2025-06-05
preprintOpen accessSenior authorA growing variety of prompt engineering techniques has been proposed for Large Language Models (LLMs), yet systematic evaluation of each technique on individual software engineering (SE) tasks remains underexplored. In this study, we present a systematic evaluation of 14 established prompt techniques across 10 SE tasks using four LLM models. As identified in the prior literature, the selected prompting techniques span six core dimensions (Zero-Shot, Few-Shot, Thought Generation, Ensembling, Self-Criticism, and Decomposition). They are evaluated on tasks such as code generation, bug fixing, and code-oriented question answering, to name a few. Our results show which prompting techniques are most effective for SE tasks requiring complex logic and intensive reasoning versus those that rely more on contextual understanding and example-driven scenarios. We also analyze correlations between the linguistic characteristics of prompts and the factors that contribute to the effectiveness of prompting techniques in enhancing performance on SE tasks. Additionally, we report the time and token consumption for each prompting technique when applied to a specific task and model, offering guidance for practitioners in selecting the optimal prompting technique for their use cases.
Recent grants
VOSS: Research on the Process of Virtual Research Environment
NSF · $249k · 2012–2017
Frequent coauthors
- 25 shared
Eduardo Santana de Almeida
- 19 shared
Carlos Jensen
DNV (United Kingdom)
- 17 shared
Marshall Scott Poole
- 13 shared
Alex Groce
Northern Arizona University
- 10 shared
Rahul Gopinath
University of Sydney
- 9 shared
Mohammad Amin Alipour
University of Houston
- 8 shared
Umme Ayda Mannan
Oregon State University
- 8 shared
Jiawei Li
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Iftekhar Ahmed
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup