Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Ernest Davis

Ernest Davis

· Professor of Computer ScienceVerified

New York University · Computer Science

Active 1948–2024

h-index31
Citations5.1k
Papers16032 last 5y
Funding$329k
See your match with Ernest Davis — sign in to PhdFit.Sign in

About

Ernest Davis is a professor in the Department of Computer Science at New York University, affiliated with the Courant Institute of Mathematical Sciences. His research focuses on automated commonsense reasoning, with particular interest in benchmarks and datasets for this area. He has contributed to surveys of the state of the art in commonsense reasoning, including the Winograd Schema Challenge and related topics. Davis is involved in teaching courses on Artificial Intelligence, and he maintains a web presence with resources and notes on AI topics. His work includes publications such as books, research papers, surveys, essays, and reports, and he has written for a general audience on topics related to artificial intelligence.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Mathematics
  • Psychology
  • Cognitive psychology
  • Physics
  • Epistemology
  • Programming language
  • Philosophy
  • Linguistics
  • Mathematics education
  • Cognitive science
  • Mathematical economics
  • Theoretical computer science
  • Pure mathematics

Selected publications

  • The Defeat of the Winograd Schema Challenge (Abstract Reprint)

    Proceedings of the AAAI Conference on Artificial Intelligence · 2024-03-24

    articleOpen access

    The Winograd Schema Challenge—a set of twin sentences involving pronoun reference disambiguation that seem to require the use of commonsense knowledge—was proposed by Hector Levesque in 2011. By 2019, a number of AI systems, based on large pre-trained transformer-based language models and fine-tuned on these kinds of problems, achieved better than 90% accuracy. In this paper, we review the history of the Winograd Schema Challenge and discuss the lasting contributions of the flurry of research that has taken place on the WSC in the last decade. We discuss the significance of various datasets developed for WSC, and the research community's deeper understanding of the role of surrogate tasks in assessing the intelligence of an AI system.

  • Mathematics, word problems, common sense, and artificial intelligence

    Bulletin of the American Mathematical Society · 2024 · 72 citations

    1st authorCorresponding
    • Computer Science
    • Computer Science
    • Artificial Intelligence

    The paper discusses the capacities and limitations of current artificial intelligence (AI) technology to solve word problems that combine elementary mathematics with commonsense reasoning. No existing AI systems can solve these reliably. We review three approaches that have been developed, using AI natural language technology: outputting the answer directly, outputting a computer program that solves the problem, and outputting a formalized representation that can be input to an automated theorem verifier. We review some benchmarks that have been developed to evaluate these systems and some experimental studies. We discuss the limitations of the existing technology at solving these kinds of problems. We argue that it is not clear whether these kinds of limitations will be important in developing AI technology for pure mathematical research, but that they will be important in applications of mathematics, and may well be important in developing programs capable of reading and understanding mathematical content written by humans.

  • The defeat of the Winograd Schema Challenge

    Artificial Intelligence · 2023-07-11 · 30 citations

    article
  • Benchmarks for Automated Commonsense Reasoning: A Survey

    arXiv (Cornell University) · 2023-02-09 · 6 citations

    preprintOpen access1st authorCorresponding

    More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed and many aspects of common sense remain untested. Consequently, we do not currently have any reliable way of measuring to what extent existing AI systems have achieved these abilities. This paper surveys the development and uses of AI commonsense benchmarks. We discuss the nature of common sense; the role of common sense in AI; the goals served by constructing commonsense benchmarks; and desirable features of commonsense benchmarks. We analyze the common flaws in benchmarks, and we argue that it is worthwhile to invest the work needed ensure that benchmark examples are consistently high quality. We survey the various methods of constructing commonsense benchmarks. We enumerate 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video based, and 7 simulated physical environments. We discuss the gaps in the existing benchmarks and aspects of commonsense reasoning that are not addressed in any existing benchmark. We conclude with a number of recommendations for future development of commonsense AI benchmarks.

  • Benchmarks for Automated Commonsense Reasoning: A Survey

    ACM Computing Surveys · 2023-09-11 · 55 citations

    review1st authorCorresponding

    More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed, and many aspects of common sense remain untested. Consequently, there is currently no reliable way of measuring to what extent existing AI systems have achieved these abilities. This article surveys the development and uses of AI commonsense benchmarks. It enumerates 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video-based, and 7 based in simulated physical environments. It gives more detailed descriptions of twelve of these, three from each category. It surveys the various methods used to construct commonsense benchmarks. It discusses the nature of common sense, the role of common sense in AI, the goals served by constructing commonsense benchmarks, desirable features of commonsense benchmarks, and flaws and gap in existing benchmarks. It concludes with a number of recommendations for future development of commonsense AI benchmarks; most importantly, that the creators of benchmarks invest the work needed to ensure that benchmark examples are consistently high quality.

  • Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems

    arXiv (Cornell University) · 2023-08-10 · 9 citations

    preprintOpen access1st authorCorresponding

    This report describes a test of the large language model GPT-4 with the Wolfram Alpha and the Code Interpreter plug-ins on 105 original problems in science and math, at the high school and college levels, carried out in June-August 2023. Our tests suggest that the plug-ins significantly enhance GPT's ability to solve these problems. Having said that, there are still often "interface" failures; that is, GPT often has trouble formulating problems in a way that elicits useful answers from the plug-ins. Fixing these interface failures seems like a central challenge in making GPT a reliable tool for college-level calculation problems.

  • Mathematics, word problems, common sense, and artificial intelligence

    arXiv (Cornell University) · 2023-01-23 · 8 citations

    preprintOpen access1st authorCorresponding

    The paper discusses the capacities and limitations of current artificial intelligence (AI) technology to solve word problems that combine elementary knowledge with commonsense reasoning. No existing AI systems can solve these reliably. We review three approaches that have been developed, using AI natural language technology: outputting the answer directly, outputting a computer program that solves the problem, and outputting a formalized representation that can be input to an automated theorem verifier. We review some benchmarks that have been developed to evaluate these systems and some experimental studies. We discuss the limitations of the existing technology at solving these kinds of problems. We argue that it is not clear whether these kinds of limitations will be important in developing AI technology for pure mathematical research, but that they will be important in applications of mathematics, and may well be important in developing programs capable of reading and understanding mathematical content written by humans.

  • Physical Reasoning in an Open World

    arXiv (Cornell University) · 2022-01-22

    preprintOpen accessSenior author

    Most work on physical reasoning, both in artificial intelligence and in cognitive science, has focused on closed-world reasoning, in which it is assumed that the problem specification specifies all relevant objects and substance, all their relations in an initial situation, and all exogenous events. However, in many situations, it is important to do open-world reasoning; that is, making valid conclusions from very incomplete information. We have implemented in Prolog an open-world reasoner for a toy microworld of containers that can be loaded, unloaded, sealed, unsealed, carried, and dumped.

  • The Defeat of the Winograd Schema Challenge

    arXiv (Cornell University) · 2022-01-07 · 4 citations

    preprintOpen access

    The Winograd Schema Challenge - a set of twin sentences involving pronoun reference disambiguation that seem to require the use of commonsense knowledge - was proposed by Hector Levesque in 2011. By 2019, a number of AI systems, based on large pre-trained transformer-based language models and fine-tuned on these kinds of problems, achieved better than 90% accuracy. In this paper, we review the history of the Winograd Schema Challenge and discuss the lasting contributions of the flurry of research that has taken place on the WSC in the last decade. We discuss the significance of various datasets developed for WSC, and the research community's deeper understanding of the role of surrogate tasks in assessing the intelligence of an AI system.

  • Limits of an AI program for solving college math problems

    arXiv (Cornell University) · 2022 · 41 citations

    1st authorCorresponding
    • Computer Science
    • Artificial Intelligence
    • Computer Science

    Drori et al. (2022) report that "A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level ... [It] automatically answers 81\% of university-level mathematics problems." The system they describe is indeed impressive; however, the above description is very much overstated. The work of solving the problems is done, not by a neural network, but by the symbolic algebra package Sympy. Problems of various formats are excluded from consideration. The so-called "explanations" are just rewordings of lines of code. Answers are marked as correct that are not in the form specified in the problem. Most seriously, it seems that in many cases the system uses the correct answer given in the test corpus to guide its path to solving the problem.

Recent grants

Frequent coauthors

  • Gary Marcus

    35 shared
  • Leora Morgenstern

    Palo Alto Research Center

    18 shared
  • Yanjun Ma

    China Southern Power Grid (China)

    16 shared
  • Kenneth Church

    16 shared
  • Valia Kordoni

    Humboldt-Universität zu Berlin

    16 shared
  • Zeyu Chen

    Soochow University

    16 shared
  • Thomas Lukasiewicz

    9 shared
  • Vid Kocijan

    7 shared
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Ernest Davis

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup