
Caleb Belth
· Assistant ProfessorVerifiedUniversity of Utah · Linguistics
Active 2017–2024
Research topics
- Natural Language Processing
- Information Retrieval
- Artificial Intelligence
- Machine Learning
- Computer Science
- Theoretical computer science
Selected publications
A Learning-Based Account of Phonological Tiers
Linguistic Inquiry · 2024-03-07 · 1 citations
article1st authorCorrespondingI propose a learning-based account of phonological-tier-like representations. I argue that humans show a proclivity for tracking adjacent dependencies, and propose a learning algorithm that incorporates this by tracking only adjacent dependencies. The model changes representations in response to being unable to predict the surface form of alternating segments—a decision governed by the Tolerance Principle, which allows for learning despite the sparsity and exceptions inevitable in naturalistic data. Tier-like representations emerge from the algorithm, which achieves high-accuracy learning over natural language data, handles crosslinguistic complexities like neutral segments and blockers, and makes correct experimental predictions about human behavior.
A learning-based account of local phonological processes
Phonology · 2023-02-01 · 1 citations
articleOpen access1st authorCorrespondingAbstract Phonological processes tend to involve local dependencies, an observation that has been expressed explicitly or implicitly in many phonological theories, such as the use of minimal symbols in SPE and the inclusion of primarily strictly local constraints in Optimality Theory. I propose a learning-based account of local phonological processes, providing an explicit computational model. The model is grounded in experimental results that suggest children are initially insensitive to long-distance dependencies and that as their ability to track non-adjacent dependencies grows, learners still prefer local generalisations to non-local ones. The model encodes these results by constructing phonological processes starting around an alternating segment and expanding outward to incorporate more phonological context only when surface forms cannot be predicted with sufficient accuracy. The model successfully constructs local phonological generalisations and exhibits the same preference for local patterns that humans do, suggesting that locality can emerge as a computational consequence of a simple learning procedure.
Towards an Algorithmic Account of Phonological Rules and Representations
Deep Blue (University of Michigan) · 2023-01-01 · 1 citations
articleOpen access1st authorCorrespondingThe development of computer science in the middle of the twentieth century provided a valuable tool for the study of language as a cognitive system, by allowing linguistic theories to be stated in computational terms. The resulting theories have traditionally placed emphasis on describing the space of possible human languages, and viewed this delineated space as antecedent to a theory of how such a language might be learned from linguistic data. In the domain of phonology—the study of the structure of linguistic sound—this dissertation takes steps approaching the problem from the opposite direction, by framing the problem as that of identifying the learning procedure(s) by which humans construct a language in response to linguistic exposure. The object of study is shifted from the investigation of how a learner will discover a supposed target grammar, to the investigation of the ontogenetic process by which humans develop computational, phonological systems. The proposed algorithmic approach identifies independently-established psychological mechanisms available to a learner, and then uses these as the components of a hypothesized learning procedure. The dissertation includes an algorithmic account of how graph-based representations of words, which render long-distance dependencies as local in that graph structure and are known as phonological tiers, arise naturally from a learning algorithm sensitive to only adjacent dependencies. The dissertation also proposes an algorithmic account of when abstract representations of morphemes are needed for effective generalization to unseen words in the face of the sparsity of linguistic input, and how rules can be constructed to map between these abstract representations and their concrete realizations. Stated in explicit, computational terms, the proposed learning system is evaluated on realistic natural language data, and makes precise, testable predictions. The learner constructs accurate linguistic generalizations from naturalistic data: across languages evaluated, the learner achieves, on average, 0.96 accuracy on held-out test words, and never lower than 0.92. These results are achieved with training data of no more than a thousand words. Moreover, the models' predictions are consistently borne out in developmental predictions and experimental settings, including a novel experiment carried out to directly test this model. When compared to a prominent alternative learning-based model—neural networks—the proposed model achieves higher accuracy, while producing comparatively interpretable outputs, and—critically—providing an intelligible algorithm, which brings greater understanding to the mechanisms underlying phonological development.
A hidden challenge of link prediction: which pairs to check?
Knowledge and Information Systems · 2022-02-18 · 1 citations
article1st authorCorrespondingThe Greedy and Recursive Search for Morphological Productivity
arXiv (Cornell University) · 2021-05-12 · 38 citations
preprintOpen access1st authorCorrespondingAs children acquire the knowledge of their language's morphology, they invariably discover the productive processes that can generalize to new words. Morphological learning is made challenging by the fact that even fully productive rules have exceptions, as in the well-known case of English past tense verbs, which features the -ed rule against the irregular verbs. The Tolerance Principle is a recent proposal that provides a precise threshold of exceptions that a productive rule can withstand. Its empirical application so far, however, requires the researcher to fully specify rules defined over a set of words. We propose a greedy search model that automatically hypothesizes rules and evaluates their productivity over a vocabulary. When the search for broader productivity fails, the model recursively subdivides the vocabulary and continues the search for productivity over narrower rules. Trained on psychologically realistic data from child-directed input, our model displays developmental patterns observed in child morphology acquisition, including the notoriously complex case of German noun pluralization. It also produces responses to nonce words that, despite receiving only a fraction of the training data, are more similar to those of human subjects than current neural network models' responses are.
What is Normal, What is Strange, and What is Missing in an Knowledge Graph
Figshare · 2020-04-01 · 2 citations
article1st authorCorrespondingKnowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a \textit{unified solution} to KG characterization by formulating the problem as \emph{unsupervised KG summarization} with a set of inductive, \textit{soft rules}, which describe what is \emph{normal} in a KG, and thus can be used to identify what is \emph{abnormal}, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose \method, \emph{Knowledge Graph Inductive SummarizaTion}, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle---a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (\NELL{}, \DBpedia{}, and \Yago{}), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that \method outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93\% of missing entities---over 10\% more than baselines), while also being efficient for large knowledge graphs.
2020 · 38 citations
1st authorCorresponding- Computer Science
- Computer Science
- Information Retrieval
Author: Belth, Caleb et al.; Genre: Conference Paper; Published online: 2020; Title: What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
2020-11-01
articleOpen accessPersonalized Knowledge Graph Summarization: From the Cloud to Your Pocket
2019-11-01 · 47 citations
articleOpen accessThe increasing scale of encyclopedic knowledge graphs (KGs) calls for summarization as a way to help users efficiently access and distill world knowledge. Motivated by the disparity between individuals' limited information needs and the massive scale of KGs, in this paper we propose a new problem called personalized knowledge graph summarization. The goal is to construct compact "personal summaries" of KGs containing only the facts most relevant to individuals' interests. Such summaries can be stored and utilized on-device, allowing individuals private, anytime access to the information that interests them most. We formalize the problem as one of constructing a sparse graph, or summary, that maximizes a user's inferred "utility" over a given KG, subject to a user-and device-specific constraint on the summary's size. To solve it, we propose GLIMPSE, a summarization framework that provides theoretical guarantees on the summary's utility and is linear in the number of edges in the KG. In an evaluation with real user queries to open-source, encyclopedic KGs of up to one billion triples, we show that GLIMPSE efficiently creates summaries that outperform strong baselines by up to 19% in query answering F1 score.
When to remember where you came from
2019-08-27 · 6 citations
articleOpen access1st authorCorrespondingFor trajectory data that tend to have beyond first-order (i.e., non-Markovian) dependencies, higher-order networks have been shown to accurately capture details lost with the standard aggregate network representation. At the same time, representation learning has shown success on a wide range of network tasks, removing the need to hand-craft features for these tasks. In this work, we propose a node representation learning framework called EVO or Embedding Variable Orders, which captures non-Markovian dependencies by combining work on higher-order networks with work on node embeddings. We show that EVO outperforms baselines in tasks where high-order dependencies are likely to matter, demonstrating the benefits of considering high-order dependencies in node embeddings. We also provide insights into when it does or does not help to capture these dependencies. To the best of our knowledge, this is the first work on representation learning for higher-order networks.
Frequent coauthors
- 25 shared
Sharareh Alipour
- 25 shared
Steven Skiena
- 25 shared
Anar Amgalan
University of Southern California
- 7 shared
Danai Koutra
- 2 shared
Jordan Kodner
- 2 shared
Deniz Beser
- 2 shared
Jilles Vreeken
Helmholtz Center for Information Security
- 2 shared
Alican Büyükçakır
University of Michigan–Ann Arbor
Education
- 2023
PhD, Computer Science and Engineering
University of Michigan
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Caleb Belth
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup