Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Michael Bender

Michael Bender

· Research Assistant ProfessorVerified

Stony Brook University · Computer Science

Active 1961–2026

h-index51
Citations9.6k
Papers38575 last 5y
Funding$4.6M1 active
See your match with Michael Bender — sign in to PhdFit.Sign in

About

Michael A. Bender is the John L. Hennessy Chaired Professor of Computer Science at Stony Brook University. He has a distinguished background in algorithms, data structures, cache and I/O-efficient computing, parallel computing, databases, storage, and scheduling. Bender is the founder and former Chief Scientist of Tokutek, Inc, an enterprise database company acquired by Percona in 2014. His research encompasses both pure and applied aspects of algorithms and data structures, with over 200 publications and involvement as PI or co-PI in 40 grants. He has received numerous awards for his contributions to research and education, including fellowships in the IEEE, AAAS, and EATCS, as well as awards for teaching excellence and distinguished papers.

Research topics

  • Computer Science
  • Mathematics
  • Algorithm
  • Theoretical computer science
  • Computer vision
  • Programming language
  • Computer network
  • Parallel computing
  • Operating system
  • Combinatorics
  • Computer hardware
  • Discrete mathematics

Selected publications

  • Writes Wrought Right, and Other Adventures in File System Optimization

    UNC Libraries · 2026-04-09

    articleOpen access

    File systems that employ write-optimized dictionaries (WODs) can perform random-writes, metadata updates, and recursive directory traversals orders of magnitude faster than conventional file systems. However, previous WOD-based file systems have not obtained all of these performance gains without sacrificing performance on other operations, such as file deletion, file or directory renaming, or sequential writes. Using three techniques, late-binding journaling , zoning , and range deletion , we show that there is no fundamental trade-off in write-optimization. These dramatic improvements can be retained while matching conventional file systems on all other operations. BetrFS 0.2 delivers order-of-magnitude better performance than conventional file systems on directory scans and small random writes and matches the performance of conventional file systems on rename, delete, and sequential I/O. For example, BetrFS 0.2 performs directory scans 2.2 × faster, and small random writes over two orders of magnitude faster, than the fastest conventional file system. But unlike BetrFS 0.1, it renames and deletes files commensurate with conventional file systems and performs large sequential I/O at nearly disk bandwidth. The performance benefits of these techniques extend to applications as well. BetrFS 0.2 continues to outperform conventional file systems on many applications, such as as rsync, git-diff, and tar, but improves git-clone performance by 35% over BetrFS 0.1, yielding performance comparable to other file systems.

  • History-Independent Dynamic Partitioning with Applications to B-Trees, Skip Lists and Fusion Trees

    ACM Transactions on Database Systems · 2026-04-25

    articleOpen access1st authorCorresponding

    A data structure is history independent if its internal representation reveals nothing about the history of operations beyond what can be determined from the current contents of the data structure. History independence is typically viewed as a security or privacy guarantee, with the intent being to minimize risks incurred by a security breach or audit. Despite widespread advances in history independence, there is an important data-structural primitive that previous work has been unable to replace with an equivalent history-independent alternative— dynamic partitioning . In dynamic partitioning, we are given a dynamic set S of ordered elements and a size-parameter B , and the objective is to maintain a partition of S into ordered groups, each of size Θ ( B ). Dynamic partitioning is important throughout computer science, with applications to B-tree rebalancing, write-optimized dictionaries, log-structured merge trees, other external-memory indexes, geometric and spatial data structures, cache-oblivious data structures, and order-maintenance data structures. The lack of a history-independent dynamic-partitioning primitive has meant that designers of history-independent data structures have had to resort to complex alternatives. In this paper, we achieve history-independent dynamic partitioning. Our algorithm runs asymptotically optimally against an oblivious adversary, processing each insert/delete with O (1) operations in expectation and O ( B log N /log logN ) with high probability in set size N . We also use our dynamic partitioning scheme to build a history-independent B -tree, history-independent fusion tree, and external-memory skip list.

  • The Case for External Graph Sketching

    Society for Industrial and Applied Mathematics eBooks · 2025-01-01

    book-chapterOpen access1st authorCorresponding

    Algorithms in the data stream model use O (polylog (N )) space to compute some property of an input of size N, and many of these algorithms are implemented and used in practice. However, sketching algorithms in the graph semi-streaming model use O (V polylog (V )) space for a V-vertex graph, and the fact that implementations of these algorithms are not used in the academic literature or in industrial applications may be because this space requirement is too large for RAM on today’s hardware.

  • Contention resolution with message deadlines

    Distributed Computing · 2025-07-12

    article
  • Optimal Non-oblivious Open Addressing

    2025-06-15

    article1st authorCorresponding
  • The Case for External Graph Sketching

    ArXiv.org · 2025-04-24

    preprintOpen access1st authorCorresponding

    Algorithms in the data stream model use $O(polylog(N))$ space to compute some property of an input of size $N$, and many of these algorithms are implemented and used in practice. However, sketching algorithms in the graph semi-streaming model use $O(V polylog(V))$ space for a $V$-vertex graph, and the fact that implementations of these algorithms are not used in the academic literature or in industrial applications may be because this space requirement is too large for RAM on today's hardware. In this paper we introduce the external semi-streaming model, which addresses the aspects of the semi-streaming model that limit its practical impact. In this model, the input is in the form of a stream and $O(V polylog(V))$ space is available, but most of that space is accessible only via block I/O operations as in the external memory model. The goal in the external semi-streaming model is to simultaneously achieve small space and low I/O cost. We present a general transformation from any vertex-based sketch algorithm to one which has a low sketching cost in the new model. We prove that this automatic transformation is tight or nearly (up to a $O(\log(V))$ factor) tight via an I/O lower bound for the task of sketching the input stream. Using this transformation and other techniques, we present external semi-streaming algorithms for connectivity, bipartiteness testing, $(1+ε)$-approximating MST weight, testing k-edge connectivity, $(1+ε)$-approximating the minimum cut of a graph, computing $ε$-cut sparsifiers, and approximating the density of the densest subgraph. These algorithms all use $O(V poly(\log(V), ε^{-1},k)$ space. For many of these problems, our external semi-streaming algorithms outperform the state of the art algorithms in both the sketching and external-memory models.

  • History-Independent Concurrent Hash Tables

    ArXiv.org · 2025-03-26

    preprintOpen access

    A history-independent data structure does not reveal the history of operations applied to it, only its current logical state, even if its internal state is examined. This paper studies history-independent concurrent dictionaries, in particular, hash tables, and establishes inherent bounds on their space requirements. This paper shows that there is a lock-free history-independent concurrent hash table, in which each memory cell stores two elements and two bits, based on Robin Hood hashing. Our implementation is linearizable, and uses the shared memory primitive LL/SC. The expected amortized step complexity of the hash table is $O(c)$, where $c$ is an upper bound on the number of concurrent operations that access the same element, assuming the hash table is not overpopulated. We complement this positive result by showing that even if we have only two concurrent processes, no history-independent concurrent dictionary that supports sets of any size, with wait-free membership queries and obstruction-free insertions and deletions, can store only two elements of the set and a constant number of bits in each memory cell. This holds even if the step complexity of operations on the dictionary is unbounded.

  • Time To Replace Your Filter: How Maplets Simplify System Design

    ArXiv.org · 2025-10-07

    preprintOpen access1st authorCorresponding

    Filters such as Bloom, quotient, and cuckoo filters are fundamental building blocks providing space-efficient approximate set membership testing. However, many applications need to associate small values with keys-functionality that filters do not provide. This mismatch forces complex workarounds that degrade performance. We argue that maplets-space-efficient data structures for approximate key-value mappings-are the right abstraction. A maplet provides the same space benefits as filters while natively supporting key-value associations with one-sided error guarantees. Through detailed case studies of SplinterDB (LSM-based key-value store), Squeakr (k-mer counter), and Mantis (genomic sequence search), we identify the common patterns and demonstrate how a unified maplet abstraction can lead to simpler designs and better performance. We conclude that applications benefit from defaulting to maplets rather than filters across domains including databases, computational biology, and networking.

  • Fast and Compact Sketch-Based Dynamic Connectivity

    ArXiv.org · 2025-09-17

    preprintOpen access

    We study the dynamic connectivity problem for massive, dense graphs. Our goal is to build a system for dense graphs that simultaneously answers connectivity queries quickly, maintains a fast update throughput, and a uses a small amount of memory. Existing systems at best achieve two of these three performance goals at once. We present a parallel dynamic connectivity algorithm using graph sketching techniques that has space complexity $O(V \log^3 V)$ and query complexity $O(\log V/\log\log V)$. Its updates are fast and parallel: in the worst case, it performs updates in $O(\log^2 V)$ depth and $O(\log^4 V)$ work. For updates which don't change the spanning forests maintained by our data structure, the update complexity is $O(\log V)$ depth and $O(\log^2 V)$ work. We also present CUPCaKE (Compact Updating Parallel Connectivity and Sketching Engine), a dynamic connectivity system based on our parallel algorithm. It uses an order of magnitude less memory than the best lossless systems on dense graph inputs, answers queries with microsecond latency, and ingests millions of updates per second on dense graphs.

  • Adaptive and Scalable Data Structures (Dagstuhl Seminar 25191)

    Leibniz-Zentrum für Informatik (Schloss Dagstuhl) · 2025-01-01

    reportOpen access1st authorCorresponding

    This report documents the program and the outcomes of Dagstuhl Seminar 25191 "Adaptive and Scalable Data Structures". Data structures govern the organization and manipulation of data in computing systems across a broad range of applications. The efficiency and scalability of data structures has profound implications, motivating continued research on the entire spectrum from theoretical to practical. As the size and complexity of data sets increases and as the underlying computing infrastructure changes, data structures need to be continually redesigned with scalability in mind. Classical data structures also need reevaluation to better fit the requirements of modern applications. Adaptivity offers a way to design data structures that automatically take advantage of features of the underlying hardware, specific structure and biases in their usage, or side-information, and the limits of data structure adaptivity pose deep research questions. The goal of this seminar was to reflect on these complementary aspects of data structure research and to identify promising research questions. The program provides a snapshot of the current state of research and establishes possible future directions for the field.

Recent grants

Frequent coauthors

Awards & honors

  • Fellow of the Institute of Electrical and Electronics Engine…
  • SPAA Distinguished Paper Award, 2025
  • ACM SIGMOD Research Highlight Award, 2024
  • Fellow of the American Association for the Advancement of Sc…
  • Fellow of the European Association of Theoretical Computer S…
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Michael Bender

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup