Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…

Augustin Chaintreau

· Associate ProfessorVerified

Columbia University · Computer Science

Active 2000–2025

h-index35
Citations7.1k
Papers13321 last 5y
Funding$540k
See your match with Augustin Chaintreau — sign in to PhdFit.Sign in

About

Augustin Chaintreau is an Assistant Professor in the Computer Science Department at Columbia University. His research focuses on designing algorithms and conducting mathematical analysis of networks, with the goal of balancing the benefits of leveraging personal data and social networks with a commitment to fairness and privacy. His recent work addresses issues such as transparency in personalization, fairness in personal data markets, efficiency of crowdsourced content curation, and user privacy across various domains. He has contributed to understanding social media behaviors, data sharing, and privacy risks, with his research being featured in numerous media outlets including the Washington Post, Fortune, New Scientist, and the New York Times. Chaintreau has also been actively involved in the research community, serving as PC Co-chair for ACM SIGMETRICS, General Chair for the Data Transparency Lab Conference 2016, and area editor for IEEE Transactions on Mobile Communication. He has participated in program committees for over thirty conferences and has held editorial and leadership roles within the academic community. In addition to his research, Chaintreau teaches courses related to social networks and computer networks at Columbia University, providing teaching materials and engaging in academic instruction across multiple semesters. His work aims to advance understanding of networked systems, social information sharing, and privacy, contributing both theoretical insights and practical solutions.

Research topics

  • Computer Science
  • Machine Learning
  • Data science
  • Artificial Intelligence
  • Programming language
  • Mathematics
  • Telecommunications
  • World Wide Web
  • Mathematical optimization

Selected publications

  • Information Loss and Disparate Effects in Network Embeddings

    ArXiv.org · 2025-09-15

    preprintOpen accessSenior author

    An extensive line of work studies fairness interventions for network embeddings, but less is known about their baseline behavior. In this work, we ask: how do baseline embeddings (without fairness interventions) produce disparate effects at the representation level? We analyze the asymptotic behavior of low-dimensional embeddings on stochastic block model (SBM) graphs, which encode both homophily and group structure. We characterize exact conditions under which embeddings cause information loss, showing that the amount of information loss depends directly on the graph's density and assortativity. Notably, very different graphs can produce identical embeddings in the limit, and this non-invertibility disproportionately affects smaller and sparser communities. As a result, simple downstream tasks, such as link prediction, introduce higher error rates for these communities, helping explain disparities widely observed in practice.

  • The Cost of Balanced Training-Data Production in an Online Data Market

    2025-04-22

    articleOpen access1st authorCorresponding

    Many ethical issues in machine learning are connected to the training data. Online data markets are an important source of training data, facilitating both production and distribution. Recently, a trend has emerged of for-profit ''ethical'' participants in online data markets. This trend raises a fascinating question: Can online data markets sustainably and efficiently address ethical issues in the broader machine-learning economy? In this work, we study this question in a stylized model of an online data market. We investigate the effects of intervening in the data market to achieve balanced training-data production. The model reveals the crucial role of market conditions. In small and emerging markets, an intervention can drive the data producers out of the market, so that the cost of fairness is maximal. Yet, in large and established markets, the cost of fairness can vanish (as a fraction of overall welfare) as the market grows. Our results suggest that ''ethical'' online data markets can be economically feasible under favorable market conditions, and motivate more models to consider the role of data production and distribution in mediating the impacts of ethical interventions.

  • The Cost of Balanced Training-Data Production in an Online Data Market

    ArXiv.org · 2025-01-31

    preprintOpen access1st authorCorresponding

    Many ethical issues in machine learning are connected to the training data. Online data markets are an important source of training data, facilitating both production and distribution. Recently, a trend has emerged of for-profit "ethical" participants in online data markets. This trend raises a fascinating question: Can online data markets sustainably and efficiently address ethical issues in the broader machine-learning economy? In this work, we study this question in a stylized model of an online data market. We investigate the effects of intervening in the data market to achieve balanced training-data production. The model reveals the crucial role of market conditions. In small and emerging markets, an intervention can drive the data producers out of the market, so that the cost of fairness is maximal. Yet, in large and established markets, the cost of fairness can vanish (as a fraction of overall welfare) as the market grows. Our results suggest that "ethical" online data markets can be economically feasible under favorable market conditions, and motivate more models to consider the role of data production and distribution in mediating the impacts of ethical interventions.

  • Network Fairness Ambivalence: When Does Social Network Capital Mitigate or Amplify Unfairness?

    ACM SIGMETRICS Performance Evaluation Review · 2024-06-11

    articleSenior author

    What are the necessary and sufficient conditions under which multi-hop dissemination strategies decrease rather than increase inequity within social networks? Our analysis of various strategies suggests that this largely depends on a limit related to the degree of homophily in the network.

  • Network Fairness Ambivalence: When does social network capital mitigate or amplify unfairness?

    Proceedings of the ACM on Measurement and Analysis of Computing Systems · 2024-05-21

    articleOpen accessSenior author

    Social networks inherit societal biases present across lines of gender, race, socioeconomic status, and other factors. Networks can structurally perpetuate unequal access to information and opportunities through homophilous dynamics. While there is substantial knowledge about inequity in the diffusion of opportunities in a network where nodes seek them from their immediate neighbors, much less is known when considering beyond that first hop. In this paper, we leverage recent mathematical analysis of network fairness to prove that enabling simple multi-hop dissemination can reduce inequity towards a minority group in the network as long as homophily is sufficiently weak. Otherwise, our necessary and sufficient condition proves that multi-hop dissemination strategies amplify the bias already found amongst considering direct neighbors. We empirically validate these results on four social network datasets as well as present an example of a key application of our findings with a scenario of individuals who leverage their personal network to seek job referrals. Our results suggest that online platforms designing algorithms to promote opportunities to multi-hop connections must carefully take into account network metrics measuring group size and homophily in order to avoid amplifying bias against marginalized groups on their platforms.

  • Fairness Rising from the Ranks: HITS and PageRank on Homophilic Networks

    2024-05-08 · 5 citations

    articleOpen accessSenior author

    In this paper, we investigate the conditions under which link analysis algorithms prevent minority groups from reaching high ranking slots. We find that the most common link-based algorithms using centrality metrics, such as PageRank and HITS, can reproduce and even amplify bias against minority groups in networks. Yet, their behavior differs: one one hand, we empirically show that PageRank mirrors the degree distribution for most of the ranking positions and it can equalize representation of minorities among the top ranked nodes; on the other hand, we find that HITS amplifies pre-existing bias in homophilic networks through a novel theoretical analysis, supported by empirical results. We find the root cause of bias amplification in HITS to be the level of homophily present in the network, modeled through an evolving network model with two communities. We illustrate our theoretical analysis on both synthetic and real datasets and we present directions for future work.

  • Network Fairness Ambivalence: When Does Social Network Capital Mitigate or Amplify Unfairness?

    2024-06-01

    articleSenior author

    What are the necessary and sufficient conditions under which multi-hop dissemination strategies decrease rather than increase inequity within social networks? Our analysis of various strategies suggests that this largely depends on a limit related to the degree of homophily in the network.

  • Longitudinal study of exposure to radio frequencies at population scale

    Environment International · 2022-03-24 · 14 citations

    articleOpen access

    Evaluating exposure to radio frequencies (RF) at population-scale is important for conducting sound epidemiological studies about possible health impact of RF radiations. Numerous studies reported population exposure to RF radiations used in wireless telecommunication technologies, but used very small population samples. In this context, the real exposure of the population at scale remains poorly understood. Here, to the best of our knowledge, we report the largest crowd-based measurement of population exposure to RF produced by cellular antennas, Wi-Fi access points, and Bluetooth devices for 254,410 unique users in 13 countries from January 2017 to December 2020. First, we present methods to assess the population exposure to RF radiations using smartphone measurements obtained using the ElectroSmart Android app. Then, we use these methods to evaluate and characterize the evolution of RF exposure. We show that total exposure has been multiplied by 2.3 in the four-year period considered, with Wi-Fi as the largest contributor. The cellular exposure levels are orders of magnitude lower than regulation limits and are not correlated to national regulation policies. The population tends to be more exposed at home; for half of the study subjects, personal Wi-Fi routers and Bluetooth devices contributed to more than 50% of their total exposure. In this work, we showcase how crowdsource-based data allow large-scale and long-term assessment of population exposure to RF radiations.

  • Non-Existence of Stable Social Groups in Information-Driven Networks

    Theory of Computing Systems · 2022-07-06

    articleOpen access1st authorCorresponding
  • “I Don’t Have a Photograph, But You Can Have My Footprints." – Revealing the Demographics of Location Data

    Proceedings of the International AAAI Conference on Web and Social Media · 2021-08-03

    articleOpen access

    High accuracy location data are routinely available to a plethora of mobile apps and web services. The availability of such data lead to a better general understanding of human mobility. However, as location data are usually not associated with demographic information, little work has been done to understand the differences in human mobility across demographics. In this study we begin to fill the void. In particular, we explore how the growing number of geotagged footprints that social network users create can reveal demographic attributes and how these footprints enable the understanding of mobility at a demographic level. Our methodology gives rise to novel opportunities in the study of mobility. We leverage publicly available geotagged photographs from a popular photosharing network to build a dataset on demographic mobility patterns. Our analysis of this dataset not only reproduces previous results on mobility behavior at various geographical levels but further extends the existing picture: it allows for the refinement of mobility modeling from entire populations to specific demographic groups. Our analysis suggests the existence of regional variations in mobility and reveals statistically significant differences in mobility between genders and ethnicities.

Recent grants

Frequent coauthors

  • François Baccelli

    École Normale Supérieure - PSL

    23 shared
  • Christophe Diot

    Google (United States)

    23 shared
  • Guillaume Ducoffe

    National Institute for Research & Development in Informatics

    14 shared
  • Arthi Ramachandran

    14 shared
  • Daniel Hsu

    13 shared
  • Fabrizio Dell’Acqua

    12 shared
  • Nakul Verma

    12 shared
  • Emmanuelle Lebhar

    Université Paris Cité

    12 shared

Education

  • Ph.D, Computer Science

    École Normale Supérieure - PSL

    2006
  • DEA [M. Sc.] Probability and Applications, Mathematics Department

    Université de Paris

    2002
  • Magistère, Mathematics and Computer Science

    École Normale Supérieure - PSL

    2001

Awards & honors

  • PC Co-chair for ACM SIGMETRICS
  • General Chair for the Data Transparency Lab Conference 2016
  • Area editor for IEEE Transactions on Mobile Communication
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Augustin Chaintreau

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup