Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Gianluca Stringhini

Gianluca Stringhini

· Assistant Professor – Electrical & Computer EngineeringAffiliated Faculty – Computer ScienceVerified

Boston University · Computer Science

Active 2010–2026

h-index48
Citations8.5k
Papers300135 last 5y
Funding$1.6M1 active
See your match with Gianluca Stringhini — sign in to PhdFit.Sign in

About

Gianluca Stringhini is an Assistant Professor in the Department of Electrical and Computer Engineering at Boston University. His research applies a data-driven approach to better understand malicious activity on the Internet. Through the collection and analysis of large-scale datasets, he develops novel and robust mitigation techniques to make the Internet a safer place. His work involves a mix of quantitative analysis, some qualitative analysis, machine learning, crime science, and systems design. Recently, he has investigated the spread of alternative news and memes on online social networks, raids organized by trolls against other Internet users, cyberbullying, ransomware, online dating scams, money laundering schemes linked to cybercrime, malware delivery networks, and online social network compromises.

Research topics

  • Computer Science
  • Sociology
  • Political Science
  • World Wide Web
  • Social Science
  • Internet privacy
  • Geography
  • Law
  • Computer Security
  • Cartography
  • Business
  • Advertising
  • Criminology
  • Biology
  • Public relations

Selected publications

  • The Cost of Convenience: Identifying, Analyzing, and Mitigating Predatory Loan Applications on Android

    ArXiv.org · 2026-01-19

    articleOpen accessSenior author

    Digital lending applications, commonly referred to as loan apps, have become a primary channel for microcredit in emerging markets. However, many of these apps demand excessive permissions and misuse sensitive user data for coercive debt-recovery practices, including harassment, blackmail, and public shaming that affect both borrowers and their contacts. This paper presents the first cross-country measurement of loan app compliance against both national regulations and Google's Financial Services Policy. We analyze 434 apps drawn from official registries and app markets from Indonesia, Kenya, Nigeria, Pakistan, and the Philippines. To operationalize policy requirements at scale, we translate policy text into testable permission checks using LLM-assisted policy-to-permission mapping and combine this with static and dynamic analyses of loan apps' code and runtime behavior. Our findings reveal pervasive non-compliance among approved apps: 141 violate national regulatory policy and 147 violate Google policy. Dynamic analysis further shows that several apps transmit sensitive data (contacts, SMS, location, media) before user signup or registration, undermining informed consent and enabling downstream harassment of borrowers and third parties. Following our disclosures, Google removed 93 flagged apps from Google Play, representing over 300M cumulative installs. We advocate for adopting our methodology as a proactive compliance-monitoring tool and offer targeted recommendations for regulators, platforms, and developers to strengthen privacy protections. Overall, our results highlight the need for coordinated enforcement and robust technical safeguards to ensure that digital lending supports financial inclusion without compromising user privacy or safety.

  • Praxium: Diagnosing Cloud Anomalies with AI-based Telemetry and Dependency Analysis

    ArXiv.org · 2026-03-25

    articleOpen access

    As the modern microservice architecture for cloud applications grows in popularity, cloud services are becoming increasingly complex and more vulnerable to misconfiguration and software bugs. Traditional approaches rely on expert input to diagnose and fix microservice anomalies, which lacks scalability in the face of the continuous integration and continuous deployment (CI/CD) paradigm. Microservice rollouts, containing new software installations, have complex interactions with the components of an application. Consequently, this added difficulty in attributing anomalous behavior to any specific installation or rollout results in potentially slower resolution times. To address the gaps in current diagnostic methods, this paper introduces Praxium, a framework for anomaly detection and root cause inference. Praxium aids administrators in evaluating target metric performance in the context of dependency installation information provided by a software discovery tool, PraxiPaaS. Praxium continuously monitors telemetry data to identify anomalies, then conducts root cause analysis via causal impact on recent software installations, in order to provide site reliability engineers (SRE) relevant information about an observed anomaly. In this paper, we demonstrate that Praxium is capable of effective anomaly detection and root cause inference, and we provide an analysis on effective anomaly detection hyperparameter tuning as needed in a practical setting. Across 75 total trials using four synthetic anomalies, anomaly detection consistently performs at >0.97 macro-F1. In addition, we show that causal impact analysis reliably infers the correct root cause of anomalies, even as package installations occur at increasingly shorter intervals.

  • The Cost of Convenience: Identifying, Analyzing, and Mitigating Predatory Loan Applications on Android

    arXiv (Cornell University) · 2026-01-19

    preprintOpen accessSenior author

    Digital lending applications, commonly referred to as loan apps, have become a primary channel for microcredit in emerging markets. However, many of these apps demand excessive permissions and misuse sensitive user data for coercive debt-recovery practices, including harassment, blackmail, and public shaming that affect both borrowers and their contacts. This paper presents the first cross-country measurement of loan app compliance against both national regulations and Google's Financial Services Policy. We analyze 434 apps drawn from official registries and app markets from Indonesia, Kenya, Nigeria, Pakistan, and the Philippines. To operationalize policy requirements at scale, we translate policy text into testable permission checks using LLM-assisted policy-to-permission mapping and combine this with static and dynamic analyses of loan apps' code and runtime behavior. Our findings reveal pervasive non-compliance among approved apps: 141 violate national regulatory policy and 147 violate Google policy. Dynamic analysis further shows that several apps transmit sensitive data (contacts, SMS, location, media) before user signup or registration, undermining informed consent and enabling downstream harassment of borrowers and third parties. Following our disclosures, Google removed 93 flagged apps from Google Play, representing over 300M cumulative installs. We advocate for adopting our methodology as a proactive compliance-monitoring tool and offer targeted recommendations for regulators, platforms, and developers to strengthen privacy protections. Overall, our results highlight the need for coordinated enforcement and robust technical safeguards to ensure that digital lending supports financial inclusion without compromising user privacy or safety.

  • Loki: Proactively discovering online scams by mining toxic search queries

    2026-01-01

    articleOpen accessSenior author

    Online e-commerce scams, ranging from shopping scams to pet scams, globally cause millions of dollars in financial damage every year.In response, the security community has developed highly accurate detection systems able to determine if a website is fraudulent.However, finding candidate scam websites that can be passed as input to these downstream detection systems is challenging: relying on user reports is inherently reactive and slow, and proactive systems issuing search engine queries to return candidate websites suffer from low coverage and do not generalize to new scam types.In this paper, we present LOKI, a system designed to identify search engine queries likely to return a high fraction of fraudulent websites.LOKI implements a keyword scoring model grounded in Learning Under Privileged Information (LUPI) and feature distillation from Search Engine Result Pages (SERPs).We rigorously validate LOKI across 10 major scam categories and demonstrate a 20.58 times improvement in discovery over both heuristic and datadriven baselines across all categories.Leveraging a small seed set of only 1,663 known scam sites, we use the keywords identified by our method to discover 52,493 previously unreported scams in the wild.Finally, we show that LOKI generalizes to previously-unseen scam categories, highlighting its utility in surfacing emerging threats.

  • Revealing The Secret Power: How Algorithms Can Influence Content Visibility on Social Media

    2026-01-01

    articleOpen accessSenior author

    In recent years, the opaque design and the limited public understanding of social networks' recommendation algorithms have raised concerns about potential manipulation of information exposure.Reducing content visibility, aka shadow banning, may help limit harmful content; however, it can also be used to suppress dissenting voices.This prompts the need for greater transparency and a better understanding of this practice.In this paper, we investigate the presence of visibility alterations through a large-scale quantitative analysis of two Twitter/X datasets comprising over 40 million tweets from more than 9 million users, focused on discussions surrounding the Ukraine-Russia conflict and the 2024 US Presidential Elections.We use view counts to detect patterns of reduced or inflated visibility and examine how these correlate with user opinions, social roles, and narrative framings.Our analysis shows that the algorithm systematically penalizes tweets containing links to external resources, reducing their visibility by up to a factor of eight, regardless of the ideological stance or source reliability.Rather, content visibility may be penalized or favored depending on the specific accounts producing it, as observed when comparing tweets from the Kyiv Independent and RT.com or tweets by Donald Trump and Kamala Harris.Overall, our work highlights the importance of transparency in content moderation and recommendation systems to protect the integrity of public discourse and ensure equitable access to online platforms.

  • Praxium: Diagnosing Cloud Anomalies with AI-based Telemetry and Dependency Analysis

    arXiv (Cornell University) · 2026-03-25

    preprintOpen access

    As the modern microservice architecture for cloud applications grows in popularity, cloud services are becoming increasingly complex and more vulnerable to misconfiguration and software bugs. Traditional approaches rely on expert input to diagnose and fix microservice anomalies, which lacks scalability in the face of the continuous integration and continuous deployment (CI/CD) paradigm. Microservice rollouts, containing new software installations, have complex interactions with the components of an application. Consequently, this added difficulty in attributing anomalous behavior to any specific installation or rollout results in potentially slower resolution times. To address the gaps in current diagnostic methods, this paper introduces Praxium, a framework for anomaly detection and root cause inference. Praxium aids administrators in evaluating target metric performance in the context of dependency installation information provided by a software discovery tool, PraxiPaaS. Praxium continuously monitors telemetry data to identify anomalies, then conducts root cause analysis via causal impact on recent software installations, in order to provide site reliability engineers (SRE) relevant information about an observed anomaly. In this paper, we demonstrate that Praxium is capable of effective anomaly detection and root cause inference, and we provide an analysis on effective anomaly detection hyperparameter tuning as needed in a practical setting. Across 75 total trials using four synthetic anomalies, anomaly detection consistently performs at >0.97 macro-F1. In addition, we show that causal impact analysis reliably infers the correct root cause of anomalies, even as package installations occur at increasingly shorter intervals.

  • Lessons Learned from Anomaly Detection in Chameleon Cloud

    2025-09-23

    article

    Cloud computing has become integral to modern technology infrastructure, supporting a wide range of services from e-commerce to AI applications. Chameleon is a large-scale, configurable testbed designed to enable edge-to-cloud research through full bare-metal provisioning, virtualization, and diverse hardware resources, which is built on a leading open source cloud platform OpenStack. However, monitoring Chameleon’s heterogeneous infrastructure is challenging, particularly across Open-Stack services and hardware components. Traditional threshold-based alerting methods struggle to keep up with the scale and complexity of such environments. In this work, we present an anomaly detection framework for OpenStack services in the Chameleon Cloud. We curate and publish the first dataset of resource usage metrics collected from OpenStack control plane services. We evaluate four state-of-the-art unsupervised multivariate time series models, namely TranAD, Prodigy, USAD, and OmniAnomaly, on this dataset and share key insights from deploying them. Our findings indicate that for our use case, while all models achieve high F1 scores, training with three days of healthy data effectively balances training cost and detection accuracy.

  • Timeliness Matters: Leveraging Reinforcement Learning on Social Media Data to Prioritize High-Risk Conversations for Promoting Youth Online Safety

    Proceedings of the International AAAI Conference on Web and Social Media · 2025-06-07 · 1 citations

    articleOpen access

    Ensuring the online safety of youth has motivated research towards the development of machine learning (ML) methods capable of accurately detecting social media risks after-the-fact. However, for these detection models to be effective, they must proactively identify high-risk scenarios (e.g., sexual solicitations, cyberbullying) to mitigate harm. This `real-time' responsiveness is a recognized challenge within the risk detection literature. Therefore, this paper presents a novel two-level framework that first uses reinforcement learning to identify conversation stop points to prioritize messages for evaluation. Then, we optimize state-of-the-art deep learning models to accurately categorize risk priority (low, high). We apply this framework to a time-based simulation using a rich dataset of 23K private conversations with over 7 million messages donated by 194 youth (ages 13-21). We conducted an experiment comparing our new approach to a traditional conversation-level baseline. We found that the timeliness of conversations significantly improved from over 2 hours to approximately 16 minutes with only a slight reduction in accuracy (0.88 to 0.84). This study advances real-time detection approaches for social media data and provides a benchmark for future training reinforcement learning that prioritizes the timeliness of classifying high-risk conversations.

  • From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs

    ArXiv.org · 2025-09-01

    preprintOpen accessSenior author

    High-quality datasets of real-world vulnerabilities and their corresponding verifiable exploits are crucial resources in software security research. Yet such resources remain scarce, as their creation demands intensive manual effort and deep security expertise. In this paper, we present CVE-GENIE, an automated, large language model (LLM)-based multi-agent framework designed to reproduce real-world vulnerabilities, provided in Common Vulnerabilities and Exposures (CVE) format, to enable creation of high-quality vulnerability datasets. Given a CVE entry as input, CVE-GENIE gathers the relevant resources of the CVE, automatically reconstructs the vulnerable environment, and (re)produces a verifiable exploit. Our systematic evaluation highlights the efficiency and robustness of CVE-GENIE's design and successfully reproduces approximately 51% (428 of 841) CVEs published in 2024-2025, complete with their verifiable exploits, at an average cost of $2.77 per CVE. Our pipeline offers a robust method to generate reproducible CVE benchmarks, valuable for diverse applications such as fuzzer evaluation, vulnerability patching, and assessing AI's security capabilities.

  • Mirror Mirror on the Wall, which APK Mirror Site is the Largest of Them All?

    2025-10-28

    article

Recent grants

Frequent coauthors

  • Jeremy Blackburn

    157 shared
  • Emiliano De Cristofaro

    University of California, Riverside

    128 shared
  • Savvas Zannettou

    Delft University of Technology

    92 shared
  • Michael Sirivianos

    Cyprus University of Technology

    49 shared
  • Nicolas Kourtellis

    Telefonica Research and Development

    45 shared
  • Ilias Leontiadis

    Meta (United Kingdom)

    42 shared
  • Enrico Mariconti

    31 shared
  • Adam Doupé

    31 shared

Labs

Education

  • Ph.D., Computer Science

    University of California, San Diego

    2012
  • M.S., Computer Science

    University of California, San Diego

    2009
  • B.S., Computer Science

    University of Rome 'La Sapienza'

    2007

Awards & honors

  • Facebook Secure the Internet Grant (2018)
  • Google Faculty Research Award (2015)
  • Symantec Research Labs Fellowship (2012)
  • multiple best paper awards
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Gianluca Stringhini

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup