
David Choffnes
· Professor, Executive Director - Cybersecurity and Privacy InstituteVerifiedNortheastern University · Cybersecurity and Information Systems
Active 2003–2026
About
David Choffnes is a Professor and the Executive Director of the Cybersecurity and Privacy Institute at Khoury. His role involves leading interdisciplinary efforts in cybersecurity and privacy, collaborating with Khoury to advance research and education in these fields. The biography emphasizes his leadership position and his association with Khoury, but does not provide additional details about his research focus, background, or key contributions.
Research topics
- Computer Science
- Computer Security
- Internet privacy
- World Wide Web
- Political Science
- Human–computer interaction
- Sociology
- Psychology
- Operating system
- Computer network
- Speech recognition
- Multimedia
- Database
- Linguistics
- Data science
Selected publications
Beyond the Hype: Empirical Analysis of Matter Standard's Security and Privacy
Zenodo (CERN European Organization for Nuclear Research) · 2026-10-12
articleOpen accessThis paper presents an empirical analysis of the security and privacy properties of the Matter IoT standard. Moving beyond theoretical design claims, the authors evaluate real-world implementations to identify practical weaknesses and deployment challenges. The study examines device onboarding, communication protocols, and access control mechanisms, highlighting gaps between the specification and actual behavior. It also explores privacy risks related to metadata exposure and ecosystem interoperability. The findings reveal that while Matter introduces meaningful security improvements, inconsistencies in implementation and ecosystem complexity can undermine its guarantees. The paper concludes with recommendations to strengthen security practices and improve privacy protections in future deployments.DatasetDataset catalogues Matter and non-Matter IoT devices, including bulbs, plugs, bridges, controllers, and miscellaneous devices, across multiple brands, enabling comparison of functionality, compatibility, and ecosystem diversity in smart home deployments. Network traces have been limited to 300MB per device.ScriptsRorating-Device-ID-analysis-script Function to analyze all PCAPs in the specified directory, extract RI values, count occurrences ,plot a histogram and to use different colors for each key (k1, k2, …)
Beyond the Hype: Empirical Analysis of Matter Standard's Security and Privacy
Zenodo (CERN European Organization for Nuclear Research) · 2026-10-12
articleOpen accessThis paper presents an empirical analysis of the security and privacy properties of the Matter IoT standard. Moving beyond theoretical design claims, the authors evaluate real-world implementations to identify practical weaknesses and deployment challenges. The study examines device onboarding, communication protocols, and access control mechanisms, highlighting gaps between the specification and actual behavior. It also explores privacy risks related to metadata exposure and ecosystem interoperability. The findings reveal that while Matter introduces meaningful security improvements, inconsistencies in implementation and ecosystem complexity can undermine its guarantees. The paper concludes with recommendations to strengthen security practices and improve privacy protections in future deployments.DatasetDataset catalogues Matter and non-Matter IoT devices, including bulbs, plugs, bridges, controllers, and miscellaneous devices, across multiple brands, enabling comparison of functionality, compatibility, and ecosystem diversity in smart home deployments. Network traces have been limited to 300MB per device.ScriptsRorating-Device-ID-analysis-script Function to analyze all PCAPs in the specified directory, extract RI values, count occurrences ,plot a histogram and to use different colors for each key (k1, k2, …)
SPHERE CPS Enclave: A Reconfigurable Testbed for Industrial Control System Security Experimentation
2025-05-06
articleCyber-physical systems (CPS) increasingly face security threats that can disrupt critical infrastructure operations. The SPHERE CPS enclave is a modular, remotely accessible industrial control system (ICS) testbed designed to support security experimentation on programmable logic controllers (PLCs), industrial networks, and digital twin simulations. It enables researchers to investigate cyber-physical attacks, anomaly detection, and intrusion resilience strategies. Unlike general cybersecurity testbeds, SPHERE's CPS enclave provides a configurable, realistic environment for studying adversarial scenarios that bridge cyber and physical domains. The infrastructure offers controlled, reproducible experiments with customizable network topologies and hardware-in-the-loop validation. This poster presents the design philosophy, community-driven experimental goals, and deployment considerations of the SPHERE CPS enclave, demonstrating its potential for advancing CPS security research.
Empirically Measuring Data Localization in the EU
Proceedings on Privacy Enhancing Technologies · 2025-05-19 · 1 citations
articleOpen accessSenior authorEU data localization regulations limit data transfers to non-EU countries with the GDPR. However, BGP, DNS and other Internet protocols were not designed to enforce jurisdictional constraints, so implementing data localization is challenging. Despite initial research on the topic, little is known about if or how companies currently operate their server infrastructure to comply with the regulations. We close this knowledge gap by empirically measuring the extent to which servers and routers that process EU requests are located outside of the EU (and a handful of 'adequate' non-EU countries). The key challenge is that both browser measurements (to infer relevant endpoints) and data-plane measurements (to infer relevant IP addresses) are needed, but no large-scale public infrastructure allows both. We build a novel methodology that combines BrightData (browser) and RIPE Atlas (data-plane) probes, with joint measurements from over 1,000 networks in 20 EU countries. We find that, on average, 2.2% of servers serving users in each EU country are located in non-adequate destination countries (1.4% of known trackers). Our findings suggest that data localization policies are largely being followed by content providers, though there are exceptions.
Promises, Promises: Understanding Claims Made in Social Robot Consumer Experiences
2025-04-24 · 1 citations
articleOpen accessSocial robots are a class of emerging smart consumer electronics devices that promise sophisticated experiences featuring emotive capabilities, artificial intelligence, conversational interaction, and more. With unique risk factors like emotional attachment, little is known on how social robots communicate these promises to consumers and whether they adequately deliver upon them within their overall product experiences prior to and during user interaction. Animated by a consumer protection lens, this paper systematically investigates manufacturer claims made for four commercially available social robots, evaluating these claims against the provided user experience and consumer reviews. We find that social robots vary widely in the manner and extent to which they communicate intelligent features and the supposed benefits of these features, while consumer perspectives similarly include a wide range of perceptions on robot and AI performance, capabilities, and product frustrations. We conclude by discussing social robots’ unique propensities for consumer risk, and consider implications for regulators, developers, and researchers of social robots.
Echoes of Privacy: Uncovering the Profiling Practices of Voice Assistants
Proceedings on Privacy Enhancing Technologies · 2025-03-07
articleOpen accessSenior authorMany companies, including Google, Amazon, and Apple, offer voice assistants as a convenient solution for answering general voice queries and accessing their services. These voice assistants have gained popularity and can be easily accessed through various smart devices such as smartphones, smart speakers, smartwatches, and an increasing array of other devices. However, this convenience comes with potential privacy risks. For instance, while companies vaguely mention in their privacy policies that they may use voice interactions for user profiling, it remains unclear to what extent this profiling occurs and whether voice interactions pose greater privacy risks compared to other interaction modalities. In this paper, we conduct 1171 experiments involving 24530 queries with different personas and interaction modalities during 20 months to characterize how the three most popular voice assistants profile their users. We analyze factors such as labels assigned to users, their accuracy, the time taken to assign these labels, differences between voice and web interactions, and the effectiveness of profiling remediation tools offered by each voice assistant. Our findings reveal that profiling can happen without interaction, can be incorrect and inconsistent at times, may take several days or weeks to change, and is affected by the interaction modality.
MLCerts Docker Images (ICSE 2026)
Zenodo (CERN European Organization for Nuclear Research) · 2025-12-08
otherOpen accessSenior authorLicensed under a Creative Commons Attribution 4.0 International License. Auxiliary material, up to date documentation, and issue tracking available at: https://github.com/rub-softsec/MLCerts The Datasets and Language Models are available at: https://zenodo.org/records/15971208 This archive contains a Docker image for the Differential Testing Framework, an image with patched Transcert implementation and an image for generating synthetic certificates using a pre-trained model. MLCerts Differential Testing Framework Run the image with the corresponding data directory mounted. The outputs of the testing framework will appear in /attached_dir/testing-results and /attached_dir/coverage directories. docker load -i mlcerts_export.tar docker run -it -v ./attached_dir_export:/attached_dir mlcerts_export bash conda deactivate cd /attached_dir ./mlcerts/run_testing.sh . ./cert_data_pem/v3-experiments/ ./mlcerts/ ./LIBS/ ./customCA/cacert.pem Transcert Run the image with the corresponding data directory mounted to find the patched Transcert source code (based on https://github.com/joky27/transcert_related). docker load -i transcert_export.tar docker run -it -v ./attached_dir_export:/attached_dir transcert_export bash conda activate transcert cd /attached_dir The key modifications are to: Use fastcov instead of lcov. Use gmtime_adj* functions instead of set* due to a pyOpenSSL bug (https://github.com/pyca/pyopenssl/issues/311) that has not been fixed due to API deprecation. Language Models LM code requires installation of CUDA drivers specific to the GPUs available. For a simple demonstration, we release a container that relies on the main model used in the paper and uses CPU to generate certificates. The data directory to attach with the container needs to be downloaded: llm-code-mlcerts-export.zip from https://zenodo.org/records/15971208. Run the image with the corresponding data directory mounted: docker load -i mlcerts-llm-cpu-demo.tar docker run -it -v ./MLcerts-EXPORT:/MLcerts-EXPORT mlcerts-llm-cpu-demo /bin/bash Then, for generating synthetic certificates using final model used in paper (IPv4/RNN-Medium with Temperature = 1.5): cd /MLcerts-EXPORT/Char-RNN-PyTorch conda activate py39 python3 generate.py zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000 1024 3 1.5 zmap-data testZmap1M The synthetic ASN outputs will appear in ./outputCerts directory. Due to the reliance on CPU, it may take ~10 minutes per output. Finally, to convert ASN outputs to usable PEM formats: cd /MLcerts-EXPORT/ conda activate myenv python3 asn1_to_pem.py <asn_file_path> <output_dir_path> 2 python3 asn1_to_pem.py Char-RNN-PyTorch/outputCerts/zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000testZmap1M/fbbff4ee-67f0-423b-8647-5e11754ebdf3.asn . 2 and an output.XYZ.pem file is generated, using CA information from customCA/ directory. BibTeX Please cite our paper if you rely on our artifacts for your work. @inproceedings{icse2026-hallucinating-certificates, title = {{Hallucinating Certificates: Differential Testing of TLS Certificate Validation Using Generative Language Models}}, author = {Paracha, Talha and Posluns, Kyle and Borgolte, Kevin and Lindorfer, Martina and Choffnes, David}, booktitle = {Proceedings of the 48th IEEE/ACM International Conference on Software Engineering (ICSE)}, date = {2026-04}, edition = {48}, editor = {Mezini, Mira and Zimmermann, Thomas}, location = {Rio de Janeiro, Brazil}, publisher = {Association for Computing Machinery (ACM)/Institute of Electrical and Electronics Engineers (IEEE)} }
MLCerts Docker Images (ICSE 2026)
Zenodo (CERN European Organization for Nuclear Research) · 2025-12-08
otherOpen accessSenior authorLicensed under a Creative Commons Attribution 4.0 International License. Auxiliary material, up to date documentation, and issue tracking available at: https://github.com/rub-softsec/MLCerts The Datasets and Language Models are available at: https://zenodo.org/records/15971208 This archive contains a Docker image for the Differential Testing Framework, an image with patched Transcert implementation and an image for generating synthetic certificates using a pre-trained model. MLCerts Differential Testing Framework Run the image with the corresponding data directory mounted. The outputs of the testing framework will appear in /attached_dir/testing-results and /attached_dir/coverage directories. docker load -i mlcerts_export.tar docker run -it -v ./attached_dir_export:/attached_dir mlcerts_export bash conda deactivate cd /attached_dir ./mlcerts/run_testing.sh . ./cert_data_pem/v3-experiments/ ./mlcerts/ ./LIBS/ ./customCA/cacert.pem Transcert Run the image with the corresponding data directory mounted to find the patched Transcert source code (based on https://github.com/joky27/transcert_related). docker load -i transcert_export.tar docker run -it -v ./attached_dir_export:/attached_dir transcert_export bash conda activate transcert cd /attached_dir The key modifications are to: Use fastcov instead of lcov. Use gmtime_adj* functions instead of set* due to a pyOpenSSL bug (https://github.com/pyca/pyopenssl/issues/311) that has not been fixed due to API deprecation. Language Models LM code requires installation of CUDA drivers specific to the GPUs available. For a simple demonstration, we release a container that relies on the main model used in the paper and uses CPU to generate certificates. The data directory to attach with the container needs to be downloaded: llm-code-mlcerts-export.zip from https://zenodo.org/records/15971208. Run the image with the corresponding data directory mounted: docker load -i mlcerts-llm-cpu-demo.tar docker run -it -v ./MLcerts-EXPORT:/MLcerts-EXPORT mlcerts-llm-cpu-demo /bin/bash Then, for generating synthetic certificates using final model used in paper (IPv4/RNN-Medium with Temperature = 1.5): cd /MLcerts-EXPORT/Char-RNN-PyTorch conda activate py39 python3 generate.py zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000 1024 3 1.5 zmap-data testZmap1M The synthetic ASN outputs will appear in ./outputCerts directory. Due to the reliance on CPU, it may take ~10 minutes per output. Finally, to convert ASN outputs to usable PEM formats: cd /MLcerts-EXPORT/ conda activate myenv python3 asn1_to_pem.py <asn_file_path> <output_dir_path> 2 python3 asn1_to_pem.py Char-RNN-PyTorch/outputCerts/zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000testZmap1M/fbbff4ee-67f0-423b-8647-5e11754ebdf3.asn . 2 and an output.XYZ.pem file is generated, using CA information from customCA/ directory. BibTeX Please cite our paper if you rely on our artifacts for your work. @inproceedings{icse2026-hallucinating-certificates, title = {{Hallucinating Certificates: Differential Testing of TLS Certificate Validation Using Generative Language Models}}, author = {Paracha, Talha and Posluns, Kyle and Borgolte, Kevin and Lindorfer, Martina and Choffnes, David}, booktitle = {Proceedings of the 48th IEEE/ACM International Conference on Software Engineering (ICSE)}, date = {2026-04}, edition = {48}, editor = {Mezini, Mira and Zimmermann, Thomas}, location = {Rio de Janeiro, Brazil}, publisher = {Association for Computing Machinery (ACM)/Institute of Electrical and Electronics Engineers (IEEE)} }
Dark Patterns as Disloyal Design
SSRN Electronic Journal · 2025-01-01
preprintOpen accessEmpirically Measuring Data Localization in the EU
ArXiv.org · 2025-04-12
preprintOpen accessSenior authorEU data localization regulations limit data transfers to non-EU countries with the GDPR. However, BGP, DNS and other Internet protocols were not designed to enforce jurisdictional constraints, so implementing data localization is challenging. Despite initial research on the topic, little is known about if or how companies currently operate their server infrastructure to comply with the regulations. We close this knowledge gap by empirically measuring the extent to which servers and routers that process EU requests are located outside of the EU (and a handful of ``adequate'' non-EU countries). The key challenge is that both browser measurements (to infer relevant endpoints) and data-plane measurements (to infer relevant IP addresses) are needed, but no large-scale public infrastructure allows both. We build a novel methodology that combines BrightData (browser) and RIPE Atlas (data-plane) probes, with joint measurements from over 1,000 networks in 19 EU countries. We find that, on average, 2.3% of servers serving users in each EU country are located in non-adequate destination countries (1.4% of known trackers). Our findings suggest that data localization policies are largely being followed by content providers, though there are exceptions.
Recent grants
CI-New: Collaborative Research: An Open Platform for Internet Routing Experiments
NSF · $361k · 2015–2018
SaTC: Frontiers: Collaborative: Protecting Personal Data Flow on the Internet
NSF · $1.7M · 2020–2026
TWC: Small: Efficient Traffic Analysis Resistance for Anonymity Networks
NSF · $508k · 2016–2020
NSF · $507k · 2019–2022
NeTS: Small: A Principled Approach to Enabling Policy Transparency for Mobile Networks
NSF · $308k · 2016–2020
Frequent coauthors
- 45 shared
Ashwin Rao
University of Helsinki
- 44 shared
Martina Lindorfer
TU Wien
- 43 shared
Narseo Vallina-Rodriguez
- 38 shared
Álvaro Feal
Universidad del Noreste
- 38 shared
Amogh Pradeep
Boston University
- 37 shared
Julien Gamba
Universidad Carlos III de Madrid
- 34 shared
Alan Mislove
Northeastern University
- 30 shared
Fabián E. Bustamante
Northwestern University
Labs
Education
- 1996
Ph.D., Computer Science
Massachusetts Institute of Technology
- 1993
M.S., Computer Science
Massachusetts Institute of Technology
- 1989
B.S., Computer Science
University of California, Berkeley
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with David Choffnes
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup