Resume-aware faculty matching

Find professors who actually fit you

Upload your resume. Four AI agents analyze your background, rank the faculty who fit, inspect their recent research, and help you draft outreach — grounded in their actual work, not templates.

Free to startNo credit cardCancel anytime
Top matches Balanced preset
Dr. Sarah Chen
Stanford · Interpretability · NLP
91
Dr. Marcus Holloway
MIT · Robotics · RL
84
Dr. Aisha Okonkwo
CMU · Fairness · HCI
82
Nova · Professor Researcher · re-ranking top 20…
Juan Pablo Bello

Juan Pablo Bello

· Professor of Computer Science and EngineeringVerified

New York University · Computer Science

Active 1998–2026

h-index45
Citations9.3k
Papers23969 last 5y
Funding$7.3M
See your match with Juan Pablo Bello — sign in to PhdFit.Sign in

About

Juan Pablo Bello is a Professor of Music Technology, Computer Science & Engineering, Electrical & Computer Engineering, and Urban Science at New York University. He received a Bachelor of Engineering in Electronics from the Universidad Simón Bolívar in Caracas, Venezuela in 1998, and earned a doctorate in Electronic Engineering at Queen Mary, University of London in 2003. His expertise lies in digital signal processing, applied machine learning, and their applications in machine listening and music information retrieval. He has published more than 150 papers and articles in books, journals, and conference proceedings. Since 2016, he has served as the director of the Music and Audio Research Lab (MARL), a research center within NYU's Steinhardt School of Education, Culture and Human Development. From 2019 to 2022, he was also the director of the Center for Urban Science and Progress (CUSP), a research center at NYU's Tandon School of Engineering. His work has received support from various public and private institutions including the NSF, DARPA, IMLS, Bosch, Adobe, Google, and iHeartRadio. He is a recipient of an NSF CAREER award and a Fulbright scholar grant for multidisciplinary studies in France.

Research topics

  • Artificial Intelligence
  • Computer Science
  • Machine Learning
  • Speech recognition
  • Data Mining
  • Telecommunications
  • Real-time computing
  • Natural Language Processing
  • Computer Security
  • Multimedia
  • Acoustics
  • Geography
  • World Wide Web
  • Engineering
  • Computer network

Selected publications

  • Evaluating Compositional Structure in Audio Representations

    ArXiv.org · 2026-03-14

    articleOpen accessSenior author

    We propose a benchmark for evaluating compositionality in audio representations. Audio compositionality refers to representing sound scenes in terms of constituent sources and attributes, and combining them systematically. While central to auditory perception, this property is largely absent from current evaluation protocols. Our framework adapts ideas from vision and language to audio through two tasks: A-COAT, which tests consistency under additive transformations, and A-TRE, which probes reconstructibility from attribute-level primitives. Both tasks are supported by large synthetic datasets with controlled variation in acoustic attributes, providing the first benchmark of compositional structure in audio embeddings.

  • Evaluating Compositional Structure in Audio Representations

    arXiv (Cornell University) · 2026-03-14

    preprintOpen accessSenior author

    We propose a benchmark for evaluating compositionality in audio representations. Audio compositionality refers to representing sound scenes in terms of constituent sources and attributes, and combining them systematically. While central to auditory perception, this property is largely absent from current evaluation protocols. Our framework adapts ideas from vision and language to audio through two tasks: A-COAT, which tests consistency under additive transformations, and A-TRE, which probes reconstructibility from attribute-level primitives. Both tasks are supported by large synthetic datasets with controlled variation in acoustic attributes, providing the first benchmark of compositional structure in audio embeddings.

  • Controllable Embedding Transformation for Mood-Guided Music Retrieval

    2026-04-21

    articleOpen access

    Music representations are the backbone of modern recommendation systems, powering playlist generation, similarity search, and personalized discovery. Yet most embeddings offer little control for adjusting a single musical attribute, e.g., changing only the mood of a track while preserving its genre or instrumentation. In this work, we address the problem of controllable music retrieval through embedding-based transformation, where the objective is to retrieve songs that remain similar to a seed track but are modified along one chosen dimension. We propose a novel framework for mood-guided music embedding transformation, which learns a mapping from a seed audio embedding to a target embedding guided by mood labels, while preserving other musical attributes. Because mood cannot be directly altered in the seed audio, we introduce a sampling mechanism that retrieves proxy targets to balance diversity with similarity to the seed. We train a lightweight translation model using this sampling strategy and introduce a novel joint objective that encourages transformation and information preservation. Extensive experiments on two datasets show strong mood transformation performance while retaining genre and instrumentation far better than training-free baselines, establishing controllable embedding transformation as a promising paradigm for personalized music retrieval.

  • Comparative analysis of SVM and logistic regression for classifying diagnostic microRNA signatures in colorectal cancer

    2025-09-20

    articleOpen access

    The Early and accurate classification of gene signatures is critical for improving colorectal cancer (CRC) diagnosis. While previous studies have applied machine learning to microRNA datasets, few have combined feature selection and extraction methods in aunified diagnostic pipeline. This study proposes a novel integration of Genetic Algorithm (GA) and Independent Component Analysis (ICA) for selecting and extracting relevant features from high-dimensional microRNA data. GA is used as a wrapper-based feature selection method to reduce the original 2457 features to 52, while ICA further transforms these into 12 uncorrelated components. These components are then classified using Support Vector Machine (SVM) and Logistic Regression (LR) models. Using the GA–ICA–SVM pipeline, we achieved an AUC of 0.8347, outperforming the LR model, which achieved an AUC of 0.7318. This approach demonstrates improved performance and efficiency in detecting CRC-related biomarkers and offers a reproducible framework for biomarker-based cancer diagnosis.

  • Balancing Information Preservation and Disentanglement in Self-Supervised Music Representation Learning

    ArXiv.org · 2025-07-30

    preprintOpen accessSenior author

    Recent advances in self-supervised learning (SSL) methods offer a range of strategies for capturing useful representations from music audio without the need for labeled data. While some techniques focus on preserving comprehensive details through reconstruction, others favor semantic structure via contrastive objectives. Few works examine the interaction between these paradigms in a unified SSL framework. In this work, we propose a multi-view SSL framework for disentangling music audio representations that combines contrastive and reconstructive objectives. The architecture is designed to promote both information fidelity and structured semantics of factors in disentangled subspaces. We perform an extensive evaluation on the design choices of contrastive strategies using music audio representations in a controlled setting. We find that while reconstruction and contrastive strategies exhibit consistent trade-offs, when combined effectively, they complement each other; this enables the disentanglement of music attributes without compromising information integrity.

  • Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach

    ArXiv.org · 2025-07-08

    preprintOpen accessSenior author

    Acoustic mapping techniques have long been used in spatial audio processing for direction of arrival estimation (DoAE). Traditional beamforming methods for acoustic mapping, while interpretable, often rely on iterative solvers that can be computationally intensive and sensitive to acoustic variability. On the other hand, recent supervised deep learning approaches offer feedforward speed and robustness but require large labeled datasets and lack interpretability. Despite their strengths, both methods struggle to consistently generalize across diverse acoustic setups and array configurations, limiting their broader applicability. We introduce the Latent Acoustic Mapping (LAM) model, a self-supervised framework that bridges the interpretability of traditional methods with the adaptability and efficiency of deep learning methods. LAM generates high-resolution acoustic maps, adapts to varying acoustic conditions, and operates efficiently across different microphone arrays. We assess its robustness on DoAE using the LOCATA and STARSS benchmarks. LAM achieves comparable or superior localization performance to existing supervised methods. Additionally, we show that LAM's acoustic maps can serve as effective features for supervised models, further enhancing DoAE accuracy and underscoring its potential to advance adaptive, high-performance sound localization systems.

  • Balancing Information Preservation and Disentanglement in Self-Supervised Music Representation Learning

    2025-10-12

    articleSenior author

    Recent advances in self-supervised learning (SSL) methods offer a range of strategies for capturing useful representations from music audio without the need for labeled data. While some techniques focus on preserving comprehensive details through reconstruction, others favor semantic structure via contrastive objectives. Few works examine the interaction between these paradigms in a unified SSL framework. In this work, we propose a multi-view SSL framework for disentangling music audio representations that combines contrastive and reconstructive objectives. The architecture is designed to promote both information fidelity and structured semantics of factors in disentangled subspaces. We perform an extensive evaluation on the design choices of contrastive strategies using music audio representations in a controlled setting. We find that while reconstruction and contrastive strategies exhibit consistent trade-offs, when combined effectively, they complement each other; this enables the disentanglement of music attributes without compromising information integrity.

  • Latent Multi-view Learning for Robust Environmental Sound Representations

    ArXiv.org · 2025-10-02

    preprintOpen accessSenior author

    Self-supervised learning (SSL) approaches, such as contrastive and generative methods, have advanced environmental sound representation learning using unlabeled data. However, how these approaches can complement each other within a unified framework remains relatively underexplored. In this work, we propose a multi-view learning framework that integrates contrastive principles into a generative pipeline to capture sound source and device information. Our method encodes compressed audio latents into view-specific and view-common subspaces, guided by two self-supervised objectives: contrastive learning for targeted information flow between subspaces, and reconstruction for overall information preservation. We evaluate our method on an urban sound sensor network dataset for sound source and sensor classification, demonstrating improved downstream performance over traditional SSL techniques. Additionally, we investigate the model's potential to disentangle environmental sound attributes within the structured latent space under varied training configurations.

  • Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach

    2025-10-12

    articleSenior author

    Acoustic mapping techniques have long been used in spatial audio processing for direction of arrival estimation (DoAE). Traditional beamforming methods for acoustic mapping, while interpretable, often rely on iterative solvers that can be computationally intensive and sensitive to acoustic variability. On the other hand, recent supervised deep learning approaches offer feedforward speed and robustness but require large labeled datasets and lack interpretability. Despite their strengths, both methods struggle to consistently generalize across diverse acoustic setups and array configurations, limiting their broader applicability. We introduce the Latent Acoustic Mapping (LAM) model, a self-supervised framework that bridges the interpretability of traditional methods with the adaptability and efficiency of deep learning methods. LAM generates high-resolution acoustic maps, adapts to varying acoustic conditions, and operates efficiently across different microphone arrays. We assess its robustness on DoAE using the LOCATA and STARSS benchmarks. LAM achieves comparable or superior localization performance to existing supervised methods. Additionally, we show that LAM’s acoustic maps can serve as effective features for supervised models, further enhancing DoAE accuracy and underscoring its potential to advance adaptive, high- performance sound localization systems.

  • Towards Few-Shot Training-Free Anomaly Sound Detection

    2025-08-17

    articleSenior author

Recent grants

Frequent coauthors

  • Justin Salamon

    124 shared
  • Vincent Lostanlen

    Centre National de la Recherche Scientifique

    72 shared
  • Andrew Farnsworth

    Cornell University

    57 shared
  • Mark Cartwright

    New York University

    56 shared
  • Rachel Bittner

    55 shared
  • Magdalena Fuentes

    31 shared
  • Ho-Hsiang Wu

    Robert Bosch (United States)

    30 shared
  • Brian McFee

    29 shared

Awards & honors

  • Frontier Award from the National Science Foundation
  • CAREER Award from the National Science Foundation
  • Fulbright Scholar Grant
  • Resume-aware match score
  • Save to shortlist
  • AI-drafted outreach

See your match with Juan Pablo Bello

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

  • Free to start
  • No credit card
  • 30-second signup