
Yifan Sun
· Assistant ProfessorVerifiedStony Brook University · Mathematics
Active 2013–2026
About
Yifan Sun is an Assistant Professor in the Department of Computer Science at Stony Brook University. She received her PhD in Electrical Engineering from the University of California Los Angeles in 2015, where her research focused on convex optimization and semidefinite programming. Following her doctoral studies, she worked at Technicolor Research and Innovation, concentrating on machine learning and data science applications. She has also completed two postdoctoral positions, one at the University of British Columbia in Vancouver, Canada, and another at L'Institut National de Rechereche en Informatique et Automatique (INRIA) in Paris, France.
Research topics
- Computer Science
- Artificial Intelligence
- Political Science
- Business
- Data Mining
- Engineering
- Psychology
- Computer Security
- Sociology
- Social Science
- Public relations
- Medicine
- Parallel computing
- Economics
- Internet privacy
- Marketing
- Social psychology
- Management science
- Microeconomics
- Industrial organization
- Geography
- Human–computer interaction
- Mathematics
- Data science
Selected publications
HMS-BERT: Hybrid Multi-Task Self-Training for Multilingual and Multi-Label Cyberbullying Detection
arXiv (Cornell University) · 2026-03-13
preprintOpen accessCyberbullying on social media is inherently multilingual and multi-faceted, where abusive behaviors often overlap across multiple categories. Existing methods are commonly limited by monolingual assumptions or single-task formulations, which restrict their effectiveness in realistic multilingual and multi-label scenarios. In this paper, we propose HMS-BERT, a hybrid multi-task self-training framework for multilingual and multi-label cyberbullying detection. Built upon a pretrained multilingual BERT backbone, HMS-BERT integrates contextual representations with handcrafted linguistic features and jointly optimizes a fine-grained multi-label abuse classification task and a three-class main classification task. To address labeled data scarcity in low-resource languages, an iterative self-training strategy with confidence-based pseudo-labeling is introduced to facilitate cross-lingual knowledge transfer. Experiments on four public datasets demonstrate that HMS-BERT achieves strong performance, attaining a macro F1-score of up to 0.9847 on the multi-label task and an accuracy of 0.6775 on the main classification task. Ablation studies further verify the effectiveness of the proposed components.
HMS-BERT: Hybrid Multi-Task Self-Training for Multilingual and Multi-Label Cyberbullying Detection
arXiv (Cornell University) · 2026-03-13
articleOpen accessCyberbullying on social media is inherently multilingual and multi-faceted, where abusive behaviors often overlap across multiple categories. Existing methods are commonly limited by monolingual assumptions or single-task formulations, which restrict their effectiveness in realistic multilingual and multi-label scenarios. In this paper, we propose HMS-BERT, a hybrid multi-task self-training framework for multilingual and multi-label cyberbullying detection. Built upon a pretrained multilingual BERT backbone, HMS-BERT integrates contextual representations with handcrafted linguistic features and jointly optimizes a fine-grained multi-label abuse classification task and a three-class main classification task. To address labeled data scarcity in low-resource languages, an iterative self-training strategy with confidence-based pseudo-labeling is introduced to facilitate cross-lingual knowledge transfer. Experiments on four public datasets demonstrate that HMS-BERT achieves strong performance, attaining a macro F1-score of up to 0.9847 on the multi-label task and an accuracy of 0.6775 on the main classification task. Ablation studies further verify the effectiveness of the proposed components.
Curvature on Graphs with Negative Edge Weights
IEEE Transactions on Network Science and Engineering · 2026-01-01
articleOpen accessgraph and of graph frustration, to capture antagonistic effects of signed edge weights modeling promotion (+) or inhibition (-); a balanced graph is one where every cycle has an even number of negative edge weights, and frustration quantifies the degree of deviation from a balanced graph. Based on these concepts, we introduce modified Ollivier-Ricci-inspired fragility indices that point to pathways that magnify frustration in unbalanced graphs. We study two types of networks, gene regulatory and social networks, to demonstrate the utility of the fragility indices to impede or enhance functionality with respect to graph frustration. Our results demonstrate that, indeed, these new indices better identify critical edges, as quantified by several global measures, than other commonly used indices.
The Ultrasound Journal · 2025-10-14 · 1 citations
articleOpen accessPURPOSE: Utilizing ultrasonic imaging technology, this study assessed and compared the thickness and elasticity features of the abdominal and spinal back muscles in patients with idiopathic scoliosis to those of healthy individuals. The objective was to elucidate the mechanical adaptations in spinal muscles among IS patients. METHODS: This cross-sectional study included 38 patients diagnosed with idiopathic scoliosis and 33 healthy controls. Outcome measures comprised the Cobb angle, spinal curvature, muscle thickness, and muscle elasticity. Ultrasound elastography imaging was employed to assess the thickness and elasticity of the erector spinae, rectus abdominis, external oblique, and transverse abdominis muscles bilaterally at corresponding spinal levels. The objective was to document and compare the ultrasonic imaging characteristics of these muscles in individuals with idiopathic scoliosis and in the normal population. RESULTS: The study findings indicated that idiopathic scoliosis patients had significantly lower body weight than the control group, with C7-CSVL notably greater in the idiopathic scoliosis group than in healthy individuals. Muscle thickness was substantially reduced on both the concave and convex sides at T6, T10, and L3 levels of the erector spinae, as well as in the rectus abdominis (RA) and transverse abdominis (TrA) muscles, relative to the normal cohort. Additionally, idiopathic scoliosis patients exhibited increased elasticity in the erector spinae muscle on the convex side at T6, while the elasticity of the erector spinae muscle on the concave side at L3 was significantly lower compared to healthy individuals. CONCLUSIONS: This study, utilizing ultrasound elastography imaging technology, unveiled distinct features in individuals with mild idiopathic scoliosis, including decreased muscle thickness in the erector spinae at T6, T10, and L3 levels, as well as heightened elasticity in the thoracic region and reduced elasticity in the lumbar region. The findings presented in this study provide insights for diagnostic strategies in individuals with early-stage scoliosis.
Pseudo Label Learning for Partial Point Cloud Registration
IEEE Transactions on Visualization and Computer Graphics · 2025-08-19
articlePartial point cloud registration plays a crucial role in computer vision and has widespread applications in 3D map construction, pose estimation, and high-precision localization. However, the collected point clouds often contain missing data due to hardware limitations and complex environments. Various partial registration algorithms have been proposed, most of which rely on estimating overlap regions. However, a significant proportion of these algorithms rely heavily on ground truth labels. Manual labeling is both time-consuming and labor-intensive, whereas algorithmic automatic labeling lacks sufficient accuracy. To tackle this issue, we present PSEudo Label learning for unsupervised partial point cloud registration (PSEL). This method utilizes complementary tasks to learn reliable pseudo labels for overlap regions and correspondences without depending on ground truth labels. The key idea is to use the complementarity between overlap estimation and registration to generate two types of pseudo labels based on the nearest points in pairs of aligned point clouds. These pseudo labels are then employed to supervise the learning of overlap regions and correspondences, gradually enhancing their accuracy throughout the learning process and ultimately establishing an unsupervised learning framework. PSEL consists of an overlap estimation module and a correspondence filtering module. The pseudo labels generated after registration are used to supervise both modules. Notably, the correspondence filtering module has two pipelines. The similarity and difference of the corresponding point features are used to eliminate false correspondences during the training and inference stages, respectively, with only the latter being optimized with pseudo labels. To validate the effectiveness of our registration method, we conducted experiments using the synthetic dataset ModelNet40, the indoor dataset 3DMatch, and the outdoor dataset KITTI.
Prototype-Enhanced Few-Shot Relation Extraction Method Based on Cluster Loss Optimization
Symmetry · 2025-10-07
articleOpen accessThe purpose of few-shot relation extraction (RE) is to recognize the relationship between specific entity pairs in text when there are a limited number of labeled samples. A few-shot RE method based on a prototype network, which constructs relation prototypes by relying on the support set to assign labels to query samples, inherently leverages the symmetry between support and query processing. Although these methods have achieved remarkable results, they still face challenges such as the misjudging of noisy samples or outliers, as well as distinguishing semantic similarity relations. To address the aforementioned challenges, we propose a novel semantic enhanced prototype network, which can integrate the semantic information of relations more effectively to promote more expressive representations of instances and relation prototypes, so as to improve the performance of the few-shot RE. Firstly, we design a prompt encoder to uniformly process different prompt templates for instance and relation information, and then utilize the powerful semantic understanding and generation capabilities of large language models (LLMs) to obtain precise semantic representations of instances, their prototypes, and conceptual prototypes. Secondly, graph attention learning techniques are introduced to effectively extract specific-relation features between conceptual prototypes and isomorphic instances while maintaining structural symmetry. Meanwhile, a prototype-level contrastive learning strategy with bidirectional feature symmetry is proposed to predict query instances by integrating the interpretable features of conceptual prototypes and the intra-class shared features captured by instance prototypes. In addition, a clustering loss function was designed to guide the model to learn a discriminative metric space with improved relational symmetry, effectively improving the accuracy of the model’s relationship recognition. Finally, the experimental results on the FewRel1.0 and FewRel2.0 datasets show that the proposed approach delivers improved performance compared to existing advanced models in the task of few-shot RE.
ArXiv.org · 2025-11-13
preprintOpen accessSpatio-temporal graphs are powerful tools for modeling complex dependencies in traffic time series. However, the distributed nature of real-world traffic data across multiple stakeholders poses significant challenges in modeling and reconstructing inter-client spatial dependencies while adhering to data locality constraints. Existing methods primarily address static dependencies, overlooking their dynamic nature and resulting in suboptimal performance. In response, we propose Federated Spatio-Temporal Graph with Dynamic Inter-Client Dependencies (FedSTGD), a framework designed to model and reconstruct dynamic inter-client spatial dependencies in federated learning. FedSTGD incorporates a federated nonlinear computation decomposition module to approximate complex graph operations. This is complemented by a graph node embedding augmentation module, which alleviates performance degradation arising from the decomposition. These modules are coordinated through a client-server collective learning protocol, which decomposes dynamic inter-client spatial dependency learning tasks into lightweight, parallelizable subtasks. Extensive experiments on four real-world datasets demonstrate that FedSTGD achieves superior performance over state-of-the-art baselines in terms of RMSE, MAE, and MAPE, approaching that of centralized baselines. Ablation studies confirm the contribution of each module in addressing dynamic inter-client spatial dependencies, while sensitivity analysis highlights the robustness of FedSTGD to variations in hyperparameters.
Luthier: A Dynamic Binary Instrumentation Framework Targeting AMD GPUs
2025-05-11
articleDynamic Binary instrumentation (DBI) is a widely used technique for collecting detailed, fine-grained information from program execution without requiring recompilation or access to the program's source code. DBI provides several benefits over static instrumentation, including full code discovery and the ability to selectively toggle profiling during runtime. Luthier is an open-source DBI framework targeting AMD GPUs, designed to integrate and run seamlessly on the ROCm software stack. During runtime, Luthier allows inspection of loaded GPU code objects and carries out instrumentation by either manually modifying instructions or inserting calls to special device functions (i.e., 'hooks'') at user-specified locations in the program. Luthier hooks allow inspection and modification of the device visible state, and can communicate with the host via host-accessible device memory buffers. Luthier also supports switching between instrumented and un-instrumented versions of a kernel. In this paper, we describe some of the key design challenges we encountered when developing this open-source DBI framework. We then showcase Luthier's user-facing APIs and internal components, providing example usecases implemented using our framework. While Luthier incurs a 50X runtime overhead (on average) when running an instrumented application, this overhead is 10 times lower as compared to the state-of-the-art GPU-based DBI framework, when running equivalent tools on the same workload written in CUDA.
TrioSim: A Lightweight Simulator for Large-Scale DNN Workloads on Multi-GPU Systems
2025-06-20 · 2 citations
articleOpen accessSenior authorJournal of Environmental Management · 2025-05-15 · 2 citations
article
Frequent coauthors
- 35 shared
David Kaeli
Northeastern University
- 18 shared
Yixuan Zhang
- 18 shared
John Kim
- 16 shared
Trinayan Baruah
- 16 shared
José Luis Abellán
Universidad de Murcia
- 14 shared
Ajay Joshi
- 12 shared
Saiful A. Mojumder
- 12 shared
Andrea G. Parker
Georgia Institute of Technology
Labs
Institute for Advanced Computational SciencePI
Education
- 2005
Ph.D., Computer Science
University of California, Los Angeles
- 2001
M.S., Computer Science
University of California, Los Angeles
- 1998
B.S., Computer Science
University of Science and Technology of China
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yifan Sun
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup