
About
Yuke Wang is an assistant professor at Rice University's Computer Science department starting in Fall 2025. He earned his Ph.D. in Computer Science from the University of California, Santa Barbara (UCSB), where he worked under the supervision of Professor Yufei Ding. Prior to that, he obtained his Bachelor of Engineering in Software Engineering from the University of Electronic Science and Technology of China (UESTC) in 2018. Yuke Wang's research interests focus on systems for deep learning and parallel programming, with recent projects emphasizing generative AI applications such as large language models (LLMs) and diffusion models, and their acceleration on heterogeneous platforms including GPUs and TPUs. He aims to facilitate efficient, scalable, and secure deep learning in the future. Yuke Wang has been recognized with prestigious awards including the NVIDIA Graduate Fellowship (2022-2023), which is awarded to the top 10 global applicants, as well as multiple faculty research awards from Google and Amazon. His work spans efficient deep learning systems, scalable architectures, and secure deep learning frameworks, contributing to advancing the state of the art in these areas.
Research topics
- Machine Learning
- Computer Science
- Artificial Intelligence
- Computer Security
- Particle physics
- Nuclear physics
- Astronomy
- Engineering
- Computer network
- Distributed computing
- Physics
Selected publications
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios
ArXiv.org · 2025-03-08
preprintOpen accessCurrently, long-chain reasoning remains a key challenge for large language models (LLMs) because natural texts lack sufficient explicit reasoning data. However, existing benchmarks suffer from limitations such as narrow coverage, short reasoning paths, or high construction costs. We introduce SCoRE (Scenario-based Commonsense Reasoning Evaluation), a benchmark that synthesizes multi-hop questions from scenario schemas of entities, relations, and logical rules to assess long-chain commonsense reasoning. SCoRE contains 100k bilingual (Chinese-English) multiple-choice questions whose reasoning chains span 2-11 hops and are grouped into various difficulty levels. Each question is accompanied by fine-grained knowledge labels, explicit reasoning chains, and difficulty levels for diagnostic evaluation. Evaluation results on cutting-edge LLMs such as o3-mini and Deepseek R1 shows that even the best model attains only 69.78% accuracy on SCoRE (even only 47.91% on the hard set), with errors often stemming from rare knowledge, logical inconsistency, and over-interpretation of simple questions. SCoRE offers a scalable, extensible framework for evaluating and diagnosing the long-chain commonsense reasoning abilities of LLMs and guiding future advances in model design and training.
A Selective Learning Method for Temporal Graph Continual Learning
ArXiv.org · 2025-03-03
preprintOpen accessNode classification is a key task in temporal graph learning (TGL). Real-life temporal graphs often introduce new node classes over time, but existing TGL methods assume a fixed set of classes. This assumption brings limitations, as updating models with full data is costly, while focusing only on new classes results in forgetting old ones. Graph continual learning (GCL) methods mitigate forgetting using old-class subsets but fail to account for their evolution. We define this novel problem as temporal graph continual learning (TGCL), which focuses on efficiently maintaining up-to-date knowledge of old classes. To tackle TGCL, we propose a selective learning framework that substitutes the old-class data with its subsets, Learning Towards the Future (LTF). We derive an upper bound on the error caused by such replacement and transform it into objectives for selecting and learning subsets that minimize classification error while preserving the distribution of the full old-class data. Experiments on three real-world datasets validate the effectiveness of LTF on TGCL.
S-IRIS: Statistical CSI based Optimization for Irregular RIS using Reinforcement Learning
2025-06-08 · 1 citations
articleIn this paper, we investigate the transmission design for irregular reconfigurable intelligent surface (IRIS) downlink communications in 6G multiuser wireless systems. We formulate it as an optimization problem based on the statistical channel state information (CSI), where we jointly optimize the precoding at the base station (BS), and the antenna element selection and phase shifts at the IRIS. In doing so, we derive a closed-form ergodic achievable sum rate expression by leveraging the channel statistical properties. Aiming to maximize this rate, we further propose a deep reinforcement learning (DRL) method via deep deterministic policy gradient (DDPG) for a joint parameter design for antenna activation, beamforming, and power allocation, while properly incorporating the physical constraints on the total transmit power budget. Simulation results verify that our DRL-based algorithm learns an optimal policy from its environment, and outperforms the state-of-the-art benchmarks in real situations, where the statistical CSI varies less and is more stable to obtain than the instantaneous CSI.
Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering · 2025-11-27
articleTraffic flow forecasting is a fundamental task in intelligent transportation systems, directly supporting traffic control, congestion mitigation, and urban mobility planning. However, prediction remains difficult owing to nonlinear dynamics, rapidly changing spatiotemporal dependencies, and the integration of heterogeneous data. This paper proposes ASISTGCRN, an attention-based spatiotemporal graph convolutional recurrent network that introduces a tri-cycle segmentation strategy to capture recent, daily, and weekly periodic patterns. Central to the framework is a spatiotemporal block that integrates multi-head temporal and spatial attention with Dynamic Time Warping to account for both dynamic local dependencies and remote functional similarities. Adaptive graph convolution and gated recurrent units are employed to jointly model spatial structures and temporal sequences, while Transformer- and Informer-based attention layers are further applied to capture long-range dependencies, yielding two model variants, T-ASISTGCRN and I-ASISTGCRN. Extensive experiments on four widely used benchmark datasets (PEMS03, PEMS04, PEMS07M, PEMS08) demonstrate that ASISTGCRN consistently outperforms 17 baseline approaches across MAE, RMSE, and MAPE metrics. Ablation studies further verify the contribution of the tri-cycle segmentation, spatiotemporal block, and attention modules. These results indicate that the proposed framework offers improved robustness and accuracy in traffic flow prediction and has practical relevance for the design of advanced traffic management and planning strategies in complex urban environments.
Physics‐guided graph neural network for cable deployment optimization in frame structures
Computer-Aided Civil and Infrastructure Engineering · 2025-11-07 · 4 citations
articleA vortex‐induced vibration warning method based on ensemble‐learning‐embedded neural network
Computer-Aided Civil and Infrastructure Engineering · 2025-11-26
articleSpectrum Prediction via Graph Structure Learning
2024-10-07 · 4 citations
articleWith the rapid development of machine learning technologies, data-driven spectrum prediction enables intelligent dynamic spectrum access to alleviate the bottleneck of spectrum resource scarcity and congestion. However, spectrum prediction still faces some key challenges, including how to exploit the implicit but crucial multi-band correlations in wideband spectrum data, and how to capture the temporal dynamics across different bands. Due to the ignorance of such crucial features inherent from spectrum occupancy patterns, existing learning-based spectrum prediction methods unfortunately suffer from inaccurate prediction performance. To fill this gap, this paper develops a novel model of graph convolutional regression neural network (GCRNN), by introducing efficient graph structure learning (GSL-GCRNN) for dynamic multi-band spectrum prediction. The proposed GSL-GCRNN model is designed to adaptively learn both the multi-band and temporal correlations in dynamic wideband spectrum scenarios. Empowered by the graph structure estimator, graph convolutional networks are fueled to effectively extract the correlations in the frequency domain, followed by gated recurrent unit networks to further extract the temporal correlations of each band. It is worth noting that the graph structure estimator further enables to learn the multi-band correlations across different time periods on-the-fly, enhancing the accuracy of wideband spectrum prediction in dynamic environments. Simulation results verify that our GSLGCRNN approach outperforms the benchmark methods.
Value of texture analysis based on R2* map for predicting early recurrence of HCC after hepatectomy
Proceedings on CD-ROM - International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition/Proceedings of the International Society for Magnetic Resonance in Medicine, Scientific Meeting and Exhibition · 2024-08-14
articleHepatectomy is an important therapeutic method for hepatocellular carcinoma (HCC). The overall survival rate of patients with early recurrence is often lower than that of patients without early recurrence. Therefore, this worked aimed at investigating the value of texture analysis based on R2* map for predicting early recurrence of HCC after hepatectomy. Eleven optimal texture features were obtained to predict early recurrence of HCC. This research suggested that R2* map texture analysis had certain predictive value for early recurrence of HCC after hepatectomy, which was valuable for noninvasively, preoperatively and accurately predicting the prognostic factors during clinical practice.
GANFed: GAN-Based Federated Learning with Non-IID Datasets in Edge IoTs
2024-06-09 · 3 citations
articleFederated learning (FL) is a promising distributed learning framework in terms of privacy protection and communication saving. Most existing FL techniques are developed for independent-and-identically-distributed (IID) datasets, but suffer from performance degradation under Non-IID datasets. To cope with this issue, most existing work designs solutions from data perspectives (e.g., sharing some data samples between local devices) to eliminate the heterogeneity of distributed datasets, which causes extra communication overhead and may expose user privacy that contradicts FL's original intention. Unlike the existing data-based methods, we propose a generative adversarial network (GAN) based FL, named as GANFed, which is designed from a feature perspective. Specifically, we embed a discriminator into the FL network, which works with the shallow layers as a generator to form a GAN in FL. By incorporating such a GAN, the output of the shallow layers tends to present more IID features compared with the original Non-IID input data. These extracted features from the shallow layers are then used to train the deep layers of the FL network. In this way, the proposed GANFed reduces the weight divergence of the local models, and hence improves the performance of FL. Without data exchange, our GANFed avoids the leakage of user privacy and reduces the communication overhead. Experimental results show that our GANFed outperforms the standard FedAvg on Non-IID dataset in terms of improved test accuracy.
2024-02-01 · 1 citations
article1st authorCorrespondingDeep Reinforcement Learning (DRL) enhances the efficiency of Autonomous Vehicles (AV), but also makes them susceptible to backdoor attacks that can result in traffic congestion or collisions. Backdoor functionality is typically incorporated by contaminating training datasets with covert malicious data to maintain high precision on genuine inputs while inducing the desired (malicious) outputs for specific inputs chosen by adversaries. Our proposed method adds well-designed noise to the input to neutralize backdoors. The approach involves learning an optimal smoothing (noise) distribution to preserve the normal functionality of genuine inputs while neutralizing backdoors. By doing so, the resulting model is expected to be more resilient against backdoor attacks while maintaining high accuracy on genuine inputs. The effectiveness of the proposed method is verified on a simulated traffic system based on a microscopic traffic simulator, where experimental results showcase that the smoothed traffic controller can neutralize all trigger samples and maintain the performance of relieving traffic congestion.
Frequent coauthors
- 18 shared
Yingyan Lin
- 13 shared
Zhangyang Wang
- 11 shared
Chaojian Li
Georgia Institute of Technology
- 11 shared
Yang Zhao
- 9 shared
Yonggan Fu
Georgia Institute of Technology
- 6 shared
Haoran You
- 6 shared
Pengfei Xu
- 5 shared
Xiaohan Chen
Awards & honors
- NVIDIA Graduate Fellowship 2022
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Yuke Wang
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup