
Kunpeng Zhang
VerifiedUniversity of Maryland, College Park · Decision, Operations & Information Technologies
Active 2007–2026
About
Kunpeng Zhang is an Associate Professor at the Robert H. Smith School of Business. He received his Ph.D. in Computer Science from the McCormick School of Engineering at Northwestern University in 2013. Prior to his current position, he worked as an assistant professor of information and decision science at the University of Illinois, Chicago from 2013 to 2015. His research focuses on applying scalable machine learning, natural language processing, and social network analysis techniques to big data problems in business and healthcare. His work has been published in business journals and computer science conferences, and he has presented at notable conferences such as INFORMS and ICIS. He teaches Python programming in the undergraduate program and big data analytics in the graduate program.
Research topics
- Artificial Intelligence
- Data Mining
- Computer Science
- World Wide Web
- Mathematics
- Data science
- Statistics
- Advertising
- Business
Selected publications
Corrigendum: Score-based Graph Learning for Urban Flow Prediction
ACM Transactions on Intelligent Systems and Technology · 2026-05-14
articleOpen accessThis is a corrigendum for the article "Score-based Graph Learning for Urban Flow Prediction" published in ACM Trans. Intell. Syst. Technol. 15, 3, Article 59 (May 2024), 25 pages.
Data and Methods for Identifying Artificial Intelligence-Related Patents
SSRN Electronic Journal · 2026-01-01
preprintOpen accessSenior authorINFORMS journal on computing · 2025-08-01
articleCommunity search (CS) is a fundamental problem in graph mining, where the goal is to find communities that are relevant to a given query. Traditional CS methods often struggle with scalability and effectiveness, especially when dealing with large graphs or complex queries.
Divide and Contrast: A Text-Based Method for Firm Market Risk Prediction
INFORMS journal on computing · 2025-04-25
articleSenior authorForecasting the market risk for publicly traded companies is a critical task for market participants. Financial economics research demonstrates that the textual information contained in corporate disclosures, such as earnings conference call transcripts, can effectively predict a firm’s future risk. This finding has inspired a growing body of research focused specifically on transcript-based approaches to risk forecasting. However, earnings transcripts are typically long documents with thousands of words. Prior transcript-based risk forecasting studies that represent the entire transcript as one text sequence often fail to capture risk-relevant information and fall short in risk forecasting. In this work, we propose a novel divide-and-contrast machine learning method for predicting risks from earnings conference call transcripts. We exploit the semistructured nature of an earnings transcript and decompose it into several semantically coherent conversation units, ranging from the finest grained question–answer pair level to the coarsest grained transcript level. We then propose contrastive learning objectives as an auxiliary task to the risk forecasting objective, facilitating the learning of risk-relevant information from the earnings transcripts. We conduct experiments on a data set of U.S. market earnings call transcripts. The experimental results show that our proposed divide-and-contrast method substantially outperforms state-of-the-art methods by significantly reducing errors in risk forecasting. This paper sheds light on extracting informative insights from lengthy financial documents to support informed decision making. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.0195 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2023.0195 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
ACM Transactions on Intelligent Systems and Technology · 2025-08-26 · 1 citations
articleOpen accessMost start-ups fail, and early-stage ventures face even lower survival rates. Identifying high-potential start-ups remains a critical challenge for venture capital (VC) investors and policymakers. While predictive models exist, the evolving relationships between VC investors, start-ups, and management teams in dynamic networks are underexplored. We propose a method to predict whether a start-up will succeed within 5 years of its first funding round. Using a 40-year global VC dataset, we model the VC ecosystem as a dynamic bipartite network linking start-ups to individuals (investors/managers). Our approach incrementally updates graph embeddings through unsupervised self-attention to incorporate new nodes, edges, and their neighbors. Node embeddings are further fine-tuned via link prediction and classification tasks, while temporal dependencies are captured to form sequential representations. The model identifies early-stage start-ups with twice the success likelihood of those chosen by professional investors. Key factors including networking and education align with VC literature. Additionally, we provide model complexity analysis and open source our implementation to support practical applications and future research.
Progressive Dependency Representation Learning for Stock Ranking in Uncertain Risk Contrasting
2025-04-04 · 2 citations
articleSelf-correcting convolution segmentation of infected lung regions algorithm
Journal of Computational Methods in Sciences and Engineering · 2025-06-17
articleSenior authorThis paper introduces a novel self-correcting convolution module that significantly differs from traditional convolution techniques by enabling multi-scale feature extraction through heterogeneous kernel processing and feature calibration. The module uniquely combines down-sampling and up-sampling operations within a single convolution block to capture both local and global contexts, addressing key limitations in existing methods. In addition, an improved Dice loss function is proposed, which integrates both under- and over-segmentation penalties through an mDice loss. This is combined with a cross-entropy loss based on classification to optimize segmentation performance. The proposed self-correcting convolution segmentation algorithm demonstrates superior accuracy in segmenting lung infection regions compared to existing methods, particularly the AMSU-Net network. Experimental results indicate that the inclusion of multi-scale spatial information and refined loss functions significantly enhances segmentation precision. The novelty of this research lies in the introduction of a self-correcting convolution module that improves the receptive field and the diversity of extracted features. Furthermore, the enhanced mDice loss function, integrating segmentation penalties, contributes to improved model performance. This method offers a promising advancement in lung infection segmentation using deep learning techniques.
Food Chemistry X · 2025-06-16
articleOpen accessCorrespondingThe physicochemical properties, chemical compositions, antioxidant activities, and volatile profiles of fragrant Semen Trichosanthis oil (FSTO) extracted by cold pressing (CP), solvent extraction (SE), subcritical n -butane extraction (SBE), and supercritical CO 2 extraction (SPE) were systematically investigated. Unsaturated fatty acids predominated in the oils (89.57–92.73 %), primarily trichosanic acid, linoleic acid, and oleic acid, with SBE oil containing the highest proportion (92.73 %). CP oil showed the highest phytosterol content (434.72 mg/100 g) and total flavonoid content, alongside the strongest antioxidant activity. In contrast, SBE oil had the greatest total phenolic content (62.08 μg GAE/g) and oxidative stability (1.24 h). All oils exhibited Newtonian flow behavior, with SBE and SPE oils exhibiting the lowest activation energy. A total of 108 volatile organic compounds were identified by gas chromatography-ion mobility spectrometry, with SBE and SPE oils displaying more complex and abundant profiles, while CP oil better retained natural aroma compounds. These results demonstrate that extraction method significantly influences the composition, functionality, and volatile profile of FSTOs. SBE achieved a favorable balance between oil yield, quality, and sustainability. • Four extraction methods for fragrant Semen Trichosanthis oils (FSTOs) were compared. • FSTOs are rich in UFAs, particularly trichosanic acid and linoleic acid. • Cold pressing oil exhibited the highest phytosterol content and antioxidant activity. • Subcritical n -butane extracted oil had the highest phenolics and oxidative stability. • Gas chromatography-ion mobility spectrometry identified 108 volatile compounds.
Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents
arXiv (Cornell University) · 2025-12-27
preprintOpen accessDue to the high value and high failure rates of startups, predicting their success is a critical challenge. Existing approaches typically model startup success from a single decision-maker's perspective, overlooking the collective dynamics that dominate real-world venture capital (VC) decision-making. We propose SimVC-CAS, a collective agent system that simulates VC decisions as a multi-agent interaction process. By designing role-playing agents and a GNN-based supervised interaction module, we reformulate startup financing prediction as a group decision-making task, capturing both enterprise fundamentals and investor network dynamics. Each agent represents an investor with distinct traits and preferences, enabling heterogeneous evaluations and realistic information exchange over a graph-structured co-investment network. Using both proprietary and public VC data with strict anti-leakage controls, we show that SimVC-CAS significantly improves predictive performance, achieving approximately 25% relative improvement in average precision@10, while exhibiting consistency with real investor decisions. The interaction mechanism is particularly effective for network-central startups, confirming the importance of network in VC decision-making. Analysis of agents' reasoning for decision changes further reveals how network environment influence decision quality, demonstrating the system's interpretability. Our approach may generalize to broader group decision-making scenarios.
Disentangling Inter- and Intra-Cascades Dynamics for Information Diffusion Prediction
IEEE Transactions on Knowledge and Data Engineering · 2025-05-09 · 14 citations
articleInformation diffusion prediction is a vital component for a wide range of social applications, including viral marketing identification and precise recommendation. Prior methods focus on modeling contextual information from a single cascade, ignoring rich collaborative information behind historical interactions across various cascades and future data within the cascade. Leveraging such interactions can substantially enhance diffusion prediction performance but presents two major challenges: (1) user intents are usually entangled behind historical interactions; and (2) utilizing future data may introduce severe training-inference discrepancies. We present MIM, a novel information diffusion model merging multi-scale interactions for improving user intent learning and behavior retrieval. Specifically, we convert cascades and social relations into multi-channel hypergraphs, where each channel depicts a common fine-grained user intent behind historical interactions across cascades. By aggregating embeddings learned through multiple channels, we obtain comprehensive intent representations. Second, we decouple past- and future-level temporal influences within a cascade via a dual temporal network. Then we implement past-future knowledge transferring to enhance the knowledge learnt from the dual network via hierarchical knowledge distillation. Extensive experiments conducted on four datasets demonstrate that MIM significantly outperforms various benchmarks.
Frequent coauthors
- 60 shared
Fan Zhou
- 47 shared
Ting Zhong
University of Electronic Science and Technology of China
- 44 shared
Goce Trajcevski
Iowa State University
- 14 shared
Jitao Wang
Hebei Medical University
- 14 shared
Qiang Gao
Southwestern University of Finance and Economics
- 13 shared
Xovee Xu
University of Electronic Science and Technology of China
- 12 shared
Yongan Chen
Chinese People's Liberation Army
- 9 shared
Harry Jiannan Wang
University of Delaware
Education
- 2013
Ph.D, Computer science
Northwestern University
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Kunpeng Zhang
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup