
Mallesham Dasari
VerifiedNortheastern University · Electrical and Energy Engineering
Active 2014–2026
About
Mallesham Dasari is an Assistant Professor in the Department of Electrical and Computer Engineering at Northeastern University, joining in January 2024. He holds a PhD in Computer Science from Stony Brook University, earned in 2021. His research focuses on networked systems and optimization for immersive Extended Reality (XR) and spatial computing. He leads the Spatial Intelligence Research Group (SINRG), which develops systems and algorithms related to spatial intelligence in computing, covering areas such as advanced 3D content compression, real-time capture and distribution of 4D scenes, and the design of open-source XR hardware. His work aims to advance the entire pipeline of XR technologies, contributing to the development of immersive experiences and spatial computing systems.
Research topics
- Computer Science
- Artificial Intelligence
- Real-time computing
- Computer network
- Multimedia
- Embedded system
- Operating system
- Computer vision
Selected publications
ISM: Intelligent Multi-Path Scheduler for Multi-Camera Networked Systems
2026-04-04
articleSenior authorN4MC: Neural 4D Mesh Compression
arXiv (Cornell University) · 2026-02-23
articleOpen accessSenior authorWe present N4MC, the first 4D neural compression framework to efficiently compress time-varying mesh sequences by exploiting their temporal redundancy. Unlike prior neural mesh compression methods that treat each mesh frame independently, N4MC takes inspiration from inter-frame compression in 2D video codecs, and learns motion compensation in long mesh sequences. Specifically, N4MC converts consecutive irregular mesh frames into regular 4D tensors to provide a uniform and compact representation. These tensors are then condensed using an auto-decoder, which captures both spatial and temporal correlations for redundancy removal. To enhance temporal coherence, we introduce a transformer-based interpolation model that predicts intermediate mesh frames conditioned on latent embeddings derived from tracked volume centers, eliminating motion ambiguities. Extensive evaluations show that N4MC outperforms state-of-the-art in rate-distortion performance, while enabling real-time decoding of 4D mesh sequences. The implementation of our method is available at: https://github.com/frozzzen3/N4MC.
SceneHub4D: A Dataset and Evaluation Framework for 6-DoF 4D VR Scenes
IEEE Transactions on Visualization and Computer Graphics · 2026-03-30
articleVolumetric video and 6-DoF scene capture are becoming central to immersive applications such as telepresence and mixed reality content delivery. However, existing volumetric datasets are often short in duration, restricted to studio-captured human subjects, and provide only limited geometric representations. Consequently, evaluating real-world immersive applications in full-scene contexts often necessitates custom capture and 3D reconstruction setups, creating high practical barriers and ultimately hindering reproducibility. To this end, we present SceneHub4D, a new dataset and evaluation framework. Our dataset captures long, dynamic sequences across diverse real-world indoor environments with synchronized multi-view RGB-D streams, calibrated camera poses, and high-resolution background geometry reconstructed via photogrammetry and LiDAR. We provide multiple 3D representations, including point clouds, textured meshes, and Gaussian splats, along with a software toolkit for format conversion, rendering, and metric evaluation. To support structured comparison and perceptual analysis, we provide supplementary metrics including Geometry Complexity Score and Volumetric Temporal Information, and evaluate rendering performance across desktop GPUs and VR headsets. By lowering the practical barriers to capture, reconstruction, and evaluation, SceneHub4D enables researchers to study immersive 3D streaming and rendering systems without requiring custom hardware setups or complex data collection pipelines. We expect it will serve as a useful foundation for advancing volumetric media research.
4DGStream: Variable Bitrate Dynamic Gaussian Splatting Streaming
IEEE Transactions on Multimedia · 2026-01-01
articleWhile 3D Gaussian Splatting (3DGS) has revolutionized static scene representation, the extension to dynamic scene, i.e., 3DGS video (GSV), faces challenges related to reconstruction quality, rendering speed, and storage requirements. The substantial data volume of current GSV poses significant hurdles for streaming applications, particularly in the realm of AR, VR and MR. To tackle these challenges, we introduce 4DGStream, a novel framework that integrates an efficient GSV compression method, Light4D, and a bitrate adaptation streaming strategy, QoSmooth, to ensure smooth playback while maintaining high visual quality. Light4D employs a binarizationassisted spatiotemporal deformation network to model the deformation of Gaussian primitive attributes over time, while a spatiotemporal-aware masking module prunes trivial Gaussians, further enhancing long-term reconstruction quality. To reduce storage, Light4D uses a binary hash grid to model the entropy of attributes for arithmetic coding, with its binary nature allowing efficient entropy modeling via a Bernoulli distribution. These components enable Light4D to improve the FPS/Storage metric by up to 12.4× over SpacetimeGS and 26.4× over 4DGS on the Neu3D dataset, with performance gains exceeding 3× orders of magnitude compared to other NeRF-based state-of-the-art (SOTA) methods. Here, FPS/Storage reflects the balance between rendering speed and data storage. Despite significant model size reductions, Light4D maintains or surpasses the reconstruction quality of 4DGS. Furthermore, QoSmooth provides effective rate control to enhance playback smoothness, reducing bitrate level switches by 61.6% and increasing time-average utility by 26.2%. All these improvements make 4DGStream highly suited for GSV streaming, improving QoE by 36.7% compared to SOTA methods.
LMG: Efficient Streaming of Layered Mesh-Gaussian 3D Scenes
2026-04-04
article2026-04-04
articleOpen accessThe scalability of multi-user SLAM is fundamentally limited by the constrained network and computational resources. Existing approaches either focus on SLAM for single-user scenarios or overload networks and servers by streaming dense, uniform camera data and treating all users equally. This results in poor pose estimation accuracy or slow updates to multiple users. Our key insight is that the sparsity and heterogeneity of user activity reveal that not all users or frames contribute equally to the shared map. Building on this, we propose SPARC - Proximity-aware Scheduling of AR Mapping and a blur-aware adaptive Cloud-based GenAI sampling method, which together form a cloud-native framework for efficient multi-user SLAM. On the client side, adaptive, context-aware frame transmission selectively forwards high-value frames. On the server side, generative AI (GenAI)-based upsampling reconstructs dense scene features from sparse inputs, while a proximity-aware scheduler prioritizes updates for users with higher drift or critical interactions. Together, these components reduce redundant transmission, improve resource allocation, and enable fairness without sacrificing accuracy. We show through extensive experimentation that our method reduces the latency by 2× to 4× compared to state-of-the-art while maintaining similar or better tracking accuracy. More broadly, this work reimagines SLAM as a cloud-native service, paving the way for scalable, real-time AR/VR applications where many users seamlessly interact in shared environments.
N4MC: Neural 4D Mesh Compression
Open MIND · 2026-02-23
preprintSenior authorWe present N4MC, the first 4D neural compression framework to efficiently compress time-varying mesh sequences by exploiting their temporal redundancy. Unlike prior neural mesh compression methods that treat each mesh frame independently, N4MC takes inspiration from inter-frame compression in 2D video codecs, and learns motion compensation in long mesh sequences. Specifically, N4MC converts consecutive irregular mesh frames into regular 4D tensors to provide a uniform and compact representation. These tensors are then condensed using an auto-decoder, which captures both spatial and temporal correlations for redundancy removal. To enhance temporal coherence, we introduce a transformer-based interpolation model that predicts intermediate mesh frames conditioned on latent embeddings derived from tracked volume centers, eliminating motion ambiguities. Extensive evaluations show that N4MC outperforms state-of-the-art in rate-distortion performance, while enabling real-time decoding of 4D mesh sequences. The implementation of our method is available at: https://github.com/frozzzen3/N4MC.
Invited Talk: Time-Varying Mesh Compression
2025-12-01
articleSenior authorThe proliferation of immersive applications such as telepresence, AR/VR streaming, and 3D digital twins demands efficient capture, compression, and delivery of dynamic 3D scenes. Time-varying meshes (TVMs), sequences of meshes with evolving geometry and topology, are a compact and expressive format for volumetric video, but their high data rates and irregular structure make real-time streaming challenging. This paper presents an overview of the compression of TVMs. The talk also includes two complementary contributions from our group: TVMC, a time-varying mesh compression framework for deformable objects utilizing volume-tracked references, and an extension designed for full, unbounded scene meshes with both static and dynamic parts. Together, these systems demonstrate high compression ratios, low decoding latencies, and robustness to real-world scene complexity, paving the way for scalable, low-bitrate 3D video streaming.
TVMC: Time-Varying Mesh Compression Using Volume-Tracked Reference Meshes
2025-03-26 · 4 citations
articleOpen accessSenior authorTime-varying meshes (TVMs), characterized by their varying connectivity and number of vertices, hold significant potential in AR/VR applications. However, their practical use is challenging due to their large file sizes and the complexity of time-varying topology. Many time-varying mesh compression methods attempted to exploit redundancy between consecutive meshes to compress TVMs more efficiently, however, most face difficulties in establishing stable vertex and surface correspondence between the frames of a TVM. We propose TVMC, a novel TVM compression method that leverages volume tracking and extracts high-quality reference meshes for inter-frame prediction. Specifically, we use as-rigid-as-possible volume tracking to align consecutive TVMs and track volume centers, followed by multidimensional scaling to refine reference centers. This allows us to precisely deform a group of frames to the reference space and extract the reference mesh which is then deformed to approximate each mesh in the group to get displacement fields for TVM compression. Extensive experiments show that TVMC outperforms state-of-the-art methods (e.g., Google Draco, V-DMC 4.0, etc.), with bitrates of 4-6 Mbps compared to 9--12 Mbps for Draco and 10-15 Mbps for V-DMC 4.0. It reduces the decoding time by 66.1% compared to Draco and enables an increased group of frames (up to 15) without significant distortion.
XRFab: Immersive Cleanroom Training with Digital Twins and XR for Semiconductor Manufacturing
2025-10-19
articleSenior authorSemiconductor manufacturing requires high-precision workflows performed in controlled cleanroom environments, making operator training both essential and logistically challenging. Traditional training approaches like classroom/video lectures or equipment simulations often fail to capture the spatial complexity, safety constraints, and decision-making properties of cleanroom tasks like silicon wafer handling. In this paper, we present XRFab, an immersive training system that integrates Extended Reality (XR) with Digital Twin modeling to provide a data-driven, interactive cleanroom simulation. XRFab enables users to experience and manipulate cleanroom workflows under dynamic environmental conditions. We have built a prototype of XRFab using the Meta Quest 3 headset and the Unity platform with haptic feedback. Our system lays the groundwork for safe and low-cost training tools for semiconductor fabrication and highlights a promising direction for merging XR and Digital Twin technologies in industrial education.
Frequent coauthors
- 13 shared
Samir R. Das
Stony Brook University
- 8 shared
Arani Bhattacharya
- 8 shared
Anthony Rowe
- 4 shared
Himanshu Sindhwal
- 4 shared
Pranjal Sahu
Stony Brook University
- 4 shared
Aruna Balasubramanian
Stony Brook University
- 3 shared
Santiago Vargas
Stony Brook University
- 3 shared
Karthikeyan Sundaresan
Georgia Institute of Technology
Labs
Spatial Intelligence Research GroupPI
Education
- 2021
PhD, Computer Science
Stony Brook University
Awards & honors
- ACM Multimedia Systems Conference (MMSys) Best Reproducible…
- IEEE Transactions on Multimedia (2025)
- Journal of Entertainment Computing (2025)
- Spring 2026 PEAK Experiences Awardees for Undergrad Research
- Spring 2025 PEAK Experiences Awardees for Undergrad Research
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Mallesham Dasari
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup