Jierui Peng

PhD Student

PhD student in Computer Science at Case Western Reserve University, working on embodied AI at the intersection of computer vision and machine learning, with a focus on world modeling and real-time intelligent systems. Research Assistant at University Hospitals, exploring clinically grounded AI for perception, reasoning, and decision-making in real-world environments.

Education

B.S., Brandeis University, 2021
M.S., New York University, 2023
Ph.D., Case Western Reserve University, current

I am a PhD student in Computer Science at Case Western Reserve University, working at the intersection of embodied AI, computer vision, and machine learning. My work focuses on building intelligent systems that can perceive, reason, and act in the real world, with an emphasis on embodied AI, world modeling, and efficient real-time inference. I am a member of the VU Lab, co-advised by Prof. Vipin Chaudhary and Prof. Yu Yin. I am also a Research Assistant at University Hospitals, supervised by Dr. Sanjay Rajagopalan.

I am particularly interested in bridging the gap between high-level understanding and real-world execution. My research explores how structured reasoning, language grounding, and causal understanding can be leveraged to build embodied systems that are more reliable, interpretable, and generalizable.

My research focuses on embodied AI, aiming to bridge the gap between high-level understanding and real-world execution. I develop systems that integrate structured reasoning, language grounding, and world modeling to enable efficient, interpretable, and generalizable perception–action pipelines in real-world environments.

Current Projects

NEBULA: Diagnostic and Robust Evaluation for Vision-Language-Action Systems

This project introduces NEBULA, a unified ecosystem for evaluating Vision-Language-Action (VLA) agents through a dual-axis framework that disentangles capability and robustness. It addresses the limitations of traditional end-task success metrics by proposing fine-grained capability tests with controlled variable isolation and systematic stress tests for reliability assessment. In addition, NEBULA provides a standardized data format, unified API, and large-scale aggregated dataset to enable reproducible cross-dataset training and benchmarking, revealing critical failure modes in modern embodied agents.

Related Publications:

CLAIRE: Causally Explainable AI for EKG-based Risk Prediction

This project presents CLAIRE, a causally explainable AI framework for predicting mortality and major adverse cardiovascular events (MACE) from structured EKG data. The system integrates large language models with structured clinical features to enable both high predictive performance and interpretable reasoning. A two-stage pipeline combines end-to-end prediction with feature attribution and causal graph generation, linking EKG abnormalities to physiological mechanisms. The framework achieves strong accuracy while providing clinically validated explanations, bridging the gap between black-box prediction and mechanistic understanding in medical AI.

Related Publications:

RT-LTP: Real-Time Latent Trajectory Prediction with Efficient Online Adaptation

This project proposes RT-LTP, an efficient trajectory prediction framework designed for real-time online learning under distribution shift. The method reformulates trajectory forecasting as a latent-space alignment problem, predicting future motion in a compact, semantically consistent latent space. It incorporates a lightweight low-rank adaptation module to enable fast test-time learning without full model updates, significantly reducing optimization latency. The approach improves both prediction accuracy and computational efficiency, enabling robust deployment in high-speed dynamic environments such as autonomous driving.

Related Publications:

NEBULA: Do We Evaluate Vision-Language-Action Agents Correctly?

2026

Jierui Peng , Yanyan Zhang , Tuo Liang

arXiv preprint arXiv:2510.16263

This paper introduces NEBULA, a unified ecosystem for evaluating Vision-Language-Action (VLA) agents beyond coarse end-task success metrics. It proposes a novel dual-axis evaluation framework that combines fine-grained capability tests for skill-specific diagnosis with systematic stress tests to measure robustness under real-world perturbations. In addition, NEBULA standardizes fragmented embodied AI datasets through a unified data format and API, enabling reproducible cross-dataset training and benchmarking. Experimental results reveal that state-of-the-art VLA models exhibit significant hidden weaknesses in critical capabilities such as spatial reasoning and dynamic adaptation, highlighting the need for more interpretable and reliability-aware evaluation. [oai_citation:0‡ICLR_2026_Nebula_Final.pdf](sediment://file_000000002e8c722f8ce2ecef4cc5af26)

Artificial Intelligence

arXiv Code

BibTeX Citation

@misc{peng2025nebulaevaluatevisionlanguageactionagents,
      title={NEBULA: Do We Evaluate Vision-Language-Action Agents Correctly?}, 
      author={Jierui Peng and Yanyan Zhang and Yicheng Duan and Tuo Liang and Vipin Chaudhary and Yu Yin},
      year={2025},
      eprint={2510.16263},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2510.16263}, }

Mentors

Vipin Chaudhary, PhD

Kevin J. Kranzusch Chair, Computer and Data Sciences,
Center for PEATAI, Case School of Engineering

Yu Yin, PhD

Assistant Professor, Department of Computer and Data Sciences, Case School of Engineering

Collaborators

Yanyan Zhang

PhD Student