Education
- PhD in Computer Science, Case Western Reserve University, Spring 2023 – present
Awards and Honors
- Department Outstanding Graduate Research Award, Case Western Reserve University, 2024
Current Projects
Representation Typicality Estimation
Estimating the typical set of a distribution in representation space to detect outliers and unsafe inputs. This line of work began with Forte (ICLR 2025), which uses self-supervised representations and manifold statistics for OOD detection, and extends to Trust The Typical (ICLR 2026), which reframes LLM safety as an OOD problem and integrates into vLLM with under 6% overhead. K4 operationalizes this for log anomaly detection of supercomputers.
Proof of Thought Ecosystem
Bridging probabilistic LLM outputs with deterministic formal verification. Proof of Thought (NeurIPS'24 Workshop) translates LLM reasoning into First-Order Logic for theorem provers; Grammars of Formal Uncertainty (NeurIPS 2025) characterizes when LLM-generated formal artifacts can be trusted via PCFG-based uncertainty quantification.
Trust the Typical
2026Current approaches to LLM safety rely on a brittle pattern of identifying and blocking known threats via guardrails. This paper introduces Trust The Typical (T3), a framework that reframes safety as an out-of-distribution detection problem, learning the distribution of acceptable prompts in a semantic space and flagging significant deviations as potential threats. Unlike prior methods, T3 requires no training on harmful examples yet achieves state-of-the-art performance across 18 benchmarks spanning toxicity, jailbreaking, multilingual harms, and over-refusal—reducing false positive rates by up to 40× relative to specialized safety models. A single model trained on safe English text transfers effectively to over 14 languages without retraining.
K^4-Serve: Robust Streaming Log Anomaly Detection for HPC & AI Infrastructure
2026K^4-Serve operationalizes the K^4 framework for streaming anomaly detection on production HPC and AI infrastructure logs. It combines Kafka-based ingestion, versioned normalization, sliding-window scoring, retraining, and observability features to support robust real-world deployment. The system achieves stable deployment on real HPC logs with near-perfect event-level detection and only one false alert in the reported study. The work bridges anomaly-detection methodology and production cyberinfrastructure practice.
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
2025Large language models show remarkable promise for automated reasoning by generating formal specifications, but a fundamental tension exists between their probabilistic nature and the deterministic guarantees required by formal verification. This paper comprehensively investigates failure modes and uncertainty quantification in LLM-generated formal artifacts, revealing that SMT-based autoformalization has highly domain-specific accuracy impacts ranging from +34.8% on logical tasks to −44.5% on factual ones. A probabilistic context-free grammar (PCFG) framework is introduced to model LLM outputs and yield a refined uncertainty taxonomy, finding that uncertainty signals are task-dependent—for example, grammar entropy for logic achieves AUROC > 0.93.
Forte: Finding Outliers with Representation Typicality Estimation
2025Generative models can now produce photorealistic synthetic data virtually indistinguishable from real training data, challenging OOD detectors that rely on generative model likelihoods due to likelihood misestimation and typicality issues. This paper introduces Forte, which hypothesizes that estimating typical sets using self-supervised learners leads to better OOD detection, using representation learning and informative summary statistics based on manifold estimation to address these issues. Forte outperforms other unsupervised approaches and achieves state-of-the-art performance on established challenging benchmarks as well as new synthetic data detection tasks, requiring no class labels.
Mentors
Collaborators
Alan Luo
PhD Student
Andrew Yu
PhD Student
Biyao Zhang
PhD Student
Chaoda Song
PhD Student
Mohsen Hariri
AI Scientist
Nahal Shahani
PhD Student
Nengbo Wang
PhD Student
Shouren Wang
PhD Student
Srihari Sankar
PhD Student
Vikash Singh
PhD Student
Vinooth Kulkarni
PhD Student
Wang (Van) Yang
PhD Student
Weicong Chen
AI Scientist
Xinpeng Li
PhD Student
Yanyan Zhang
PhD Student
Yu Yin, PhD
Assistant Professor, Department of Computer and Data Sciences, Case School of Engineering
Yunlai Zhou
PhD Student
Zahra Rahmani
PhD Student