Debargha Ganguly

PhD Student

PhD student at Case Western Reserve University working on robust, reliable, and scalable AI. Collaborator with Microsoft Research, Google DeepMind, and Amazon Science.

Education

PhD in Computer Science, Case Western Reserve University, Spring 2023 – present

Awards and Honors

Department Outstanding Graduate Research Award, Case Western Reserve University, 2024

I am a PhD student at Case Western Reserve University, advised by Dr. Vipin Chaudhary. My research focuses on building AI systems that are trustworthy in practice by combining representation learning, formal methods, and principled uncertainty quantification.

Much of my work has been shaped through collaborations with researchers at Microsoft Research, Google DeepMind, Amazon Science, and national laboratories, and several results have been deployed in production systems at scale.

Trustworthy AI: out-of-distribution detection, neurosymbolic reasoning, and uncertainty quantification for LLMs.

Current Projects

Representation Typicality Estimation

Estimating the typical set of a distribution in representation space to detect outliers and unsafe inputs. This line of work began with Forte (ICLR 2025), which uses self-supervised representations and manifold statistics for OOD detection, and extends to Trust The Typical (ICLR 2026), which reframes LLM safety as an OOD problem and integrates into vLLM with under 6% overhead. K4 operationalizes this for log anomaly detection of supercomputers.

Related Publications:

Proof of Thought Ecosystem

Bridging probabilistic LLM outputs with deterministic formal verification. Proof of Thought (NeurIPS'24 Workshop) translates LLM reasoning into First-Order Logic for theorem provers; Grammars of Formal Uncertainty (NeurIPS 2025) characterizes when LLM-generated formal artifacts can be trusted via PCFG-based uncertainty quantification.

Related Publications:

Trust the Typical

2026

Debargha Ganguly , Srihari Sankar , Biyao Zhang , Vikash Singh , Kanan Gupta , Harshini Kavuru , Alan Luo , Weicong Chen , Warren Morningstar , Raghu Machiraju , Vipin Chaudhary

14th International Conference on Learning Representations (ICLR), April 23-27, 2026, Rio De Janeiro, Brazil

Current approaches to LLM safety rely on a brittle pattern of identifying and blocking known threats via guardrails. This paper introduces Trust The Typical (T3), a framework that reframes safety as an out-of-distribution detection problem, learning the distribution of acceptable prompts in a semantic space and flagging significant deviations as potential threats. Unlike prior methods, T3 requires no training on harmful examples yet achieves state-of-the-art performance across 18 benchmarks spanning toxicity, jailbreaking, multilingual harms, and over-refusal—reducing false positive rates by up to 40× relative to specialized safety models. A single model trained on safe English text transfers effectively to over 14 languages without retraining.

Trustworthy AI Artificial Intelligence

arXiv

BibTeX Citation

@inproceedings{
ganguly2026trust,
title={Trust The Typical},
author={Debargha Ganguly and Sreehari Sankar and Biyao Zhang and Vikash Singh and Kanan Gupta and Harshini Kavuru and Alan Luo and Weicong Chen and Warren Richard Morningstar and Raghu Machiraju and Vipin Chaudhary},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=vfbeleLBWv}
}

K^4-Serve: Robust Streaming Log Anomaly Detection for HPC & AI Infrastructure

2026

W. Chen , V. Singh , Z. Rahmani , D. Ganguly , Mohsen Hariri , S. Maxwell , S. Gajurel , E. Dragowsky , H. Djohari , Vipin Chaudhary

ACM PEARC 2026 (under review)

K^4-Serve operationalizes the K^4 framework for streaming anomaly detection on production HPC and AI infrastructure logs. It combines Kafka-based ingestion, versioned normalization, sliding-window scoring, retraining, and observability features to support robust real-world deployment. The system achieves stable deployment on real HPC logs with near-perfect event-level detection and only one false alert in the reported study. The work bridges anomaly-detection methodology and production cyberinfrastructure practice.

HPC Artificial Intelligence

BibTeX Citation

@misc{chen2026k4serve,
  title={K^4-Serve: Robust Streaming Log Anomaly Detection for HPC \& AI Infrastructure},
  author={W. Chen and V. Singh and Z. Rahmani and D. Ganguly and Mohsen Hariri and S. Maxwell and S. Gajurel and E. Dragowsky and H. Djohari and Vipin Chaudhary},
  year={2026},
  note={Under review at ACM PEARC 2026}
}

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

2025

Debargha Ganguly , Vikash Singh , Sreehari Sankar , Biyao Zhang , Xuecen Zhang , Srinivasan Iyengar , Xiaotian Han , Amit Sharma , Shivkumar Kalyanaraman , Vipin Chaudhary

39th Conference on Neural Information Processing Systems (NeurIPS 2025), December 2025

Large language models show remarkable promise for automated reasoning by generating formal specifications, but a fundamental tension exists between their probabilistic nature and the deterministic guarantees required by formal verification. This paper comprehensively investigates failure modes and uncertainty quantification in LLM-generated formal artifacts, revealing that SMT-based autoformalization has highly domain-specific accuracy impacts ranging from +34.8% on logical tasks to −44.5% on factual ones. A probabilistic context-free grammar (PCFG) framework is introduced to model LLM outputs and yield a refined uncertainty taxonomy, finding that uncertainty signals are task-dependent—for example, grammar entropy for logic achieves AUROC > 0.93.

Artificial Intelligence Trustworthy AI

arXiv

BibTeX Citation

@misc{ganguly2025grammarsformaluncertaintytrust,
      title={Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks}, 
      author={Debargha Ganguly and Vikash Singh and Sreehari Sankar and Biyao Zhang and Xuecen Zhang and Srinivasan Iyengar and Xiaotian Han and Amit Sharma and Shivkumar Kalyanaraman and Vipin Chaudhary},
      year={2025},
      eprint={2505.20047},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.20047}, 
}

Forte: Finding Outliers with Representation Typicality Estimation

2025

Debargha Ganguly , Warren Morningstar , Andrew Yu , Vipin Chaudhary

13th International Conference on Learning Representations (ICLR), April 24-28, 2025, Singapore

Generative models can now produce photorealistic synthetic data virtually indistinguishable from real training data, challenging OOD detectors that rely on generative model likelihoods due to likelihood misestimation and typicality issues. This paper introduces Forte, which hypothesizes that estimating typical sets using self-supervised learners leads to better OOD detection, using representation learning and informative summary statistics based on manifold estimation to address these issues. Forte outperforms other unsupervised approaches and achieves state-of-the-art performance on established challenging benchmarks as well as new synthetic data detection tasks, requiring no class labels.

Trustworthy AI Artificial Intelligence

arXiv

BibTeX Citation

@misc{ganguly2024fortefindingoutliers,
      title={Forte : Finding Outliers with Representation Typicality Estimation}, 
      author={Debargha Ganguly and Warren Morningstar and Andrew Yu and Vipin Chaudhary},
      year={2024},
      eprint={2410.01322},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2410.01322}, 
}