Case Western Reserve University
Vikash Singh

Vikash Singh

PhD Student

Education

  • Ph.D. in Computer Science, Case Western Reserve University, Cleveland, OH, USA (May 2024 – Present)
  • M.S. in Computer Science, Case Western Reserve University, Cleveland, OH, USA (Aug 2023 – May 2024)
  • B.Tech. in Civil Engineering, Minor in Computer Science, Indian Institute of Technology Mandi, Himachal Pradesh, India (Aug 2019 – May 2023)
Filter:

Trust the Typical

2026

14th International Conference on Learning Representations (ICLR), April 23-27, 2026, Rio De Janeiro, Brazil

Current approaches to LLM safety rely on a brittle pattern of identifying and blocking known threats via guardrails. This paper introduces Trust The Typical (T3), a framework that reframes safety as an out-of-distribution detection problem, learning the distribution of acceptable prompts in a semantic space and flagging significant deviations as potential threats. Unlike prior methods, T3 requires no training on harmful examples yet achieves state-of-the-art performance across 18 benchmarks spanning toxicity, jailbreaking, multilingual harms, and over-refusal—reducing false positive rates by up to 40× relative to specialized safety models. A single model trained on safe English text transfers effectively to over 14 languages without retraining.

Trustworthy AI Artificial Intelligence

K4: Online Log Anomaly Detection via Unsupervised Typicality Learning

2025

IEEE/ACM International Conference on High Performance Computing (SC25), December 17-20, 2025, Hyderabad, India

Existing log anomaly detection methods are often slow, dependent on error-prone parsing, and use unrealistic evaluation protocols. This paper introduces K4 (Knowing the Unknown by Knowing only the Known), a fully unsupervised, parser-independent framework that transforms arbitrary log embeddings into compact four-dimensional descriptors—Precision, Recall, Density, Coverage—using efficient k-nearest neighbor statistics. Under a realistic online chunk-based evaluation protocol, K4 achieves state-of-the-art AUROC of 0.995–0.999 across HDFS, BGL, and Thunderbird datasets, with training under 4 seconds and inference as low as 4 μs.

Trustworthy AI HPC Artificial Intelligence

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

2025

39th Conference on Neural Information Processing Systems (NeurIPS 2025), December 2025

Large language models show remarkable promise for automated reasoning by generating formal specifications, but a fundamental tension exists between their probabilistic nature and the deterministic guarantees required by formal verification. This paper comprehensively investigates failure modes and uncertainty quantification in LLM-generated formal artifacts, revealing that SMT-based autoformalization has highly domain-specific accuracy impacts ranging from +34.8% on logical tasks to −44.5% on factual ones. A probabilistic context-free grammar (PCFG) framework is introduced to model LLM outputs and yield a refined uncertainty taxonomy, finding that uncertainty signals are task-dependent—for example, grammar entropy for logic achieves AUROC > 0.93.

Artificial Intelligence Trustworthy AI

Efficient Fine-Grained GPU Performance Modeling for Distributed Deep Learning of LLM

2025

IEEE/ACM International Conference on High Performance Computing (SC25), December 17-20, 2025, Hyderabad, India

Training large language models is one of the most compute-intensive tasks in HPC, and predicting end-to-end training time for multi-billion parameter models across hundreds of GPUs is challenging due to complex interactions between transformer components, parallelism strategies, and multi-tier communication. This paper addresses this by decomposing LLMs into core computational primitives and modeling them with operator-level decomposition, lightweight hardware-aware prediction models for key operations, and an end-to-end prediction system integrating these across complex parallelization strategies. The resulting framework enables accurate distributed LLM training performance prediction without costly full-scale sampling.

HPC Artificial Intelligence