publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

ICLR 2026 Oral

LoongRL: Incentivizing Long-Context Reasoning in Large Language Models via Reinforcement Learning

Siyuan Wang, Gaokai Zhang, Li Lyna Zhang, Ning Shang, Fan Yang, Dongyao Chen, and 1 more author

International Conference on Learning Representations, 2026

Abs arXiv HTML Website

We present LoongRL, a reinforcement learning framework with novel data synthesis that enables 7B parameter models to surpass 32B long-range models on long-context reasoning tasks at 100k-200k tokens.
Preprint

Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

Yiqing Xie, Emmy Liu, Gaokai Zhang, Nachiket Kotalwar, Shubham Gandhi, Sathwik Acharya, and 4 more authors

arXiv preprint, 2026

Abs arXiv HTML Website

We present Hybrid-Gym, a framework for training coding agents to generalize across repository-level environments through synthetic task generation.

2025

ICML 2025

LongRoPE2: Near-Lossless LLM Context Window Scaling

Ning Shang, Li Lyna Zhang, Siyuan Wang, Gaokai Zhang, Gilsinia Lopez, Fan Yang, and 2 more authors

International Conference on Machine Learning, 2025

Abs arXiv HTML

Large Language Models (LLMs) with extended context windows are essential for complex tasks. We present LongRoPE2, a novel method that extends LLM context windows to 128K tokens while retaining 98.5% short-context accuracy. Our approach introduces improved position encoding strategies that enable near-lossless context extension.

2024

Preprint

Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment

Jason Vega, Junsheng Huang, Gaokai Zhang, Hangoo Kang, Minjia Zhang, and Gagandeep Singh

arXiv preprint, 2024

Abs arXiv HTML

We present a comprehensive robustness benchmarking study of Large Language Models, demonstrating that simple random augmentations can effectively break safety alignments.