Dual Tuning for Reasoning Efficacy-Driven Data Curation in Multimodal LLM Training

Ruobing Zheng*, Tianqi Li*, Jianing Li, Qingpei Guo, Yi Yuan, Jingdong Chen
*Equal Contribution.
Ant Group.
zhengruobing.zrb@antgroup.com, shijian.ltq@antgroup.com

Abstract

Reasoning post-training improves Large Language Models (LLMs) on complex tasks such as mathematics and coding, but its benefits across diverse multimodal tasks remains uncertain. The trend of releasing parallel "Instruct" and "Thinking" models by leading teams is both resource-intensive and user-unfriendly. Prior work finds that the gains from reasoning training are influenced by multiple factors, such as base model capabilities, task characteristics, and Chain-of-Thought (CoT) data quality. However, principled criteria for determining when reasoning post-training is beneficial and which data should support it are still lacking. In this paper, we propose Dual Tuning, a reasoning efficacy-driven data curation framework for multimodal LLMs training. Given a target task and a base model, Dual Tuning jointly evaluates whether the training data is beneficial and whether reasoning training with current CoT content yields positive gains over non-reasoning alternatives. We apply Dual Tuning across spatial, mathematical, and multi-disciplinary tasks, and further analyze how reinforcement learning and thinking patterns affect reasoning efficacy. The Dual Tuning results guide data curation by identifying data that benefit reasoning training, data better suited to direct-answer training, and data that are detrimental under both training modes. Our work provides quantitative criteria for selecting appropriate training data and matching post-training strategies.

Validation of the Dual Tuning & Domain-level Comparison

Fine-Grained Task Analysis

Further Exploration

BibTex

@article{zheng2026thethinkingboundary,
  title={Dual Tuning for Reasoning Efficacy-Driven Data Curation in Multimodal LLM Training},
  author={Zheng, Ruobing and Li, Tianqi and Li, Jianing and Guo, Qingpei and Yuan, Yi and Chen, Jingdong},
  journal={arXiv preprint arXiv:2603.04415},
  year={2026}
}