2026-05-22agentsscalingdatacode

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Yifan Yang, Ziyang Gong, Weiquan Huang, Qihao Yang, Ziwei Zhou, Zisu Huang, Yan Li, Xuemei Gao, Qi Dai, Bei Liu, Kai Qiu, Yuqing Yang, Dongdong Chen, Xue Yang, Chong Luo

Key claim

SkillOpt improves agent skill performance across multiple benchmarks.

SkillOpt is a novel optimizer for agent skills that improves performance by applying a controlled text-space optimization approach. It significantly enhances the accuracy of various models in different execution environments, demonstrating its effectiveness across multiple benchmarks.

Novelty

8.0/10

SkillOpt introduces a systematic approach to optimizing agent skills using a text-space optimizer.

Reliability

8.0/10

The methodology is solid, with extensive evaluation across multiple benchmarks and models.

Deep reliability assessment

The methodology supports the claim that SkillOpt can optimize skills for frozen agents across various benchmarks and models, but the generalizability to all possible tasks and models without further validation is overclaimed.

Reproducibility

Yes, the paper provides a code link: https://aka.ms/SkillOpt.

Discussion questions

How does SkillOpt handle tasks with subjective or multi-dimensional success criteria where automatic validation is challenging?
What are the practical implications for deploying SkillOpt in real-world applications with limited computational resources?
What specific scenarios or benchmarks would falsify the claim that SkillOpt is the best or tied-best method across all evaluated cells?

Key figure

Figure 1 provides an overview of SkillOpt, showing how the optimizer model converts trajectories into skill edits and uses a validation gate to ensure improvements.

Benchmark results

SearchQAaccuracy: 87.3vs No skill+9.6SOTA

SpreadsheetBenchaccuracy: 80.7vs No skill+38.9SOTA

DocVQAaccuracy: 91.2vs No skill+12.4SOTA

Codelink

aka.ms/SkillOptOfficial

Read on arXiv →