About Me

Xiyuan Yang (杨希渊) is now an undergraduate (sophomore) in School of Artificial Intelligence, Shanghai Jiao Tong University (SJTU-SAI). During his freshman year of study, he ranked SECOND of 62 in grades and received honors including the National Scholarship (5 scholarships in total). He joined the research group MAGIC in SJTU, under the supervision of Prof. Siheng Chen.

My research interest lies in continuously pushing the capability boundaries of language intelligence, enabling it to accomplish increasingly complex and valuable tasks. I focus on two key directions: agentic harness construction, which builds workflows that maximize the potential of base language models through agentic tool-calling and agentic memory; and agentic training-data construction, which synthesizes high-quality agent trajectories and feeds them back into the model's post-training and mid-training stages, so that the model intrinsically acquires these agentic capabilities.

Topics I'm currently interested in:

  • Self-Evolving Agents for Autonomous LLM Post-Training
  • Agentic Trajectory Data Construction
  • Agentic Tool Calling Benchmarks

Download my CV

Education

Shanghai Jiao Tong University

B.S. in Artificial Intelligence, School of Artificial Intelligence

2024 - 2028

GPA: 4.1/4.3 (Ranked 2 out of 62)

Score: 94.0/100

Scholarships: National Scholarship (first 3%), "Han Ying Ju Hua" Scholarship (15 per year), Zhiyuan Honor Scholarships (first 50%), SJTU Undergraduate Excellence Scholarship

High Graded Courses:

  • Comprehensive Programming Practice: 100/100
  • Probability and Statistics (Honor): 100/100
  • Algorithm Design and Analysis: 100/100
  • Numerical Analysis: 100/100
  • Linear Algebra (Honor): 98/100
  • Fundamentals of Programming (Honor): 98/100
  • Fundamentals of ML, DL and RL: 98/100

Experience

MAGIC Lab

Undergraduate Research Assistant

Advisor: Prof. Siheng Chen

2024 - Present

Publications & Tech Projects

DataMaster Publication

DataMaster: Data-Centric Autonomous AI Research

Yaxin Du*, Xiyuan Yang*, Zhifan Zhou, Wanxu Liu, Zixing Lei, Zimeng Chen, Fenyi Liu, Haotian Wu, Yuzhu Cai, Zexi Liu, Xinyu Zhu, WenHao Wang, Linfeng Zhang, Chen Qian, Siheng Chen

We introduce DataMaster, a data-agent framework for task-conditioned autonomous data engineering. Given a fixed learning algorithm, DataMaster improves downstream performance by searching for external data, selecting and composing datasets, and applying cleaning or transformation. Its DataTree, shared Data Pool, and Global Memory coordinate branch exploration and reuse evidence, improving medal rate by 32.27% on MLE-Bench Lite and surpassing the instruct model on GPQA.
ICML 2026 Regular
Persona Publication

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

Wenhao Wang*, Peizhi Niu*, Gongyi Zou*, Xiyuan Yang*, Jingxing Wang*, Haoting Shi, Yaxin Du, Jingyi Chai, Xianghe Pang, Shuo Tang, Yanfeng Wang, Siheng Chen

We introduce MCP-Persona, a benchmark for evaluating LLM agents on personalized MCP tools in realistic social and productivity environments. It simulates account- and database-grounded tasks across Reddit, Xiaohongshu, Lark, and Slack, exposing challenges beyond generic information seeking. Experiments with SOTA agents reveal substantial gaps in personalized tool use, highlighting MCP-Persona as a practical testbed for personal-application agents.
ICLR 2026 Poster
InfoMosaic Publication

InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents

Yaxin Du, Yuanshuo Zhang, Xiyuan Yang, Yifan Zhou, Cheng Wang, Gongyi Zou, Xianghe Pang, Wenhao Wang, Menglan Chen, Shuo Tang, Zhiyu Li, Feiyu Xiong, Siheng Chen

We introduce InfoMosaic-Bench, the first benchmark dedicated to multi-source information seeking in tool-augmented agents. Covering 6 representative domains (medicine, finance, maps, video, web, and multi-domain integration), InfoMosaic-Bench requires agents to combine general-purpose search with domain-specific tools. Tasks are synthesized with InfoMosaic-Flow, a scalable pipeline that grounds task conditions in verified tool outputs, enforces cross-source dependencies, and filters out shortcut cases solvable by trivial lookup.
AppCopilot Publication

AppCopilot: Toward General, Accurate, Long-Horizon, and Efficient Mobile Agent

Jingru Fan, Yufan Dang, Jingyao Wu, Huatao Li, Runde Yang, Xiyuan Yang, Yuheng Wang, Chen Qian

We introduce AppCopilot, a multimodal, multi-agent mobile agent designed for seamless cross-app operation. It implements a complete end-to-end pipeline encompassing data collection, model training, fine-tuning, efficient inference, and deployment across PC and mobile platforms. At the model level, it integrates multimodal foundation models with robust bilingual (Chinese-English) support. The reasoning and control layer employs chain-of-thought reasoning, hierarchical task decomposition, and multi-agent collaboration.

Open Source Projects

Active GitHub committer with 30+ open-source repositories and 1500+ commits.

Xiyuan Yang's GitHub Contributions 3D Contribution Graph
Open-Source Projects
IntelliSearch V3.1: Unifying Search, Empowering Action for tool calling autonomous agents.
SAI Community: The first open-source SAIer's forum for courses, careers and future.
Technical Blog: 120+ articles on CS and AI, 450k+ words.
LLM Infra Docs: Core source code analysis of large-scale open-source LLM infrastructure.
Agent Codebase: Generalized codebase for agentic scientific research pipelines.
PaperFlow: Never use Overleaf again!
SyncFlash: Lightning-fast configuration of your server.
Course Labs
LLM Reasoning: Enhancing reasoning abilities for LLMs using reinforcement learning.
Clustering: Clustering of high-dimensional data with intrinsic low rank.
Image Scaling: 2D image scaling based on classical interpolation algorithms.
CTR-Press: Compress-then-Refine for training-free KV cache compression.
L-PHYM: Long-horizon language-driven physics-based motion control.
Data Structure: Course notes and source code for data structures.
CSAPP: Computer Systems: A Programmer's Perspective (labs and notes).
LLM: Stanford CS336 language models learning notes.

Technical Blogs

Maintainer of Xiyuan Yang's Technical Blog. I regularly publish technical content focusing on computer science and AI. To date, I have authored over 120 articles with a cumulative word count exceeding 450,000 words.

GitHub Repository / Blog Website

Skills

Programming Languages
Python C++ Rust CSS JavaScript HTML
Tools
Git LaTeX Shell Docker
ML & DL & LLM
PyTorch Vllm Cuda Pandas FastMCP
Languages
    English (CET-6: 584)

Awards

🏅
2026-01: "Han Ying Ju Hua" Scholarship (15 per year)
🏅
2025-12: SJTU Undergraduate Excellence Scholarship
🥇
2025-10: Zhiyuan Honor Scholarships (first 50%)
🏆
2025-10: Challenge Cup Defender (link) (Grand Prize, AI Track Champion)
🥇
2025-09: National Scholarship (first 3%)
🏅
2025-01: Honorable Mention in MCM 2025
🥇
2024-10: Zhiyuan Honor Scholarships (first 50%)