大模型评测算法工程师

生态平台部｜全职｜工程通道｜上海

2026-04-16

岗位职责

- 负责构建大模型与 Agent 的评测体系与评测平台
- 设计 Agent / LLM Benchmark、任务环境与评测指标
- 研发自动化评测策略，包括 LLM-as-a-judge、execution-based evaluation 等
- 分析 Agent trajectory 与任务执行过程，评估任务成功率与失败模式
- 支持模型对比分析与排行榜（Leaderboard）建设，推动评测体系持续演进

岗位要求

- 计算机、人工智能或相关专业硕士及以上学历
- 扎实的算法基础与工程能力，熟练使用 Python
- 熟悉大模型或 Agent 技术生态（LLM、Tool Use、Agent Framework 等）
- 对模型评测、Benchmark 构建或自动化评测系统有兴趣或相关经验
加分项：
- 参与过 LLM evaluation、benchmark 或评测平台相关项目
- 熟悉常见评测体系或平台，如 HumanEval、SWE-bench、WebArena、Chatbot Arena 等

热招职位

${ v.title }

${ v.other_info }${ (v.other_info && v.other_info.length ? '｜' : '') + v.updatedAtShow }

${ v.newstitle }

${ v.newstitle }

新闻动态

科研活动

${ v.newstitle }

${ v.newstitle }

InternVL

MinerU

LMDeploy

InternLM

OpenCompass

XTuner

${ v.newstitle }

${ v.newstitle }

社会招聘和校园招聘

招生信息

${ v.newstitle }

大模型评测算法工程师

${ v.title }