collaborating with a leading AI research lab to support the evaluation of advanced machine learning systems. We are seeking experienced machine learning engineers and researchers to contribute to the design of high-quality evaluation suites that measure AI performance on real-world machine learning engineering tasks. The work focuses on translating practical ML research and engineering workflows into structured benchmarks for frontier models. This is a project-based, remote opportunity suited for experts with hands-on ML research experience.
Key responsibilities
Design and write detailed evaluation suites for machine learning engineering tasks
Assess AI-generated solutions across areas such as model training, debugging, optimization, and experimentation
Ideal qualifications
3+ years of experience in machine learning engineering or applied ML research
Hands-on experience with model development, experimentation, and evaluation
Background in ML research (industry lab or academic setting strongly preferred)
Strong ability to reason about ML system design choices and tradeoffs
Clear written communication and high attention to technical detail
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Contract and Payment Terms