ML Systems Designer - Fully Remote

Job summary

United States
Software Developer

Work model

Fully remote
Only US
2 days ago
Job description

About The Job

Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark , General Catalyst , Peter Thiel , Adam D'Angelo , Larry Summers , and Jack Dorsey .

Position

ML Systems Designer

Type

Full-time or Part-time Contract Work

Compensation

$60--$100/hour

Location

US-Based and Non-US-Based

Role Responsibilities

  • Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness.
  • Conduct fact-checking using trusted public sources and authoritative references.
  • Execute code and validate outputs using appropriate tools for accuracy testing.
  • Annotate model responses by identifying strengths, areas of improvement, and factual or conceptual inaccuracies.
  • Assess code quality, readability, algorithmic soundness, and explanation quality.
  • Ensure model responses align with expected conversational behavior and system guidelines.
  • Apply consistent evaluation standards by following clear taxonomies, benchmarks, and detailed evaluation guidelines.

Qualifications

Must-Have

  • BS, MS, or PhD in Computer Science or a closely related field
  • 3 years real-world experience in software engineering or related technical roles
  • Expertise in at least two relevant programming languages (e.g., Python, Java, C , C, JavaScript, Go, Rust, Ruby, SQL, Powershell, Bash, Swift, Kotlin, R, TypeScript, HTML/CSS)
  • Ability to solve HackerRank or LeetCode Medium and Hard--level problems independently
  • Experience contributing to well-known open-source projects, including merged pull requests
  • Significant experience using LLMs while coding and understanding their strengths and failure modes
  • Strong attention to detail and comfort evaluating complex technical reasoning, identifying subtle bugs or logical flaws

Preferred

  • Prior experience with RLHF, model evaluation, or data annotation work
  • Track record in competitive programming
  • Experience reviewing code in production environments
  • Familiarity with multiple programming paradigms or ecosystems
  • Experience explaining complex technical concepts to non-expert audiences

Application Process (Takes 20--30 mins to complete)

  • Upload resume
  • AI interview based on your resume
  • Submit form

Resources & Support

  • For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome
  • For any help or support, reach out to: [email protected]

PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.