Machine Learning Engineer - LLM Evaluation & Automation at TEKsystems c/o Allegis Group

Overview

We are seeking a Machine Learning Engineer to join a high-impact team focused on advancing LLM evaluation, NLP, and AI-driven automation. This role centers on designing scalable evaluation frameworks, optimizing prompt strategies, and building systems that ensure high-quality, consistent model outputs across product domains. You will partner closely with product, engineering, and research teams to drive measurable improvements in AI performance. This is a hands-on role with a strong emphasis on LLM evaluation systems, prompt engineering, and data-driven model optimization.

Job Details

Location: Culver City, CA (Hybrid with 3 days a week onsite)
Pay Rate: $60-70 hr/w2
Job Type: Contract
Contract Length: 6 months
Experience Level: Mid-level to Senior

Key Responsibilities

Design and build LLM-based evaluation frameworks, including automated scoring pipelines and rubric-based grading systems
Build and maintain data pipelines for evaluation datasets using Python, SQL, and scalable processing tools
Translate complex evaluation results into clear, actionable insights for technical and non-technical stakeholders
Implement automation workflows and agentic evaluation systems to improve efficiency and reduce manual efforts
Develop prompt engineering strategies to evaluate output quality, accuracy, and consistency
Create and maintain metrics, KPIs, and dashboards to track and communicate model performance
Conduct error analysis, root-cause investigations, and quality deep dives to guide model improvements
Partner cross-functionally to define evaluation methodologies and integrate them into production workflows

Must-Have Qualifications

5+ years of experience in ML engineering, NLP, or AI/ML automation
Strong programming skills in Python and SQL
Deep understanding of machine learning concepts with a focus on NLP and advanced LLM capabilities (e.g., Chain-of-Thought, agentic workflows)
Experience working with large-scale datasets and data pipelines
Strong experience with LLM evaluation, prompt engineering, or auto grading systems
Experience developing metrics and KPIs to measure model output quality and consistency

Nice-to-Have

Experience with LLM-as-judge systems or human + model evaluation frameworks
Background in inter-rater reliability, evaluation calibration, or judged systems design
Experience with PySpark or distributed data processing tools
Exposure to building dashboards or visualization tools for model performance tracking

Technical Skills

Python, SQL, NLP, LLM Evaluation, Prompt Engineering, Machine Learning, Data Pipelines, Automation Systems

Pay and Benefits

The pay range for this position is $60.00 - $70.00/hr. Eligibility requirements apply to some benefits. If eligible, benefits may include:

Medical, dental & vision
Critical Illness, Accident, and Hospital
401(k) Retirement Plan
Life Insurance
Short and long-term disability
Health Spending Account (HSA)
Transportation benefits
Employee Assistance Program
Time Off/Leave (PTO, Vacation or Sick Leave)

About TEKsystems

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia.

Note: The company is an equal opportunity employer.

Already filled

Machine Learning Engineer - LLM Evaluation & Automation

Job summary

Work model