Staff Machine Learning Engineer, AI Serving

Jobgether · US

Company

Jobgether

Location

US

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Machine Learning Engineer, AI Serving in the United States.

This role sits at the core of a large-scale machine learning infrastructure organization focused on powering real-time recommendations, content discovery, and generative AI systems at massive scale. You will be responsible for designing and evolving high-performance inference systems that support millions of queries per second with strict latency and reliability requirements. The position combines deep systems engineering with advanced ML deployment, spanning GPU-based model serving, Kubernetes orchestration, and distributed cloud infrastructure. You will play a key role in shaping how large models and LLMs are served efficiently in production environments. Working in a highly collaborative and technically advanced team, you will influence platform architecture that directly impacts user experience, ranking systems, and AI-driven features. This is a high-impact engineering role where scalability, performance, and reliability are central to success.

Accountabilities:

  • Lead the design, development, and maintenance of a large-scale ML inference platform supporting low-latency, high-throughput model serving for search, ranking, and generative AI workloads.
  • Architect and implement GPU-based serving systems capable of handling millions of queries per second with strong reliability and performance guarantees.
  • Build and optimize end-to-end inference pipelines, including routing, caching, batching, and feature processing systems.
  • Develop and maintain model export frameworks to convert trained models into optimized formats for efficient GPU inference.
  • Design and improve observability systems for real-time monitoring of model performance, system health, and feature behavior.
  • Lead efforts in benchmarking, performance tuning, and scalability improvements across multi-cluster cloud environments.
  • Collaborate with cross-functional ML, infrastructure, and product teams to support production deployment of large-scale ML and LLM systems.
  • Requirements

    • 7+ years of experience in Machine Learning Engineering, AI Platform Engineering, or large-scale distributed systems development.
    • Strong experience operating and scaling Kubernetes-based infrastructure in production environments.
    • Deep knowledge of ML serving systems, inference pipelines, and production-grade AI deployment.
    • Strong programming skills in Python and/or Go, with experience in building scalable backend or ML systems.
    • Hands-on experience with modern ML/AI frameworks and tooling such as PyTorch, Triton, vLLM, or similar technologies.
    • Experience with cloud platforms (AWS, GCP) and infrastructure tooling such as Terraform or equivalent.
    • Strong understanding of observability, monitoring, and performance tuning for real-time systems.
    • Ability to communicate complex technical concepts clearly to both technical and non-technical stakeholders.
    • Strong ownership mindset with a focus on scalability, reliability, and developer experience.
    • Benefits

      • Competitive compensation package with base salary, equity (RSUs), and potential performance-based incentives.
      • Comprehensive healthcare coverage including medical, dental, and vision insurance.
      • Retirement plan with employer matching contributions.
      • Flexible remote-first work environment.
      • Generous paid time off, including vacation, holidays, and volunteer days.
      • Paid parental leave and family support programs.
      • Mental health support, coaching, and wellness resources.
      • Learning and development support for professional growth.
      • Additional benefits covering workspace support, caregiving, and family planning.
Apply Now

Date Posted

05/08/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories