Machine Learning Operations (MLOps) Engineer

Jobgether · US

Company

Jobgether

Location

US

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Machine Learning Operations (MLOps) Engineer in the United States.

This role sits at the core of building and scaling a modern machine learning platform that powers production-grade AI systems. You will be responsible for designing and operating the infrastructure that enables seamless model training, deployment, and monitoring across high-impact products. Working at the intersection of software engineering, DevOps, and machine learning, you will help define how ML systems are built and operated at scale. This is a highly hands-on engineering role focused on reliability, performance, and automation of end-to-end ML workflows. You will collaborate closely with machine learning engineers and data teams to improve developer experience and accelerate delivery of AI-driven solutions. The environment is fast-paced, highly technical, and centered on building scalable systems that support real-world production AI use cases.

Accountabilities:

In this role, you will design, build, and maintain the infrastructure and tooling that supports the full machine learning lifecycle, from training and experimentation to deployment and monitoring in production environments.

  • Design and implement scalable ML infrastructure to support training, evaluation, deployment, and inference workflows
  • Develop and maintain containerized systems using Docker and Kubernetes for distributed and scalable workloads
  • Build and orchestrate distributed training pipelines and workflow automation systems
  • Implement and maintain ML lifecycle tools such as MLflow for experiment tracking, versioning, and reproducibility
  • Own and optimize production inference systems, including low-latency and high-availability model serving architectures
  • Develop and maintain CI/CD pipelines for machine learning models, including automated deployment, version control, and rollback strategies
  • Build and manage data pipelines integrated with platforms such as Snowflake and related data systems
  • Implement observability solutions including monitoring, logging, and alerting for model performance, drift detection, and system health
  • Collaborate closely with ML engineers to improve platform usability, reliability, and overall developer experience
  • Requirements

    This role requires strong software engineering expertise combined with hands-on experience building and operating machine learning infrastructure at scale. The ideal candidate is highly technical, automation-driven, and comfortable working across distributed systems.

    • Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent practical experience
    • 5+ years of experience in software engineering, DevOps, or MLOps roles
    • Strong proficiency in Python and experience building production-grade distributed systems
    • Hands-on experience with Docker, Kubernetes, and cloud-based infrastructure
    • Proven experience designing and maintaining CI/CD pipelines for production systems
    • Familiarity with ML lifecycle tools such as MLflow or equivalent platforms
    • Experience working with data platforms such as Snowflake or similar cloud data warehouses
    • Strong understanding of system design, microservices, APIs, and scalable architectures
    • Excellent debugging and troubleshooting skills across complex distributed environments
    • Strong collaboration skills and ability to work effectively with ML engineers and data teams
    • Benefits

      • Fully remote work opportunity
      • Unlimited vacation policy, sick time, and paid holidays
      • Comprehensive healthcare coverage including medical, dental, and vision plans
      • 401(k) retirement savings plan
      • Paid parental leave and supportive time-off policies
      • Startup environment with strong focus on innovation and engineering impact
      • Opportunity to work on cutting-edge machine learning infrastructure at scale
      • Collaborative, engineering-driven culture focused on automation and continuous improvement.
Apply Now

Date Posted

04/14/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories