Manager, Next-Gen AI Cluster Validation

Jobgether · US

Company

Jobgether

Location

US

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Manager, Next-Gen AI Cluster Validation in the United States.

This role offers the opportunity to lead the development and validation of next-generation AI supercomputing systems at scale. You will manage a high-performing technical team responsible for integrating compute, networking, storage, and software systems into large-scale AI and HPC clusters. The position involves collaborating closely with internal teams, external partners, and customers to ensure successful deployment and performance of cutting-edge systems. You will design and implement tools, processes, and documentation to support cluster development, automation, and performance engineering. This role blends strategic leadership with hands-on execution in a fast-paced, remote-friendly environment. It is ideal for a technical leader passionate about AI, HPC, and supercomputing innovation.

Accountabilities:

  • Lead a distributed engineering team designing and validating next-generation AI and HPC clusters
  • Integrate new compute, networking, storage, and software systems for high-performance applications
  • Develop platforms for system automation, software development, and performance optimization
  • Build tools and documentation to support large-scale supercomputing system deployment and operations
  • Collaborate with internal teams on cluster architecture, integration, and at-scale bring-up
  • Partner with external collaborators and customers to support validation of clusters based on reference architectures
  • Ensure the team delivers high-quality, scalable, and reliable AI computing solutions
  • Requirements:

    • BS in Applied Science or Engineering; advanced degrees preferred
    • 8+ years of experience in high-performance computing, AI, or machine learning environments
    • 3+ years of experience in technical leadership roles managing engineering teams
    • Proficiency in software development and system automation with languages such as Go, Python, or Ansible
    • Proven ability to lead distributed, high-performing teams and foster collaboration
    • Strong problem-solving skills and creative thinking in complex technical environments
    • Comfortable working in a remote-friendly environment across multiple locations
    • Excellent teamwork, communication, and collaboration skills
    • Familiarity with AI/ML workloads, cluster architectures, or HPC systems is strongly preferred
    • Benefits:

      • Competitive base salary range: $224,000 – $356,500, plus equity opportunities
      • Comprehensive healthcare, dental, and vision coverage
      • Flexible paid time off and parental leave programs
      • Retirement savings and matching plans
      • Professional development and learning opportunities
      • Remote-friendly work environment with global collaboration
      • Exposure to cutting-edge AI and HPC technologies and high-impact projects
Apply Now

Date Posted

04/02/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories