Technical Support Engineer, Linux and HPC Admin

NVIDIA · Remote

Company

NVIDIA

Location

Remote

Type

Full Time

Job Description

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for over 25 years. It's a unique legacy of innovation fueled by great technology-and dynamic people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. NVIDIANS immerse themselves in a diverse, supportive environment that encourages everyone to do their best work. Join the team and see how you can make a lasting impact on the world.

NVIDIA Base Command Manager powers thousands of clusters worldwide, varying from a few to several thousands of nodes, and streamlines cluster provisioning, workload management, and infrastructure monitoring. It provides all the tools you need to deploy and run an AI data center. We take great pride in providing excellent, comprehensive support to our customers! The Technical Support Engineer in this role will significantly impact and contribute to the overall success of both external customers running their clusters with NVIDIA solutions AND internal clusters used for research, operations, and next-generation projects.

Want more jobs like this?

Get jobs that are Remote delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


What you'll be doing:

  • Support our internal and external customers using our Linux-based cluster management software product, ensuring everyone receives the help they require to support their clusters.
  • Collaborate with the development team to collect the correct information and escalate issues to the appropriate development team.
  • Become and serve as a subject-matter expert in several areas.
  • Research and development tasks for customers or internal use by our development team.
  • Participate in proactive discussions with internal stakeholders to ensure BCM best practices are widely communicated.
  • Work with the latest hardware (e.g. GPUs, AI accelerators, high-speed interconnects) and software technologies such as parallel filesystems (e.g. Lustre, GPFS, WekaIO), Jupyter, and various ML frameworks and tools, Spark, Kubernetes, and Ceph.

What we need to see:

  • BS degree or equivalent experience in Electrical Engineering or related field.
  • 5 years of relevant, aligned experience providing support in the HPC realm, ideally in a customer-facing role.
  • Proven research skills and interest in assisting customers to achieve their goals.
  • Experience in a technical customer-facing role.
  • Eagerness to learn and become an authority on our product.
  • Excellent written communication skills with the ability to easily convey complex technical information to consumable summaries.
  • In-depth knowledge of Linux.
  • Familiarity with typical Linux installations and their most common software elements.

Ways to stand out from the crowd:

  • Experience with high-performance computing and system administration would be an asset
  • Previous experience as a system admin running BCM/Bright Cluster Manager/Base Command Manager clusters is a definite plus.

Apply Now

Date Posted

10/31/2024

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Associate Technical Support Engineer - Recharge

Views in the last 30 days - 0

Recharge is a subscription platform for innovative brands offering customer retention solutions They seek Technical Support roles with 247 coverage em...

View Details

Software Engineer Networking Software and Services - xAI

Views in the last 30 days - 0

The text describes xAIs mission to develop AI systems for understanding the universe and advancing human knowledge It outlines a role involving networ...

View Details

Full Stack Product Engineer - Jiga

Views in the last 30 days - 0

Jiga is a remotefriendly company focused on empowering engineers with trust autonomy and flexibility They emphasize simplicity ownership and impactful...

View Details

Senior Design Manager (Infrastructure) - Canonical

Views in the last 30 days - 0

Canonical a leading opensource provider seeks a Senior Design Manager to drive innovation in cloud and AI technologies The role offers remote work glo...

View Details

Senior Product Designer - Org & Security - Typeform

Views in the last 30 days - 0

This job description outlines a role in developing an intelligent contact management system with AI capabilities The position involves designing user ...

View Details

Executive Director Patient Advocacy - Kyverna Therapeutics

Views in the last 30 days - 0

Kyverna Therapeutics is seeking an Executive Director for Patient Advocacy to lead initiatives in autoimmune disease treatment The role involves build...

View Details