Sr Software Engineer - HPC Clusters, SLURM, ML
Job Description
Location : 100% Remote
Duration : One year (possibility of extensions)
Pay rate : $75 (w2) plus benefits
Key Skills : HPC Clusters, Cluster Administrator, Cluster tech stack, SLURM, BeeGFS, M/L Training (GPU+CPU), AI centric, Simulation, 3D graphics.
Job Description :
The client is looking for an experienced cluster administrator to manage HPC clusters. The right candidate will have experience on SLURM and related technologies and will be familiar with workloads related to machine learning training and inference (GPU and CPU).
Job Responsibilities : • Serve as the primary contact for a GPU+CPU cluster • Collected team feedback and relayed to the support team (schedule downtimes/maintenance, propose changes to the cluster, etc.) • Perform capacity planning to help determine compute/storage needs for the team moving forward • Serve as the owner of the SLURM job scheduler, defining the configuration that better fits the team and developing/enabling advanced features • Serve as the team datasets owner (manage the datasets that live in the cluster and how people access them) • Help the team optimize/troubleshoot complex jobs/pipelines (AI centric, simulation, 3D graphics, etc.). • Educate the team on how to use the cluster (SLURM, BeeGFS, datasets, etc.), enabling a fast ramp up time of new scientists and engineers (via tutorials, presentations, wiki docs, etc.)
Required skills and experience : • Experience designing and managing large clusters with heterogeneous HW (CPUs, GPUs, etc.) • User-centric and results oriented. You can learn from data what the needs of our scientists/engineers will and can produce a cluster growth plan to fulfill these needs • Power user. You are willing to extensively test the different workflows that run in the cluster and help optimize them. • Cluster tech stack. You are an expert on cluster orchestration and management, familiar with technologies such as SLURM, BeeGFS, Docker, etc. (or you are willing to learn them quickly) • Good communication skills. You can effectively communicate with a variety of shareholders, including presenting plans to higher management and having technical discussions with engineers/scientists.
Minimum Educational Requirement : BS degree or higher
Website: www.mavensoft.com
Date Posted
10/13/2022
Views
5
Similar Jobs
Software Engineer - DAT
Views in the last 30 days - 1
DAT is looking for a Software Engineer to join their team in Beaverton OR or Denver CO The role involves working in a full stack TypeScript ecosystem ...
View DetailsSoftware Engineer (Mid-level) - Act-On Software
Views in the last 30 days - 7
ActOn is a leading marketing automation company looking for a Software Engineer to join their team The company offers a supportive and fun culture com...
View DetailsIT Engineer - DAT
Views in the last 30 days - 0
DAT is seeking a strong IT Engineer to join their growing technical team and drive the evolution of their technology infrastructure and end user exper...
View DetailsSenior Structural Engineer - HDR
Views in the last 30 days - 5
HDR is a company that specializes in engineering architecture and construction services They believe in diversity and collaboration and offer employee...
View DetailsSenior Frontend Engineer - ICIS
Views in the last 30 days - 0
Cirium is a company that provides data and aviation analytics solutions to various industries They are looking for a senior frontend software engineer...
View DetailsIntermediate Software Developer - Cornell Pump
Views in the last 30 days - 13
The job posting is for a design and programming position in Amazon Web Services AWS to support a cloudbased system The successful candidate will join ...
View Details