Site Reliability Engineer - Engineering Platforms

CoreWeave · Other US Location

Company

CoreWeave

Location

Other US Location

Type

Full Time

Job Description

CoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry’s fastest and most flexible infrastructure. CoreWeave builds cloud solutions for compute intensive use cases — VFX and rendering, machine learning and AI, batch processing, and Pixel Streaming — that are up to 35 times faster and 80% less expensive than the large, generalized public clouds. Learn more at www.coreweave.com.

About the role:

The Engineering Platforms Team functions as the lubricant that keeps CoreWeave’s gears of innovation turning fast and friction-free. This team is responsible for the development, integration, and operation of platforms central to the engineering experience with the ultimate objective of enabling engineers across CoreWeave to do more, better. Central to the Engineering Platforms mission is the operation of our observability, CI/CD, and service catalog systems which leverage CoreWeave’s deep investment in the Kubernetes ecosystem. Engineers on this team will endeavor to discover and remove engineer friction across CoreWeave’s engineering teams through the development of boilerplate, integrations, automation and the operation of shared platforms.

We are seeking a Site Reliability Engineer who can help us execute on the mission of making developers’ lives easier. This individual will work with a team of 8-10 mixed-specialization engineers and have the opportunity to work on the full gamut of rewarding challenges that come with the business of building a cloud in a communicative, supportive, and high-performing environment. As a member of the Engineering Platforms Team you would have the opportunity to:

  • Design and implement services and tools to reduce friction and toil in the lives of our engineering and operations.

  • Improve the performance, security, reliability, and scalability of our observability, CI/CD, and related services and participate in the Engineering Platforms on-call rotation.

  • Create and maintain Kubernetes operators, custom controllers, and other tools to intelligently scale our operational capability.

  • Develop dashboards, alerts, and insights into the customer experience using Grafana-ecosystem tools such as Mimir and Loki.

  • Enable and evangelize the practice of reliability engineering across CoreWeave’s engineering teams.

  • Grow, change, invest in your teammates, be invested-in, share your ideas, listen to others, be curious, have fun, and, above all, be yourself.

Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are some qualities we’ve found compatible with our team. If a portion of this resonates with you, we’d love to talk. 

  • You have one or more years of experience in a software or infrastructure engineering industry

  • You enjoy helping your colleagues achieve more with less effort.

  • You’re comfortable with the idea of using Go as your primary programming language.

  • You’re familiar with how to containerize applications and/or have experience using Kubernetes to manage deployments.

  • You have experience deploying services in production and are interested in learning reliability-at-scale engineering concepts such as the different types of testing, progressive deployments, error budgets, the role observability, and fault-tolerant design.

  • You’ve done some Linux shell scripting and/or can navigate a *nix-based operating system (with the right cheat sheet, if required).

  • You’re excited about being part of a team of diverse perspectives and backgrounds that believe in tackling challenges, growing hand in hand, and winning together.


Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast!  We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values: 

  • Be Curious at your Core
  • Act like an Owner
  • Empower Employees
  • Deliver Best In-Class Client Experience 
  • Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us! 

Benefits

We offer a competitive salary and benefits, including:

  • Medical, dental and vision insurance - 100% paid for the employee
  • Life Insurance 
  • Short and long-term disability insurance 
  • Flexible Spending Account
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our offices
  • Weekly massages in NJ office
  • A casual work environment
  • Work culture focused on innovative disruption

California Consumer Privacy Act - California applicants only

CoreWeave is an equal opportunity employer, committed to our diversity and inclusiveness. We will consider all qualified applicants without regard to race, color, nationality, gender, gender identity or expression, sexual orientation, religion, disability or age.


Apply Now

Date Posted

06/07/2023

Views

4

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Software Architecture Engineering and Cloud Computing Engineer - The Aerospace Corporation

Views in the last 30 days - 0

The Aerospace Corporation is seeking a Senior Project Engineer with expertise in software architecture engineering and cloud computing The role involv...

View Details

Software Engineering Manager - Cargill

Views in the last 30 days - 0

The Software Engineering Manager job involves setting goals for a team responsible for software project development and delivery ensuring quality stan...

View Details

Lead Technical Support Engineer - HERE Technologies

Views in the last 30 days - 0

This role Senior Technical Support Engineer at HERE Technologies involves supporting a diverse portfolio of products and services acting as a technica...

View Details

Principal / Lead Software Engineer- RUST (Algorithmic and Mathematics) - m/w/d - HERE Technologies

Views in the last 30 days - 0

HERE Technologies is seeking a Principal Software Engineer to lead the development of extended services for their VRP solver Tour Planning The role in...

View Details

Senior Software Engineer (Scala/Java) - HERE Technologies

Views in the last 30 days - 0

HERE Technologies is seeking an experienced backend engineer with strong Java or Scala skills to join the Map Processing Pipelines team The role invol...

View Details

Sales Development Representative - UK (Remote) - Dscout

Views in the last 30 days - 0

Dscout is a company that specializes in experience research solutions helping innovative companies like Salesforce Sonos Groupon and Best Buy to build...

View Details