Staff Software Engineer - ML Training Platform

Reddit β€’ USA

Company

Reddit

Location

USA

Type

Full Time

Job Description

Location: This role is completely remote-friendly . If you happen to live close to one of our physical office locations our doors are open for you to come into the office as often as you'd like.

Who We Are: The Machine Learning Platform team at Reddit is a high-impact team that owns the infrastructure that powers recommendations content discovery user and content quantification while directly impacting other teams such as Growth Ads Feeds and Core Machine Learning teams.

What You’ll Do: As a Staff Software Engineer Training Platform this person will work on our wider Machine Learning Platform team and be instrumental in architecting implementing and maintaining foundational ML infrastructure that powers Feeds Ranking Content Understanding Recommendations and much more to fulfill Reddit’s mission of bringing community and belonging to everyone in the world. You will build systems and tools that enable machine learning engineers (MLEs) and data scientists (DSs) and continuously improve the ML software development lifecycle. You will deliver a self service ML platform that enables the continuous iteration and improvement of systems that use ML techniques including Deep Learning Natural Language Processing Recommendation Systems Representation Learning and Computer Vision.

  • Lead the building testing and maintenance of ML infrastructure at Reddit

  • Propose design and implement high-performance ML platform solutions that significantly advance the deployment of models that serve millions of redditors a seamless experience for MLEs

  • Play a pivotal role in designing building and optimizing the infrastructure and tooling required to support large-scale machine learning workflows

  • Design and implement solutions that significantly advance the architecture of the ML Platform

  • Analyze bottlenecks in distributed systems and optimize for performance and cost-efficiency

  • Work with management on team goal setting planning and de-risk project execution

  • Mentor other team members in adopting a rigorous DevOps approach to maintain and/or improve ML platform components and services health and quality

Who You Might Be:

  • 8+ years of work experience in a production software development environment or building data systems plus a degree in ML Engineering Computer Science or other relevant discipline

  • Experience with design and architecture of large scale ML Systems

  • Experience with ML frameworks such as TensorFlow PyTorch or JAX

  • Experience with training workflows hyperparameter tuning and resource optimization on CPU and GPU

  • Experience with MLOps practices and tools such as Ray and MLFlow

  • Hands-on experience with Kubernetes Docker or other container orchestration systems

  • Experience building production-quality code incorporating testing evaluation and monitoring using object oriented programming experience in: Python and/or golang.

  • Comfortable with distributed systems big data (Petabyte scale) and data-intensive systems

Benefits:

  • Comprehensive Healthcare Benefits

  • 401k Matching

  • Workspace benefits for your home office

  • Personal & Professional development funds

  • Family Planning Support

  • Flexible Vacation (please use them!) & Reddit Global Wellness Days

  • 4+ months paid Parental Leave

  • Paid Volunteer time off

#LI-DB1 #LI-Remote

Apply Now

Date Posted

01/29/2025

Views

0

Back to Job Listings ❀️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.9

Similar Jobs

Data Platform Engineer (Staff / Sr Staff) - Equilibrium Energy

Views in the last 30 days - 0

This job description highlights a foundational role in designing data platforms focusing on infrastructure cataloging and collaboration with teams It ...

View Details

Software Development Engineer III - Identity & Auth - Mapbox

Views in the last 30 days - 0

Mapbox is a leading platform for locationaware businesses offering robust tools and security features The role involves developing secure identity man...

View Details

Front-End Engineer (Senior/ Staff) - Equilibrium Energy

Views in the last 30 days - 0

Equilibrium seeks a FrontendFullStack Engineer to innovate in renewable energy software shaping scalable solutions for complex power systems The role ...

View Details

Software Engineer - Distributed Systems - Figma

Views in the last 30 days - 0

This job description highlights a Software Engineer role at Figma focusing on infrastructure design and distributed systems It emphasizes collaboratio...

View Details

Software Engineer (L5) - Ads Identity, Signals & Audiences - Advertiser User Engagement Signals - Netflix

Views in the last 30 days - 0

Netflix highlights its leading entertainment service with over 300 million members introduces an adsupported tier and outlines a Software Engineer rol...

View Details

Full Stack Engineer - Oddball

Views in the last 30 days - 0

Oddball seeks a Full Stack Engineer to build quality software for the federal space emphasizing impact growth and clean code The role offers remote wo...

View Details