Site Reliability Engineer

Rockset · Peninsula

Company

Rockset

Location

Peninsula

Type

Full Time

Job Description

ABOUT ROCKSET


At Rockset, we’ve built the real-time analytics database for the world's data applications. Our team and technology come from a rich heritage, rooted in the experience of building massive scale data systems at the world’s leading companies, and we created Rockset to make those kinds of powerful data platforms available to real-time application developers everywhere. We are creating a world where developers can go from complex data sets to fast, interactive applications and analysis effortlessly.


We’re a fast-growing company that values curiosity, diversity, and open-mindedness. You will solve interesting problems, surrounded by exceptional people, while making customers happy. We work hard, but also take our personal lives and experiences seriously. We are backed by Greylock Partners and Sequoia Capital, and headquartered in San Mateo, CA with offices in Boston, MA and London, UK and remote employees throughout the US.



As a site reliability engineer, you will be responsible for the automation, stability, security, configuration, monitoring, alerting, and capacity planning of Rockset's network, systems, and infrastructure. You will also build tools that help the rest of the engineering team be more productive, and including the ones that Rockset engineers use to deploy and manage their services. You will have a foundational impact on shaping the team and the systems we create. The on-call pager is shared by most of the engineering team, not just SRE.

 

Our infrastructure is completely hosted in Amazon Web Services. We use a variety of home grown, open source, and commercial tools, including Kubernetes, Docker, Kafka, Zookeeper, Prometheus, Grafana, Salt, Terraform, Phacility, and Buildkite. We try to deploy new code to our production environment twice a week, but as an SRE you can expect to make production changes on a daily basis.

 

You should expect to collaborate with all other engineering teams to develop solutions that meet reliability, security, and business requirements. Lastly, you will diagnose, triage, and build solutions for complex technical issues at scale.


The US base salary range for this full-time position is $140,000/year to $215,000/year + equity + benefits. The actual pay may vary based on factors such as location, experience, and skills. Final salary will be commensurate with the candidate’s level and location. This range represents base salary only.


You'd be a great fit if you are:

  • Passionate about distributed systems, database technologies, and highly scalable services
  • Poised under fire and willing to share an on-call rotation with the rest of the team
  • A self-starter who thrives in a fast-paced environment
  • Willing to learn new skills and technologies
  • Attentive to details and comfortable with ambiguity

It would be even more awesome if you also have:

  • Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience
  • Experience as an SRE for 3+ years 
  • Experience building and operating public-facing 24x7 web applications at scale
  • Experience working with cloud infrastructure and patterns (AWS preferred)
  • Strong programming skills in a scripted language (Python, Ruby, Bash)
  • Experience with Kubernetes, Mesos, Swarm, or similar container orchestration tools
  • Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools
  • Experience with Grafana, Prometheus, Datadog, or similar monitoring tools
  • Experience with Azure a plus

OUR COMMITMENT TO DIVERSITY


We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Apply Now

Date Posted

12/06/2023

Views

2

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Manager, Site Reliability Engineering - Zoox

Views in the last 30 days - 0

Zoox is seeking a Site Reliability Engineering Manager to lead and grow the team ensuring the reliability scalability and performance of the companys ...

View Details

Senior Staff Simulation Engineer - Wisk

Views in the last 30 days - 0

Wisk Aero is seeking a Senior Staff Simulation Engineer to join their Flight Physics Vehicle Modeling FPVM team The role involves designing implementi...

View Details

Senior Simulation Software Integration Engineer - Wisk

Views in the last 30 days - 0

Wisk is seeking a Senior Simulation Software Integration Engineer to lead the integration of highfidelity simulation models develop advanced test fram...

View Details

Support Engineer - Pricefx

Views in the last 30 days - 0

Pricefx a leading SaaS Pricing Price Optimization Management provider is seeking a Tier 34 Support Engineer The role involves providing technical sup...

View Details

Avionics Mechanical Engineer (Harness) - Reliable Robotics Corporation

Views in the last 30 days - 0

Reliable Robotics is seeking an Avionics Mechanical Engineer to join their Vehicle Design and Integration team in Mountain View California The role in...

View Details

Sr. Flight Software Engineer (Verification) - Reliable Robotics Corporation

Views in the last 30 days - 0

Reliable Robotics is a team of missiondriven engineers developing safetyenhancing technology for aviation aiming to make air transportation safer more...

View Details