Database Reliability Engineer (DRE)
Job Description
Airtable enables any team to create apps on top of shared data and power their most critical and unique workflows. Teams at more than 300,000 organizations, including 80% of the Fortune 100, rely on the Airtable Connected Apps Platform to connect their people and data and achieve their most important goals. Founded in 2013 and headquartered in San Francisco, Airtable ranks #6 on the Forbes Cloud 100 and has raised $1.36 billion to date.
The Storage team at Airtable is looking for Database Reliability Engineers (DREs) to own the reliability, scalability, and performance of our MySQL database infrastructure. DREs at Airtable combine the deep database expertise of database administrators with the engineering methodology and operational discipline of site reliability engineers. We run petabyte-scale MySQL clusters that serve hundreds of thousands of queries per second, and DREs are critical to Airtable’s success as we continue to scale.
As a DRE at Airtable, you will be entrusted with identifying and leading large projects to enhance our database infrastructure. You will work closely with software engineers and site reliability engineers to execute these projects, with a particular focus on aspects like reliability, observability, and operational ease-of-use.
As an example of what you might work on, we blogged about our team’s work upgrading our database infrastructure from MySQL 5.6 to MySQL 8.0.
Some potential projects on the horizon include:
- Implementing a zero-downtime failover capability for MySQL, to improve both mean-time-to-recovery and our own operational capabilities. This may involve reworking our ProxySQL architecture, or using open-source tools like Orchestrator.
- Develop self-service tooling and frameworks that allow product developers to independently provision and operate new database clusters.
- Cost optimization of our database infrastructure. This includes a variety of improvements like automated processes for deleting and archiving cold data, increasing the storage density of our databases, and exploring alternative database instance types.
- Creating a testing framework that can validate large MySQL infrastructural changes with production-level load.
- Evaluating next-generation distributed databases like Vitess, TiDB, CockroachDB, etc, with a focus on aspects like fault-tolerance, backup/restore, observability, and operational capabilities.
As a member of the Storage team, you will participate in an on-call rotation for our transactional storage systems. DREs are exemplary incident responders: able to apply rigorous thinking and strong debugging skills to quickly remediate problems during high-pressure situations, and afterwards leading blameless postmortems to understand and address underlying root causes.
Finally, as one of the first DREs at Airtable, you will have the opportunity to influence the Storage team’s vision and strategy, and create and evangelize best practices, processes, and tooling related to our database infrastructure.
What you'll do- Own all aspects of the reliability, scalability, and performance of Airtable’s MySQL database infrastructure, using a data-driven approach (metrics, SLOs) to prioritize improvements.
- Work with technologies like MySQL, RDS, Terraform, NodeJS, TypeScript, Datadog, ELK, and Sentry.Â
- Partner with SWEs and SREs to identify and lead projects to enhance Airtable’s MySQL database infrastructure, working collaboratively to gather requirements, align stakeholders, and develop and execute project plans.
- Build tools to automate operational processes, with a focus on reducing manual work and improving developer velocity.
- Participate in an on-call rotation to maintain high-scale, mission-critical databases that are essential to Airtable.
- Influence the vision, strategy, and culture of a rapidly growing team.
- You have 4+ years of experience operating high-scale production database clusters (MySQL, Percona Server, MariaDB, Vitess, TiDB, etc)
- You are curious, driven, and thorough. Whether it's scripting bulletproof automation, testing scripts, writing recovery playbooks, or digging into logs to address an incident, you value thoroughly understanding the root cause of a problem, solving it, and solving it well.
- You enjoy writing clean, maintainable, well-tested code. In some situations it’s the right decision to move fast and incur technical debt, but with the understanding that it needs to be paid back later.
- We have your medical, dental, and vision insurance 100% covered (and your dependents covered at 80%)
- High deductible health plan available with health savings account contribution
- Complimentary One Medical membership for individuals and dependents
- Monthly “Lifestyle Wallet” to use for benefits like personal fitness (e.g., gym memberships, fitness equipment, etc.) to pet care to nutrition coaching, and more. Â
- Complimentary mental health support via Modern HealthÂ
- Family planning support via Carrot (fertility, adoption, and surrogacy)
- Flexible and generous time off and sick time benefits
- 16 weeks of parental leave
- Annual Learning & Development wallet to support your career development
- Emergency backup care for dependentsÂ
- Access to financial planning and legal support
- Supplemental reimbursement for Gender Affirmation procedures and services
- Access to pre-tax Transportation & Commuter Benefits
Airtable enables any team to create apps on top of shared data and power their most critical and unique workflows. Teams at more than 300,000 organizations, including 80% of the Fortune 100, rely on the Airtable Connected Apps Platform to connect their people and data and achieve their most important goals. Founded in 2013 and headquartered in San Francisco, Airtable ranks #6 on the Forbes Cloud 100 and has raised $1.36 billion to date.
Please see our Privacy Notice for details regarding Airtable’s collection and use of personal information relating to the application and recruitment process by clicking here.
Explore More
Date Posted
05/24/2023
Views
8
Similar Jobs
Manager, Site Reliability Engineering - Zoox
Views in the last 30 days - 0
Zoox is seeking a Site Reliability Engineering Manager to lead and grow the team ensuring the reliability scalability and performance of the companys ...
View DetailsSenior Staff Simulation Engineer - Wisk
Views in the last 30 days - 0
Wisk Aero is seeking a Senior Staff Simulation Engineer to join their Flight Physics Vehicle Modeling FPVM team The role involves designing implementi...
View DetailsSenior Simulation Software Integration Engineer - Wisk
Views in the last 30 days - 0
Wisk is seeking a Senior Simulation Software Integration Engineer to lead the integration of highfidelity simulation models develop advanced test fram...
View DetailsSupport Engineer - Pricefx
Views in the last 30 days - 0
Pricefx a leading SaaS Pricing Price Optimization Management provider is seeking a Tier 34 Support Engineer The role involves providing technical sup...
View DetailsAvionics Mechanical Engineer (Harness) - Reliable Robotics Corporation
Views in the last 30 days - 0
Reliable Robotics is seeking an Avionics Mechanical Engineer to join their Vehicle Design and Integration team in Mountain View California The role in...
View DetailsSr. Flight Software Engineer (Verification) - Reliable Robotics Corporation
Views in the last 30 days - 0
Reliable Robotics is a team of missiondriven engineers developing safetyenhancing technology for aviation aiming to make air transportation safer more...
View Details