Intermediate Site Reliability Engineer - Database Operations
Company
GitLab
Location
EMEA
Type
Full Time
Job Description
An overview of this role
You will join our Database Operations team as an Intermediate Site Reliability Engineer keeping GitLab.com—one of the largest single-tenancy open source SaaS platforms on the internet—running smoothly and reliably. In this role you'll take ownership of the PostgreSQL database infrastructure that powers millions of developers worldwide automating operational tasks improving system performance and reliability and designing solutions that scale to support hundreds of thousands of concurrent users. You'll work at a unique scale where your decisions directly impact the experience of our customers and the feedback you generate informs product development across GitLab. Over your first year you'll establish expertise in a core area of database operations mentor junior team members and drive projects that deliver measurable improvements to system efficiency and reliability.
You bring both pragmatic operational discipline and software craftsmanship to this role. You're not just responding to incidents—you're designing systems and building automation that prevent them. You'll partner with engineering teams across GitLab to review database changes optimize performance and help others succeed through self-service tooling and knowledge sharing. This is hands-on infrastructure work at scale where your contributions directly shape how reliably and securely GitLab serves the entire platform.
Some examples of projects you could work on:
-
Design and implement mature automation for database provisioning replication and backup testing using tools like Terraform and Ansible.
-
Develop self-service tools and dashboards that empower other teams to manage their own database resources.
-
Lead capacity planning and scalability initiatives to ensure GitLab.com continues growing reliably.
-
Participate in production incident response and help implement systemic improvements to prevent recurrence.
What you’ll do
-
Automate operational tasks across all environments—from package updates and configuration changes to provisioning of user-facing services—so manual effort becomes the exception not the rule.
-
Design and maintain PostgreSQL database infrastructure components that allow GitLab.com to scale reliably while supporting hundreds of thousands of concurrent users.
-
Respond to production incidents and platform emergencies working with peer SREs to diagnose and resolve database-related issues quickly and thoroughly.
-
Build observability systems that monitor database health predict capacity needs based on usage patterns and alert on symptoms rather than outages.
-
Develop and ship database performance solutions in collaboration with product and engineering teams including query optimization migration reviews and infrastructure recommendations.
-
Create self-service tools and automation—using Terraform Ansible Chef and GitLab ChatOps—that empower engineering teams to manage their own database interactions safely.
-
Document decisions learnings and operational procedures so that knowledge becomes repeatable actions and eventually becomes automation.
-
Participate in regularly scheduled on-call rotations to ensure GitLab.com remains operational during off-hours and weekends when necessary.
What you’ll bring
-
Hands-on experience running PostgreSQL in high-growth large production environments including both self-managed infrastructure and database-as-a-service platforms.
-
Expertise with infrastructure automation and configuration management tools such as Ansible Terraform Chef or Puppet to automate operational tasks and drive system reliability.
-
Solid understanding of SQL PL/pgSQL data modeling and data structure design; ability to analyze PostgreSQL internals to troubleshoot and optimize systems.
-
Experience working in large-scale distributed SaaS production environments where you've managed reliability performance and scalability challenges at significant scale.
-
Strong written communication skills and commitment to documentation; you thrive in remote asynchronous environments and share knowledge effectively across your team.
-
Proactive hands-on approach where you identify issues take ownership of solutions and contribute improvements to infrastructure and code.
-
Capability to mentor junior team members and develop deep expertise in your domain areas then share that knowledge to help others grow.
-
Backend engineering experience with languages such as Ruby or Go and/or familiarity with OLAP databases like Clickhouse.
About the team
We are responsible for building running and evolving the entire lifecycle of the PostgreSQL database engine that powers GitLab.com. You’ll be part of our team focused on owning the reliability scalability performance and security of our database infrastructure and supporting services. GitLab.com is one of the largest single-tenancy open source SaaS sites on the internet which means your work directly impacts hundreds of thousands of concurrent users worldwide. We operate in a fully distributed asynchronous environment across multiple regions collaborating on everything from database automation and infrastructure design to incident response and capacity planning. You’ll be solving novel challenges at scale—from implementing observability stacks that predict capacity needs to designing the infrastructure components that allow GitLab to scale reliably. We continuously seek to reduce complexity and improve efficiency by leveraging cloud vendor managed products and services where appropriate ensuring GitLab.com remains a best-in-class production environment. For more on how we operate see Database Operations Team Handbook Page .
Date Posted
11/12/2025
Views
0
Similar Jobs
Intermediate Site Reliability Engineer - Environment Automation - GitLab
Views in the last 30 days - 0
The role of a Site Reliability Engineer at GitLab involves keeping userfacing services and production systems running smoothly by combining software e...
View DetailsLead Golang Software Engineer - Commercial Systems - Canonical
Views in the last 30 days - 0
Canonical is a leading provider of opensource software and operating systems for global enterprise and technology markets The company is hiring a Lead...
View DetailsCore ZK Engineer - Linea - Consensys
Views in the last 30 days - 0
Consensys is a leading blockchain and web3 software company pioneering innovation in the ecosystem They offer opportunities to work on cuttingedge zkE...
View DetailsSupport Engineer - n8n
Views in the last 30 days - 0
n8n is an opensource workflow orchestration platform that empowers technical teams to automate faster and smarter The company has grown rapidly since ...
View DetailsQA Engineer - Zaps - LI.FI
Views in the last 30 days - 0
The job posting is for a cryptonative Quality Assurance Engineer to test and validate the Zaps product a B2B new product suite The role involves revie...
View DetailsSenior Sales Engineer - EDB
Views in the last 30 days - 0
EDB is seeking a Senior Sales Engineer with experience in relational databases customerfacing roles and collaboration The role involves supporting the...
View Details