Staff Software Engineer - Grafana Databases, Managed Services at Jobgether - UK

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Software Engineer – Grafana Databases, Managed Services in the United Kingdom.

In this role, you will operate at the intersection of large-scale distributed systems, streaming infrastructure, and cloud database platforms, helping power mission-critical observability services used globally. You will be responsible for the reliability, scalability, and performance of multi-cloud infrastructure that underpins high-throughput metrics, logs, and traces systems. Working in a deeply technical, remote-first engineering environment, you will influence architecture decisions while remaining hands-on in production systems. Your work will directly impact the stability and efficiency of large-scale data pipelines operating across hundreds of clusters. This is a high-autonomy role where you will partner with platform and database teams to solve complex distributed systems challenges. You will also play a key role in shaping operational excellence, reliability practices, and long-term system evolution across global infrastructure.

Accountabilities

In this role, you will take ownership of large-scale streaming and database infrastructure, ensuring reliability, scalability, and performance across hundreds of production clusters while driving architectural improvements and operational excellence.

Operate and evolve large-scale multi-cloud streaming and database infrastructure across production environments
Diagnose and resolve complex cross-layer failures involving storage, compute, networking, and control-plane systems
Design and implement safe rollout, upgrade, and migration strategies across distributed systems at scale
Improve observability, automation, and operational tooling to reduce system toil and increase reliability
Define and evolve SLOs, error budgets, and reliability standards for shared infrastructure systems
Partner with engineering teams to optimize query performance, data partitioning, and system scalability
Serve as a primary escalation point for high-severity incidents and lead deep root cause analysis efforts
Drive long-term architectural improvements to reduce systemic risks across multi-cluster environments
Mentor engineers and contribute to best practices in distributed systems engineering and operational excellence

Requirements

You bring deep expertise in distributed systems, infrastructure engineering, or platform engineering, with strong experience operating high-scale production systems in cloud environments. You are highly technical, autonomous, and comfortable leading complex initiatives across global teams.

8+ years of software engineering experience in SRE, platform engineering, infrastructure, or distributed systems roles
Strong experience with large-scale streaming or database systems (e.g., Kafka, Redpanda, ClickHouse, Cassandra, or similar)
Hands-on expertise with Kubernetes in AWS, GCP, or Azure environments
Proficiency in infrastructure-as-code tools such as Terraform, Helm, or similar
Strong programming skills in systems-oriented languages (Go preferred)
Deep understanding of distributed systems behavior, failure modes, and performance trade-offs
Experience with observability, incident response, and writing post-incident reviews
Strong knowledge of Linux internals, networking, storage systems, and cloud architecture
Proven ability to lead technical initiatives and influence architectural decisions without formal authority
Excellent communication skills with the ability to work effectively in remote, cross-functional teams

Benefits

Competitive compensation package including base salary, bonus (where applicable), and equity (RSUs)
Fully remote-first working model with global collaboration across distributed teams
30 days annual leave, including designated shutdown days for full disconnection
Equity ownership in the company’s long-term success through RSU participation
Access to modern AI development tools with company-supported usage budgets
Strong emphasis on autonomy, trust, and outcome-driven engineering culture
Career growth opportunities in a fast-scaling global infrastructure organization
Exposure to cutting-edge distributed systems and large-scale observability platforms
Inclusive, transparent, and highly collaborative engineering environment

Staff Software Engineer - Grafana Databases, Managed Services

Company

Location

Type

Job Description

Accountabilities

Requirements

Benefits

Explore More

Date Posted

Views

Similar Jobs

Staff Fullstack Engineer, Avatars - Jobgether

Staff Backend Engineer, Voices - Jobgether

Software Engineer: Frontend - Jobgether

Senior/Staff Backend Engineer - Jobgether

Senior Software Engineer (Golang) - Jobgether

Senior Software Engineer - Clearing - Jobgether