Movable Ink

Lead Site Reliability Engineer

Reposted 3 Hours Ago

Easy Apply

Hiring Remotely in Ontario CA USA

Remote or Hybrid

Senior level

Artificial Intelligence • Marketing Tech • Software

The magic behind your marketing.

The Role

Lead technical reliability initiatives across a multi-cloud multi-region active-active content platform. Architect and evolve core services observability and logging automation and capacity planning. Mentor engineers drive cross-team reliability projects define standards (IaC SLOs on-call) and proactively improve platform scalability and incident outcomes.

Summary Generated by Built In

Movable Ink scales content personalization for marketers through data-activated content generation and AI decisioning. The world’s most innovative brands rely on Movable Ink to maximize revenue simplify workflow and boost marketing agility. Headquartered in New York City with close to 600 employees Movable Ink serves its global client base with operations throughout North America Central America Europe Australia and Japan.

As one of our Lead Site Reliability Engineers you will combine hands-on technical expertise with strategic technical leadership across infrastructure and software development. You will own the design and evolution of major systems within our multi-cloud multi-region active-active content serving platform that serves upwards of 25 Billion requests daily. Through a combination of architectural vision cross-team collaboration and mentorship you will help drive the reliability initiatives and define the technical strategy that scales our platform to 50 Billion requests per day and beyond.

Responsibilities:

Define and drive the automation strategy for infrastructure tooling establishing standards that minimize manual work increase performance and reduce incident frequency and severity of incidents
Own the design reliability and evolution of core platform applications mentoring team members on best practices and ensuring systems meet long-term business objectives
Architect and lead the logging platform strategy driving its design and balancing availability retention and cost optimization
Establish capacity planning and performance management frameworks proactively identifying scaling opportunities and guiding teams through complex troubleshooting scenarios
Lead cross-functional reliability initiatives with SRE and service engineering teams influencing architectural decisions and championing practices that ensure resilient service delivery
Demonstrate a high level of autonomy in anticipating identifying and addressing systemic weaknesses and opportunities for platform improvement without direct supervision.

Qualifications:

Proven track record in Site Reliability or Software Engineering designing building and owning scalable resilient services with a focus on long-term reliability strategy
Deep expertise in architecting and operating complex distributed systems such as Apache Pulsar Apache Kafka Grafana Loki ScyllaDB/Cassandra with the ability to guide teams through distributed system challenges
Designing and owning automation strategies to manage services at scale with expertise in establishing performance analysis frameworks and mentoring others on diagnostics and resolution
Deep hands-on experience (6+ years) in Site Reliability or Software Engineering specifically leading and shaping multi-cloud architecture and strategy (AWS and GCP).
Experience architecting and leading large-scale observability platforms including defining observability standards and SLO frameworks. We use Prometheus and Thanos with Grafana Alloy Loki and Tempo
Experience leading on-call excellence including driving improvements to monitoring and alerting strategies automating runbooks and mentoring team members on incident response best practices. Every member of the SRE team does a week long on-call rotation
Expert-level proficiency with infrastructure as code including defining IaC standards and patterns across teams. We use Terraform and Chef
Advanced Kubernetes expertise including cluster architecture design multi-tenancy strategies and guiding teams on container orchestration best practices. We use EKS and GKE
Proficiency in multiple programming languages with the ability to design and review code that meets reliability standards. We use NodeJS Golang Ruby Python and shell scripting
Advanced Linux systems expertise with the ability to diagnose complex system-level issues and mentor others on performance tuning and troubleshooting

The base pay range for this position is $154000-$200000 CAD/year which can include additional bonus depending on the position ultimately offered in addition to a full range of medical financial and/or other benefits. The base pay offered may vary depending on job-related knowledge skills and experience.

Studies have shown that women communities of color and historically underrepresented people are less likely to apply to jobs unless they meet every single qualification. We are committed to building a diverse and inclusive culture where all Inkers can thrive. If you’re excited about the role but don’t meet all of the abovementioned qualifications we encourage you to apply. Our differences bring a breadth of knowledge and perspectives that makes us collectively stronger.

We welcome and employ people regardless of race color gender identity or expression religion genetic information parental or pregnancy status national origin sexual orientation age citizenship marital status ethnicity family or marital status physical and mental ability political affiliation disability Veteran status or other protected characteristics. We are proud to be an equal opportunity employer.

What the Team is Saying

View all jobs at Movable Ink

View Movable Ink Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: New York NY

600 Employees

Year Founded: 2010

What We Do

Movable Ink personalizes every customer engagement through automation and artificial intelligence. The world’s most innovative brands rely on Movable Ink to maximize revenue simplify workflow and achieve the optimal customer experience. Headquartered in New York City with 600 employees Movable Ink serves its global client base with operations throughout North America Central America Europe and Australia.

Why Work With Us

Look closely at any Inker and you will find that our values remain heartfelt and timeless. We seek out knowledge and cultivate our intuition. We understand that communication starts by listening understanding and caring about others' success. We set a high personal bar believe nothing is impossible and commit ourselves fully to the goal.

Gallery

Movable Ink Offices

Learn More

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible

HQNew York NY

London GB

Munich DE

Toronto Ontario

Waltham MA

Learn more

View all jobs at Movable Ink

View Movable Ink Profile

Report Job

Lead Site Reliability Engineer

Location

Type

Job Description

Movable Ink

Lead Site Reliability Engineer

What the Team is Saying

What We Do

Why Work With Us

Gallery

Movable Ink Offices

Explore More

Date Posted

Views

Similar Jobs

Senior Software Engineer - Commercial Software -

VP Sales - NAM Regional Lead, SME Channel Distribution -

Accounting Manager -

Multimedia Journalist - Spectrum News 1 -

Principal Billing Operations I -

Corporate Communications Manager -

Browse By Category

Browse By Location

Browse By Company

Free Tools

Popular Searches

Resources