Job Description
Lob was founded in 2013 by technical co-founders with a vision to connect the world one mailbox at a time. Today we're transforming the way businesses use direct mail and bringing the power of technology to a traditionally manual channel.
Our modern logistics and fulfillment engine helps businesses to build and scale high-quality personalized direct mail programs without the operational burden. As we grow to meet the evolving needs of our customers and expand our product offerings we’re building a team to shape the future of direct mail.
About The Role
We are looking for a Senior Platform Engineer to help scale and improve the reliability observability performance and cost efficiency of our platform infrastructure.
This role is focused on observability engineering and infrastructure optimization across AWS environments. The ideal candidate has deep hands-on experience with Datadog OpenTelemetry and HashiCorp Nomad and understands how to build highly visible scalable and operationally efficient systems while actively reducing unnecessary infrastructure spend.
You will work closely with engineering teams to improve telemetry monitoring performance testing platform reliability and cloud infrastructure efficiency across a fast-moving distributed environment including leveraging modern AI-driven tooling and operational workflows where appropriate.
What You’ll Work On
- Building and improving observability across distributed systems and services
- Designing dashboards alerting metrics tracing and telemetry pipelines
- Improving operational visibility using Datadog and OpenTelemetry
- Helping evolve and mature the organization’s observability strategy and tooling
- Supporting and improving HashiCorp Nomad orchestration environments
- Identifying and implementing AWS cost-saving opportunities across compute storage and platform infrastructure
- Improving infrastructure utilization and operational efficiency across Nomad workloads
- Optimizing S3 storage utilization lifecycle management and storage costs
- Designing and maintaining performance testing environments and tooling
- Running load and performance tests to identify bottlenecks and scalability issues
- Managing and tuning Elasticsearch/OpenSearch environments
- Troubleshooting production performance issues across services infrastructure and databases
- Partnering with engineering teams to improve platform reliability scalability and infrastructure efficiency
Responsibilities
- Lead observability initiatives across infrastructure and applications
- Design and maintain monitoring telemetry dashboards tracing and alerting systems
- Build actionable visibility into platform health reliability and performance
- Improve incident detection troubleshooting and operational response capabilities
- Define observability standards and best practices across engineering teams
- Drive infrastructure cost optimization initiatives across AWS services and platform environments
- Analyze infrastructure utilization and recommend performance and cost efficiency improvements
- Maintain and improve infrastructure-as-code standards and workflows
- Design build and maintain scalable performance testing environments and tooling
- Execute and analyze load/performance testing initiatives
- Support and improve Nomad-based orchestration environments
- Troubleshoot complex production and infrastructure issues across distributed systems
- Collaborate closely with engineering teams to improve scalability reliability operational visibility and infrastructure efficiency
- Create and maintain operational documentation and platform best practices
Qualifications
- 7+ years of experience in platform engineering infrastructure engineering or site reliability engineering
- Strong hands-on experience with HashiCorp Nomad
- Deep expertise with Datadog
- Strong experience implementing and operating observability platforms using OpenTelemetry and modern monitoring tooling
- Experience with Grafana or similar visualization and observability platforms
- Strong understanding of distributed tracing metrics logging and monitoring best practices
- Experience building dashboards alerts telemetry pipelines and operational visibility tooling
- Strong experience identifying and implementing AWS cost optimization strategies in production environments
- Strong knowledge of S3 optimization lifecycle management and storage cost reduction
- Experience building and running performance/load testing environments
- Strong troubleshooting and performance analysis skills across distributed systems
- Strong experience operating infrastructure in AWS environments
- Strong experience with Terraform and infrastructure-as-code practices
- Experience balancing platform reliability observability and infrastructure cost efficiency at scale
- Experience working with distributed and event-driven architectures using technologies such as Redis SQS or Temporal
- Experience managing and tuning Elasticsearch or OpenSearch clusters
- Experience working in fast-paced engineering environments
- Strong communication and collaboration skills
Nice to Have
- Exposure to PostgreSQL RDS to Aurora migrations
- Experience with Kubernetes
- Experience with CI/CD systems and deployment automation
- Experience with Go Python or TypeScript
Since great engineers come from a variety of backgrounds it doesn’t particularly matter if you have a specific degree—we want to hear about your contributions in a real-world setting.
Compensation information
The compensation for this role consists of a base salary + additional RSUs.
Annual Base Salary: $160000 - $177500
<#LI-REMOTE #LI-GD1
“Lob’s salary ranges are based on market data relative to our size industry and stage of growth. Salary is one part of total compensation which also includes equity perks and competitive benefits. Salary decisions are based on many factors including geographic location qualifications for the role skillset proficiency and experience level. Lob reasonably expects to pay candidates who are offered roles within the provided salary ranges.”
We offer remote working opportunities in AZ CA CO DC FL GA IA IL MA MD MI MN MT NE NC NH NJ NV NY OH OR PA RI TN TX UT and WA unless specified otherwise in the job description above.
If you are looking for a progressive fun-spirited and mentally stimulating environment come join us at Lob!
Our Commitment to Diversity
Lob is an equal opportunity employer and values diversity of backgrounds and perspectives to cultivate an environment of understanding to have greater impact on our business and customers. We encourage under-represented groups to apply and do not discriminate on the basis of race religion color national origin gender sexual orientation age marital status veteran status disability status or criminal history in accordance with local state and/or federal laws including the San Francisco’s Fair Chance Ordinance.
Recent awards
#88 on BuiltIn's Best Remote Midsize Companies to Work For in 2025
BuiltIn Best Remote Midsize Companies to Work For in 2024
BuiltIn Best Midsize Companies to Work For 2022
Skills Required
- 7+ years of experience in platform engineering infrastructure engineering or site reliability engineering
- Strong hands-on experience with HashiCorp Nomad
- Deep expertise with Datadog
- Strong experience with OpenTelemetry and modern monitoring tooling
- Strong understanding of distributed tracing metrics logging and monitoring best practices
- Strong experience with AWS cost optimization strategies
- Strong experience with Terraform and infrastructure-as-code practices
What the Team is Saying


Lob Compensation & Benefits Highlights
- Healthcare Strength—Medical dental and vision coverage begins on day one with multiple plan options and substantial employer contributions toward premiums. Mental-health support and HSA funding further bolster overall coverage depth.
- Parental & Family Support—16 weeks of paid parental leave and fertility/family-planning support through Carrot are available to all parents with adoption assistance included. Additional family-friendly measures include temporary childcare stipends and compassionate leave that extends to pets.
- Equity Value & Accessibility—All benefits-eligible employees receive equity as RSUs with four-year vesting with additional performance grants available. Broad-based participation increases access to ownership across roles.
Lob Insights
What We Do
Lob was founded in 2013 by technical co-founders with a vision to connect the world one mailbox at a time. We're transforming the way businesses use direct mail and bringing the power of technology to a traditionally manual channel. Our modern logistics and fulfillment engine helps businesses to build and scale high-quality personalized direct mail programs without the operational burden. As we grow to meet the evolving needs of our customers and expand our product offerings we're building a team to shape the future of direct mail.
Why Work With Us
We believe automation is a catalyst for business success and we’re looking for unique people like you to bring to life our vision of increasing the connectivity between the offline and online worlds. We are proud to be a carbon neutral business!
Gallery
Lob Offices
Remote Workspace
Employees work remotely.
We are a remote first company! We celebrate flexibility in our working environment by being a remote first company and sponsoring co-working space use.
Explore More
Date Posted
05/15/2026
Views
0