Job Description
Over the past 15 years, we have seen a shift in the focus of business models across every industry – from selling physical products via one-time transactions to monetizing services via ongoing customer (aka subscriber) relationships. This is the “Subscription Economy” a phrase coined by our CEO, Tien Tzuo, he even wrote the book on it: Subscribed.
Companies have realized that the path to growth going forward is to establish direct, digital relationships with their customers, and monetize these relationships through an ever growing set of digital services.
Our vision is simple: we call it “The World Subscribed.” It’s the idea that one day every company will join the Subscription Economy -- a $1.5 Trillion opportunity by 2025 according to UBS.
Our mission: to power the world’s best companies to win in the Subscription Economy.
THE TEAM
The Site Reliability Operations Engineer at Zuora plays a critical and visible role in delivering and supporting our platform. We are responsible for scaling and optimizing the reliability, availability, and performance of our infrastructure and platform services, and partnering with Engineering teams to build highly available and performant services. We work with amazing developer teams in the design, provisioning, integration, configuration, monitoring, and incident response of large scale distributed applications and platform services. We deliver awesome SaaS.
Responsible For:
- Service Operations & Impacting issue RestorationÂ
- Driving Command Center Incident Bridges for customer issues to resolution
- Responding to Observability Alerts/Alarms
- Responding to escalated issues from Customer supportÂ
- Write & Automate runbooks and drive alerts/incidents and service requests reduction by automationÂ
- Being a liaison for a service and partner with service owner to make the service rock solid and efficient
WHAT YOU’LL ACHIEVE
As a SRO, you will be a member of a team that understands the configuration, technical dependencies, and overall behavioral characteristics of production services. In partnership with developers, you have the responsibility to ensure services are designed and delivered with focus on security, resiliency, scale, and performance. SROs are the ultimate authority and are accountable for end-to-end performance and operability of the services they own.
Champion service reliability operations and incidents prevention
- You will be part of the team whose mission is the shared ownership of a collection of services and technology areas, in partnership with developer teams.
- You are a key escalation point for issues that have been documented as Standard Operating Procedures (SOPs) or issues that needed in-depth troubleshooting and analysis. You will help maintain up-to-date documentation on deployments, processes and SOP runbooks.
- You are a key escalation point in leading incidents and working with Subject Matter Expert (SME) for performing real-time incident handling tasks to support operations. You will help develop and implement the incident management process.
- You will have the deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Once you have expertly mitigated an incident, you will immediately work with SME on how to more quickly resolve the issue next time, with the goal to prevent the problem from recurring. You will help develop and implement the problem management process.Â
- You will manage the full lifecycle of infrastructure and change management, including planned maintenance, standart, normal, and emergency changes. You will help develop and implement change management processes to ensure developers and SRO can easily manage system configurations, deploy new code quickly and fix incidents faster.
Service design and implementation
- You will partner with development SCRUM teams in defining and implementing improvements to service architecture, both current and future. You will be an expert at articulating technical characteristics of services and their dependencies, and guide development teams to engineer highly reliable and performant services.
- You will frequently partner with developer SCRUM teams and actively participate in the execution of tasks required to meet milestones and deliverables set by the team throughout a release cycle.
Operations Engineering
- You will work with a large scale centralized monitoring and logging system to help maintain uptime and troubleshoot problems. You will understand and be able to communicate the capacity, scale, security, performance attributes and requirements of services you own. You will lead design and implementation of monitoring, alerts, and responses for all infrastructure and applications.
- You will implement SRO automation, develop automation across the service reliability operation, and optimize operations hours by reducing manual operations. You will ​​apply engineering mindset and development skills to site reliability operations.
- You will take part in a shared on-call rotation that won’t cripple your life or kill your soul.Â
Job Involves:
- Resolution of complex and critical issues, participation in Major incidents as a SME
- Service expert ensuring expertise is reflected in SOP's documentation are shared
- Instrumentation and metrics that clearly describe the service behaviors
- Scaling requirements and patterns
- Resiliency and recoverability, ensuring that backup / restore and disaster recovery capabilities are implemented, tested and maintained
- Driving and escalating gaps in automation, solutions and documentation
WHAT YOU’LL NEED TO BE SUCCESSFUL
SROs are a rare mix of sysadmins and development engineers, and as such you have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems. You are driven by professional curiosity and a desire to develop a deep understanding of the services and the technologies they depend upon.
You demonstrate competence in shell scripting and high-level programming languages such as Bash, Ansible, Python, Terraform and low-level / no-code programming languages and solutions such as Google Apps Scripts, Jenkins Pipelines Groovy scripts, Jira Automation, Rundeck.
You are proactive, self-motivated, customer-focused, organized, and a good communicator.Â
You have over 4 years experience running large scale customer facing web services with a solid understanding of:
- REST APIs
- Linux/Unix system internals.
- Load balancing technologies, including L7 routing, DNS, and CDN
- Networking and TCP/IP
- Off the shelf observability (monitoring, metrics, alerting, tracing) solutions (Grafana, LogicMonitor, Pingdom) or open source ones (Prometheus)
- Log analysis and troubleshooting using Kibana
- Standard Internet services, such as DNS, HTTP, etc.
- Cloud computing patterns
- Configuration management using Puppet, Chef, Ansible, or similar
- IT Security and compliance
- Container based orchestration platforms such as Kubernetes/EKS/AKS and ECS at scale
- CI/CD pipelines using tools such as GIT, Jenkins, Spinnaker, Terraform and Ansible
- RDBMS and Messaging Fundamentals - MySQL, Oracle and Kafka is preferredÂ
- Programming with Python
You demonstrate practical knowledge of various aspects of distributed service design, including messaging protocols, caching strategies, persistence technologies, and queuing.Â
You have experience with AWS Services like EC2, ELB, ElastiCache, DynamoDB, SQS, SNS, RDS, S3.
You are passionate about automation.
Your head is full of customer-delighting ideas for the next hackathon.
An ideal candidate will also have experience with:
- Container and Container Management technologies, such as Docker and Kubernetes
- Databases and big data stores.
- Defining and documenting technical architecture of complex and highly scalable products.
- Familiarity with ITIL-based incident, problem, and change management.
- Experience working with large global teams and ability to coordinate well within and across various development teams.
Benefits*
- Competitive compensation, company equity, and retirement programs
- Medical, dental and vision insurance
- Paid holidays and “wellness” days and company wide winter break
- Generous, flexible time offÂ
- 6 months fully paid parental leave
- Learning & Development stipend
- Opportunities to volunteer and give back, including charitable donation match
- Free resources and support for your mental wellbeing
*Specific benefits offerings may vary by country
About ZuoraÂ
As the Subscription Economy leader, Zuora empowers today’s innovative companies to nurture and monetize direct, digital relationships. Our award-winning multi-product portfolio now includes Zuora Revenue, Zuora Collect and Zuora Central Platform. More recently, we’ve added subscription experience platform Zephr to our family, further expanding our capabilities to serve as an intelligent hub that monetizes the complete quote to cash and revenue recognition process at scale.
Through our combination of technology and expertise, Zuora (NYSE: ZUO) helps more than 1,000 companies around the world, including BMC Software, Box, Caterpillar, General Motors, Penske Media Corporation, Schneider Electric, Siemens and Zoom nurture and monetize direct, digital customer relationships. Headquartered in Silicon Valley, Zuora operates offices around the world in the U.S., EMEA, APAC and LATAM.
“ZEO” Culture
At Zuora, we’re building an inclusive, high-performance culture that every ZEO wants to subscribe to. We want ZEOs at every level to feel valued, included, and inspired to innovate, connect and collaborate authentically as we pioneer the Subscription Economy. You’ll be empowered to think like an owner, take initiative and together, with the support of your team you’ll push each other to the next level and help transform business models everywhere.
To learn more visit www.zuora.com
Zuora is proud to be an Equal Employment Opportunity Employer.
Think, be and do you! At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an Equal Opportunity Employer committed to creating an inclusive environment for all.
Zuora does not discriminate on the basis of, and considers individuals seeking employment with Zuora without regards to, race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.
We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us by sending an email to [email protected].
Date Posted
04/18/2023
Views
1
Similar Jobs
Software Engineer Networking Software and Services - xAI
Views in the last 30 days - 0
The text describes xAIs mission to develop AI systems for understanding the universe and advancing human knowledge It outlines a role involving networ...
View DetailsAssociate Technical Support Engineer - Recharge
Views in the last 30 days - 0
Recharge is a subscription platform for innovative brands offering customer retention solutions They seek Technical Support roles with 247 coverage em...
View DetailsFull Stack Product Engineer - Jiga
Views in the last 30 days - 0
Jiga is a remotefriendly company focused on empowering engineers with trust autonomy and flexibility They emphasize simplicity ownership and impactful...
View DetailsSenior Design Manager (Infrastructure) - Canonical
Views in the last 30 days - 0
Canonical a leading opensource provider seeks a Senior Design Manager to drive innovation in cloud and AI technologies The role offers remote work glo...
View DetailsSenior Product Designer - Org & Security - Typeform
Views in the last 30 days - 0
This job description outlines a role in developing an intelligent contact management system with AI capabilities The position involves designing user ...
View DetailsExecutive Director Patient Advocacy - Kyverna Therapeutics
Views in the last 30 days - 0
Kyverna Therapeutics is seeking an Executive Director for Patient Advocacy to lead initiatives in autoimmune disease treatment The role involves build...
View Details