Staff Operations Engineer
Job Description
Responsibilities
- Design, implement, and maintain monitoring systems and infrastructure to ensure the health, performance, and availability of our distributed monitoring platform.
- Make changes to the monitoring system according to the company's needs and processes, using the configuration mgmt toolset.
- Collaborate with development and operations teams to integrate monitoring solutions into the software development lifecycle and operational processes.
- Guide teammates to raise their level of expertise across the team's solutions.
- Be on top of capacity requirements in a growing environment.
- Active work with the team's codebase to extend system integrations and routine automation.
- Conduct regular audits and assessments of monitoring systems to ensure adherence to best practices and industry standards.
- Represent team in global incidents resolution, and participate in on-call rotation.
Skills
- Proven experience as an Operations Engineer or similar role, with a focus on designing and implementing monitoring solutions of 5+ years.
- Linux in-depth knowledge.
- Excellent problem-solving and troubleshooting skills.
- Knowledge of one of the programming languages (see Preferable technology stack).
- Deep understanding of the monitoring domain and SaaS approaches.
- Experience in implementing and operating monitoring systems in large-scale, heterogeneous, and fast-growing environments.
- Proficiency in cloud platforms.
- Thorough knowledge of one or more of the configuration management tools.
- Familiarity with ITIL or other IT service management frameworks.
- Experience with one of the agile development approaches.
- Ability to work in a diverse multicultural environment, communicating with globally distributed teams.
- Customer-centric mindset.
- Team player with self-start ability.
- Fluent in spoken and written English.
Preferable technology stack
- OS: Linux (CentOS/RedHat/Oracle Linux).
- Programming languages in order of preference: Go, Python, PHP, Perl.
- Cloud: AWS, GCP.
- Containerization: Kubernetes.
- Distributed Log: Kafka, ELK stack.
- Monitoring: Zabbix, Prometheus, CloudWatch, Grafana.
- DBs: VictoriaMetrics, MongoDB, PostgreSQL, MySQL.
- Configuration Mgmt: Ansible, Terraform, ArgoCD, Spinnaker.
- VCS: GitLab.
- HA: Keepalived, HAProxy.
Qualification
- B.S in Computer Engineering, Computer Science, or related field with 5 years of related experience
Date Posted
05/12/2023
Views
12
Similar Jobs
Systems Engineer - Mission Operations Lead - York Space Systems
Views in the last 30 days - 0
York Space Systems is seeking a Systems Engineer Mission Operations Lead The role involves acting as the mission operations focal point leading the de...
View DetailsSenior Electrical Engineer - Red 6
Views in the last 30 days - 0
Red 6 is a pioneering AR technology startup specializing in synthetic air combat training The company is seeking a Senior Electrical Engineer to contr...
View DetailsSoftware Engineer-Simulation, Integration and Test - York Space Systems
Views in the last 30 days - 0
York Space Systems a leading aerospace company is seeking a passionate candidate with an Aerospace and Software background to join their Simulation In...
View DetailsLaunch Systems Integration Engineer, Sr. - York Space Systems
Views in the last 30 days - 0
York Space Systems an innovative aerospace company is seeking a Launch Systems Integration Senior Engineer The role involves coordinating spacecraft l...
View DetailsCompliance Researcher - Accurate Background
Views in the last 30 days - 0
Accurate Background is seeking a Compliance Researcher to join their team The role involves maintaining the Global Services Register conducting compli...
View DetailsColorado JCC Salesforce Administrator - OpenTent
Views in the last 30 days - 0
OpenTent a dedicated team of data specialists is seeking a Salesforce Administrator to support the Boulder and Denver Jewish Community Centers The rol...
View Details