Site Reliability Engineer
Company
Ventures Lab
Location
Malta
Type
Full Time
Job Description
The purpose of the Site Reliability Engineer (SRE) role is to ensure the stability, scalability, and performance of production systems while driving improvements in overall system reliability and operational efficiency. By bridging the gap between development and operations, the SRE role focuses on creating a resilient infrastructure through automation, monitoring, and proactive incident management.
The SRE is responsible for designing and implementing tools and processes that enhance the reliability of applications, reduce downtime, and optimize system performance. They work to establish best practices for high availability, incident response, and continuous improvement, ensuring seamless user experiences and aligning system operations with business objectives. The SRE plays a critical role in both preventing and rapidly resolving issues, contributing to a stable, scalable, and reliable technology ecosystem.
- Design, implement, and maintain highly available infrastructure, focusing on failover strategies, redundancy, and scalability.
- Develop and maintain Infrastructure as Code (IaC) scripts using tools like Terraform, Ansible, or CloudFormation.
- Set up and manage monitoring and alerting systems to proactively detect issues (using tools like Prometheus, Grafana, or Datadog).
- Automate repetitive tasks, deployments, and infrastructure provisioning to improve efficiency and reduce human error.
- Conduct performance tuning and optimizations across infrastructure, applications, and databases to improve responsiveness and reduce latency.
- Work closely with security teams to ensure compliance with regulatory standards and address vulnerabilities promptly and implement security best practices across infrastructure and applications to protect systems and data.
- Collaborate with development teams to optimize applications and integrate reliability into the software development lifecycle.
- Partner with DevOps to improve CI/CD pipelines, streamline releases, and enhance build and deployment automation.
- Advocate for Site Reliability Engineering principles and educate teams on reliability best practices, monitoring, and error handling
- Implement and track SLAs, SLOs, and error budgets, continuously assessing and improving reliability.
- Infrastructure as Code (IaC): Proficiency with IaC tools such as Terraform, Ansible, CloudFormation, or similar for automating infrastructure provisioning.
- Cloud Platforms: Strong experience with cloud providers (Azure) and services such Kubernetes (EKS/GKE/AKS).
- Monitoring and Alerting: Hands-on experience with monitoring and alerting tools (Prometheus, Grafana, Datadog, New Relic, or similar).
- Scripting and Automation: Proficiency in scripting languages like Python, Bash, or PowerShell for automation and tooling.
- CI/CD and DevOps: Familiarity with CI/CD pipelines and tools (Azure Devops, Bamboo or Octopus), and experience implementing continuous delivery and deployment practices.
- Incident Management: Experience with troubleshooting, root cause analysis, and leading incident response efforts.
- Strong skills of performance Optimization
- Ability to analyze complex systems
- Understanding security practices
- Competitive salary synonymous with skills and experience
- Performance and bonus structure dependent on achievement of set targets and personal performance
- Consultancy contract (B2B) offering paid time off

Date Posted
12/05/2024
Views
0
Similar Jobs
Client Success Manager - Visa
Views in the last 30 days - 0
Visa a global leader in payments and technology is seeking a Consultant Client Success This role involves managing postsale client services driving cl...
View DetailsPowerApps Engineer - EPAM Systems
Views in the last 30 days - 0
The job posting is for a PowerApps Engineer position The role involves contributing to a migration project developing new features and communicating w...
View DetailsFull Stack Software Engineer - Raketech Group Limited
Views in the last 30 days - 0
Raketech is seeking a Senior FullStack Software Engineer to drive their vision forward in the iGaming affiliate and performance marketing industry The...
View DetailsSenior Software Engineer (PHP/Wordpress) - Raketech Group Limited
Views in the last 30 days - 0
Raketech is seeking a Senior Software Engineer with experience in PHP WordPress and Agile methodologies to drive their vision forward in the iGaming a...
View DetailsSenior Fullstack Software Engineer (React/ Nest.js/ TypeScript) - Raketech Group Limited
Views in the last 30 days - 0
Raketech is seeking a Senior FullStack Software Engineer to drive their vision forward in the iGaming affiliate and performance marketing industry The...
View DetailsSite Reliability Engineer - Ventures Lab
Views in the last 30 days - 0
The Site Reliability Engineer SRE role is responsible for ensuring the stability scalability and performance of production systems They focus on creat...
View Details