The SRE position is very important for our Client to ensure the availability of its applications in GCP and Azure cloud platforms.
The clients requires also disposition to provide planned on call support.
SRE Responsibilities include:
- Ensure the reliability and uptime of cloud solutions and services aligned with user needs
- Support pre-launch activities including system design consulting platform development capacity planning and launch reviews
- Monitor and enhance live services by tracking availability latency and overall system health
- Scale systems sustainably through automation and drive improvements in reliability and delivery velocity
- Assess and optimize existing infrastructure within geoscience workflows
- Collaborate with network and security teams to ensure secure and reliable application operations
- Develop and document best practices for new projects and services
- Leverage service management systems to share lessons learned and best practices across the technical community
- Participate in incident response and conduct blameless postmortems
Below are the Key technical skills required:
- Proficiency in GCP and Microsoft Azure
- Experience with observability tools such as Grafana Prometheus Thanos Loki
- Knowledge of Google Stack driver/Azure monitoring
- Azure CI/CD pipeline expertise
- Strong scripting and automation capabilities
- Familiarity with microservice architecture
- Experience with Azure/GCP PostgreSQL
- Experience with cloud storage such as Azure and Google storage solutions
- Experience of container registries
- Very good English communication level at least B2