Job Description
As a Corporate Site Reliability Engineer (SRE) at Dropbox, you will help lead the infrastructure strategy and technical direction of one of the most innovative technology companies globally. Successful candidates will possess a growth mindset, strong accountability and be passionate about designing, building, and securing scalable infrastructure services in a dynamic environment. You will drive improvement projects in automation and observability and effectively handle incidents that arise in a prompt but measured way. In this role, you’ll serve as a technical lead of programs related to monitoring, metrics, alerting and reliability throughout the IT Services organization, and contribute to the evolution of our world-class infrastructure while ensuring utmost security and scalability.
Responsibilities- Ensure the reliability, scalability, and performance of Dropbox's infrastructure and services
- Collaborate with cross-functional teams to develop and maintain best practices for monitoring, logging, and incident response
- Build, Implement and maintain automations & infrastructure-as-code tooling, specifically Terraform, Ansible, and Github Actions as well as custom code platforms.
- Utilize container orchestration platforms, such as Kubernetes, Amazon ECS and Red Hat Openshift, to manage containers at scale
- Manage and optimize monitoring and logging pipelines using tools like Datadog and Cribl LogStream
- Drive improvement projects related to service health and visibility for our stakeholders, ranging from developers to business service owners to C-level.
- Develop and maintain custom tooling and automation scripts in Bash, Python and other scripting languages
- Perform periodic on-call duty to manage incidents related to the reliability of Dropbox corporate services
- Participate in on-call rotations and provide support during and after incident response calls
- 5+ years of experience in site reliability engineering or a similar engineering roles with hands-on coding experience
- Strong knowledge of AWS services, including EC2, S3, RDS, R53, Lambda, and others
- Strong knowledge of Linux administration, internals, filesystems, volume management and specific distro’s such as Ubuntu, RHEL, DNS, DHCP
- Experience with monitoring and logging tools, Datadog and logging pipeline tools such as Vector or Cribl LogStream
- Experience driving one or more transformational programs related to metrics and observability
- Experience with scripting in a higher level language (Python preferred)
- Experience developing automation to solve infrastructure-related tasks with tools such as Chef/Ansible/Terraform/
- Experience with log analysis and building metrics, alerts and visuals from log data
- Strong proficiency in infrastructure-as-code tools, such as TerraformÂ
- Strong Proficiency in Config Management tools specifically Ansible Automation Platform and Chef
- Experience with containerization technologies, such as Docker, and container orchestration platforms like Kubernetes or Amazon ECS
- Knowledge of LDAP, REST API’s and current Auth
- Familiarity with GitHub and Git-based workflows
- Understanding of RDS databases and network security technologies, such as WAF
- Strong problem-solving skills and the ability to work well in a fast-paced, collaborative environment
- Excellent written and verbal communication skills
Explore More
Date Posted
08/27/2023
Views
7
Similar Jobs
Software Engineer Networking Software and Services - xAI
Views in the last 30 days - 0
The text describes xAIs mission to develop AI systems for understanding the universe and advancing human knowledge It outlines a role involving networ...
View DetailsAssociate Technical Support Engineer - Recharge
Views in the last 30 days - 0
Recharge is a subscription platform for innovative brands offering customer retention solutions They seek Technical Support roles with 247 coverage em...
View DetailsFull Stack Product Engineer - Jiga
Views in the last 30 days - 0
Jiga is a remotefriendly company focused on empowering engineers with trust autonomy and flexibility They emphasize simplicity ownership and impactful...
View DetailsSenior Design Manager (Infrastructure) - Canonical
Views in the last 30 days - 0
Canonical a leading opensource provider seeks a Senior Design Manager to drive innovation in cloud and AI technologies The role offers remote work glo...
View DetailsSenior Product Designer - Org & Security - Typeform
Views in the last 30 days - 0
This job description outlines a role in developing an intelligent contact management system with AI capabilities The position involves designing user ...
View DetailsExecutive Director Patient Advocacy - Kyverna Therapeutics
Views in the last 30 days - 0
Kyverna Therapeutics is seeking an Executive Director for Patient Advocacy to lead initiatives in autoimmune disease treatment The role involves build...
View Details