Site Reliability Engineer

AddShoppers · Charlotte, NC

Company

AddShoppers

Location

Charlotte, NC

Type

Full Time

Job Description

 

AddShoppers is searching for a Site Reliability Engineer (SRE) with deep Linux and Automation knowledge to join our team on a full-time basis. As we continue to scale our fully distributed operations, we are seeking a talented SRE to join our remote team and help ensure the reliability, scalability, and performance of our infrastructure. Your primary responsibility will be to ensure the availability and stability of our infrastructure by proactively monitoring, troubleshooting, and resolving incidents. You will collaborate closely with cross-functional teams, including software engineers, operations, and product managers, to optimize our systems and enhance their resilience.


Principle Responsibilities:

  • Monitor and maintain the health, performance, and availability of our production systems, ensuring proactive identification and resolution of potential issues
  • Troubleshoot and resolve incidents and outages promptly and effectively, minimizing downtime and impact on end-users
  • Develop and implement monitoring and alerting solutions to proactively detect and address performance bottlenecks, security vulnerabilities, and other issues
  • Collaborate with software engineering teams to design and implement scalable, reliable, and highly available systems
  • Own the Development Pipeline Stages from planning to environments and structures to logging and monitoring solutions for cloud applications
  • Lead infrastructure and platform deployments across cloud environments.
  • Plan and perform project work aimed at increasing the availability and scalability of all components of our infrastructure
  • Perform end to end POC on new tools or technologies and help in adopting and implementing new DevOps tools and processes
  • Troubleshoot and debug issues utilizing tools like tcpdump, nmap, and netstat. Interpret output to provide direction for development teams on what fixes are needed
  • Stay updated with industry trends and emerging technologies, applying them to enhance our infrastructure and SRE practices

 

Experience:

  • 3-5+ years with DevOps and overall cloud engineering including 3+ years hands-on engineering with related tools
  • Bachelor's degree in computer science, engineering, mathematics
  • Excellent skills automating, deploying, configuring Infrastructure tools, developing cloud native distributed systems with Automation & IaC (infrastructure as code), tools like Ansible, Puppet, Terraform
  • GCP is the preferred platform but we realize skills are transferable
  • Develop container orchestration platforms, microservices, and serverless architecture
  • Solid understanding of Internet protocol (IP, TCP, HTTP, DNS, SSL/TLS)
  • Strong background in build and deployment processes, CI/CD, and application configuration management
  • Efficiently migrated legacy workloads into scalable platforms
  • Exposure to security concepts, best practices and policies for cloud-based deployments
  • Familiarity with our tech stack (or related technologies) which include GCP, Python, MongoDB, Bash, and Elasticsearch
  • Knowledge of logging and performance management tools such as Splunk, Dynatrace, AppDynamics
  • Experience with Web application firewall (WAF) concepts and technologies
  • Expertise in communication protocols, particularly the details of HTTP and HTTP server implementations
  • Knowledge in network configurations like Subnet, VPN, DNS configurations, DHCP etc.
Apply Now

Date Posted

06/27/2023

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories