Site Reliability Engineer C-477

SMASH · South Bay

Company

SMASH

Location

South Bay

Type

Full Time

Job Description


SMASH, Who we are?

We are agents for tech professionals in Costa Rica and Colombia that help them build careers in the United States. 

  • We believe in long-lasting relationships with our talent. We invest time getting to know them as individuals and understanding what they are looking for as their professional next step. 
  • We aim to find the perfect match. As agents, we make sure to pair our talent with our US clients, not only by their technical skills but as a cultural fit. Our core competency is to find the right talent, fast.
  • We purposefully move away from the “contractor” or “outsourcing” type of relationship. Our clients don’t want contractors or “just a service.” Neither does our talent.

 Our Benefits

  • Work from Home
  • English Academy for Employees and Relative
  • Business Skills Coach – Certifications
  • Discounts with Tech Universities
  • Events and additional Perks

Job Description 
The Site Reliability Engineer is responsible for keeping all member-facing and internal production systems running smoothly. As an SRE engineer you will work with multiple teams to encourage SRE principles, maintain the availability and reliability of systems, establish SLIs/SLO’s, and develop tools and monitoring for operational visibility. SRE engineers are members of the scrum teams and work closely with quality and software engineers to support services prior to general availability through activities such as launch reviews, reviewing performance and validating logging in dev environments. Responsible for ensuring quality releases to production environments. The SRE engineer participates in an on-call rotation, working with internal and vendor teams to manage, troubleshoot and resolve production issues.

To be effective, an individual must be able to perform each job duty successfully.

  • Keep current with emerging testing techniques and technologies, as well as emerging development practices.

  • Assist in diagnosing, finding the root cause, reporting, and tracking production and non- production issues.

  • Continually researching new ways of improving and scaling systems and services.

  • Lead initiatives to improve the reliability, scalability, and availability of production applications.

  • Build out tools, platform, and processes to enable these goals.

  • Lead and contribute to design, develop, and improve SRE practices and procedures.

  • Create and maintain health dashboards, identifying and measuring health indicators, SLI’s/SLO’s and providing tools for operational visibility of production systems.

  • Participate in and contribute to improving our incident response acting as an escalation point for production incidents.

  • Perform root cause analysis (RCA), troubleshoot, and debug issues across our applications and services to identify and fix root cause.

  • Enhance and maintain the software release procedures and processes.

  • A strong desire and aptitude for system automation to eliminate manual work with day-to- day operations.

  • Skilled with application monitoring practices and tools (New Relic, Azure Monitor, DataDog, Splunk, etc.)

  • Understanding of and experience with SRE and DevOps principles. Demonstrated experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts.

  • Meets the needs of the end user in a quality, consistent, and professional manner, using independent judgment where appropriate.

  • Mentors less experienced engineers.

  • Excellent communication skills (verbal and written) are critical, along with exceptional problem-solving skills, and exceptionally professional behavior when interacting and responding with other technical teams throughout the organization.

  • Take part in an on-call rotation.

  • Performs additional duties and responsibilities as assigned.

Experience 

  • Minimum 4 years of professional experience in site reliability engineering, software development, or systems administration

  • Experience monitoring or troubleshooting web applications.

  • Experience with Scrum and associated tools such as Azure DevOps or Jira

  • Experience with some of the following tool sets:

    • Application monitoring tools (New Relic, DataDog, Splunk, etc.)
    • Automation tools (Pega, Microsoft Power Platform, Logic Apps, etc.) o API tools (Rest#, Postman, Swagger, etc.)
    • Front end tools (Selenium, Page Object Model, etc.)
    • Backend tools (SQL Server, Entity Framework, Dapper, etc.)
    • Build tools (Node, Docker, Azure Pipelines, etc.)
    • Infrastructure as Code(Terraform, Ansible, Chef,etc.)
  • Experience with automating, monitoring, andor alerting on some of the following:
    • Web applications in Angular and React
    • Internal support tools
    • 3rd party integrations
    • Database and API connections (Rest and SOAP)
    • Cloud Solutions (AWS, Azure, or others)
  • Experience working in an agile CI/CD or rapid software testing environment.

  • Experience understanding of Git and source control concepts.

Apply Now

Date Posted

04/24/2024

Views

3

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0.5

Similar Jobs

Staff Flight Test Engineer - Wisk

Views in the last 30 days - 0

Wisk Aero is seeking a Staff Flight Test Engineer to join their team in Hollister CA The role involves ensuring safe and efficient flight testing and ...

View Details

Senior Developer, Data Engineer - Tarana Wireless, Inc.

Views in the last 30 days - 0

Tarana is seeking a Senior DeveloperData Engineer with 5 years of experience in building largescale data pipelines The role involves designing buildin...

View Details

Staff Engineer, System Design Verification Engineering - Western Digital

Views in the last 30 days - 0

Western Digital is seeking a validation engineer to define and track test plans characterize and optimize SSDs and lead bug review meetings The ideal ...

View Details

Servo Development Engineer - Western Digital

Views in the last 30 days - 0

Western Digital a company with over 50 years of experience in data storage is seeking a skilled professional to optimize highperformance and robust po...

View Details

Senior Front-End Software Engineer - Percipient.ai

Views in the last 30 days - 0

Percipientai founded in 2017 is a cuttingedge technology company specializing in Computer Vision Artificial Intelligence and Deep Learning They develo...

View Details

Principal Software Engineer (Prisma Access) - Palo Alto Networks

Views in the last 30 days - 0

Palo Alto Networks is a cybersecurity company committed to protecting the digital way of life They are seeking a Principal Software Engineer to build ...

View Details