Job Description
SMASH, Who we are?
We are agents for tech professionals in Costa Rica and Colombia that help them build careers in the United States.
- We believe in long-lasting relationships with our talent. We invest time getting to know them as individuals and understanding what they are looking for as their professional next step.
- We aim to find the perfect match. As agents, we make sure to pair our talent with our US clients, not only by their technical skills but as a cultural fit. Our core competency is to find the right talent, fast.
- We purposefully move away from the “contractor” or “outsourcing” type of relationship. Our clients don’t want contractors or “just a service.” Neither does our talent.
Our Benefits
- Work from Home
- English Academy for Employees and Relative
- Business Skills Coach – Certifications
- Discounts with Tech Universities
- Events and additional Perks
Job Description
The Site Reliability Engineer is responsible for keeping all member-facing and internal production systems running smoothly. As an SRE engineer you will work with multiple teams to encourage SRE principles, maintain the availability and reliability of systems, establish SLIs/SLO’s, and develop tools and monitoring for operational visibility. SRE engineers are members of the scrum teams and work closely with quality and software engineers to support services prior to general availability through activities such as launch reviews, reviewing performance and validating logging in dev environments. Responsible for ensuring quality releases to production environments. The SRE engineer participates in an on-call rotation, working with internal and vendor teams to manage, troubleshoot and resolve production issues.
To be effective, an individual must be able to perform each job duty successfully.
-
Keep current with emerging testing techniques and technologies, as well as emerging development practices.
-
Assist in diagnosing, finding the root cause, reporting, and tracking production and non- production issues.
-
Continually researching new ways of improving and scaling systems and services.
-
Lead initiatives to improve the reliability, scalability, and availability of production applications.
-
Build out tools, platform, and processes to enable these goals.
-
Lead and contribute to design, develop, and improve SRE practices and procedures.
-
Create and maintain health dashboards, identifying and measuring health indicators, SLI’s/SLO’s and providing tools for operational visibility of production systems.
-
Participate in and contribute to improving our incident response acting as an escalation point for production incidents.
-
Perform root cause analysis (RCA), troubleshoot, and debug issues across our applications and services to identify and fix root cause.
-
Enhance and maintain the software release procedures and processes.
-
A strong desire and aptitude for system automation to eliminate manual work with day-to- day operations.
-
Skilled with application monitoring practices and tools (New Relic, Azure Monitor, DataDog, Splunk, etc.)
-
Understanding of and experience with SRE and DevOps principles. Demonstrated experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts.
-
Meets the needs of the end user in a quality, consistent, and professional manner, using independent judgment where appropriate.
-
Mentors less experienced engineers.
-
Excellent communication skills (verbal and written) are critical, along with exceptional problem-solving skills, and exceptionally professional behavior when interacting and responding with other technical teams throughout the organization.
-
Take part in an on-call rotation.
-
Performs additional duties and responsibilities as assigned.
Experience
-
Minimum 4 years of professional experience in site reliability engineering, software development, or systems administration
-
Experience monitoring or troubleshooting web applications.
-
Experience with Scrum and associated tools such as Azure DevOps or Jira
-
Experience with some of the following tool sets:
- Application monitoring tools (New Relic, DataDog, Splunk, etc.)
- Automation tools (Pega, Microsoft Power Platform, Logic Apps, etc.) o API tools (Rest#, Postman, Swagger, etc.)
- Front end tools (Selenium, Page Object Model, etc.)
- Backend tools (SQL Server, Entity Framework, Dapper, etc.)
- Build tools (Node, Docker, Azure Pipelines, etc.)
- Infrastructure as Code(Terraform, Ansible, Chef,etc.)
- Experience with automating, monitoring, andor alerting on some of the following:
- Web applications in Angular and React
- Internal support tools
- 3rd party integrations
- Database and API connections (Rest and SOAP)
- Cloud Solutions (AWS, Azure, or others)
-
Experience working in an agile CI/CD or rapid software testing environment.
-
Experience understanding of Git and source control concepts.
Explore More
Date Posted
04/24/2024
Views
3
Similar Jobs
Staff Flight Test Engineer - Wisk
Views in the last 30 days - 0
Wisk Aero is seeking a Staff Flight Test Engineer to join their team in Hollister CA The role involves ensuring safe and efficient flight testing and ...
View DetailsSenior Developer, Data Engineer - Tarana Wireless, Inc.
Views in the last 30 days - 0
Tarana is seeking a Senior DeveloperData Engineer with 5 years of experience in building largescale data pipelines The role involves designing buildin...
View DetailsStaff Engineer, System Design Verification Engineering - Western Digital
Views in the last 30 days - 0
Western Digital is seeking a validation engineer to define and track test plans characterize and optimize SSDs and lead bug review meetings The ideal ...
View DetailsServo Development Engineer - Western Digital
Views in the last 30 days - 0
Western Digital a company with over 50 years of experience in data storage is seeking a skilled professional to optimize highperformance and robust po...
View DetailsSenior Front-End Software Engineer - Percipient.ai
Views in the last 30 days - 0
Percipientai founded in 2017 is a cuttingedge technology company specializing in Computer Vision Artificial Intelligence and Deep Learning They develo...
View DetailsPrincipal Software Engineer (Prisma Access) - Palo Alto Networks
Views in the last 30 days - 0
Palo Alto Networks is a cybersecurity company committed to protecting the digital way of life They are seeking a Principal Software Engineer to build ...
View Details