Site Reliability Engineer

QGenda · Atlanta GA

Company

QGenda

Location

Atlanta GA

Type

Full Time

Job Description

Title: Site Reliability Engineer

Classification (FLSA): Exempt

Position type: Full time

Reports to: Director, Site Reliability Engineering

Summary / Objective:

QGenda is a fast growing Atlanta-based healthcare software company, with an amazing corporate culture, where we strive to be the best place to be a customer. Our software is used by thousands of hospital departments around the world to automatically generate the most optimized physician work schedules to accommodate complex business rules and accurately schedule the appropriate medical provider based on their skill level, specialty, availability, and preferences.

As a Site Reliability Engineer, you will work with our product development teams to increase the scalability, reliability, and performance of our systems. You’ll build and extend existing automation for configuration and monitoring of our AWS hosted applications. You’ll evaluate new AWS services and tools to determine if they could be utilized in our environments. You’ll bring a focus to platform health and monitoring to allow us to deliver the best possible experience for our customers.

Key Responsibilities:

Assist in Development Operations
- Partner with software engineering teams to make sure scalability/reliability is designed and implemented in new features and products
- Promote fundamentals of site reliability across the Product Development department and the organization as a whole
- Work closely with development and operations teams to build highly available, cost effective systems
Build and Maintain Infrastructure
- Write automation code for provisioning and operating infrastructure
- Oversee infrastructure for customer facing applications hosted in AWS within production and pre-production environments including their provisioning
- Maintain an understanding of new cloud computing capabilities on Amazon Web Services and look for opportunities to utilize those capabilities for our products
Ensure Application Uptime and Performance
- Use extensive metrics to identify issues before they impact our customers
- Establish end-to-end monitoring and alerting on all critical aspects of the system to ensure SLAs and get proactive notifications of possible issues for all systems
- Design platforms for extremely high uptime metrics and ensure that our production SLAs are measured, monitored and maintained
- Identify underlying root causes and provide recommendations or solutions for long term permanent fixes to critical production issues
- Participate in service capacity planning and demand forecasting, software performance analysis and system tuning
Assure High Security Across the Application and Organization
- Troubleshoot problems across the entire cloud-based stack: network, databases, and application – and build automation to prevent problem recurrence
- Develop effective documentation, tooling, and alerts to both identify and address reliability risks
Participate in on‐call rotation with other team members on the Development Team

Knowledge, Skills and Abilities:

Advanced proficiency with at least one scripting or programming language
Solid Windows administration experience, experience with Active Directory is a plus
Strong experience supporting .NET web applications
Experience with Nginx, Apache, Docker or similar technologies
Hands-on experience building infrastructure and supporting applications in AWS using services such as Lambda, EC2, ECS, S3, SNS, SQS, RDS, Redshift, and Elasticache
Strong understanding of networking and DNS
Familiarity with configuration management and infrastructure as code (IaC) tools such as Ansible, Terraform or Cloudformation
Availability for off-hours deployment and upgrades of production systems during release and maintenance windows
Firm understanding and experience with Agile and Scrum SDLC processes
Using distributed version control system experience (Git preferred) to check‐in code, branching, merging, pull request, code review, etc.
Knowledge of CI/CD best practices and tools such as AWS CodeBuild, Jenkins and TeamCity
Experience designing and delivering secure, high performance and highly‐available cloud services
Experience working with stakeholders to define and track SLIs, SLOs and SLAs using metrics and monitoring to ensure the objectives are met or exceeded

Education / Professional Certifications or Licenses Required:

Bachelor's degree from an accredited college or university or equivalent industry experience

Work Environment / Physical demands/ Travel Requirements:

Computer-based work environment.
Sitting and standing for extended periods
Lifting of 5 - 10 pounds.

PM21

Explore More

fast growing Jobs amazing corporate culture Jobs best place to be a customer Jobs increase scalability Jobs reliability and performance Jobs More Jobs at QGenda Jobs in Atlanta GA

Apply Now

Date Posted

09/01/2022

Views

Back to Job Listings Add To Job List Company Profile View Company Reviews

Positive

Subjectivity Score: 0.8

Similar Jobs

API Software Development Engineer - II - Synchrony

Views in the last 30 days - 6

The job description is for an API Software Development Engineer II at Synchrony The role involves working on microservice APIs participating in hackat...

View Details

API Software Development Engineer - I - Synchrony

Views in the last 30 days - 5

The job description is for an API Software Development Engineer I position at Synchrony The role involves working on microservice APIs participating ...

View Details

Senior Software Engineer (Java) - NCR Corporation

Views in the last 30 days - 6

NCR Corporation is a leading software and servicesled enterprise provider in the financial retail and hospitality industries They are looking for a Se...

View Details

Sr. Data Analyst/Engineer - Remote - Sharecare

Views in the last 30 days - 11

Sharecare is a digital health company that helps people manage their health They are seeking a Sr Data AnalystEngineer to contribute to a new platform...

View Details

ADMS Engineer - Ameren

Views in the last 30 days - 9

The job posting is for an Engineer to support the design maintenance and operation of SCADA and ADMS systems The successful candidate will have a Bach...

View Details

Oral & Maxillofacial Surgeon - Atlanta Oral & Maxillofacial Surgery

Views in the last 30 days - 0

View Details

Site Reliability Engineer

Company

Location

Type

Job Description

Explore More

Date Posted

Views

Similar Jobs

API Software Development Engineer - II - Synchrony

API Software Development Engineer - I - Synchrony

Senior Software Engineer (Java) - NCR Corporation

Sr. Data Analyst/Engineer - Remote - Sharecare

ADMS Engineer - Ameren

Oral & Maxillofacial Surgeon - Atlanta Oral & Maxillofacial Surgery

Browse By Category

Browse By Location

Browse By Company

Free Tools

Popular Searches

Resources