OCI Site Reliability Engineer
Job Description
The Company
BetaNXT is a provider of frictionless wealth management enterprise solutions, supported by decades of experience and powered by real-time data capabilities. Our integrated approach empowers our clients to deliver a comprehensive, end-to-end advisor and investor experience. Leading the way has been our mission from the start.
About the Role
BetaNXT is seeking a OCI Cloud SRE. This role needs to possess the ability to exercise good judgment and business acumen in selecting methods and techniques to connect our cloud applications to cloud services or on-prem applications. This role also leads continuous support and enhancements of shared application frameworks while acting as an advanced collaborator with development, infrastructure, and operations teams to develop, monitor, and tune security controls to support BetaNXT's commitment to Zero Trust Architecture.
Responsibilities
Responsible for the operation of production environments, including systems and databases, supporting critical business operations. Install, monitor, maintain, support, and optimize all production server hardware and software. Perform administration and analysis for multiple production environments and recommend new and novel solutions to improve availability, performance, incident resolution, observability and supportability. This is an opportunity to bring a combination of deep technical knowledge with administration/analysis knowledge of Oracle's Cloud Infrastructure to provide escalation support to a wide range of complex production environment problems related to immense growth, scaling, leveraging the cloud, extremely high performance, and high availability requirements. Coordinate escalated support cases and lead appropriate internal technical resources and/or third-party vendors to resolution. Provide on-call support. Responsible for Oracle production environments, assist with server operating system and application upgrades, bug fixes, and patching, and work on standardization projects for both hardware and software under the Oracle technology stack while providing consistent system uptime as expected in a Cloud environment.
Specific Duties
- Experience deploying code within change management procedures.
- Experience participating in or running incident recovery calls.
- Experience and complete knowledge of OCI Administration with strong understanding of Tenancy, VCN, Compartments, Network Layers, handling firewalls, subnet, storage options, FSS, Block Volume, Object storage.
- OCI infra-administration including installation, configuration, migrations, tuning, patching, administration & monitoring.
- Hands on experience in writing and modifying Terraform scripts for deployments on OCI and for managing OCI Infrastructure.
- Hands on experience with performance tuning, hardware upgrades, and resource optimization as required. Configure CPU, memory, and disk partitions as required and ensure high availability of infrastructure.
- Experience in OS patches and upgrades on a regular basis and upgrade administrative tools and utilities. Configure/add new services as necessary.
- Experience around handling OS or Application related security vulnerabilities is an added advantage.
- Hands on Experience with provisioning storage, compute nodes, network. and understanding requirements for the same.
- Experience with Certificate renewals on OCI.
- Strong understanding of HA concepts in DR and setting up on Infrastructure for DR Site.
- Experience with IaaS solutions: virtual machines/networks, on-premises/hybrid cloud computing, cloud identity, security models, cloud monitoring, logging, local and cloud storage.
- Should be well-versed with ITIL based service delivery for Incident Management, Change Management, Problem Management, Capacity Management, Configuration Management.
Qualifications
- 6+ years of experience with cloud platforms such as Azure, AWS, or Google Cloud.
- 2+ years of experience with Oracle cloud implementation and support.
- Experience in defining and implementing a comprehensive cloud reliability and observability strategy with hands-on experience with containers and container orchestration preferred.
- Strong understanding of Kubernetes concepts/ecosystem and deploying applications. Proficiency with CI/CD tools, especially Gitlab Runners/Jenkins.
- Software Development Lifecycle in an agile environment.
- Database replication using Oracle Goldengate.
- Proven experience with infrastructure-as-code tools such as Terraform, solid experience with CI/CD tooling.
- Proficiency in scripting languages such as Python, PowerShell, or Bash is preferred.
- Experience in migration of on-premises to Oracle cloud.
- Optimize OCI infrastructure for performance, cost, and security.
- Troubleshoot and resolve issues related to OCI.
- Monitor OCI infrastructure performance and implement improvements as needed.
- Use of modern practices to increase the speed of delivering software (e.g., Agile, DevOps, test automation, etc.)
- Excellent analytical and problem-solving skills
- Customer obsession, passion for delighting customers.
- Proven ability to quickly learn new technical domains and then train others.
Preferred Qualifications
- BS or MS in Computer Science or equivalent
- Oracle OCI certifications
- Experience with DevOps practices such as CI/CD, version control, and testing
- Familiarity with AWS and Azure is a plus.
Explore More
Date Posted
11/22/2023
Views
0
Similar Jobs
Project Delivery Manager - Regulatory Reporting Manager - Deloitte
Views in the last 30 days - 0
View DetailsOracle Fusion Field Service Implementation Director - PwC
Views in the last 30 days - 0
View Details