Senior AI/ML Ops Engineer
Job Description
Â
This position focuses on the end-to-end workflow for our AI/ML models and data. It will require our team members to wear different hats as we scale our deployments, our maturity and our organization
Â
- Data engineering - work with our cloud system data engineers and our data scientists to ensure high data quality and reliability in our data warehouse. Design and implement our feature store to support repeatable, high quality model builds
- AI/MLOps - design and implement an improved model workflow including EDA, feature engineering, training, evaluation, deployment and monitoring. Implement versioning, observability and reporting. Monitor efficiency, cost, performance and reliability.
- DevOps - We are a product-oriented company, so you will need to work with software and DevOps engineers on our teams to build out GitOps pipelines for new APIs, clusters and tools that we develop. This includes deployment tools, security, observability and alerting.
Â
Our process is highly iterative, high velocity and focused on quality and operational excellence. We use small, product-focused teams that own features and products end-to-end, so you will improve infrastructure and implement strategic changes as part of the product creation and release process. We emphasize general software skills and the ability to work interactively with a LLM to experiment and implement infrastructure over experience with specific technologies.
Â
In your first year at Tarana, you will take ownership of the end-to-end model lifecycle, and help us to consistently deliver high-quality, high-reliability, cost-effective machine learning and AI systems as part of various products. You will guide our strategic decisions around technology and implementation by working with other engineers and executives to develop business-focused designs based on experience and empirical data.
This is a hands-on role in a low-overhead team of builders.Â
Â
Required Skills & Experience:
Â
- BS or higher in Computer Science, MS preferredÂ
- 5-12 years of experience building large scale ML/AI models and systemsÂ
Knowledge, Skills and Abilities:
- Strong understand of software systems and programming, with and without assistance from a LLM
- Strong python, Spark, Pandas and SQL skills. Scala and Rust are a bonus
- Strong knowledge of cloud platforms such as AWS, Azure or Google Cloud and experience with infrastructure-as-code tools like Terraform or CloudFormation.
- Proficiency in containerization technologies such as Docker and container orchestration platforms like Kubernetes.
- Experience with CI/CD tools such as GitLab CI/CD, Github Actions or CircleCI.
- Familiarity with schema design and data warehouse architectures.
- Familiarity with machine learning frameworks and libraries such as PyTorch, Tensorflow and scikit-learn.
- Understanding of DevOps/Agile/Lean core principles and how to apply them
- Strong problem-solving and troubleshooting of complex systems
- Experience with ML workflow tools such as MLflow, MetaFlow and/or Kubeflow.
- Experience with monitoring, metrics, and logging model performance and data pipelines (Prometheus, Grafana, etc.)Â
Â
Â
The salary range for this position is: $180,000 to $240,000
Compensation will be determined based on several factors including, but not limited to: skill set, years of experience and the employee’s geographic location.
Tarana provides competitive benefits to employees in this role including: Medical, dental and vision benefits, 401K match, flexible time off and stock options.
Â
Date Posted
09/13/2024
Views
0
Similar Jobs
Senior Developer, Data Engineer - Tarana Wireless, Inc.
Views in the last 30 days - 0
Tarana is seeking a Senior DeveloperData Engineer with 5 years of experience in building largescale data pipelines The role involves designing buildin...
View DetailsSenior Front-End Software Engineer - Percipient.ai
Views in the last 30 days - 0
Percipientai founded in 2017 is a cuttingedge technology company specializing in Computer Vision Artificial Intelligence and Deep Learning They develo...
View DetailsSenior Program Manager, Global Occupational Health & Safety - ServiceNow
Views in the last 30 days - 0
ServiceNow is seeking a Health Safety Program Manager to design implement and lead a comprehensive corporate safety program The role involves develop...
View DetailsStaff Flight Test Engineer - Wisk
Views in the last 30 days - 0
Wisk Aero is seeking a Staff Flight Test Engineer to join their team in Hollister CA The role involves ensuring safe and efficient flight testing and ...
View DetailsStaff Engineer, System Design Verification Engineering - Western Digital
Views in the last 30 days - 0
Western Digital is seeking a validation engineer to define and track test plans characterize and optimize SSDs and lead bug review meetings The ideal ...
View DetailsSenior Finance Manager, Central FP&A - Palo Alto Networks
Views in the last 30 days - 0
Palo Alto Networks is seeking a Senior Finance Manager with 10 years of experience in FPA The role involves leading ad hoc projects collaborating with...
View Details