Are you passionate about building scalable data solutions and working with cutting-edge technologies? We're looking for a Data Engineer with deep expertise in Databricks to help us design and optimize modern data processing frameworks.
In this role you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers) where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology.
- Design and build streaming & batch data pipelines using Apache Spark on Databricks
- Optimize performance with Delta Lake Z-ordering and liquid clustering
- Work hands-on with PySpark SQL and develop our modern Data Lakehouse architecture
- Leverage the latest Databricks features: Structured Streaming Lakeflow Unity Catalog DBSQL
- Support AI/ML workflows and implement CI/CD pipelines with tools like GitHub Actions or Azure DevOps
- 3+ years in data/software engineering with large-scale distributed systems.
- Strong Python skills (OOP pytest/unittest) and good experience with PySpark SQLGlot Pydantic
- Solid understanding of medallion architecture data ingestion and big data optimization
- Familiarity with Git workflows and platforms like GitHub GitLab or Azure DevOps
- Experience with cloud platforms (Azure preferred) and Databricks.
- Certifications in Databricks or Azure Data Engineering
- Experience with real-time data processing (e.g. Kafka Spark Streaming)
- Exposure to AI/ML workflows and tools like MLflow.