A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.
Seeking new possibilities and always staying curious we are a team dedicated to creating the world’s leading AI-powered cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers so the door is always open for those who want to grow their career.
We are seeking a skilled Software Developer to join our IBM Software team. As part of our team you will be responsible for developing and maintaining high-quality software products working with a variety of technologies and programming languages.
IBM’s product and technology landscape includes Research Software and Infrastructure. Entering this domain positions you at the heart of IBM where growth and innovation thrive.
We are seeking an experienced and highly skilled Spark Scala Developer.The candidate will have a deep understanding of distributed computing data pipelines and real-time and batch data processing architecture.
Key Responsibilities:-
Design develop and optimize big data applications using Apache Spark and Scala.
-
Architect and implement scalable data pipelines for both batch and real-time processing.
-
Collaborate with data engineers analysts and architects to define data strategies.
-
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters.
-
Build and maintain reusable code and libraries for future use.
-
Work with various data storage systems like HDFS Hive HBase Cassandra Kafka and Parquet.
-
Implement data quality checks logging monitoring and alerting for ETL jobs.
-
Mentor junior developers and lead code reviews to ensure best practices.
-
Ensure security governance and compliance standards are adhered to in all data processes.
-
Troubleshoot and resolve performance issues and bugs in big data solutions.
-
12+ years of total software development experience.
-
Minimum 5+ years of hands-on experience with Apache Spark and Scala .
-
Strong experience with distributed computing parallel data processing and cluster computing frameworks .
-
Proficiency in Scala with deep knowledge of functional programming.
-
Solid understanding of Spark tuning partitions joins broadcast variables and performance optimization techniques.
-
Experience with cloud platforms such as AWS Azure or GCP (especially EMR Databricks or HDInsight).
-
Hands-on experience with Kafka Hive HBase NoSQL databases and data lake architectures.
-
Familiarity with CI/CD pipelines Git Jenkins and automated testing.
-
Strong problem-solving skills and the ability to work independently or as part of a team.
-
Experience with Databricks Delta Lake or Apache Iceberg .
-
Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks.
-
Experience with data governance tools (e.g. Apache Atlas Collibra).
-
Contributions to open-source big data projects are a plus.
-
Excellent communication and leadership skills.