Data Engineer - SQL, Spark, Python, Scala at Mavensoft Technologies - Portland, OR

Job title: Data Engineer - SQL, Spark, Python, Scala (Remote)

Duration:4 months (contract)

Key Skills: Big Data/Hadoop, Delta Lakes, Python, Scala, SQL, Cloudera, Apache Hadoop, Hortonworks, mongoDB, Java, Apache Cassandra, Apache Hive, Hadoop Distributed File System (HDFS), Cloudera Impala, Apache Kafka, NoSQL Database, MapReduce, etc.

Role responsibilities:

Design and build reusable components, frameworks and libraries at scale to support analytics products.
Design and implement product features in collaboration with business and Technology stakeholders
Identify and solve issues concerning data management to improve data quality Clean, prepare and optimize data for ingestion and consumption
Collaborate on the implementation of new data management projects and re-structure of the current data architecture
Implement automated workflows and routines using workflow scheduling tools
Build continuous integration, test-driven development and production deployment frameworks
Collaboratively review design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
Analyze and profile data for designing scalable solutions
Troubleshoot data issues and perform root cause analysis to proactively resolve product and operational issues
Develop architecture and design patterns to process and store high volume data sets
Participate in an Agile / Scrum methodology to deliver high - quality software releases every 2 weeks through Sprints
Includes experience with Cloudera, Apache Hadoop, Hortonworks, mongoDB, Java, Apache Cassandra, Apache Hive, Hadoop Distributed File System (HDFS), Cloudera Impala, Apache Kafka, NoSQL Database, MapReduce, etc

The following qualifications and technical skills will position you well for this role:

5+ years of experience with detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools.
3+ years' experience in Big Data stack environments (Hadoop, SPARK, Hive & Data Lake)
3+ year's working with multiple file formats (Parque, Avro, Delta Lake) & API
3+ year's expericen in cloud environments like AWS (Serverless technologies like AWS Lambda, API Gateway, NoSQL like Dynamo, EMR & S3)
Experience with relational and non relational SQL
Strong experience in coding languages like Python, Scala & Java
Have experience in building realtime streaming data pipelines
Experiece in pub/sub modes like Kafka
Strong understanding of data structures and algorithms
Experience in building lamda, kappa, microservice and batch architecture
Experience working on CI/CD processes and source control tools such as GitHub and related dev processes.
Has a passion for data solutions and willing to pick up new programming languages, technologies, and frameworks

These are the characteristics that we strive for in our own work. We would love to hear from candidates who embody the same:

Desire to work collaboratively with your teammates to come up with the best solution to a problem
Demonstrated experience and ability to deliver results on multiple projects in a fast-paced, agile environment
Excellent problem-solving and interpersonal communication skills
Strong desire to learn and share knowledge with others

Data Engineer - SQL, Spark, Python, Scala

Company

Location

Type

Job Description

Explore More

Date Posted

Views

Similar Jobs

Esri - C++ Software Engineer I - Maps Sdks - Esri

Civil-Site Engineer - Leidos

Territory Account Executive (Mandarin), Strategic Cuisines - Portland, OR - Toast

Partner Sales Manager - Amplitude

Nurse Practitioner or Physician Assistant - Pearl - Onemedical

Portland, OR (Buckman) Territory Account Executive, SMB - Toast