Job Details
Experience Needed:
Career Level:
Education Level:
Salary:
Job Categories:
Skills And Tools:
Job Description
- Communicate the data needs of data scientists and other teams and come up with efficient, GDPR compliant ETL processes.
- Support other teams by providing guidance on data usage, processing and how they can best leverage the platform
- Build scalable data pipelines to ingest data from a variety of data sources (Relational DBs and Data Lakes), identify critical data elements and define data quality rules.
- Leverage Spark and Kafka knowledge to design and develop capabilities to improve change data capture and real-time capabilities of the pipeline.
- Provide insights on areas of improvements including data governance, best practices, large scale processing
- Support the bug fixing and performance analysis and data validation and quality along the data pipeline
Job Requirements
- 3+ years of experience as software engineer, with strong skills in at least one programming language is mandatory, preferably Scala or Java or Python
- 1+ year of experience with Spark on Hadoop, EMR etc.
- Experience working with real time data processing using Kafka, Spark Streaming or similar technology
- Experience with distributed systems and design/implementation for reliability, availability, scalability and performance
- Proven experience with AWS technologies like S3, EMR, Cloud formation.
- Creative and innovative approach to problem-solving
- Experience with Kubernetes is a big plus