Data Engineer AWS
Job Description:
This role is responsible for gathering and analyzing data from several internal and external sources, designing a cloud-focused data platform for analytics and business intelligence, reliably providing data to our analysts.
This role requires significant understanding of data mining and analytical techniques. An ideal candidate will have strong technical capabilities, business acumen, and the ability to work effectively with cross-functional teams. Responsibilies include but are not limited to:
Work with Data architects to understand current data models, to build pipelines for data ingestion and transformation.
Design, build, and maintain a framework for pipeline observation and monitoring, focusing on reliability and performance of jobs.
Surface data integration errors to the proper teams focusing on:
o ensuring timely processing of new data
o performance of data pipelines
o integrity and quality of source data
Hands-on experience building data-lake style infrastructures using streaming data set technologies (particularly with Apache Kafka)
Qualifications:
Bachelors degree in computer science, data science or related technical field, or equivalent practical experience
Experience building and maintaining AWS based data pipelines: currently utilizing AWS Lambda, Docker / ECS, MSK, Airflow, Databricks, Unity Catalog
Development experience utilizing two or more of the following:
o Python: (Pandas/Numpy, Boto3, SimpleSalesforce)
o Databricks (pySpark, pySQL, DLT)
o Apache Spark
o Kafka and the Kafka Connect ecosystem (schema registry and Avro)
o Terraform (or other infrastructure as code platform)
Enthusiasm for working directly with customer teams (Business units and internal IT)
Databricks certification
Preferred Qualifications
Proven experience with relational and NoSQL databases (e.g. Postgres, Redshift, MongoDB)
Experience with version control (git) and peer code reviews
Familiarity with data visualization techniques using tools such as Grafana, PowerBI, AWS Quick Sight, and Excel.