SR Data Engineer

  • Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina
  • Full-Time
  • Remote

Job Description:

Only Candidates from Argentina, Paraguay, Bolivia, Colombia

We are seeking a Senior Data Engineer or Data Architect to lead the design and implementation of a modern AWS-based Lakehouse for a data-rich enterprise.
This role combines deep technical ownership with hands-on delivery and requires a strong ability to collaborate with business stakeholders to catalog data, define an end-state medallion architecture, and build production-grade ingestion and transformation pipelines.

The successful candidate will help the organization transition toward a governed, high-quality, and analyst-friendly platform that supports downstream analytics and machine learning initiatives.

What Youll Bring (Required Skills & Experience):

  • 7+ years of experience building and operating large-scale data platforms and ETL pipelines in production environments.
  • Strong hands-on expertise with AWS data services, including S3, Glue, EMR, DMS, MWAA, Athena, and Redshift.
  • Deep working knowledge of Spark (PySpark or Scala) and migration of Hive HQL workloads to Spark.
  • Practical experience with open table formats (Apache Iceberg or Delta Lake) and Glue Data Catalog registration.
  • Proven ability to design and manage Airflow DAGs and integrate Airflow with AWS services.
  • Strong SQL skills and experience with dbt or equivalent SQL modeling frameworks.
  • Hands-on implementation of data quality frameworks (Deequ, Great Expectations, or Glue Data Quality).
  • Familiarity with data lineage and observability practices (e.g., OpenLineage).
  • Solid understanding of security and compliance for consumer data, including IAM, KMS, and Lake Formation.
  • Excellent communication skills, with experience in stakeholder interviews and executive-level technical presentations.

Preferred Qualifications

  • Prior experience migrating from on-prem Hadoop/Hive/HDFS to S3-based lakehouses.
  • Familiarity with enterprise data catalogs such as AWS DataZone or Collibra.
  • Hands-on exposure to SageMaker MLOps, feature stores, and model monitoring.
  • Experience using AppFlow, Transfer Family, or other managed connectors for SaaS and SFTP ingestion.
  • Background in Infrastructure as Code (IaC) Terraform, CloudFormation, or AWS CDK with CI/CD automation for data workloads.
  • Leadership experience mentoring engineers and contributing to team structure and hiring decisions.

Education & Experience

  • Bachelors degree in Computer Science, Engineering, or Data Science (Masters degree preferred but not required).
  • Typically, 715 years of relevant professional experience.
  • Exceptional candidates with fewer years but demonstrable impact and technical depth will be considered.

What Youll Do (Responsibilities):

  • Inventory and catalog all data sources across cloud and on-prem systems, capturing metadata, ownership, and maintaining a canonical data catalog.
  • Map and reverse-engineer existing data pipelines and logic, including Airflow DAGs, Hive HQL, Scala jobs, notebooks, and stored procedures.
  • Design and document a target medallion architecture (Bronze/Silver/Gold) on AWS using open table formats and governed metadata.
  • Build and operate ingestion pipelines leveraging AWS services such as Glue, DMS, AppFlow, Transfer Family, and structured S3 landing zones.
  • Implement transformation pipelines using Spark (Glue or EMR), converting data to Iceberg or Delta Lake and registering assets under Lake Formation governance.
  • Develop and maintain workflow orchestration using Apache Airflow (MWAA), including retries, alerts, and SLA monitoring.
  • Define and enforce data quality and profiling using Deequ, Great Expectations, or Glue Data Quality, publishing metrics to the catalog.
  • Consolidate business logic into modular, testable pipelines and SQL models using dbt for analyst-friendly development.
  • Design serving and consumption layers using Athena, Redshift Serverless, and controlled APIs or search exports; plan for ML readiness with SageMaker and feature stores.
  • Establish CI/CD pipelines for data infrastructure and workloads, implement automated testing and deployment gates, and maintain operational documentation.
  • Mentor engineers and analysts, providing technical guidance and presenting architectural recommendations to leadership.

Why join the client?

Be part of a company at the forefront of technological change. We foster a collaborative, purpose-driven environment where teamwork thrives. Were invested in your long-term growth and provide opportunities to develop your skills and career.

Perks & Benefits

At the client, we believe in the personal and professional growth of our team. Thats why we offer a variety of benefits designed to support your development, well-being, and quality of life:

  • Certifications and training to boost your skills.
  • English classes, tailored to your level.
  • Free gym access, so you can take care of your physical and mental health.
  • Gifts on special occasions, because we love celebrating with you.
  • Wellness and professional growth programs focused on your balance and progress.
  • An inclusive and collaborative environment, where diversity and great ideas are truly valued.