Subscribe to the latest remote jobs:

Lead Data Integration Engineer

🇺🇸 United States | 🇨🇴 Colombia | 🇰🇿 Kazakhstan | 🇰🇬 Kyrgyzstan | 🇺🇿 Uzbekistan

Scala

Management

Redshift

Python

AWS

GCP

Azure

Git

Snowflake

Machine Learning

Design

Project Management

Devops

SQL

Testing

Lead Data Integration Engineer

from 🇺🇸 United States | 🇨🇴 Colombia | 🇰🇿 Kazakhstan | 🇰🇬 Kyrgyzstan | 🇺🇿 Uzbekistan

We are seeking an experienced and visionaryLead Data Integration Engineer to join our growing data solutions team. In this pivotal technical leadership role, you will act as a Tech Lead / Team Lead responsible for architecting, designing, and maintaining complex, enterprise-grade data pipelines.

You will lead a high-performing engineering team (5+ engineers), shaping cloud-native and hybrid data solution architectures while establishing best practices. Leveraging your extensive expertise in cloud infrastructure, database modeling, orchestration, and data security, you will transform raw data environments into robust, high-performance, and scalable analytical assets.

Responsibilities

  • Mentor & Lead: Guide a team of data integration and ETL engineers, fostering a collaborative, innovation-driven culture. Manage sprint planning, resource allocation, and risk management
  • Code Review & Quality Assurance: Conduct regular code reviews and set up testing frameworks to ensure compliant, highly performant, and secure database code (SQL, Python, Scala)
  • Culture of Excellence: Act as a role model, championing modern software development practices (CI/CD, version control via Git, modular code design, automated alerting, and monitoring)
  • Modern ETL/ELT Design: Architect end-to-end data pipeline solutions using serverless frameworks or hybrid setups
  • Orchestration & Automation: Establish robust pipeline scheduling and workflow orchestration using tools like Apache Airflow (Astronomer) or cloud-native schedulers to achieve high pipeline reliability (>99.5%) and minimize troubleshooting times
  • Data Modeling & Warehousing: Design, model, and maintain scalable corporate data systems (OLAP, OLTP, Star/Snowflake schemas, Data Lakes, Lakehouses, and Data Meshes)
  • Change Data Capture (CDC): Implement modern ingestion patterns, CDC methods, micro-batching, delta extracts, and routine partition maintenance
  • Bridge Business & Tech: Collaborate with product owners, enterprise architects, and business stakeholders to translate complex requirements into detailed technical specifications and data-flow diagrams
  • Present Solutions: Confidently present advanced technical architectures to C-level client executives, showcasing ROI, architectural benefits, and performance gains

Requirements

  • Experience: 5+ years of proven experience in data management, database design, migration, storage, and advanced data modeling
  • Leadership: Demonstrated experience leading, mentoring, and developing teams of data engineers (experience managing teams of 4–7 engineers is highly preferred)
  • Data Integration & ETL Tools: Deep hands-on experience with major ETL tools such as Talend, AWS Glue, Azure Data Factory, GCP Dataflow, or Apache NiFi
  • Orchestration Mastery: Strong working experience with orchestration and scheduling engines, particularly Apache Airflow (Astronomer)
  • Programming & Scripting: Strong production coding skills in SQL, Python (PySpark, Pandas), Scala, SparkSQL, or Bash
  • Cloud Platforms: Extensive knowledge of at least one major cloud platform (AWS, Azure, GCP) with emphasis on big data warehousing engines (such as Redshift, Snowflake, Google BigQuery, or Azure Synapse)
  • Infrastructure & Security: Proven understanding of cloud security structures (identity and access management, data masking, and RLS) and compliance standards (GDPR, HIPAA, PI)
  • Documentation: Ability to write high-quality technical design specifications, mapping documentation, and data lineage reports
  • Communication & English: Excellent verbal and written English communication (B2+/C1 level), capable of directly managing international clients and conducting presentations

Nice to Have

  • Serverless Engineering: Experience architecting fully serverless cloud data pipelines (e.g., combining AWS S3, Athena, and AWS Lambda)
  • AI/ML Integration: Exposure to integrating machine learning models, computer vision, or generative AI workflows into data pipeline ingestion processes
  • Advanced Certifications: Professional certifications in Cloud Architecture (AWS, Azure, GCP), Talend, or Airflow
by @maxrusakovic