Roles:
We are looking for a visionary and technical Lead Data Engineer to guide and scale our data engineering team within our regional data chapter. In this role, you will own the architecture, evolution, and reliability of our next-generation regional data platform. You will lead a talented team of engineers to optimize our Databricks-driven Data Lakehouse architecture, drive core DataOps practices , implement robust data governance , and collaborate across functional squads to empower advanced analytics, business intelligence, and AI initiatives.
The ideal candidate is an expert data architect and a proven technical leader who thrives on transforming messy, disconnected datasets into a unified, low-latency, and highly secure data ecosystem.
Responsibilities:
- Architectural Leadership: Design, build, and continuously optimize our scalable Data Lakehouse platform leveraging Databricks and AWS infrastructure to support global business expansion.
- Pipeline & Infrastructure Ownership: Lead the design and implementation of highly automated, optimal real-time and batch data extraction, transformation, and loading (ETL/ELT) frameworks. Oversee complex integration with internal microservices, external insurance partners, and third-party APIs.
- DataOps & Automation: Champion engineering best practices by building framework controls, schema registries, automated testing, and CI/CD pipelines for data assets (utilizing tools like dbt and Airflow). Drive initiatives like Databricks serverless migrations and automated performance monitoring.
- Data Quality & Governance: Own the end-to-end framework for regional data quality, data observability (e.g., Elementary), data freshness, and data catalogs. Ensure robust data security, compliance (PDPA), and sensitivity tagging across multi-region boundaries.
- Cross-functional Collaboration: Partner with Executives, Product Owners, Software Developers, Data Analysts, and MLOps/Data Science squads to unblock complex technical dependencies, align infrastructure capabilities, and deliver actionable data products.
- Innovation & Emerging Tech: Actively research and spearhead proof-of-concepts incorporating advanced technologies like Generative AI/Agentic AI data pipelines (e.g., automated knowledge bases, smart web scraping solutions) into the data ecosystem.
- Mentorship & Chapter Management: Manage, mentor, and elevate the technical capabilities of junior and senior data engineers within regional squads, ensuring standardized practices and strong technical ownership.
Requirement:
- Experience: 5+ years of experience in Data Engineering, Data Architecture, or a related technical capability role, with at least 2+ years leading engineering teams or core technical projects.
- Databricks Expertise: Deep hands-on experience designing and managing production workloads in Databricks (Delta Lake, Unity Catalog, and serverless compute paradigms).
- Advanced Tech Stack Skills:
- Master-level proficiency in SQL (complex query authoring, optimization, and macro writing) and programmatic data engineering in Python or Scala.
- Heavy experience with Big Data open-source frameworks, primarily Apache Spark.
- Expertise with modern cloud data pipeline orchestration tools (e.g., Airflow, Dagster) and transformation tools like dbt.
- Solid mastery over AWS cloud services infrastructure (S3, EC2, RDS, VPC configurations, network connectivity) integrated within data ecosystems.
- Data Modeling & Architecture: Expert knowledge of transactional databases, distributed storage, message queuing/streaming (e.g., Kafka), and structural patterns for Lakehouse data modeling (Medallion architecture: Bronze, Silver, Gold layers).
- Problem Solving & Systems Thinking: Proven experience performing root cause analysis on production infrastructure failures, handling complex code/infrastructure migrations, and managing data pipeline debts (e.g., optimizing small file storage).
- Education: Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Information Technology, or a highly quantitative relevant field.