Overview
Job Purpose We're seeking a talented Senior Data Engineer to join our Enterprise Architecture team in a cross-cutting role that will help define and implement our next-generation data platform. In this pivotal position, you'll lead the design and implementation of scalable, self-service data pipelines with a strong emphasis on data quality and governance. This is an opportunity to shape our data engineering practice from the ground up, working directly with key stakeholders to build mission-critical ML and AI data workflows. About Our Technology Stack You'll be working with a modern, on-premises data stack that includes:
- Apache Airflow for workflow orchestration (self-hosted on Kubernetes)
- dbt for data transformation and testing
- Apache Flink for stream processing and real-time data workflows
- Kubernetes for containerized deployment and scaling
- Git-based version control and CI/CD for data pipelines
- Oracle Exadata for data warehousing
- Kafka for messaging and event streaming
We emphasize building systems that are maintainable, scalable, and focused on enabling self-service data access while maintaining high standards for data quality and governance. Responsibilities
- Design, build, and maintain our on-premises data orchestration platform using Apache Airflow, dbt, and Apache Flink
- Create self-service capabilities that empower teams across the organization to build and deploy data pipelines without extensive engineering support
- Implement robust data quality testing frameworks that ensure data integrity throughout the entire data lifecycle
- Establish data engineering best practices, including version control, CI/CD for data pipelines, and automated testing
- Collaborate with ML/AI teams to build scalable feature engineering pipelines that support both batch and real-time data processing
- Develop reusable patterns for common data integration scenarios that can be leveraged across the organization
- Work closely with infrastructure teams to optimize our Kubernetes-based data platform for performance and reliability
- Mentor junior engineers and advocate for engineering excellence in data practices
Knowledge and Experience
- 5+ years of professional experience in data engineering, with at least 2 years working on enterprise-scale data platforms
- Deep expertise with Apache Airflow, including DAG design, performance optimization, and operational management
- Strong understanding of dbt for data transformation, including experience with testing frameworks and deployment strategies
- Experience with stream processing frameworks like Apache Flink or similar technologies
- Proficiency with SQL and Python for data transformation and pipeline development
- Familiarity with Kubernetes for containerized application deployment
- Experience implementing data quality frameworks and automated testing for data pipelines
- Knowledge of Git-based workflows and CI/CD pipelines for data applications
- Ability to work cross-functionally with data scientists, ML engineers, and business stakeholders
Preferred Knowledge and Experience
- Experience with self-hosted data orchestration platforms (rather than managed services)
- Background in implementing data contracts or schema governance
- Knowledge of ML/AI data pipeline requirements and feature engineering
- Experience with real-time data processing and streaming architectures
- Familiarity with data modeling and warehouse design principles
- Prior experience in a technical leadership role
#LI-HR1 #LI-ONSITE
|