# Senior Data Quality Engineer (4 Months Contract ) Onsite in UAE - Octopus by RTG

> robusta · Abu Dhabi, United Arab Emirates · — · Posted 2026-06-14

**Workplace:** on_site

## Description

### About the Role

We are seeking an experienced Senior Databricks Data Quality Engineer to lead the design, implementation, and automation of enterprise-scale data quality frameworks within a Databricks environment. The successful candidate will play a key role in establishing data quality controls, profiling frameworks, remediation processes, and AI-assisted quality monitoring across a large-scale data platform consisting of 170+ datasets and over 1,300 Critical Data Elements (CDEs).

This role requires strong expertise in Databricks, PySpark, Delta Lake, MLflow, and modern data quality management practices.

### Key Responsibilities

### Data Platform & Databricks Configuration

-   Configure and manage Databricks workspaces, compute clusters, PySpark notebooks, Delta Lake architecture, and Unity Catalog integrations.
-   Design scalable data quality processing frameworks across 170+ datasets and 1,346 prioritized Critical Data Elements (CDEs).

### Data Profiling & Quality Assessment

-   Develop AI-assisted profiling notebooks using PySpark to establish baseline data quality scores.
-   Assess data quality across six key dimensions including:

-   Completeness
-   Uniqueness
-   Validity
-   Consistency
-   Accuracy
-   Timeliness

-   Analyze null rates, duplicate records, invalid values, format violations, outliers, and schema drift.

### Data Quality Rule Framework

-   Design and build a scalable Data Quality Rule Factory using parameterized PySpark functions.
-   Enable automated deployment of over 6,700 data quality rules without manual rule-by-rule development.
-   Create reusable rule templates across datasets and data quality dimensions.

### Pipeline Quality Enforcement

-   Integrate data quality controls within Bronze, Silver, and Gold Delta Lake layers.
-   Implement quality gates that prevent data progression unless predefined thresholds are met.
-   Develop reusable Databricks Jobs for automated validation and monitoring.

### Data Cleansing & AI-Driven Remediation

-   Build automated data cleansing pipelines for:

-   Standardization
-   Deduplication
-   Schema harmonization

-   Deploy MLflow-managed machine learning models for:

-   Anomaly detection
-   Fuzzy duplicate detection
-   Exact duplicate identification

-   Ensure explainability of model outputs and support human-in-the-loop validation processes.

### Exception Management

-   Design failed-record handling frameworks and quarantine Delta tables.
-   Capture failure reasons, affected CDEs, rule references, and timestamps.
-   Develop automated reprocessing mechanisms for corrected records.

### Data Quality Monitoring & Reporting

-   Build Delta Lake aggregation tables for data quality metrics.
-   Deliver data quality KPIs to Power BI dashboards including:

-   Dimension-level scores
-   Rule pass/fail rates
-   SLA adherence metrics

-   Configure automated alerting using Databricks SQL Alerts and Azure Monitor.

### Predictive Data Quality Analytics

-   Develop predictive models to identify datasets at risk of quality degradation.
-   Support AI-assisted Root Cause Analysis (RCA) using profiling outputs and machine learning techniques.
-   Export and prepare remediation datasets for prioritization and governance reporting.

## Requirements

-   Bachelor's degree in Computer Science, Data Engineering, Information Systems, or a related field.
-   5+ years of experience in Data Engineering or Data Quality Engineering.
-   3+ years of hands-on experience with Databricks and PySpark.
-   Strong expertise in Delta Lake architecture and data pipeline development.
-   Experience with Unity Catalog implementation and governance.
-   Hands-on experience with MLflow and machine learning deployment.
-   Strong SQL skills and data modeling expertise.
-   Experience building enterprise-scale data quality frameworks.
-   Experience integrating Databricks with Power BI and Azure services.
-   Strong understanding of data governance, metadata management, and data quality dimensions.

### Preferred Qualifications

-   Microsoft Azure certifications.
-   Databricks Certified Data Engineer Associate or Professional.
-   Experience with enterprise data governance programs.
-   Experience implementing AI-assisted data quality and remediation solutions.
-   Knowledge of Master Data Management (MDM) principles.

## Apply

[Apply at robusta](https://apply.workable.com/robusta/j/116E0570DD/apply)

---
Powered by [Workable](https://www.workable.com)
