Scalable Data Quality Index (DQI) for a Leading Travel Company

About Client:

A global travel company that operates a major international airline alongside a suite of online platforms for booking flights, hotels, vacation packages, and ground transportation. With millions of customers interacting daily across apps, websites, call centers, and partner channels, delivering seamless, personalized experiences is at the heart of their business.

Background:

The client’s data landscape spans operational airline systems, online travel agency (OTA) platforms, CRM tools, and partner APIs. To centralize insights, they had built a cloud-based data lake. However, the rapid growth of data sources combined with legacy systems and inconsistent partner feed, led to major data quality issues that eroded trust and slowed innovation. 

Challenge:

  • Inconsistent Formats In Data Lake: Different systems and partners used conflicting data structures and naming conventions
  • Missing & Incomplete Data: Critical booking and contact details were often absent or partially captured.
  • Duplicate Records: Customer and booking duplication skewed reports and led to fragmented profiles.
  • Delayed Issue Detection: Data quality problems emerged late in the pipeline, affecting real-time operations and personalization.
  • Manual Data Fixes: Data teams spent excessive time manually cleaning and patching issues.
  • Lack of Visibility: No centralized dashboard or framework to monitor data quality trends, health, or to identify root cause of issues across the pipeline
  • Rigid Rule Management: Hard-coded validations across pipelines made it hard to adapt to new sources and rules.

Solution:

We implemented a scalable, low-code Data Quality Index framework across the client’s hybrid data architecture, integrating with their existing tools and platforms:

  • Automated Data Discovery: Incoming data from airline systems, hotel APIs, and booking platforms was profiled for anomalies and inconsistencies.
  • Rule-Based Validation:
    • Completeness: dbt flags missing mandatory fields like passenger ID or flight number.
    • Validity: dbt validates formats and ranges (e.g., dates, airport codes).
    • Consistency: dbt standardizes values across sources (e.g., country codes).
    • Uniqueness: dbt and Monte Carlo detect duplicate records.
    • Accuracy: Monte Carlo validates data against trusted sources and monitors freshness.

Rules were applied at various stages of the data pipeline: at source ingestion, during transformation, and before loading into the data lake.

  • Real-Time Monitoring & Alerts: Integrated Monte Carlo as the primary data observability platform, with real-time dashboards and automated alerts for data quality issues. Connected seamlessly with existing monitoring tools to notify relevant teams of critical violations.
  • Automated & Semi-Automated Remediation:For common and well-defined issues, automated workflows handled formatting, missing values, and quarantining, triggered by Monte Carlo or dbt failures. For complex issues, records were flagged and routed to data stewards with context from Monte Carlo’s lineage and incidents
  • Governance-Ready Architecture: Integrated metadata, rule documentation (via dbt), and lineage to ensure compliance with GDPR and internal data policies.

Outcome:

  • Business users gained trust in data, improving reporting, analytics, and decision-making through continuous validation via Monte Carlo.
  • Automation cut manual effort, reducing costs and freeing up data engineering capacity with low-code workflows and dbt-driven checks.
  • Early issue detection reduced latency, accelerating access to clean, reliable data.
  • Cleaner data enabled better personalization, loyalty management, and customer service.
  • An audit trail supported regulatory compliance and reporting transparency.
  • Low-code setup allowed quick updates and easy scaling to new data sources.
  • Real-time alerts made data quality proactive, preventing issues before impact.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

BizAcuity
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.