Home Use Cases Data pipeline automation, orchestration and observability for global electrical engineering company

Data pipeline automation, orchestration and observability for global electrical engineering company

For one of the business areas of our Fortune 500 customer, a global electrical engineering company, we delivered a project of data pipeline automation, orchestration and observability to effectively load the data into the subsequent layers of the Snowflake data warehouse.

Technology stack

Microsoft
Azure

Azure
DevOps

Azure
KeyVault

Azure
Data Factory

Snowflake

Customer’s challenges

The project addressed several key challenges:

Slow Integration of new data sources

difficulty in quickly integrating new data sources, leading to delays in making data available for analysis and decision-making.

Fragmented data from multiple SAP systems

lack of unified visibility across operations due to challenges in integrating data from multiple SAP systems, resulting in inefficiencies and limited insights.

Manual data processing

heavy reliance on manual intervention during data processing, which reduces efficiency, increases the risk of errors, and consumes valuable time and resources.

Limited auditability and transparency of data pipelines

insufficient tracking and monitoring of data pipeline execution success, performance metrics, and statistics, causing reliability concerns for business stakeholders and hindering trust in data processes.

Solution

To achieve business objectives, our team implemented several integrated mechanisms as part of the project
that made up the solution:

Azure Data Factory (ADF) Native Framework

a scalable and generic data integration solution implemented using the ADF service, enabling rapid and metadata-driven definition of new data pipelines without the need to duplicate the code of existing pipelines.

Copy Data

the ADF Native Framework utilizes the Copy Data mechanism implemented in Snowflake (metadata table + a set of stored procedures) that enables performance and cost optimized data transfer between data warehouse layers while maintaining the data schema contract and supporting incremental loads, change tracking, soft deletions, and data versioning.

Incremental SAP consolidation

a framework implemented in Snowflake for consolidation of data from selected SAP data sources (data from three main regions) offering features like incremental data loading, generating a consolidated data model and data versioning.

Audit mechanism for ETL processes

a set of technical tables and procedures in Snowflake ensuring end-to-end monitoring and observability of data pipelines, including providing information about data quality and completeness for data consumers.

Benefits

Significantly reduced time needed to add a new data source

in the data warehouse thanks to easy and intuitive mechanism for configuring new sources and automation based on metadata in ADF Native Framework.

Reduced the overall cost of data processing by 40%

by moving away from full to incremental data loads in Snowflake and ADF.

Unified view of data from SAP systems

in the data warehouse and fast loading of new SAP data from central data hub thanks to the incremental loading mechanism.

Effective detection of problems with data pipelines

(delays, processing taking too much time, irregularities in loaded data) and notifying business users about data problems thanks to information collected by the audit mechanism.

Increased business user confidence in data in the data warehouse

in 6 months we achieved 25% growth in the number of users actively consuming data in the data warehouse.

Reliable data pipeline deployment

thanks to mature CI/CD and DevOps processes established for the solution (this involves DEV/TEST environments for Azure services and Snowflake, Azure DevOps for deployment management).

Discover the possibilities that data platforms offer for your business

Connect with our experts today