Kaluza

Transformational Data Governance & Linked Data Integration

teal arrow

Introduction

Kaluza is an intelligent energy platform that introduces new flexibility into the energy system by optimising individual devices.

Amarti provided strategic consultation and implemented transformative solutions related to data modelling, observability, data contracts, and validation. We worked closely with the senior leadership team and the Director of Data to develop and implement a comprehensive data strategy. Additionally, the amarti-led initiatives related to data observability, data quality, data lineage, and introduced innovative concepts such as semantic integration using linked data and knowledge graphs.

Objective

Our primary objective was to enable Kaluza to establish a robust data governance framework, improve data integration processes, and enhance data quality. Our team aimed to centralise data efforts, decentralise mapping processes, and introduce new technologies and methodologies to optimise the data platform.

Methods

Strategic Consultation:

Amarti collaborated with the senior leadership team and the Director of Data to perform a comprehensive discovery phase. Our team assisted in formulating and implementing the organisation’s data strategy. We advocated for necessary cultural and organisational changes to support the data initiatives and ensure operational readiness.

Data Observability and Quality:

Amarti led requirements gathering, education, and vendor selection process for data observability, data quality tooling, and data catalogue solutions. We designed and implemented Kafka journey recording to ensure comprehensive tracking of data flow.

We established a data lineage framework to track data from individual Kafka services to the data warehouse and BI tools.

Data Governance and Ownership:

We formed a data governance group based on the Data Mesh and Data as a Product organisational patterns, involving 16 domain champions from each engineering team.

Agile workshops were conducted to define the purpose, responsibilities, and boundaries of the data governance group. Collaborative ways of working were facilitated within the governance group to ensure effective data management.

Semantic Integration and Linked Data:

Amarti introduced the concept of a semantic integration layer utilising linked data, knowledge graphs, and elastic search in Neptune. As well as, developed a web-based application for data discovery, browsing data, and data models. We implemented complete provenance information for each data point, enabling tracking from the journey start to data warehouse.

End-to-End Data Provenance Solution:

We integrated existing Kafka topics with the Linked Data solution using RML (RDF Mapping Language). We demonstrated the loading of RML mappings, data, metadata, and schema definitions for comprehensive data provenance tracking. We helped develop a web application and configured a production-ready system in AWS for seamless integration.

Data Normalisation and Mapping Efficiency:

We showcased the benefits of using RML to centralise data normalisation and mapping efforts. Amarti demonstrated how decentralising mapping efforts and domain model design among engineering teams could improve efficiency. We introduced the Kaluza platform as a means to reduce overall engineering efforts and enhance data mapping processes.

Architectural Guidance and Technological Advancements:

We provided architectural guidance for Amarti’s Kafka infrastructure, and proposed leveraging tools such as Apache Flink/Beam for stateful stream processing and streaming databases. Additionally, we introduced new startups and technologies relevant to the data platform, methodology, and tooling.

Results and Impact

The work carried out by amarti resulted in significant improvements in Kaluza’s data governance, integration processes, and overall data management capabilities. The following outcomes and benefits were achieved:

Strengthened Data Governance:

The establishment of a data governance group based on the Data Mesh and Data as a Product organisational patterns proved to be a successful approach. With 16 domain champions from each engineering team, the governance group fostered ownership, collaboration, and responsibility for data management.

Agile workshops conducted by amarti facilitated the definition of purpose, responsibilities, and boundaries of the governance group. This led to a more streamlined and efficient data governance framework within the organisation.

Enhanced Data Integration and Provenance Tracking:

The introduction of a semantic integration layer utilising linked data, knowledge graphs, and Elasticsearch in Neptune significantly improved data integration processes. It enabled the organisation to search, discover, and browse data and data models effectively.

The implementation of a web-based application provided complete provenance information for each data point, offering end-to-end tracking from journey start to the data warehouse. This increased transparency and data reliability.

arrow

Streamlined Data Contracts and Validation:

The strategic consultation and collaboration with the senior leadership team and the Director of Data resulted in the development and implementation of effective data contracts and validation mechanisms. This ensured data quality and compliance with defined standards.

Amarti’s expertise in data contracts and validation empowered the organisation to make informed decisions and establish robust data validation processes.

Improved Efficiency in Data Normalisation and Mapping:

Amarti demonstrated the benefits of using RML (RDF Mapping Language) to centralise data normalisation and mapping efforts. This reduced duplication of efforts and improved overall engineering efficiency.

By decentralising mapping efforts and domain model design among engineering teams, amarti promoted a more collaborative and democratised approach to data mapping and modelling.

Architectural Guidance and Technological Advancements:

Amarti provided valuable architectural guidance on the organisation’s Kafka infrastructure, enabling the organisation to optimise their data streaming processes.

Introducing tools such as Apache Flink/Beam for stateful stream processing and streaming databases enabled the organisation to solve complex asynchronous patterns effectively.

Amarti’s guidance on new startups, technologies, and relevant developments in the data platform, methodology, and tooling helped the organisation stay up-to-date with industry trends and adopt innovative solutions.

teal arrow

In Summary

Overall, amarti’s work brought about transformative changes in data governance, integration processes, and data management for Kaluza. Implementing a semantic integration layer, robust data contracts and validation, and efficient data normalisation and mapping contributed to improved data quality, reliability, and collaboration within the organisation. The guidance on architectural improvements and technological advancements empowered amarti to leverage cutting-edge tools and techniques, enhancing their overall data platform capabilities.