Master Data Distribution in Engineering

building construction engineering

A FAIR Comparison Between File-based RDF and Web-accessible Linked Data

Master Data is authoritative information for reuse. In the modern business landscape, it stands as a foundational element in enabling informed decisions and streamlined operations. The effective handling and distribution of Master Data have become non-negotiable, ensuring its relevance throughout the entire organization1. Managing Master Data has proven to be challenging in practice, with issues such as segregated data silos, integration complexity, and obstructive organizational layers all creating hurdles2.

In this blog post, we will discuss how to effectively distribute Master Data in engineering projects, where organizations are temporary and the application landscape is diverse. More specifically, we compare the effectiveness of file-based RDF and web-accessible Linked Data for Master Data distribution, using the FAIR principles for machine-actionable information sharing3. Linked Data is to Master Data what Solid is to personal information: a standardized way to structure and connect decentralized information seamlessly4. The data model and syntax for Linked Data is the Resource Description Framework (RDF).

Master Data in the Linked Data Stack
Figure 1. Linked Data Stack (Semantic Web Layer Cake) 5

FAIR Principles for Master Data Distribution

Within the landscape of engineering, temporary consortia form among various organizations, such as asset owners, engineering firms, and contractors. The need for collaboration between organizations across the project life cycle necessitates the integration of distinct processes and software tools. This diverse information landscape presents a challenge: the imperative need to distribute Master Data across systems.

To enable the increasing capabilities of computational tools for data management, the FAIR principles were established. These principles mandate that data must be:
1. Findable through unique identifiers, rich metadata, and searchable indexes.
2. Accessible with open protocols and metadata accessible metadata.
3. Interoperable using shared languages, vocabularies, and references.
4. Reusable with detailed metadata, accessible licenses, and adherence to standards.

By embracing these FAIR principles, Linked Data standards such as RDF pave the way for efficient Master Data distribution, enabling interoperability between various stakeholders, tools, and data sources in engineering projects6.

RDF and Web-accessible Linked Data for Master Data Distribution

In this section, we explore how RDF and Linked Data align with the FAIR principles. For each principle, the extent to which RDF is able to fulfill the criteria is presented, followed by the additional capabilities offered by utilizing the web-accessibility functionality of the Linked Data Stack.

Findability
In RDF, each element is identified by its Unique Resource Identifier (URI); these globally unique identifiers are a cornerstone for findability. Sharing an RDF dataset on an indexed part of the internet ensures the information is findable for data consumers in engineering projects. Full-fledged Linked Data adds explicit links between datasets, enabling those consuming data to trace links from project information to specific Master Data entities and from Master Data to entities in other Master Data sets. This aids the findability of the Master Data by humans and machines, eliminating the need to search other sources for more information on related entities manually.

Accessibility
While file-based RDF ensures basic findability of Master Data, accessible data mandates it to be retrievable through a standardized and open communications protocol (e.g., HTTP, email). Openness is not a requirement for the data itself, nor does it have to be free7: authentication and authorization should provide dataset owners the ability to control access to the Master Data. In the case of web-accessible Linked Data distribution, access control lies with the Master Data distributor. RDF itself represents a data model, it relies on other technologies for data accessibility. The Linked Data Stack encompasses open standards for this: SPARQL, for querying Linked Data through standardized HTTP endpoints. This standardized combination of technologies empowers data consumers to establish consistent solutions and query across sources. Consequently, a significant enhancement in machine accessibility and actionability is achieved compared to file-based RDF combined with communications protocols that are not part of the Linked Data Stack.

Interoperability
RDF, a formal, accessible, and broadly applicable framework, offers an improved understanding and interpretation of data across sources compared to non-RDF data exchange. This is a foundation for interoperability and consistent use of Master Data. To further improve interoperability, Linked Data incorporates explicit and traceable links to point at entities stored in other data sources. These links, in conjunction with accessibility capabilities, enable seamless interpretation of project data, Master Data, and any other datasets referenced by or referring to the aforementioned data.

Reusability
The FAIR reusability principle partly concerns the quality of the information and the presence of clear and accessible data usage licenses. While these aspects are not directly related to the use of the Linked Data Stack, RDF does allow the expression of such information related to Master Data. Web-accessible Linked Data offers additional reusability features by connecting relevant community standards as linked and accessible contextual information. Furthermore, Master Data distributed in versioned web-accessible Linked Data sets remains in its original location, thus providing detailed provenance information to those who consume the Master Data.

Conclusion

In conclusion, while file-based RDF serves as a viable method for exchanging Master Data across diverse applications and organizations, there exists an opportunity to elevate the FAIRness of Master Data. This elevation can be achieved by adopting web-accessible Linked Data as a standardized solution. We encourage organizations to consider embracing the Linked Data Stack, a powerful approach that effectively addresses the complexities of Master Data Management in engineering.

Join the Linked Data movement today, and connect with Bram Bazuin to discuss practical implementation strategies. Your journey toward optimized Master Data awaits.

References:

1. McKinsey. (2023). The evolution of the data-driven enterprise. https://www.mckinsey.com/

2. DAMA. (2017). DAMA-DMBOK, 2nd Edition. Technics Publications.

3. GO FAIR. (n.d.). FAIR Principles. https://www.go-fair.org/fair-principles/

4. Solid. (n.d.). About Solid. https://solidproject.org/about

5. W3C. (n.d.). Semantic Web Activity Homepage. https://www.w3.org/2001/sw/

6. Garijo, D., & Poveda-Villalón, M. (2020). Best Practices for Implementing FAIR Vocabularies and Ontologies on the Web. IOS Press.

Want to know more?

Schedule a meeting

Contact
Bram Bazuin Head of Fields and Capabilities at Semmtech

Bram Bazuin

Head of Fields