Fetch: Bridging Linked Data and Applications (Integrations Blog Part 2)

Fetch is an open-source solution designed to implement the referencing and importing functionalities of the plug-in framework. It enables linked data specialists to query RDF data sources via SPARQL and integrate selected data into target applications. Once configured, domain experts without technical knowledge of linked data can easily choose and apply information from ontologies within their everyday software tools.

Fetch simplifies data integration by acting as a bridge between linked data sources and traditional data structures while promoting broader adoption of semantic web technologies. This open-source approach encourages community contributions and collaborative innovation.

Fetch operates through three key ETL (Extract, Transform, Load) components: Query, Transform, and Provide. Each is designed to be modular, allowing customization for different applications. Users can configure these components via an intuitive web-based admin panel, while the actual referencing or importing of ontologies happens directly within the target application through an HTML iframe.

Image 1: HTML iframe

How these components work together to streamline data integration is explored in depth in the next sections.

Querying: Retrieving Data from Linked Data Sources

The Query component of Fetch is responsible for retrieving data from SPARQL endpoints, which are standardized access points in the linked data ecosystem. These endpoints enable users to query structured data via the SPARQL query language, promoting seamless data integration across various sources. Fetch allows administrators to configure multiple endpoints with unique URLs and authentication settings, ensuring flexible access to diverse datasets.

Authentication

Fetch supports basic authentication through usernames and passwords or authentication tokens using standard HTTP methods. While current options are simple, the system is designed to accommodate additional protocols, like OAuth, if required.

SPARQL Queries

Fetch enables users to create custom SPARQL queries for flexible data retrieval. These queries fall into two categories:

  • Visualization Queries: Used to display data within the target application’s interface (via an HTML iframe). Users can filter and navigate the retrieved data, often shown as a treetable representation.
  • Import Queries: Designed for data import after user selection within the interface. These queries work sequentially, ensuring an organized flow of data retrieval and transformation.

To avoid complex queries and duplicate results, Fetch follows a step-by-step approach. For example, when importing definitions of physical objects and related requirements, multiple smaller queries are created, each linked to a specific import step. This method ensures accurate data retrieval and simplifies integration.

Typical query use cases include:

  • Class Definitions: Fetching class names, codes, and descriptions.
  • Hierarchy Determination: Identifying parent-child relationships for tree structures.
  • Related Elements: Retrieving associated information like documents, properties, or requirements.

Fetch connects visualization and import queries, allowing users to reuse data across query stages. For instance, URIs retrieved in visualization queries can serve as input for subsequent imports. This is done through a find-and-replace mechanism, where Fetch substitutes placeholders with actual values from previous query results.

Ultimately, the Query component empowers users to selectively retrieve and import data from linked data repositories.

Transformation: Adapting Data for Target Applications

Once data is retrieved, the Transformation component of Fetch converts it into a structure compatible with the target application. Currently, Fetch supports output in JSON and XML formats. Internally, it relies on Apache Jena to process the results, ensuring flexibility regardless of the SPARQL query’s output format.

The current XML transformation was developed specifically for a Relatics implementation. The process follows a straightforward algorithm:

  1. Create an <Import> root element.
  2. For each entity in the query result, add a <Row> element under the root.
  3. For every binding in the query solution, create an XML attribute in the <Row> element, using the variable name as the attribute and the binding value as its content.

While this XML structure works for Relatics, other applications may require different formats. Fetch’s modular design allows the algorithm to be adapted for custom XML transformations, ensuring compatibility across diverse platforms.

Providing Data: Delivering Information to Target Applications

After transformation, the Providing component delivers the data to the designated target application. To enable this, Fetch requires users to first register the target application within the tool, specifying:

  • Destination environment: The URL of the web service or API endpoint.
  • Workspace (optional): If the application has specific project environments.
  • Authentication credentials: Basic identifiers and access codes.

Users also configure the API endpoints or web services responsible for triggering data imports. Fetch allows for bidirectional data exchange, meaning users can not only send but also receive existing data from the target application. This ensures alignment between new imports and existing datasets.

Fetch currently supports two integration technologies:

  • SOAP-based web services
  • REST APIs optimized for JSON data exchange but adaptable to other formats.

Through this structured approach, Fetch ensures the data transfer, enabling users to enrich their software environments with ontology-based information while maintaining compatibility with existing workflows.

Configurations: Bringing It All Together

The Configurations component of Fetch integrates all reusable elements (SPARQL queries, endpoints, and target application setups) into a single workflow. Each configuration is divided into three key parts:

  1. Environment: Specifies the SPARQL endpoint used to retrieve data.
  2. Visualize: Defines the SPARQL queries responsible for displaying the RDF data within the target application’s interface.
  3. Import Steps: Allows users to define a sequence of SPARQL queries for importing data. Each step includes the target web service or API parameter that will receive the corresponding data.

Once the configuration is finalized, Fetch generates an HTML snippet for embedding the iframe within the target application, streamlining the integration of linked data into existing workflows.

Details Relatics Configuration page in Fetch

Image 2: Configurations page in Fetch

Target Application

The final outcome of the Fetch setup is an iframe that integrates directly into the target application. This iframe allows users to navigate the RDF data, apply filters, and select multiple objects within a hierarchy.

By embedding Fetch’s functionality within their familiar software environment, users can streamline data import workflows and maximize the value of linked data in day-to-day operations.

Fetch iframe embedded in Relatics

Image 3: Fetch iframe embedded in Relatics

Conclusion

Stay tuned for the final installment of this blog series! We’ll present a real-world case study showcasing our concept in action with an engineering client. Follow us on LinkedIn to ensure you do not miss this release. We will announce once the case study is live! If you have any questions, please contact us here or reach out to the author of this blog Utku Sivacilar.