The enRichMyData toolbox provides a set of loosely coupled but interoperable tools and services that provide functional capabilities needed to support the design of data enrichment pipelines. The enRichMyData toolbox is meant to handle complex data enrichment scenarios, where tools and services can be combined and customized as needed. The core of the enRichMyData toolbox is represented by TAO (Tool Augmentation by user enhancements and Orchestration) which provides a means for the orchestration of heterogeneous processing components and libraries to process enrichment data. This can be achieved in the following steps:
· Preparation of resources such as execution nodes, the processing components, data sources and sinks.
· Definition of a workflow pipeline as a processing chain
· Execution of the workflow pipeline
· Retrieval/visualization of the pipeline execution results
The toolbox offers the following functionalities:
· Integration framework represented by TAO (TAO Project · GitHub)
· Several tools (lamAPI, OntoRefine, SemTUI) have already been integrated as docker images, along with a set of additional helper tools used for different operations (e.g., open-meteo invocation, filtering CSV columns, merging CSV files, etc.). See https://enrichmydata.github.io/toolbox/for the GitHub locations of the integrated tools.
· Demonstration pipelines for Spend Network and JOT use cases are already included in the system database as proof of concept. Please check the above-mentioned ReadMe.txt file to see how you can install these pipelines in your own installed TAO instance.
The toolbox can be operated by data scientists and engineers who explore, design, and implement data enrichment pipelines. Just a virtual machine that has installed Ubuntu 22 (as described in the ReadMe.txt document) is necessary. As the toolbox is open source, new tools could be added, and it can be extended.
Information about the toolbox can be found at
· enRichMyData toolbox installation package
https://s3.waw3-2.cloudferro.com/swift/v1/EMD/20241015_EmdInstallPackage.zip
· enRichMyData toolbox user manual with information about installation and usage
https://s3.waw3-2.cloudferro.com/swift/v1/EMD/D3.2_User_Manual.docx
· enRichMyData installation ReadMe.txt file
https://s3.waw3-2.cloudferro.com/swift/v1/EMD/ReadMe.txt
· enRichMyData V2.0 Changelog – what’s new in the current version
https://s3.waw3-2.cloudferro.com/swift/v1/EMD/CHANGELOG.md
· enRichMyData toolbox on Github:
https://enrichmydata.github.io/toolbox/
· Deployed instance of the enRichMyData toolbox:
https://64.225.143.9:8443/ui/login.html
· A video demonstrating the usage of the toolbox is available here: