ABOUT THE PROJECT

enRichMyData delivers an open software toolbox – the enRichMyData toolbox – comprising practical, robust and scalable components to support organizations in enriching their data with reference data they may have limited knowledge of, as well as supporting data providers in making their data reusable and available in data enrichment processes.

The toolbox lowers the technological entry barriers by providing support for the definition of highly scalable and replicable data enrichment pipelines through a set of tools and infrastructure services related to capabilities needed during the lifecycle of enrichment pipelines. The toolbox makes the data enrichment process accessible to a broader set of stakeholders by reducing the required expertise and enhancing the tool support level.

DISCOVERY OF POTENTIALLY
VALUABLE DATA FOR DATA ENRICHMENT

Improve data discovery and profiling featuring search on data, ontologies, and semantic data profiles to identify potentially valuable data for data enrichment.

WRAPPING DATA SOURCES
IN DIFFERENT FORMATS

Improve wrapping of data sources in different formats so they can be securely accessed as virtual semantic graphs and used more easily for data enrichment.

SIMPLIFIED CLEANING, LINKING
AND EXTENSION OF DATA

Simplify cleaning, linking (to reference resources), and extension of structured and semi-structured data, featuring approaches that enable users to specify such operations visually.

SIMPLIFIED ANNOTATION
AND CLASSIFICATION OF DATA

Simplify annotation and classification of textual data, featuring entity and concept extraction, feature extraction (via embeddings), and classification with predefined and custom classifiers.

SUPPORT THE MANAGEMENT
OF DATA ENRICHMENT PIPELINES

Support the management of data enrichment pipelines, including the creation and operation of data linking and extension services, a framework for deployment and execution of pipelines at a large scale, and reuse and extension of existing pipelines to deliver a hub of data and services for data enrichment.

SUPPORT DATA STREAMING
IN DATA ENRICHMENT PIPELINES

Support data streaming in data enrichment pipelines, featuring support for setting up appropriate endpoints and ensuring high throughput during pipeline execution.

ENERGY CONSUMPTION REDUCTION
FOR DATA ENRICHMENT PIPELINES

Monitor and reduce energy consumption for executing data enrichment pipelines using models to estimate and track their carbon footprint.

The consortium

Consists of 13 partners from 11 countries. It has three strong university partners specialised in Big Data, distributed computing, and high-productivity languages, led by a research institute. Additionally, one research institute and one international organisation are involved. EnrichMyData gathers three SMEs and five large companies that prioritise the business focus of the project in achieving high business impacts.

Scroll to Top