High-quality, rich and meaningful data are crucial to successfully implementing Artificial Intelligence (AI) and Big Data Analytics (BDA) solutions. Delivering required data to feed into AI and BDA models is costly, difficult, and often limited in data and skill availability. It is well known that up to 80% of the effort spent in AI and BDA projects is dedicated to ensuring data is fit for purpose. Activities are required to discover, understand, select, clean, transform, and integrate data from a variety of sources in such a way that data can be fed into the modelling phase. Such activities result in enriched data, eventually improving the quality of downstream BDA and AI applications. The data enrichment process is implemented by specifying, deploying, and executing data enrichment pipelines over data that can be structured, semi-structured and unstructured, in large amounts, and from static or streaming sources. While techniques exist to cover different enrichment operations such as data cleaning, linking, feature extraction, classification and semantic annotation, etc., the lack of comprehensive approaches and established tools dedicated to data enrichment makes the definition, implementation, and operation of enrichment pipelines difficult for too many organizations willing to improve their BDA and AI applications.
The overall vision of the enRichMyData project is to create a novel paradigm for building rich, high-quality, valuable, and FAIR-compliant datasets to feed downstream BDA and AI applications in the context of data-sharing ecosystems, such as data spaces. The paradigm facilitates the specification and execution of data enrichment pipelines, focusing on supporting various data enrichment operations. enRichMyData makes this easily accessible to a wide set of large and small organizations that encounter difficulties in delivering suitable data to feed their BDA and AI solutions due to the lack of usable tools/expertise for the cost-effective management of data enrichment pipelines.
News & Events
enRichMyData Toolbox Version 2 Released
The enRichMyData Toolbox Version 2 is designed to handle even the most complex data enrichment scenarios. Useful for data scientists or engineers, this open-source toolbox provides everything needed to design, execute, and optimize data enrichment pipelines with ease. The toolbox brings together a collection of interoperable tools and services that can be seamlessly combined and
enRichMyData Joins a New Collaboration at BDVAF 2024
enRichMyData was part of the BDVAF’s session “Leveraging Technologies for Data Management to Implement Data Spaces”. This session set the foundation for a new collaboration with innovative projects like WATERVERSE EU, GREEN.DAT.AI, SEDIMARK, DataBri-X, and STELAR. Iroshani Jayawardene, a Research Scientist from SINTEF, delivered a talk on “Finding Data in Data Spaces – Entity Linking for
enRichMyData Participated in a BDVAF Session as Part of the EUDATA+ Cluster
enRichMyData participated in the BDVAF’s session “Advancing Data Lifecycle Management: Tools and Strategies for Enhanced Monetisation” as part of the EUDATA+ cluster. This was an excellent opportunity to present how the project developed toolbox applies to our use cases and a wonderful occasion to map possible synergies with the other projects in the cluster (FAME,
enRichMyData Hosted an Inspirational Session at the BDVAF 2024
During the 3-day BDVA forum in Budapest, enRichMyData organised a productive session on Generative AI for Data and Knowledge Engineering, sponsored by Dynabic-eu, HUMAINE and INTEND project. The speakers explored the complexity, transparency, fairness, and accountability of adopting generative AI in data and knowledge engineering. They also highlighted some engineering challenges that arise when moving from
Join EUDATA+: European Data Sharing and Monetization Cluster at EBDVF 2024!
We are excited to announce the launch of EUDATA+, a new European Data Sharing and Monetization Cluster that aims to transform the data landscape. This groundbreaking initiative will bring together leading specialists in data sharing and monetization, creating a collaborative space for innovation and growth. As part of the cluster, enRichMyData is proud to collaborate with DATAMITE, PISTIS,
enRichMyData, INTEND, and Dynabic Projects to Host “Generative AI for Data and Knowledge Engineering” Session at BDVAF 2024
The enRichMyData project, in collaboration with the INTEND and Dynabic projects, is set to host an insightful session titled “Generative AI for Data and Knowledge Engineering” at the Big Data Value Association Forum (BDVAF) 2024 in Budapest, taking place from 2-4 October. This session will explore the transformative potential of generative AI, particularly large foundational
Social Media
Consortium
The enRichMyData project is coordinated by SINTEF (Norway), one of European’s largest independent research organisations. The project partners include companies such as Philips (The Netherlands) and Bosch (Germany), dedicated to engineering and manufacturing; Speed Network (Estonia), a provider of procurement data; JOT Internet Media (Spain), a digital marketing company; CS Group (Romania), a software service company; Expert AI (Italy), a technology company specializing in natural language understanding; and Ontotext (Bulgaria), a semantic technology company. They will have the full support of the research partners that, in addition to SINTEF, include the University of Milano Bicocca (Italy), Jozef Stefan Institute (Slovenia), University of Copenhagen (Denmark), GATE Institute (Bulgaria), and BGRIMM Technology Group (China).