enRichMyData shines at the 23rd International Semantic Web Conference

The International Semantic Web Conference (ISWC), now in its 23rd edition, stands as the leading venue for advancing research and practical applications in semantic web and knowledge graph technologies. These technologies are pivotal in promoting interoperability and simplifying data enrichment—a mission that resonates deeply with enRichMyData. Naturally, our teams were well-represented at ISWC, contributing to various aspects of this rapidly evolving field.

LLMs for ontology learning: research paper

First, the Bosch team has co-authored a paper with the University of Mannheim about exploring the effectiveness of Large Language Models (LLMs) in performing lexical semantic tasks, such as Knowledge Base Completion (KBC) or Ontology Learning (OL), particularly in domain-specific data. It addresses the unresolved question of whether LLMs excel in reasoning over unstructured or semi-structured data, or if their success is primarily attributed to learning linguistic patterns and senses alone.

To explore this, the researchers design a controlled experiment using WordNet to create parallel corpora containing both English and gibberish terms. They then analyze the outputs of LLMs for two OL tasks: relation extraction and taxonomy discovery. The findings reveal that off-the-shelf LLMs do not consistently reason over semantic relationships between concepts when adapting to gibberish corpora; instead, they rely on senses and their frame. However, the study demonstrates that fine-tuning enhances the performance of LLMs on lexical semantic tasks, even when dealing with domain-specific terms that were not encountered during pre-training.

The paper suggests that pre-trained LLMs can be effectively applied to Ontology Learning, despite the presence of arbitrary and unseen domain-specific terms.

Link to the paper: https://link.springer.com/chapter/10.1007/978-3-031-77844-5_7

Semantic Table Interpretation: from Heuristic to LLM-based approaches – a Tutorial with Website, a Survey and a Hands-On Session

Second, The UNIMIB team, together with Ernesto Jimenez-Ruiz from the University of London, delivered a tutorial on Semantic Table Interpretation (STI)—a critical set of tasks for mapping tabular data to knowledge graphs. STI includes automatic annotation of tables with shared vocabulary terms and linking tabular values to entities in knowledge graphs, all essential steps in streamlining data enrichment processes.

The tutorial, attended by about 10 participants, provided an in-depth overview of STI challenges and methods, ending with a review of recent advancements using Large Language Models (LLMs). Notably, the session discussed fresh performance insights on models like GPT-4o for entity-linking tasks.

Participants also engaged in a hands-on session, where they learned to use and train an LLM-based entity linker. The tutorial’s materials, including a survey paper enriched with interactive visualization widgets, are freely accessible on the tutorial website. For those intrigued, the slides and the notebook can be explored independently—though training an LLM comes with some computational cost! 😊

 

Link to the survey paper:https://arxiv.org/abs/2411.11891

Link to the tutorial website: https://unimib-datai.github.io/sti-website/tutorial/

 

SemT-UI Demo: Runner-Up for Best Demo Award

The Università degli Studi di Milano-Bicocca NIMIB team also showcased their SemT-UI prototype in a live demo, running up for the best demo award at the conference. SemT-UI is a tool designed for interactive tabular data enrichment, providing access to diverse reconciliation and data extension services. It supports end-to-end STI annotation, enabling users to execute and configure services, explore algorithm results, refine annotations, and export enriched tables.

SemT-UI is part of the broader SemT framework, which offers notebook-based integration and flexibility for hosting additional services. As an open-source platform, SemT-UI emphasizes extensibility and collaboration, making it a valuable resource for data scientists and engineers.

For more details on SemT-UI, refer to our earlier blog posts and published papers.

Link to demo paper: https://ceur-ws.org/Vol-3828/paper36.pdf

Link to code repository: https://i2tunimib.github.io/I2T-docs/

 

Scroll to Top