LinkR: Enriching data by linking values from different sources

When users need to enrich their dataset with an external data source, the task of linking values from both sources becomes a critical hurdle to overcome. It’s the key to unlocking the external data and seamlessly blending it with the original dataset.  In addition, Knowledge Graphs (KGs) provide a powerful abstraction to support AI applications and interoperability.  As most data arrives in loosely structured formats like tables and JSON files, one invaluable asset for data enrichment lies in simplifying the transformation of tables into graphs. This transformation necessitates linking—specifically, connecting elements of the input dataset (columns, rows, and more) to the ontology’s classes and properties employed in graph modeling. 
Tools that are part of the “LinkR” collection provide support for a variety of linking tasks that solve the above problems.

Ontotext Reconciliation is a tool for exposing Reconciliation API over RDF data indexed in GraphDB. It simplifies the creation of a flexible reconciler service for OpenRefine-compliant APIs. By leveraging RDF graph content and ontology, users can build a robust reconciliation service without modifying code. The tool supports entity type specification for filtering, optional feature extraction, and efficient searches using GraphDB connectors. These connectors ensure synchronization, provide custom mapping for RDF types, and offer scored matches and custom processing options. Overall, the tool enables the creation of reconciliation services with ease and flexibility, enhancing data matching capabilities.

Two more integrated tools are available to deliver end-to-end table-to-graph linking, KG generation, and tabular data enrichment features. s-elBat provides algorithms for all Semantic Table Interpretation (STI) tasks, which altogether deliver complete automatic annotations using reference KGs such as WikiData, DBpedia. After the annotation, the data in the table can be transformed into a graph format, the structure will be compliant to the desired ontology and the entities will be interlinked with the KG. The tool has been competing in several international competitions established for STI, scoring frequently among top performing systems. SemTUI is a versatile framework designed for both experts and non-experts seeking to semantically enrich tabular data. It was described in a previous post about our data discovery tools, because it provides such features, based on a “link and extend” paradigm. Being links to external data sources the key to support data extension and enrichment, SemTUI provides a user interface that controls several external linking and reconciliation services. Given an input, a user can revise semantic annotations computed by the s-elBat algorithms before exporting the data in graph format, link to geospatial references such as geocoordinates, or focus on reconciling values in the table against entities in an external data source (e.g., WikiData and Geonames). New linking services can be added at limited costs and interfaces to these services are based on agreed upon protocols; for example, on a dedicated version of the tool, we integrated a reconciliation service for a proprietary KG about Italian companies. In the upcoming features will strengthen support for human-in-the-loop intelligent table processing, by generating python code that developers are familiar with, helping users find most uncertain links to revise, and learning from the user feedback.

Scroll to Top