In the world of data enrichment, finding relevant connections and extending datasets with meaningful information is key to unlocking deeper insights. SemT-X, developed by UNIMIB as part of the enRichMyData toolbox, introduces a powerful new approach—”link & extend”—that makes data enrichment more intuitive, scalable, and effective.
What is SemT-X?
SemT-X is a framework for data linking and extension, designed to help users seamlessly connect their data to external sources and enrich it with additional attributes. Whether working with Knowledge Graphs (KGs) like Wikidata, private datasets, or geospatial services, SemT-X provides an interactive and scalable way to enhance datasets.
How Does It Work?
SemT-X follows a three-step enrichment paradigm:
1️⃣ Link: Identify available linking services and connect data points to external sources.
- Example: Link city names in a dataset to their corresponding Wikidata IDs.
2️⃣ Extend: Fetch relevant information from the linked data source.
- Example: Retrieve the population of each city from Wikidata.
3️⃣ Enrich: Expand the dataset further with advanced services.
- Example: Use HERE geocoding to get coordinates, then compute the shortest route distance between two locations.
This web-based data exploration process is inspired by Linked Open Data principles but is not limited to open datasets. SemT-X also supports private company Knowledge Graphs and other proprietary data sources.
Two Interfaces for Maximum Flexibility
SemT-X is available in two powerful interfaces, making it accessible to both technical and non-technical users:
SemT-UI (Graphical Interface)
- Interactive tool for discovering enrichment options and applying them to sample data.
- Ideal for exploratory data analysis and selecting the best enrichment strategies.
SemT-Py (Python Library for Jupyter Notebooks)
- Designed for data scientists who prefer coding-based enrichment workflows.
- Allows users to execute large-scale enrichment pipelines programmatically.
What’s New?
The latest updates bring enhanced usability and scalability:
SemT-Py & SemT-UI Integration:
- Users can now convert pipelines designed in SemT-UI into Python scripts, making it easy to scale up enrichment operations.
Workflow Automation & Scalability:
- Enrichment steps can now be executed as Docker containers, allowing seamless integration into workflow management environments like TAO (ScaleR component).
Improved Candidate Filtering for Better Matching:
- Entities are now categorized into PERSON, LOCATION, ORGANIZATION, and OTHER, refining search accuracy.
Fully Open-Source & Ready to Use
SemT-X is open-source, meaning users can:
✅ Download, host, and run the tool independently.
✅ Customize enrichment workflows to fit their data needs.
✅ Benefit from a free online demo (coming soon!).
Why Use SemT-X?
🔹 Automates & simplifies data enrichment with an intuitive workflow.
🔹 Bridges datasets to external sources, making them more valuable.
🔹 Works across different industries, from marketing to geospatial analysis.
🔹 Scales from small projects to enterprise-level data workflows.
Get Started with SemT-X Today!
If you’re looking for a powerful, flexible, and scalable solution for data linking and enrichment, SemT-X is the tool for you!
Try it now and take your data enrichment to the next level!