By: Treesa Rose Joseph, Jesper Friis, Francesca Lønstad Bleken, Sigurd Wenner, Tor Strømsem Haugland, Elisabeth Thronsen
Traditional cement production relies heavily on clinker, a component responsible for significant CO₂ emissions. To reduce this impact, researchers are exploring ways to substitute clinker with more sustainable materials like limestone and calcined clays.
The challenge lies in identifying the maximum clinker substitution while retaining – or even surpassing – the performance of traditional cement. To support this transition, the Horizon Europe project MatCHMaker, has adopted this as one of its use cases and is building predictive models to help determine the optimal substitution level.
Building these predictive models requires large volumes of data, generated through experiments and simulations within the project. However, machine learning tools need data in specific formats, and preparing it can be time-consuming and complex. To make the data easier for ML tools to consume, SINTEF, as a partner in MatCHMaker project, is expanding the existing semantic ecosystem to connect experimental and modelling data with AI tools. The goal is to make data not only accessible, but also meaningful and reusable across platforms and disciplines—following the FAIR principles: Findable, Accessible, Interoperable, and Reusable.
Achieving FAIR data requires semantic data documentation, which we approach through three key components:
Data models are structured templates that define what data is included, how it’s organised, and what it means—making complex datasets easier to share, validate, and analyse. Each model includes a unique identifier, a description, and a set of properties that specify the type, shape, and unit of each data element. When combined with ontologies as described below, it is ensured that information remains traceable, reusable, and understandable across different systems and contexts, supporting better integration and communication.
Ontologies are formal vocabularies that define the meaning of terms and their relationships. In the context of materials characterisation and modelling, we use domain-specific ontologies to capture the semantics of a particular field—in this case, cement production. These ontologies describe the key concepts, materials, and processes relevant to the domain. Complementing this, application ontologies focus on experimental and modelling workflows, including how data is generated and structured. Together, domain and application ontologies provide a semantic framework that unifies data from diverse sources and disciplines. By aligning with established standards like EMMO, these ontologies ensure compatibility and promote reuse beyond the scope of specific tools like MatCHMaker.
Semantic mappings connect data models to domain and application ontologies, thus creating Semantic Data Models. This helps different systems understand and exchange data by translating it from one format to another, with the help of a semantic interoperability/exchange tool.

In short, this builds up the semantic backbone of MatCHMaker enabling FAIR data and better collaboration in the materials sciences domain.

Comments
No comments yet. Be the first to comment!