Artificial Intelligence and Machine Learning Group

STELAR (Spatio-TEmporal Linked data tools for the AgRi-food data space)

1. Project details

Project acronym: STELAR
Project full name: Spatio-TEmporal Linked data tools for the AgRi-food data space
Funding period: Sep 1, 2022 - Aug 31, 2025 (36 months)
Funding body: EU under the call HORIZON-CL4-2021-DATA-01-03 - Technologies for data management (IA)
Homepage: STELAR

2. Involved partners

ATHENA RESEARCH AND INNOVATION CENTER (ARC), Greece
University of Athens (UoA), Greece
Technische Universiteit Eindhoven (TUE), Netherlands
Universität der Bundeswehr München (UniBw), Germany
Rapidminer GmBH, Germany
AGROKNOW IKE, Greece
VISTA GEOWISSENSCHAFTLICHE FERNERKUNDUNG GMBH, Germany
ABACO SPA, Italy
FOODSCALE HUB ENTREPRENEURSHIP ANDINNOVATION ASSOCIATION, Serbia.

3. Team

Prof. Dr. Eirini Ntoutsi (PI)
Dr. Vivek Kumar Singh
MSc. Konstantinos Zacharis
MSc. Chethan Krishnamurthy Ramanaik

4. Project overview

STELAR will design, develop, evaluate, and showcase an innovative Knowledge Lake Management System (KLMS) to support and facilitate a holistic approach for FAIR (Findable, Accessible, Interoperable, Reusable) and AI-ready (high-quality, reliably labeled) data that will be pilot tested in diverse, real-world use cases in the agrifood data space.

5. Overview of our contributions

we lead WP4: “Tools for AI-ready data” and contributes to other WPs. Existence of vast amounts of data is required as a supplement of the real data for ML model training and for supporting simulations. The real data and the automatically produced data profiles (correlations, distributions, average values, outliers, etc.) will be used for generating big synthetic data. We will use a combination of available open-source black-box Generative Adversarial Networks (GANs) and open-box data generators based on the data profiles, to generate data at will. We will also use our own state-of-the-art methods and algorithms for data generation and augmentation, which include work on semi-supervised learning for dealing with label scarcity, synthetic data-augmentation for tackling class-imbalance and unfairness, counterfactual generation for tabular data and textual data and generative models. The focus will be on semi- and self-supervised learning, as well as active learning, involving the domain expert only when necessary. The goal is to generate high utility data rather than just data artifacts. To this end, the generation process will consider the existence of biases and will mitigate their effects. Moreover, the plausibility of the data will be evaluated to ensure that only relevant to the problem at hand instances are considered. XAI techniques will be used to debug and validate model decisions (i.e., annotation/labeling/generation decisions) to ensure that what has been learned is correct and unbiased. Such explanations will increase the trust and confidence of domain experts when putting AI models into production. Finally, the robustness of GANs against adversarial attacks will be considered.

6.Publications

For further information please visit the STELAR website