STELAR will design, develop, evaluate, and showcase an innovative Knowledge Lake Management System (KLMS) to support and facilitate a holistic approach for FAIR (Findable, Accessible, Interoperable, Reusable) and AI-ready (high-quality, reliably labeled) data that will be pilot tested in diverse, real-world use cases in the agrifood data space.
we lead WP4: “Tools for AI-ready data” and contributes to other WPs. Existence of vast amounts of data is required as a supplement of the real data for ML model training and for supporting simulations. The real data and the automatically produced data profiles (correlations, distributions, average values, outliers, etc.) will be used for generating big synthetic data. We will use a combination of available open-source black-box Generative Adversarial Networks (GANs) and open-box data generators based on the data profiles, to generate data at will. We will also use our own state-of-the-art methods and algorithms for data generation and augmentation, which include work on semi-supervised learning for dealing with label scarcity, synthetic data-augmentation for tackling class-imbalance and unfairness, counterfactual generation for tabular data and textual data and generative models. The focus will be on semi- and self-supervised learning, as well as active learning, involving the domain expert only when necessary. The goal is to generate high utility data rather than just data artifacts. To this end, the generation process will consider the existence of biases and will mitigate their effects. Moreover, the plausibility of the data will be evaluated to ensure that only relevant to the problem at hand instances are considered. XAI techniques will be used to debug and validate model decisions (i.e., annotation/labeling/generation decisions) to ensure that what has been learned is correct and unbiased. Such explanations will increase the trust and confidence of domain experts when putting AI models into production. Finally, the robustness of GANs against adversarial attacks will be considered.
For further information please visit the STELAR website