Data labelling: call for ethical regulation of AI in Europe


💡 Artificial intelligence (AI) is revolutionizing our world, but to ensure its ethical and responsible use, adequate regulation is essential.
In this article, we wanted to discuss the importance of the data labeling process in the construction of AI products, data annotation, Crowdsourcing and ethical labeling. We call on the European Union (EU) to adopt the EU AI Act while highlighting the shortcomings of the current text in terms of AI supply chain and data management.
Data labelling for AI
Data labelling is a key step in the development of AI. It consists in assigning Tags or labels to data sets (or”Datasets“), thus allowing machine learning algorithms to understand and interpret information. However, it is imperative to ensure that this labelling is carried out carefully, accurately and in ethical conditions in order to avoid biases and prejudices.
Ethical data annotation
Data annotation requires human expertise. It involves the addition of additional information to the data, that is, a semantic layer associated with images, videos or text, such as metadata or detailed descriptions. In the context of AI, it is essential that data annotation is done in an ethical manner. This means that annotators (or Data Labelers) must follow strict guidelines to ensure the integrity and objectivity of annotated data, avoiding stereotypes, discrimination, and value judgments. They must also work in good conditions (decent working hours, stability, career prospects) and be supported (training and support) to produce quality data.
The importance of Crowdsourcing in the labeling processes
The Crowdsourcing is an effective method for labeling and annotating large amounts of data. By calling on a community of contributors, it is possible to obtain fast and accurate results. However, it is crucial to put in place rigorous quality control mechanisms in order to guarantee the reliability of the data produced by the crowdsourcing. It is also necessary to remember that this is not the only method for labeling large quantities of data: it is often more efficient to use a panel of functional specialists to annotate data, and to accept their gradual increase in competence rather than requiring an immediate maximum level of quality (which is often the case in labeling processes using Crowdsourcing). Data Labeling is an important job, and people ready to invest in it, Data Labelers, should be treated with dignity and considered as AI specialists in the same way as a Data Scientist.
Ethical labelling: a fundamental requirement
Ethical labeling is a fundamental aspect of responsible AI. It aims to ensure that the data used to train AI models is collected, labelled, and annotated in an ethical and human-friendly manner. Transparency and fairness are key principles of ethical labeling, making it possible to avoid prejudices and discrimination during automated decision-making.
Weaknesses of the EU AI Act: Supply Chain AI and Data Management
Despite the progress of the EU AI Act project, it still has some weaknesses in terms of AI Supply Chain and data management. It is essential to have clear measures in place to ensure transparency and ethics throughout the life cycle of AI systems, from data collection to use. Accountability and control mechanisms should be put in place to ensure adequate data management and to avoid abuse.
Data Labeling at the service of ethical AI: conclusion
It is imperative that the European Union adopt solid ethical regulations to frame the development and use of AI. Regulation is necessary and should not hinder innovation. Data labelling and Sourcing ethics are essential elements to ensure responsible AI and an AI data supply chain that respects human life and fundamental rights. However, it is also important to take into account the current weaknesses of the EU AI Act in terms of AI supply chain and data management, in order to strengthen the protection of these rights.