Article | 28 Jan 2025
Part II: The relationship between the GDPR and the AI Act
Introduction
In the first part of this article series, we introduced an “AI Act roadmap,” detailing how the various provisions of the AI Act will apply to different actors in the AI value chain. When considering how to comply with the obligations, the AI Act cannot be viewed in isolation, but must be considered in conjunction with other existing and forthcoming legislation, both general and sector-specific.
AI systems depend on data for their creation, development, and maintenance. In the EU, the collection and use of data is increasingly regulated. While many new data-related regulations will come into force or become applicable in the coming years, the General Data Protection Regulation[1] (GDPR), which regulates personal data, became applicable already in 2018, making it the most established data-related legislation. According to the GDPR, “personal data” is any data that can be linked to a natural person (in the GDPR defined as a data subject). The definition is broad, which means that personal data is likely to be processed in many AI systems. In such case, AI providers and deployers must comply with both the AI Act and the GDPR. The recitals of the AI Act explicitly state that the regulation on AI should not affect the protection of personal data or the application of the GDPR. However, the two regulations do, to some extent, have different objectives: while the AI Act primarily serves as a product safety regulation, the GDPR focuses on the protection of individuals’ privacy.
This article explores key areas where the regulations are compatible and reinforce each other, as well as areas where they may conflict. Setterwalls has extensive experience in both AI and data protection and can assist you in navigating both regulations concurrently.
Scope and application
The GDPR applies to actors who process personal data. The term “personal data” is broadly defined and encompasses any information that can be connected to an identifiable individual, even indirectly (i.e. data that, on its own, cannot identify a person but can do so indirectly together with other data). Processing is defined as any action performed on personal data, such as collection, storage, structuring, use, or erasure. A company that processes personal data and determines the purpose and means of the processing activity is classified as a “controller” under the GDPR. A company that processes personal data on behalf of the controller is classified as a “processor.”
A majority of the provisions in the AI Act apply to providers and deployers of AI systems classified as high-risk. Which actors that qualify as providers and deployers, as well as which systems are considered high-risk, is detailed in the first part of this article series, which is linked above. While a variety of setups can lead to dual compliance requirements under both the AI Act and the GDPR, the most regulated situation will be where a provider of a high-risk AI system also qualifies as a controller under the GDPR. The foregoing is also a very likely scenario since a provider of an AI system, in general, is also a developer of the system and thus decides what data sets (including potential personal data) to be included in the training of the system and consequently determines the purposes and means of the data processing.
Compatibility of the regulations
Automated individual decision-making
The GDPR provides special protection for data subjects regarding automated individual decision-making. Article 22 stipulates a right for data subjects not to be subject to decisions based solely on automated processing and/or profiling. This right is triggered when the decision or profiling has legal or otherwise significant effects on the individual. Examples of such decisions include automatic refusals of online credit applications or e-recruitment decisions made without human involvement. Automated profiling can involve predictions about a data subject’s economic situation, personal preferences, or behaviour. Decision-making based solely on automated data processing is, as a general rule, prohibited. However, exceptions exist, such as when automated decision-making is necessary to fulfil contractual obligations or if an explicit consent is obtained from the data subject. In these cases, the controller must ensure that the data subject is entitled to obtain human intervention in order to express his/her point of view or to contest the decision.
While the GDPR requires a possibility to obtain human intervention upon request, the AI Act takes a more “proactive approach” to human oversight. According to Article 14 of the AI Act, providers of high-risk AI systems must implement tools and processes in the AI system that allow for human intervention and/or interaction and ensure that deployers are informed about the system’s capabilities, limitations, and risks of overreliance. Deployers must be educated on interpreting the system’s outputs and face additional obligations for human oversight, which shall be fulfilled by adopting technical and organisational measures, following the providers’ instructions for use and appointing competent persons for oversight.
A provider of a high-risk AI system that complies with the AI Act’s human oversight requirements will significantly reduce the risk of violating the GDPR’s rules on automated decision-making. If the human oversight is designed so that a natural person always makes the final decision, the GDPR’s rules on fully automated decision-making will not even apply because the decision is not fully automated. Both regulations aim to mitigate the risks of fully automated decisions without human involvement and complement each other in achieving this objective.
Overlapping documentation requirements
Both the GDPR and the AI Act include various documentation requirements. The impact assessment is one of the common documents in the two legislations. In the GDPR, this is referred to as a data protection impact assessment (DPIA), outlined in Article 35. In the AI Act, it is called a fundamental rights impact assessment (FRIA) and is detailed in Article 27.
A DPIA is required when the processing of personal data, in particular when using new technologies, is likely to result in a high risk to the fundamental rights and freedoms of individuals. An example of this is the automated individual decision-making described above, or when special categories of data are processed on a large scale. Such categories of data include, for example, processing of genetic data and biometric data for the purpose of uniquely identifying a natural person or data concerning health. Supervisory authorities publish lists with guidance on when a DPIA is required. However, each controller must make its own assessment of whether it is required to carry out a DPIA. The DPIA must be completed before the processing is initiated and must assess how the processing affects the protection of personal data. The assessment should include several elements, such as a description of the processing, its purpose and the proportionality of the processing in relation to that purpose. It must also include an assessment of the risks to the rights and freedoms of the data subject and how the identified risks can be mitigated.
Similarly, the FRIA must be carried out before a high-risk AI system is used for the first time. Subsequent deployers may rely on a previous FRIA conducted by the provider, provided that it is still relevant. This obligation applies to most high-risk AI systems, in particular those used by public sector entities or those providing public services. The European AI Office will publish a template detailing the required components of a FRIA. Until the template is published, deployers and providers will need to prepare an assessment that includes, for example:
- an explanation of the system’s intended use and purpose;
- a description of the categories of individuals likely to be affected by the system; and
- an evaluation of the risk of harm to these individuals and measures for ensuring human oversight.
Businesses subject to both the GDPR and the AI Act will find overlapping requirements in the DPIA and the FRIA. The primary focus of both assessments is to identify potential risks with the processing and the AI system, respectively, and how such risks may be mitigated by way of safeguards, such as technical and organisational measures. When an AI system processes personal data, these assessments are in some cases linked. Article 27 of the AI Act states that if the FRIA requirements are met by conducting a DPIA, the FRIA becomes supplementary. This allows companies, in certain circumstances, to create a single impact assessment that addresses both AI-related and data protection related requirements.
Overlapping transparency requirements
Both the AI Act and the GDPR contain obligations relating to transparency and the individual’s right to an explanation of the system and the processing of data, respectively.
The AI Act mandates that providers and deployers of certain high-risk AI systems must, upon request, explain to individuals who are affected by a decision based on the AI system’s output how the system influences the decision-making process and the fundamental aspects upon which the decision is based. This right applies when the AI system has legal or similarly significant effects on the individual. Deployers must have sufficient knowledge of the AI system to provide this explanation. Providing the necessary level of explanation and transparency can be challenging due to the “black box” nature of AI. Nevertheless, providers and deployers need to understand the mechanisms of the system and how decisions are made.
The GDPR, on its end, imposes extensive information obligations to uphold the principles of fair and transparent processing. Controllers are required to provide comprehensive information to data subjects. When personal data is collected, the controller must inform the data subject about, for example, the purpose and legal basis for the processing, the duration of data storage, the rights of the data subject, and the contact details of the controller. Additionally, if an individual is subject to automated decision-making or profiling, this must also be communicated as well as meaningful information about the logic involved.
As a result, both regulations require that companies subject to these rules have a thorough understanding of how their technology operates to meet the transparency requirements imposed by the GDPR and the AI Act, respectively. In practice, companies must also have processes to ensure that requested information is practically and easily accessible to the individual and can be communicated without delay.
Potential conflicts between the regulations
The data protection principles
The GDPR is based on several data protection principles that must be applied when processing personal data, some of these are potentially at odds with the nature of AI systems. Most notable are the principles of data minimisation and storage limitation. Data minimization entails that data collection should be limited to what is strictly necessary for a specific purpose. Storage limitation requires data to be kept only for as long as it is necessary to fulfil the purpose for its collection. In an AI context, large quantities of data are critical for functionality and improvement. AI systems also introduce new methods of data collection through individuals’ interactions with the system, which often occur automatically. Such technical setups may result in the collection of more data than might be strictly necessary. Additionally, the data may be retained for longer periods than required for its initial purpose. This extended retention is often necessary for the ongoing development of the AI system, which typically differs from the original data collection objective.
The principles of data minimisation and storage limitation are challenging not only because of the technical requirements of AI systems, but also in relation to the provisions of the AI Act. Providers of high-risk AI systems must meet strict data quality and governance standards. The data used must be appropriate, representative and relevant to the individuals or groups the system is intended to serve. To mitigate the risks of bias and discrimination in AI systems, it is necessary to use data that accurately reflects the users. Access to high quality data is highlighted in the AI Act as a key measure to address inherent bias in datasets, and is proposed to be achieved, for example, by examining the specific geographic, behavioural, functional or contextual characteristics relevant to the intended purpose of the AI system. These requirements under the AI Act increase the likelihood that data which meet the quality requirements under the AI Act is, in itself or is intertwined with, personal data as defined by the GDPR.
The AI Act refers to the Ethics Guidelines for Trustworthy AI, created by a high-level expert group appointed by the European Commission. Although such guidelines are non-binding, all stakeholders are, in the recitals, encouraged to take these principles into account when developing additional best practices and standards. The ethics guidelines address inter alia privacy and data governance, emphasising the importance of ensuring high-quality, unbiased data throughout the lifecycle of AI systems, while also stressing the need to limit access to personal data within organisations. The guidelines provide both technical and organizational measures to help implement privacy and data governance in practice. Measures are also being taken at EU level to create access to high-quality data for use in AI development. An example of this is the development of the European Health Data Space, which will facilitate access to health data for the purposes of training of AI algorithms in a privacy-preserving manner. Notwithstanding this, providers must themselves ensure that they reduce the risk of relying on data in their AI systems that either are not permitted for use according to the GDPR or are insufficient in terms of quality according to the AI Act.
The rights of the data subject
The GDPR establishes several rights for data subjects, some of which pose challenges in the context of AI. These include, for example, the right to be forgotten, the right to data portability and the right of access.
The right to be forgotten – Article 17 GDPR
The right to erasure, more commonly known as the right to be forgotten, allows a data subject to require that the controller erases all personal data concerning him/her without undue delay. This obligation applies in various situations, and even if the data subject has not explicitly requested it due to the storage limitation principle explained above. Moreover, if the controller has made the data public, the obligation extends to taking reasonable steps to contact other controllers who are using the data and informing them that the data subject has requested erasure of the data.
Developers of AI systems, regardless of their risk classification, need to consider how and if the right to erasure can be technically achieved when developing AI systems and selecting the data to use in their models. This includes ensuring that personal data can be removed from AI systems and any associated databases, which can be complex given the nature of AI and machine learning models that often rely on large datasets for training and operation. One solution could be to limit the amount of real personal data and to a greater extent use test data, such as the data provided by the European Health Data Space.
The right of access and the right to data portability – Articles 15 and 20 GDPR
Similarly, the right to data portability and the right of access require that the controller, upon request, can collect and isolate all personal data about a certain data subject and provide it in a comprehensible manner to the data subject in question. The right to data portability entails that the controller should provide to the data subject, or another controller, all personal data that the data subject has provided in a structured, commonly used, and machine-readable format. This right applies when the data processing is based on consent or the fulfillment of contractual obligations between the data subject and the controller and the processing is carried out by automated means. A similar right is the data subject’s right of access the personal data regarding him or her. This right applies to all situations where personal data is being processed, regardless of the legal basis under which the processing occurs. These two rights put additional requirements on AI system developers to consider how data can be accessed and extracted from the system and its associated databases, which, as stated above, can be complex given the nature of AI.
Inferred data
In addition to personal data which originates directly or indirectly from the data subject, AI systems may often result in additional data being created which is a direct result of the processing carried out by the AI system. Such inferred data is not collected from the data subject but is created and constitutes output data of an AI system. The scope of the GDPR is limited to personal data, raising the question of when output data qualifies as personal data. If this data is attributed to individuals, meaning it is linked to a natural person, it can be assumed that the inferred data qualifies as personal data. However, if the data consists of generic models, such as group profiles (e.g., people who purchase coffee are also likely to buy pastries, or individuals who own red cars are more prone to develop heart disease), it may be challenging to determine at which point these characteristics apply to individual data subjects and can be considered personal data.
Typically, a profile like “60% of all adults enjoy reading” is not considered personal data because no identified or identifiable person is mentioned. However, it becomes personal data when ascribed to a specific individual, such as Sarah, who is an adult (e.g., the probability that Sarah, who is an adult, enjoys reading is 60%). In other words, depending on which output data is generated, it might be subject to data subject rights, such as the right to be forgotten, and other provisions in the GDPR. As such, deployers of AI systems should ensure that they have clear guidelines in place to guide employees on how both input and output data is managed within the organization. AI developers must also consider the relationship between input and output data and ensure that both are managed in accordance with the GDPR. While doing so, consideration should also be given to how such compliance measures on either end affects the business asset that data in various forms often constitutes. On that note, depending on which position an actor holds in the value chain, different solutions might be suitable to enable and maintain access to certain data, either through ownership or licensing arrangements. For example, a provider of an AI system would most likely require a license to the input and output data in order to train and develop the system whilst a deployer would like to retain the ownership of the data.
Concluding remarks
Given the broad definition of personal data, the GDPR applies to in principle all businesses in one way or another. In contrast, the scope of the AI Act is more specific, primarily targeting providers of high-risk AI systems. When personal data is used in high-risk AI systems, the regulatory obligations of both the GDPR and the AI Act overlap, which necessitates consideration of how these obligations can be balanced and fulfilled. For example, it may be challenging to retrain an AI model that relies on data subsequently deemed unlawful to process pursuant to the GDPR. Consequently, this should be assessed and managed at an early stage of development.
The link between an AI system’s training data, its functionality, and the value of its output data, entails that also AI systems which are not classified as high-risk under the AI Act still require regulatory consideration from a data protection perspective. These systems, although subject to lighter regulation under the AI Act, must still comply with the data protection rules, which will affect the technical features that can or should be implemented. It is also worth noting that several new regulations at the EU level, such as the Data Act, the Cyber Resilience Act, and national legislation implementing the NIS 2 Directive, will start applying in the relatively near future, complementing the AI Act and adding further dimensions to the management of data and AI.
Setterwalls has extensive expertise in both AI and product safety, as well as data protection and other IT-related legislation. We are well-equipped to assist your development efforts – feel free to reach out! Don’t miss the first part of this article series, where we outline an “AI Act roadmap” designed to help you determine your obligations under the AI Act. Also, keep an eye out for the third and final part of the series, which will discuss the impact of the AI Act on the life sciences sector.
[1] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC.
Contact:
Practice areas: