Google is the latest LLM developer under scrutiny by the Irish Data Protection Commission

The Data Protection Commission (DPC) recently announced it will investigate whether Google should have had to perform a Data Protection Impact Assessment (DPIA) before processing data from European users to train and develop PaLM 2, the precursor of Google's Gemini multimodal AI models. According to Article 35 of the General Data Protection Regulation (GDPR), DPIAs are required whenever data processing is likely to have a high-risk impact on the rights and freedoms of individuals—in this case, the owners of the data processed to train PaLM 2.

DPIAs enable model developers to identify data protection risks so developers can objectively evaluate whether the data processing is necessary, and in case it is, DPIAs also let developers plan and implement safeguards to mitigate any encountered risks. In theory, violations of the GDPR could cost Alphabet (Google's parent company) fines of up to 4% of its global annual turnover. However, the DPC has yet to leverage its authority to impose said fines, as exemplified by the resolutions of its interactions with Meta and X over concerns about their data processing practices.

The DPC's choice to investigate Google's data processing practices adopted during PaLM 2's development is especially notorious given that Google is working towards retiring the model in favor of the newer Gemini 1.5 models. The Gemini models already power all the generative AI features across Google's services, and the company has announced it will retire the PaLM API in October 2024. The choice to begin investigating PaLM 2 and not Gemini could signal the DPC's interest in being thorough when investigating developers' data processing practices, but of course, this is only speculation.