Sixteen data protection authorities recently confirmed that controllers must protect their properties from web scraping. And that includes web scraping for the purpose of training AI.

Here are some takeaways from the latest statement, which is a follow up to a previous statement 12 data protection authorities issued last year.

  • All companies, not just social media companies, must protect the publicly accessible personal information that they host against unlawful scraping.
  • Failure to implement adequate safeguards in compliance with applicable laws could result in regulatory intervention, including enforcement actions.
  • Those engaged in data scraping, as well as social media companies and other organizations who use data from their own platforms to train AI, should implement measures to ensure that their data practices comply with data protection and privacy laws.
  • Mass data scraping incidents that harvest personal information can constitute reportable data breaches in many jurisdictions.
  • To effectively protect against unlawful scraping, organizations should deploy a combination of safeguarding measures, and those measures should be regularly reviewed and updated to keep pace with advances in scraping techniques and technologies.
  • When an organization grants lawful permission for third parties to collect publicly accessible personal data from its platform, providing such access via an Application Programming Interface (API) can allow the organization greater control over the data and facilitate the detection and mitigation of unauthorized scraping.