(Amnesty International) Companies are extracting vast troves of online data through unlawful web scraping to build their generative artificial intelligence (AI) products in a way that is enabling a mass invasion of privacy, making these systems unlawful by design, Amnesty International said in a new briefing ‘Unlawful by Design: Exposing the Human Rights Costs of Generative AI’ documents serious risks in the large-scale data scraping and processing being used to build and train these systems, including violations of the right to privacy by design and adverse consequences for the environment and historically marginalized communities. “Companies across the world are supplying generative AI products under the veneer of efficiency and sophistication, but in reality, these systems perpetuate mass invasions of privacy through unlawful web scraping: an automated process for extracting data from websites, including personal data, such as images and social media activity, to train AI models,” said Likhita Banerji, Head of the Algorithmic Accountability Lab, Amnesty International. “The extractive data pipeline, inherent design choices made by tech companies and exploitative supply chains, to build generative AI systems have enabled a paradigm of technology development that opens up a risk of mass abuse of human rights.”. Amnesty International researched the models powering some of the most popular publicly available standalone generative AI tools, including GPT 3 by Open AI, Google’s Gemini, Meta’s Llama, DeepSeek and tools by Midjourney and Stable Diffusion. Such systems rely on extracting information from billions of public online posts and images often without the explicit consent of the individuals appearing in or creating them. Not only does this infringe on privacy by design but as datasets powering AI models scale up, the presence of hateful and discriminatory content in their outputs also gets amplified, along with negative stereotypes and prejudices, especially along racial and gendered lines. – Global: Enormous data pipelines powering major generative AI systems are rooted in mass invasions of privacy by design – Amnesty International
Global: Enormous data pipelines powering major generative AI systems are rooted in mass invasions of privacy by design
Related articles



