Meta plans to replace humans with AI to assess privacy and societal risks

The AI Report
Daily AI, ML, LLM and agents news- #artificial_intelligence

Meta is significantly increasing its reliance on artificial intelligence to assess risks across its platforms, including Facebook, Instagram, and WhatsApp. According to internal documents obtained by NPR, up to 90% of all product risk assessments will soon be automated.
This shift means that critical updates to algorithms, new safety features, and changes to content sharing rules will bypass human review and be primarily approved by an AI system. Previously, these privacy and integrity reviews were conducted almost entirely by human evaluators tasked with identifying potential harm, privacy violations, and the spread of toxic content.
Product developers inside Meta see this change as a way to speed up the release of new features and updates. However, current and former Meta employees are concerned that allowing AI to make complex determinations about real-world harm could have negative consequences. Critics argue that engineers, primarily evaluated on product launch speed, are not equipped to make these nuanced risk judgments, potentially leading to significant risks being missed.
Meta states that the changes aim to streamline decision-making and that human expertise will still be used for "novel and complex issues," with only "low-risk decisions" being automated. However, internal documents suggest automation is being considered for sensitive areas like AI safety, youth risk, and content integrity (e.g., violent content, falsehoods).
Under the new system, product teams will often receive an "instant decision" from the AI after completing a questionnaire. They are then responsible for verifying they have met the identified requirements before launch. Manual review by humans will not be the default, but rather initiated by the product teams themselves in some cases.
Experts and former employees warn that this automation, while potentially increasing speed, could decrease the rigor of risk assessment, especially as similar self-assessment processes in the past have failed to prevent significant problems. The move aligns with a broader company push to leverage AI and accelerate development, amid competition from rivals like TikTok and OpenAI.
Meta indicated in a recent report that it is using large language models for content moderation, sometimes operating "beyond that of human performance" for select policy areas, freeing up human reviewers for more complex cases. However, the concern remains that offloading the crucial task of identifying potential harm from human experts to automated systems, where product teams self-verify compliance, could lead to unforeseen negative impacts on users and society.
Users in the European Union might be somewhat insulated from these changes due to existing regulations like the Digital Services Act, which requires stricter platform policing and protection of users.

The AI Report
Author bio: Daily AI, ML, LLM and agents news