Meta AI (formerly Facebook) has deployed a new cross-problem system that tackles three different but related violations: hate speech, bullying and harassment, and violence and incitement.
Until now, AI systems have typically been single-purpose, each designed for a specific content type, language, and problems, like detecting misinformation or flagging hate speech violations. They require varying amounts of training data and different infrastructure. One of the biggest challenges is to build not more bespoke AI systems but fewer, more powerful ones.
To that end, AI models that can combine signals across multiple systems help AI make new connections and improve content understanding. This also makes integrity systems more efficient by making better use of compute resources — which, crucially, allows us to respond more rapidly to new issues.
The system has developed a broader understanding of all three separate problems by generalising the AI across the three violations, outperforming previous individual classifiers. This consolidation has helped reduce hate speech prevalence over the past six months, as reported in Meta AI’s Community Standards Enforcement Report.
The team use technology to reduce the prevalence of hate speech in several ways:
- It helps us proactively detect it
- route it to our reviewers, and
- remove it when it violates policies
“We also saw a direct impact in how quickly we bring classifiers to new languages. While previous systems typically take months to create separate classifiers for each market, we replaced existing classifiers with our cross-problem systems in many markets within weeks, without needing additional hardware to run the new advanced models,” said the company in a blog.
This work builds on the platform’s previous multimodal integrity system, which combines different systems across languages, modalities (like text, images, and videos), and violation types to understand harmful content at a deeper level.