Twitter did not say, however, whether tweets that receive notes may lead to removal or other consequences; the intent behind the tool, it said, is for users to be able to respond quickly to a fast-spreading claim.
Birdwatch’s approach flips Twitter’s existing model of moderation, in which Twitter itself labels content. It also places more power (or burden, potentially) in the hands of users, much in the way Wikipedia or reddit are platforms whose communities are self-governed by users.
Twitter users have played a role before in flagging potentially problematic tweets to the company, but this move makes users even more involved in the content moderation process.
“We know this might be messy and have problems at times, but we believe this is a model worth trying,” Twitter VP of Product Keith Coleman said in the blog post.
To participate in the project, users must first register with Birdwatch, have a verified email address and phone number with a “trusted” US wireless carrier, and have a Twitter record free of policy violations. Data generated by the pilot project will be shared with researchers, the company said.
NBC News and Fox News were first to report Birdwatch’s launch.
A permanent shift to user-driven content moderation would be a significant pivot for Twitter. The company acknowledged Monday that it could raise thorny questions about how to ensure the new system is not abused by malicious actors.
Birdwatch plans to address those concerns, at least initially, with a ranking system for contextual notes. If a note has received enough ratings indicating it is helpful, it may be displayed at the top of the list, Twitter said.
Twitter added that its pilot project follows interviews with more than 100 users who expressed interest in a more nuanced way to evaluate tweets.
But while other platforms appear to have developed thriving systems for community governance in the past, there are also key differences. Reddit, for example, is organized into sub-communities that are each moderated by volunteer administrators who can establish group rules and enforce them by directly removing content.
Wikipedia, meanwhile, keeps track of changes to its encyclopedia articles; while occasionally rogue users can make unauthorized edits to a page, those changes are often quickly rolled back to a prior version.
It is less clear how Twitter plans to address instances of abuse, including potential efforts by some users to flag accurate tweets as misleading or by overloading a notes thread with misinformation.
“We know there are a number of challenges toward building a community-driven system like this — from making it resistant to manipulation attempts to ensuring it isn’t dominated by a simple majority or biased based on its distribution of contributors,” Coleman wrote. “We’ll be focused on these things throughout the pilot.”