How spam filters work?

bix · June 25, 2019, 7:50pm

Could we get some details on how the automated spam filters function? Question prompted by an incident that got brought up on Mastodon today.

matt · June 25, 2019, 9:45pm

(Topic split because I think this deserves its own discussion.)

Sure. I don’t want to go into too much detail, to ensure the filters remain effective against actual spammers. But to sum it up:

We have automated filters in place to prevent the mass-creation of accounts and posts on Write.as. The filters are hand-adjusted by the team and put into code. (If we were raising venture capital, I’d say our system is powered by cutting-edge natural language processing AI and will bring us all closer to the singularity. But alas, there are no complex algorithms.) Spam posts follow very specific patterns, so we manually code against them.

Unfortunately, this sometimes catches legitimate users in the net (by my estimation, less than 0.0005% of the time). When anyone encounters our filter, they don’t lose their work, but see this page. It encourages people to contact us if they’re a legitimate user. Most people email us, because we mention how we prefer that.

When we receive that correspondence, we look up their post attempt – it’s logged in our system so we can adjust our pseudo-AI. If the person isn’t actually a spammer (which most aren’t), we simply disable the spam filter for their account, and they can then publish without any issue.

In the incident mentioned on Mastodon, the user encountered our spam filter, but publicly alleged that the block was because we deemed it against our “malicious speech” policy (allegedly it was because their “malicious speech” was aimed at Nazis). The user claimed this without sharing the full picture or seeking clarification with us first, which allowed this misinformation to quickly spread unchecked.

Again, our automated system doesn’t enforce this guideline. As mentioned on the “content blocked” page, it enforces our “Spam” policy alone.

To be clear, our “Malicious Speech” community guideline doesn’t exist to give comfort to hate groups. It’s there to protect against them.

On this note, I personally feel the language on our “blocked post” page is pretty straightforward (despite the humor). But if you feel it can be improved and clarified in some way, please let me know.