Teaching SpamAssassin What Is Spam

Written by

Yujin Boby

SpamAssassin is one of the most widely used open-source spam filtering systems for email servers. It is commonly deployed together with Postfix (mail transfer) and Dovecot (mail delivery / IMAP).

Instead of relying on a single rule, SpamAssassin assigns a spam score to every email based on hundreds of tests, including:

Header analysis
Body content checks
DNS blocklists (RBLs)
SPF, DKIM and DMARC results
Bayesian (statistical) analysis

When the score exceeds a configured threshold, the message is marked or delivered as spam.

Teaching SpamAssassin What Is Spam

SpamAssassin provides the sa-learn command to train its Bayesian database.

To teach SpamAssassin that emails in a Junk folder are spam, you run:

sa-learn --spam /home/USER/Maildir/.Junk\ E-mail/{cur,new}

SpamAssassin automatically ignores messages it has already learned, so running this command multiple times is safe.

Teaching SpamAssassin What Is NOT Spam (Ham)

Training spam alone is not enough. For best accuracy, SpamAssassin should also learn from legitimate mail (ham), usually from the Inbox:

sa-learn --ham /home/USER/Maildir/{cur,new}

A healthy Bayesian database contains both spam and ham, ideally at least a few thousand messages of each.

Back to SpamAssassin

Teaching SpamAssassin What Is Spam

Teaching SpamAssassin What Is Spam

Teaching SpamAssassin What Is NOT Spam (Ham)

Comments

Leave a Reply Cancel reply

More posts

Teaching SpamAssassin What Is Spam

iotop

Kill A Linux Process

Windows RDP Lockout