AntiSPAM with SpamAssassin

If you don’t do this, you’ll be probably receiving tons of spam everyday. That’s one of the biggest problems in e-mail, one of the most CPU/Memory consuming things, but necessary. We can configure spamassassin in several ways, but I think it’s important to give users some powers (after this guide, users won’t be able to make changes for themselves, we may need a web interface to do so, but they will have the possibility to train their spam filter, and the administrator can easily make changes in user preferences).

I’ll be using Ubuntu Server 12.04 to write this guide, so if you see commands like apt-get and sudo, you don’t have, you probably would know how to do it in your distribution. But configuration file contents and basic usage may probably be the same.

Ok, let’s describe what we are going to do:

  • Install spamassassin and configure it to test spam in every e-mail that arrives to any account. The e-mail will be flagged.
  • Make spamassasin use several configurations: user configuration, domain configuration, global configuration, in that order so a single user can change some properties.
  • SpamAssassin can use bayesian filters to detect spam, so it will learn. We are going to scan periodically junk mailboxes to teach the system what’s spam and what’s not in a per-user way.
  • Install sieve and configure sieve scripts to make spam mails go directly to the spam IMAP folder.

We will store spamassassin data (configuration, and bayesian data) in our MySQL database. We could use postgreSQL o SQLite instead, but in this guide we will take advantage of our existing MySQL.

Let’s install first the necessary software: spamassassin, sieve and online antispam engines (we can skip them but they will speed up the process).

$ sudo apt-get install spamassassin mailutils razor pyzor

We must have a look to the configuration files, starting with /etc/default/spamassassin (won’t write the entire file, just the lines we need to change):

# Change to one to enable spamd
ENABLED=1

OPTIONS="--create-prefs --max-children 5 --helper-home-dir -q -x -u mail"

NICE="--nicelevel 10"

We enable the daemon, set some options, and set nice level to 10 making the process have less priority in our system, maybe it becomes too slow, we can comment this line.

And now /etc/spamassassin/local.cf

rewrite_header Subject *****SPAM*****
lock_method flock
required_score 5.0
use_bayes 1
bayes_auto_learn 1
bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
skip_rbl_checks 0
pyzor_options --homedir /var/spool/pyzor
use_razor2 1
razor_timeout 5
use_pyzor 1
pyzor_timeout 5
report_contact  administrator@email.ext

It was just uncomenting lines, except last six lines:

  • rewrite_header Subject *****SPAM*****: Sets the subject of the email, writing *****SPAM***** at the beginning.
  • lock_method flock : Establish how to lock the files, using flock (but if we have nfs, we must write nfssafe)
  • required_score : Minimum score for the e-mail to be flagged as spam.
  • use_bayes 1 : Use bayesian classifier. Know new spam classifier methods with statistical information from real spam.
  • bayes_auto_learn 1 : Makes spamfilter learn from real spam.
  • bayes_ignore_header : Bayesian classifier may take a shortcut and will skip doing its job if these headers are present in the e-mail (i.e: mailing list).
  • pyzor_options : Options for pyzor installation
  • skip_rbl_checks 0 : Don’t skip RBL tests. These tests come from Realtime Blackhole List (or Realtime Blaklists) through Internet, large databases of IPs or domains of spammers (reported spam). They will make us save time and CPU because a lot of SPAM come from these addresses).
  • use_razor2 / use_pyzor : Make use of some online databases.
  • razor/pyzor_timeout : Timeout for the last described methods
  • report_contact : Brings the user an e-mail for comments

Now, let’s configure pyzor:

$ sudo mkdir /var/spool/pyzor
$ sudo pyzor --homedir /var/spool/pyzor discover
$ sudo chown -R mail:mail /var/spool/pyzor
$ sudo chmod -R +x /var/spool/pyzor

At this point we can test if everything is going ok, first restart (or start) the spamassassin service and “send” a fake message to spamc (spam client), to see if everything’s going ok:

$ sudo service spamassassin restart

$ cat /usr/share/doc/spamc/sample-spam.txt | spamc

We may see some headers in the result:

Subject: *****SPAM***** Test spam mail (GTUBE)
Date: Wed, 23 Jul 2003 23:30:00 +0200
Message-Id: <GTUBE1.1010101@example.net>
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on cloud
X-Spam-Flag: YES
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=1000.0 required=5.0 tests=GTUBE,NO_RECEIVED,
        NO_RELAYS autolearn=no version=3.3.2

Now, let’s make it work through postfix, so we must edit /etc/postfix/master.cf to make the e-mail go through a new filter:

smtp    inet    n       -       n       -       -       smtpd
   -o content_filter=spamfilter:dummy
spamfilter unix -       n       n       -       -       pipe
   flags=Rq user=mail argv=/usr/local/bin/spamfilter.sh -f ${sender} -- ${recipient}

#smtp      inet  n       -       n       -       -       smtpd

We must comment existing smtp line, and write a new one, we will pass the message through spamfilter.sh (we will create this file later) and after that, deliver it.

Now, create the file /usr/local/bin/spamfilter.sh and put these contents:

#!/bin/bash
/usr/bin/spamc | /usr/sbin/sendmail -i "$@"
exit $?

Give spamfilter.sh execution permissions:

$ sudo chown mail:mail spamfilter.sh 
$ sudo chmod +x spamfilter.sh

Now, we have the same configuration for all users, and it goes fine, but users have nothing to do if they receive false positives or false negatives. They also have to create filters in their mail clients to delete SPAM messages (or move them to another folder).

You just have to restart postfix and try to send a spam message from your favourite mail client. If it’s not spam, see the message source and notice de X-Spam headers.

$ sudo service postfix restart

 

Leave a Reply

Your email address will not be published. Required fields are marked *