Filtering mail and spam with perl

Submitted by shonyo
on December 17, 2003 - 6:38am

Thanks to the tag team effort between the Mail::Audit perl module and the Mail::SpamAssassin perl module, I now have a mail-filter perl script that both replaces procmail's ghetto recipes and filters spam.

Simon Cozens, author of Mail::Audit, and maintainer of the perl.com website, published this article on Mail::Audit, and how it could be used as an alternative to procmail. The ease at which mail filtering recipe's could be written in legible perl code was very appealing to me. However, it wasn't until I read his followup article on fighting spam that I was completely sold on using Mail::Audit.

While I won't recount all the neat things Mail::Audit is capable of from the aforementioned article, here's an easy to read recipe to give you an example of how pleasing it is to the eye.

# -------------------------------------------------------
# Anything auction related gets it's own folder.

for (qw(ebay.com paypal.com auctionsniper.com)) {
    if ($from =~ /$_/) {
        my $where = ebay;
        print LOG "$from: $subject: accepting to $where folder\n";
        $item->accept("$folder"."$where");
    }
}

And filtering spam never looked so good.

my $spamtest = Mail::SpamAssassin->new({local_tests_only => 1});
my $status = $spamtest->check ($item);
if ($status->is_spam ()) {
    $status->rewrite_mail ();
    print LOG "$from:$subject: flagging as spam\n";
    $item->accept($folder."spam");
}

I use the local_tests_only flag when constructing the Mail::SpamAssassin object to avoid using all the network based checks like Vipul's Razor and Relay Black Hole Lists. While this reduces the effectiveness of the spam filter, I don't take the performance with every email I receive. After I've run this for awhile, I'll attempt configuring spamassassin's preferences to only use the most effective network checks in an attempt to get the most spam hits with the best performance.

I recommend anyone who hosts their mail on UNIX, and is a perl enthusiast to give the above method a try. It's surprisingly easy to setup, and I think you'll be pleased with the results.