Maildrop Spam Filtering For Postfix MySQL

Courier-Maildrop can perform spam filtering and Maildir folder administration for functional enhancement within a Postfix MySQL setup. Though Postfix natively supports email storage in Maildir format, it’s lacking ability to redirect emails marked as spam. Postfix either rejects spams, or otherwise sends them all to the inboxes of virtual mail folders. Unfortunately, it does not offer an option to automatically move spams to a spam folder. While SpamAssassin makes it easier to recognize spams by writing “***Spam***” into subject lines, having to manually move or delete them is not ideal. Therefore, we can identify this behavior as a functional gap.

In this article, I first explain how Courier-Maildrop works with sample configuration files for spam filtering. Afterwards, I am listing the configuration changes for integrating Maildrop into Postfix’ output pipeline. Finally, I’m showing how to completely replace Maildrop with a simple Perl script for improved interoperability.

How Courier-Maildrop Works

Courier-Maildrop receives emails from standard input and stores them in Maildir format according to instructions from a maildroprc configuration file. Let’s look at a basic Maildrop configuration file:

# Sample maildroprc
SHELL="/bin/bash"
logfile "/var/log/maildrop/maildrop.log"


# default delivery
exception {
        to "$VHOME"
}

From this minimal configuration, Maildrop reads the location of its logfile and the Maildir output folder as specified in the “$VHOME” environment variable. It then stores mails from STDIN under “$VHOME”.

In more advanced setups, Maildrop performs pattern matches on email header lines as well as folder administration:

# Sample maildroprc with spam folder
SHELL="/bin/bash"
logfile "/var/log/maildrop/maildrop.log"
SPAMFOLDER="Junkmail"

# Check if spamfolder exists
`test -d "$VHOME/.$SPAMFOLDER"`
if( $RETURNCODE == 0 )
{
}
else
{
     `/usr/bin/maildirmake -f "$SPAMFOLDER" "$VHOME"`
     `echo "INBOX.$SPAMFOLDER" >> "$VHOME/courierimapsubscribed"`
}

# Check if message marked as spam
if (/^X-Spam-Flag: YES/) {
        log "Spam"
        exception {
                to "$VHOME/.$SPAMFOLDER"
        }
}
# default delivery
exception {
        to "$VHOME"
}

Here, Maildrop first checks if a spam folder already exists under the virtual user directory. If not, it creates one and adds a folder subscription line to a Courier-IMAP control file. Afterwards, it matches mail header lines against SpamAssassin’s X-Spam-Flag marker. If any header line starts with “X-Spam-Flag: YES”, maildrop delivers to the spam folder.

How to Integrate Maildrop for Spam Filtering into Postfix MySQL

In my description on integrating Maildrop into a Postfix setup with MySQL, I am referring this base installation. To enable delivery via Maildrop, one needs to add these settings into Postfix main.cf:

# in /etc/postfix/main.cf
# add to enable maildrop
maildrop_destination_recipient_limit = 1
virtual_transport = maildrop

Furthermore, we need to add a Maildrop section to master.cf:

# in /etc/postfix/master.cf
# add to enable maildrop
maildrop  unix  -       n       n       -       -       pipe
  flags=DRhu user=virtual argv=/usr/bin/maildrop -d ${recipient}

With these settings and a maildroprc as in the previous section, Maildrop diverts spam messages to a spam folder. But there remains one problem to solve: we still need to pass the $VHOME virtual mail directory path. Since this path is different for every mail user, and we want to support many of them, hard coding $VHOME in maildroprc is not an option. However, since Postfix invokes Maildrop with the recipient address on the command line, we could read the mail directory from database. For example, we could write a script “maildirinfo.pl” outputting the target folder on STDOUT add these lines to the maildroprc:

POSTBOX="$2"
VHOME=`/home/virtual/bin/maildirinfo.pl $POSTBOX`

A sample “maildirinfo.pl” script reading the mail folder could look like this:

#!/usr/bin/perl
#
use strict;

use FindBin;
use DBI;

my $logdir = "$FindBin::Bin/log";
my $db = 'maildb';
my $dbu = 'mail';
my $dbpw = 'YourMaildbPw';

my ($target) = @ARGV;
# derive log file name from script name
my $logfile = $FindBin::Script;
$logfile =~ s/\.[^\.]*$//;
my $logfile = "$logdir/$logfile.log";

if ( ! -d $logdir ) {
	mkdir $logdir;
}

if ( ! -d $logdir ) {
	logError("Failed to create logdir $logdir\n");
}

my $time = `date`;
$time =~ s/\s*$//;

# make sure to log any failures
eval { main() };
if ($@) {
	logError("$@");
	exit 1;
}

exit 0;

sub main {

	logLine("####\nRead mail folder for $target at $time");
	
	# open database
	my $dbh = DBI->connect("DBI:mysql:database=$db;host=localhost", $dbu, $dbpw, {'RaiseError' => 1});
	# lookup delivery for target email address
	my $maildir = getTarget($dbh, $target);
	# clean up
	$dbh->disconnect();
	
	logLine("Target from db: $maildir");
	print $maildir;
}

sub logLine {
	my ($line) = @_;
	my $lf;
	if (open($lf,'>>', $logfile)) {
		print $lf "$line\n";
		close $lf;
	} else {
		print STDERR "$line\n";
	}
}

sub logError {
	my ($line) = @_;
	$line = "*** Error: $line";
	logLine($line);
}

sub getTarget {
	my ($dbh, $email) = @_;
	my $sth = $dbh->prepare("select home, maildir from users where id = ?");
	$sth->bind_param(1, $email);
	$sth->execute();
	
	my ($home, $maildir) = (undef, undef);
	while(my $ref = $sth->fetchrow_hashref()) {
		$home = $ref->{'home'};
		$maildir = $ref->{'maildir'};
	}
	if (!$home) {
		die "No target for email $email in maildb";
	}
	$maildir =~ s/[\/\s]*$//;
	return "$home/$maildir";
}

In conclusion, a “maildirinfo” script looking up the destination mail folder from database completes our solution for spam delivery to spam folders.

A Maildrop Replacement for Spam Filtering in Perl

I recommend using a Maildrop replacement for spam filtering and mail folder administration written in Perl. This solution combines maildrop, maildirmake, maildroprc and the maildirinfo folder lookup all into one script:

#!/usr/bin/perl
#
use strict;

use FindBin;
use DBI;
use Time::HiRes qw(gettimeofday);
use File::Copy qw(move);

my $logdir = "$FindBin::Bin/log";
my $db = 'maildb';
my $dbu = 'mail';
my $dbpw = 'YourMaildbPw';
my $inbox = 'INBOX';
my $junkbox = 'Junkmail';
my $couriersubscribed = 'courierimapsubscribed';

my ($mod, $target) = @ARGV;
# derive log file name from script name
my $logfile = $FindBin::Script;
$logfile =~ s/\.[^\.]*$//;
my $logfile = "$logdir/$logfile.log";

if ( ! -d $logdir ) {
	mkdir $logdir;
}

if ( ! -d $logdir ) {
	logError("Failed to create logdir $logdir\n");
}

my $time = `date`;
$time =~ s/\s*$//;

# make sure to log any failures
eval { main() };
if ($@) {
	logError("$@");
	exit 1;
}

exit 0;

sub main {

	logLine("####\nDeliver to $target at $time");
	
	# read email from stdin
	my ($head, $body) = getMail();
	
	# log mail headers if debug flag HEAD_DEBUG==1
	my $head_debug = $ENV{HEAD_DEBUG};
	if ($head_debug && $head_debug > 0) {
		for my $line (@$head) {
			logHeadLine("head: $line");
		}
	}
	# log body if debug flag HEAD_DEBUG==2
	if ($head_debug && $head_debug > 1) {
		for my $line (@$body) {
			print logHeadLine("body: $line");
		}
	}
	
	# open database
	my $dbh = DBI->connect("DBI:mysql:database=$db;host=localhost", $dbu, $dbpw, {'RaiseError' => 1});
	# lookup delivery for target email address
	my ($home, $maildir, $spamlevel) = getTarget($dbh, $target);
	# clean up
	$dbh->disconnect();
	
	logLine("Target from db: $home / $maildir -> $spamlevel");
	my @maildircomp = split ('\/', $maildir);
	if ($maildircomp[-1] =~ /^\s*$/) {
		pop @maildircomp;
	}
	logLine("join: " . join('/',@maildircomp));
	# do we deliver to a subfolder?
	# subfolders start with a '.'
	my $submailbox = undef;
	if ($maildircomp[-1] =~ /^\.[^\.]+/) {
		# remove subfolder from mail box folder path
		$submailbox = pop @maildircomp;
	}
	my $mailboxfolder = $home . '/' . join('/',@maildircomp);
	my $maildeliveryfolder = $mailboxfolder . '/';
	if ($submailbox) {
		$maildeliveryfolder .= $submailbox . '/';
	}
	if (! -d $maildeliveryfolder) {
		maildirmake($maildeliveryfolder);
		if ($submailbox) {
			appendLine("$mailboxfolder/$couriersubscribed","${inbox}${submailbox}");
		}
	}
	# if delivery to subfolder, spamfolder is still child of main 
	# mail box folder
	my $spamfolder = $mailboxfolder . '/.' . $junkbox . '/';
	if (! -d $spamfolder) {
		maildirmake($spamfolder);
		appendLine("$mailboxfolder/$couriersubscribed","$inbox.$junkbox");
	}
	
	# check for spam marker from spamassassin
	# X-Spam-Flag: YES
	my $isspam=0;
	for my $line (@$head) {
		if ($line =~ /^X-Spam-Flag:\s*YES/i) {
			$isspam = 1;
			last;
		}
	}
	my $outfolder = $maildeliveryfolder;
	if ($isspam) {
		$outfolder = $spamfolder;
	}
	outputMaildirFile($outfolder, $head, $body);
}

# replacement for courier maildrop file output
sub outputMaildirFile {
	my ($folder, $head, $body) = @_;
	# From https://cr.yp.to/proto/maildir.html
	# output file name has
	# {unix time in seconds}.M{microseconds}P{pid}V{deviceno}I{inode}_{count}.{hostname},S={filelength}
	# count is output counter cast this routine was called multiple times
	# filelength the length of mail file
	# since we get deviceno and inode from the new mail file,
	# we first create it under the tmp folder and move it afterwards.
	my ($secs,$usecs) = gettimeofday;
	my $pid = $$;
	my $hostname = `hostname`;
	$hostname =~ s/\s*$//;
	if ($hostname =~ /^\s*$/) {
		$hostname = 'localhost';
	}
	my $rnd = int(rand(1000000));
	my $fh;
	my $tfn = "${folder}/tmp/${secs}.M${usecs}P${pid}R${rnd}.$hostname";
	open($fh,'>',$tfn) or die "Cannot open tmp output file $tfn";
	chmod 0600,$fh;
	printLineArray($fh, $head);
	printLineArray($fh, $body);
	close($fh);
	my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
		$atime,$mtime,$ctime,$blksize,$blocks) = stat($tfn);
	my $hino = uc sprintf '%016x',$ino;
	my $hdev = uc sprintf '%016x',$dev;
	my $count = 0; # would need to count up if function called multiple times
	my $ffn = "${folder}/new/${secs}.M${usecs}P${pid}V${hdev}I${hino}_${count}.${hostname},S=${size}";
	move $tfn,$ffn;
}

sub logLine {
	my ($line) = @_;
	my $lf;
	if (open($lf,'>>', $logfile)) {
		print $lf "$line\n";
		close $lf;
	} else {
		print STDERR "$line\n";
	}
}

sub logHeadLine {
	my ($line) = @_;
	$line =~ s/\s*$//;
	logLine($line);
}

sub logError {
	my ($line) = @_;
	$line = "*** Error: $line";
	logLine($line);
}

sub getMail {
	my @content = ([],[]);
	my $idx = 0;
	while (<STDIN>) {
		my $line = $_;
		if ($idx == 0 && $line =~ /^\s*$/) {
			$idx++;
		}
		push @{$content[$idx]}, $line;
	}
	return @content;
}

sub getTarget {
	my ($dbh, $email) = @_;
	my $sth = $dbh->prepare("select home, maildir, spamlevel from users where id = ?");
	$sth->bind_param(1, $email);
	$sth->execute();
	
	my ($home, $maildir, $spamlevel) = (undef, undef, undef);
	while(my $ref = $sth->fetchrow_hashref()) {
		$home = $ref->{'home'};
		$maildir = $ref->{'maildir'};
		$spamlevel = $ref->{'spamlevel'};
	}
	if (!$home) {
		die "No target for email $email in maildb";
	}
	return ($home, $maildir, $spamlevel);
}

sub appendLine {
	my ($filename, $line) = @_;
	my $fh;
	if (open($fh, '>>', $filename) ) {
		print $fh "$line\n";
		close $fh;
	}
}

sub runCommand {
	my ($cmd) = @_;
	logLine("exec: $cmd");
	if (system($cmd) != 0) {
		logLine("error running $cmd");
	}
}

sub printLineArray {
	my ($fh, $arr) = @_;
	for my $line (@$arr) {
		print $fh $line;
	}
}

# replacement for courier maildirmake
sub maildirmake {
	my ($dir) = @_;
	$dir =~ s/\/*$//;
	# create any in between folders
	my $mode = 0700;
	my @folders = split('\/',$dir);
	for (my $i = 1; $i < $#folders; $i++) {
		my @rootfolders = @folders[0..$i];
		my $rootfolder = join('/',@rootfolders);
		if (! -d $rootfolder) {
			if (!mkdir $rootfolder,$mode) {
				die "Cannot create root folder $rootfolder";
			}
		}
	}

	if (!mkdir $dir,$mode) {
		die "Cannot create dir $dir";
	}
	if (!mkdir "$dir/new",$mode) {
		die "Cannot create dir $dir/new";
	}
	if (!mkdir "$dir/cur",$mode) {
		die "Cannot create dir $dir/cur";
	}
	if (!mkdir "$dir/tmp",$mode) {
		die "Cannot create dir $dir/tmp";
	}
}

In Postfix’ master.cf, we can now simply pass this script for spam folder handling:

# /etc/postfix/master.cf
# add to enable maildrop style delivery
maildrop  unix  -       n       n       -       -       pipe
  flags=DRhu user=virtual argv=/home/virtual/delivery/vdelivery.pl -d ${recipient}

With the “${recipient}” address, equivalent to the “id” field in the users table, as command line argument the vdelivery.pl script then executes. It first reads the email from STDIN and splits it into header and body sections along the first empty line. Afterwards, it looks up the destination folder for the target address from the users table. Note that the maildb setup allows delivery to subfolders of the Inbox for specific mail aliases. Since spams sent to such aliases shall still end up in one single junk mail folder, the script adds some special subfolder handling. Afterwards, it checks if SpamAssassin flagged the mail as spam and invokes the output routine.

Avoiding Mail File Name Collisions on Server Farms

For the use of the delivery script on server farms, consider the file naming scheme Courier-Maildrop employs to avoid naming collisions:

# sample Courier-Maildrop mail file name:
1645571915.M354491P243337V0000000000010304I00000000021A1194_0.octopus,S=14602

Maildrop begins file names with Unix seconds since 1970 followed by a ‘.’ and multiple other fields with markers in bold. The M field has microseconds, P the process ID, V the file system device number, and I the inode number. The ‘_‘ delimits a counter for multiple deliveries of the same mail followed by the delivering hostname “octopus”. After the S the file name includes the file size in bytes. For a single host writing to the file system, the combination of timestamp with microsecond fraction and process ID would be safe. But server farms with multiple mailer hosts writing to the same output file system require additional differentiating factors. Therefore, Courier-Maildrop as well as the replacement script add in the file system device number, inode number, and hostname for safety.

Note, however, one possible limitation: the replacement script gets device and inode numbers only after it writes to a temporary file. For unique file names in temp folders it concatenates a combination of timestamp, process ID, hostname and a random number. By themselves, timestamp, PID and hostname should be unique. But on server plants with poor time synchronization or sloppy hostnames, the replacement script cannot guarantee freedom from collisions. I’ve never seen temp file names produced by Maildrop, so I can’t say if it’s any better with the original Courier tooling.

References

Courier-Maildrop: http://www.courier-mta.org/maildrop/

Maildir format as specified by Qmail / Cr.yp.to: https://cr.yp.to/proto/maildir.html

Postfix and Maildrop: http://www.postfix.org/MAILDROP_README.html

How Courier-Maildrop Works

How to Integrate Maildrop for Spam Filtering into Postfix MySQL

A Maildrop Replacement for Spam Filtering in Perl

Avoiding Mail File Name Collisions on Server Farms

References

Leave a Reply Cancel reply

Cookie Settings