Exim, as fans of Postfix, Qmail or, god help us, Sendmail, won't tell you, is a fantastic bit of kit.  The way it gives you absolute control over what happens to an email at every step of the way, the way it runs through a dynamic, living, breathing config file for each individual email just makes sense and is a joy to set up.

Well, some of this isn't strictly true: the learning curve is steep.  Today, I'm going to take you through my standard Exim config file and show how you have all the tools available for some seriously good, low maintenance, low cpu spam fighting.

 Download the whole config file: exim4.conf

Now, why would you want to do any specific spam fighting in your mailserver?  After all, you have a perfectly good spamchecker to pipe it out to.  Spamassassin, for instance, is brilliant software.  The answer is that if you do a few simple checks on the email first, you may not need to spend all those CPU cycles on properly checking it, and if you run a mailserver of any non-trivial size, this will save you many headaches.

Also, where an email is borderline spam, you may be able to tip it in the right direction, avoiding both false positives and false negatives.  I don't know about you, but I hate calls from my customers asking why a particular message has been classified as spam when it "clearly isn't"...

You may have noticed similarities in spam you receive: the computer it came from wasn't a mailserver, it certainly wasn't the right mailserver for the sender, it doesn't have reverse DNS, it's on a blacklist or two, it has no SPF record.  These are simple checks if you have the information at your fingertips.  Why go to all the expense of invoking another program to do all this for you?

Of course, I am not advocating not using a spamchecker.  But if all the simple tests you can do on information you have to hand fail, why carry on processing?  Also, if some of these tests fail, and they are different tests to your spam scanner, how can we make them count?

Let's step through the config and see how this works.  There will be a lot of other information on Exim along the way, to give background and context.

There are 5 sections in a typical config file:

  1. Globals
  2. ACLs
  3. Routers
  4. Transports
  5. Retry
  6. Rewrite
  7. Authenticators

We will look at these in order.

Globals

The first section of the file we will largely ignore; this is an overview of Exim with an in-depth look at spam fighting.  There are many great books on Exim and it would be pointless to attempt to reproduce this rich wealth of information here.  

But briefly, it starts with the SQL queries which make my Exim work as a virtual mailserver.  The users on my mailserver don't, in an operating system sense, exist.  But they also have a lot of attributes which the operating system wouldn't be able to tell us about if they were real users.  But notice how, in the queries, there are non-SQL strings such as '${quote_mysql:$local_part}'. There are three clues here as to Exim's power. One is the substitution itself and the second is the "$local_part" bit which shows that Exim pulls out and makes available as variables lots of useful things about the email itself, in this case the first part of the delivery address. The third is a little more subtle - what is $local_part when you write the config? Answer: Nothing. The config file is evaluated and expanded every time a new email is dealt with.

It is this constant re-evaluation and expansion that is the key to Exim's power.

Also worth noting is how easy it was to create these substituion macros - we now have, for instance, a macro MYSQL_Q_BOXNAME which expands from certain information we know about the email - the domain and user it is addressed to - to the location of that user's Mailbox (if the user exists!)

We'll see later how easy it is to use them, and how easy it is to create general variables to query and compare.

ACLs

The ACLs are where the magic happens. I'd love to meet Philip Hazel and shake his hand, as these ACLs are one of those "I could have thought of that" ideas that are the mark of genius. No one else had thought of it, but in hindsight it seems obvious: Exim treats each each email as if it were a network packet passing through a firewall.

Let's recap an SMTP conversation. SMTP goes back many years and was implemented as a text protocol, I presume so that one could communicate with a mailserver using a telnet session. So if we telnet to a mailserver's port 25, this is what the conversation will look like:

Server: 220 mail.olicomber.co.uk ESMTP Exim 4.76 Sat, 22 Sep 2012 15:14:31 +0000 Me: HELO mail.somedomain.com Mailserver: 250 mail.olicomber.co.uk Hello mail.somedomain.com [86.30.168.41] Me: MAIL FROM: someone@somedomain.com Mailserver: 250 OK Me: RCPT TO: oli@olicomber.co.uk Mailserver: 250 Accepted Me: DATA Mailserver: 354 Enter message, ending with "." on a line by itself Me: Hi Oli, how you doing? . Mailserver: 250 OK id=1TFRSP-0007Nr-LC Me: QUIT Mailserver: 221 mail.olicomber.co.uk closing connection Connection closed by foreign host.

Notice the commands being used: (connect), HELO, MAIL FROM, RCPT TO, DATA and QUIT. Each of these is a stage the message has to go through, and different amounts of information are available at each stage. (connect) is a kind of virtual stage - the server knows we have connected but we haven't asked for anything to happen yet.

At CONNECT stage, only our IP is known to the server. HELO doesn't give much more information to the server, though it is expected to be "well formed". MAIL FROM gives an email address - a "local part" and "domain name". And so on.

The ACLs correspond to these stages of the email. Each ACL acts as a firewall: A set of rules must be passed, one at a time. Each rule can ACCEPT or DENY the message. If it is ACCEPTED, we are done with this ACL and move on to the next. If is REJECTED, processing of this message will stop and an error will be shown to the person connected.

If all ACLs pass, the message is queued up for routing. This is covered later, but for now picture it as Exim is happily sitting on the message and may delivery it, forward it, send it back...or something even more exotic.

Time to look at the spam fighting specifics in the ACLs...

The basic jist is that we start a counter at 0, increment or decrement it according to various conditions as the mail goes through all ACLs, and finally deliver it to inbox or spamfolder depending on the value of the counter. If the counter is excessive at the end of the ACLs (some spammers are shockingly bad at their jobs, one wonders if their hearts are truly in it) then the message will be dropped and the spammer told to go away.

So, to start with, we initialise the counter in the CHECK_SMTP ACLs. There is nothing else we can usefully do in this ACL, so it is an ACCEPT rule:

accept set acl_m69 = 0

What is interesting about the "acl_mXX" variables we set is that we can manipulate them in similar ways to variables in other languages, unlike the SQL expansion macros we saw above, and they will exist and hold their values until Exim has finished dealing with the entire message - including delivery.

Let's walk through the check_mail ACL. At this stage, we have an IP of the server wanting to send and an email address for the sender. We can do a surprising number of checks on this info.

Out there on the internet, where servers are large and sites are small, it is often the case that a single server will deal with all incoming and outgoing messages. The mailservers you will think of straight away are exceptions to this rule - gmail, hotmail, yahoo all use arrays of incoming and outgoing servers. But many, many other sites just use a single server.

A mailserver for a domain is identified by an MX Record. So, if we can establish that the computer talking to us is the mailserver for the domain the email is coming from, we can put a little more trust in it and not be so quick to spam it. Remember, it is illegal to send spam, therefore spammers don't send from real addresses or mailservers.

So we test for this, using a rather complex couple of rules:

 warn set acl_m0 = ${if match { ${lookup dnsdb{A=\ 
${sg {${lookup dnsdb{mx=$sender_address_domain}{$value}{}}}{[ \n]}{:} } }\
{$value}{}} }\ {$sender_host_address} \ {1}{0}}
warn !condition = ${if eq {$acl_m0}{1}{1}{0}}
# Increase the spam score if it's not the proper MX sending it
set acl_m69 = ${eval:$acl_m69+20} add_header = MXCheck: Server is NOT MX for sender's domain, +20 Spam score

 

What the above does is two nested lookups to establish the IP addresses of the MX servers for the senders domain and compares with the IP address of the connected computer. The second rule checks the result of the first, adjusts the counter and adds a header to the message. If the test fails, 20 is added to the spam counter.

In this way, an email which has a "from" address of bob@examplesite.com but which is coming from the IP of someothersite.com's MX record will be hindered and is more likely to later be classified as spam.

Now we have a look at SPF. SPF is a special kind of DNS record: If we look up the senders domain and it has an SPF record, it will tell us all the addresses of the servers which are allowed to send email for this domain. This is related to the first test, but is more accepted: one may have several MX servers and several sending servers, which are all different. In this case, our first test above fails, but we are still talking to a valid server, so the mail must be kosha (or we can, in theory, prosecute the sender)

warn set acl_m1 = ${run{/usr/bin/spfquery --ip $sender_host_address --id $sender_address}{1}{0}} warn condition = ${if eq {$acl_m1}{1}{1}{0}} set acl_m69 = ${eval:$acl_m69-30} add_header = SPFCheck: Server passes SPF test, -30 Spam score

If SPF passes, we subtract 30, thus undoing any negative waiting above due to it not being an MX server, and weighting the email in favour of not being spam. If you use this configuration, you may feel this weighting should be higher, and there is no reason it couldn't be. 30 is quite conservative.

 

What I like to do at this point is have a little fun with the spammers. A spammer tries to send as fast as possible, often using a non-standard program to rattle through thousands or millions of email addresses. If we decide at this point the email might not be kosha, we throw in a couple of seconds delay. A lot of the worst spammers will disconnect if they get too delayed. 


warn set acl_m2 = ${if or {{eq {$acl_m0}{1}} \ {eq {$acl_m1}{1}}}{1}{0}} accept condition = 1 delay = ${if eq {$acl_m2}{1}{0s}{2s}}

Any well configured server will have reverse DNS.  A home PC will not.  There are, unfortunately, a lot of badly configured servers out there.  If no reverse DNS, add a bit more spam score: 

  warn 
    condition = ${if eq {$sender_host_name}{} {1}{0}}
    set acl_m69 = ${eval:$acl_m69+20}
    add_header = ReverseDNS: No reverse DNS for mailserver, +20 Spam score

We don't penalise too badly at this point due to the number of badly configured servers.  We are simply helping the weight up.  If the message eventually turns out to not be spammy at all, this won't matter.  But if it is spammy, it will tip it over the edge. 

And that is the end of the check_mail ACL. The ACCEPT above with the delay means that all mail reaching that point will move to the next stage of the SMTP Protocol and its matching ACLRCPT TO.

The first thing to do is check some blacklists.  Spamhaus is a good one.

 

  warn 
    dnslists = zen.spamhaus.org
    set acl_m69 = ${eval:$acl_m69+30}
    add_header = BlacklistCheck: Blacklisted address, +30 Spam score

 

In actual fact, this check could possibly go in the check_mail ACL as it is based purely on the sender; I cannot remember why it is here.  A failure here does not block the mail.  I've been the victim of blacklists before and I think it is quite unfair to block based on this, when it is so easy to get on a blacklist.  So instead, we add some more spam score.

 

  warn
    set acl_m70 = ${lookup mysql {MYSQL_Q_AUTOWHITELIST}{$value}}
    set acl_m69 = ${eval:$acl_m69 + $acl_m70 + 20}
    add_header = Autowhitelister: awarded +${eval:$acl_m70+20} Spam score

 

Here's a fun one - autowhitelisting.  If you remember in the first section, there is a query denoted by the MYSQL_Q_AUTOWHITELIST macro which calls a function with the sender, recipient and IP address.  Every time this function is called, it decrements a row in a table which is key'd by sender, recipient and IP address.  This query returns the value and adds it to the spam score.

The net result is that the more times one person emails another, the more likely it is to get through spam scanning.


...and that is the end of the check_rcpt ACL.  Next, we move to the DATA part of the protocol, and run through the check_data ACL.


The DATA part is the body of the message.  This is looked at by very, very clever spamscanners - in my case, Spamassassin.  But only if it is worthy.


Spammers do not generally send large messages, so we do not spam scan if the message is larger than 200k.  This saves much expensive processing of a large message.  

Also, we look at our counter.  If our counter is already over 55, we know we will be classifying it as spam anyway, so simply do not bother scanning it. This is where the cpu cycles are greatly saved.

  warn 
    condition = ${if and { {< {$message_size}{200K}} { <{$acl_m69}{55}}}{1}{0}}
    spam = nobody:true/defer_ok
    set acl_m69 = ${eval:$acl_m69 + $spam_score_int}
    add_header = SpamScan: awarded +$spam_score_int Spam Score
    add_header = SpamScan: Report: $spam_report

  warn
    add_header = SpamTally: Final spam score: $acl_m69

The final WARN adds a header to the message so the recipient can easily see (if they "view source" on the message) why it was classed as spam.

  drop
    condition = ${if >{$acl_m69}{300}{1}{0}}
    message = Your message was classified as SPAM.  Please add more content, cut down on HTML links, use fewer naughty words etc.  Also, ask your IT dept to make sure your mailserver has REVERSEDNS, SPF and is not on any black lists. Your score: $acl_m69	

Finally, if the message was stupidly spammy, simply drop it.  There's not point even filing it in the spam folder.

At this point, we know the mail is ok to deliver so we ACCEPT.

 

Routers

Routing in Exim is what happens if a mail successfully negotiates all the ACLs.  It determines where the mail is sent.

Router has various conditions attached to it.  If it matches the mail being processed, its action is taken.  This usually means invoking a transport to do the meat of the work.  Just like the ACLs, the Routers are run in order until one is matched.  If there is no match, the mail is rejected and a reject message will be sent back (unless configured not to)

A good example is the remote_smtp router:

dnslookup:
  driver = dnslookup
  domains = ! +local_domains
  transport = remote_smtp
  ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
  no_more

This says that it is a dnslookup router.  If the domain is not in our list of local domains, if it is not targetted at localhost, then give it to the remote_smtp transport to deliver.  remote_smtp will do as it sounds and dial up the remote mailserver in question to deliver it.

We are going to look at the local delivery routers.

thisisspam: driver = accept domains = ${lookup mysql {MYSQL_Q_LOCAL}{$value}} local_parts = ! root condition = ${if >{$acl_m69}{55}{1}{0}} headers_add = Filed as SPAM - Final score ($acl_m69) greater than 55 transport = spamfiler no_more local_deliver: driver = accept domains = ${lookup mysql {MYSQL_Q_LOCAL}{$value}} local_parts = ! root transport = local_deliver no_more

The local_deliver router simply accepts the message if the user exists.  The MYSQL_Q_LOCAL macro will expand to a domain if the user exists.  The transport used will be local_deliver.

The thisisspam router is very similar, but has a condition attached.  The router matches iff the spamscore is greater than 55.  In this case, it hands the message to the spamfiler transport and adds a header to the message.

 

Transports

 The transports are configurable modules which do some work with an email to deliver it locally or remotely or perhaps produce a vacation message.  There's all sorts that can be done here, and I recommend a good read around the Exim website.

Unlike ACLs and RoutersTransports do not run in order; they are called by Routers.

Here we will look at the two local delivery transports.

spamfiler:

  driver = appendfile

  directory = /home/email/${lookup mysql {MYSQL_Q_BOXNAME}{$value}{root}}/.Spam

  user = vmail

  group = vmail

  maildir_format

  mode = 0660

  directory_mode = 0770

  create_directory


local_deliver:

  driver = appendfile

  directory = /home/email/${lookup mysql {MYSQL_Q_BOXNAME}{$value}{root}}

  user = vmail

  group = vmail

  maildir_format

  mode = 660

  directory_mode = 0770

  create_directory

I won't dwell too much on what all this means; I just wanted to highlight the difference between them.  The local_deliver transport uses appendfile to place the email in the top of the Mailbox, which is the inbox, whereas the spamfiler places the email in the .Spam subfolder.

Well, that's it.  I hope that has given you some good ideas about the insanely useful role your mailserver can take in weeding the ham from the spam.  

If it has converted you to Exim, so much the better :-)

Enjoy.  

 

...Click for More
Article
Exim
IT
Spam fighting
Whitelist