The milter package is a flexible milter built using pymilter that emphasizes authentication. It helps prevent forgery of legitimate mail, and most spam is rejected as forged, or because of poor reputation when not forged.

These are the policies implemented by the bms.py application in the milter package. The milter and Milter modules in the pymilter package do not implement any policies themselves.

Classify connection

When the SMTP client connects, the connection IP address is saved for later verification, and the connection is classified as INTERNAL or EXTERNAL by matching the ip address against the internal_connect configuration. IP addresses with no PTR, and PTR names that look like the kind assigned to dynamic IPs (as determined by a heuristic algorithm) are flagged as DYNAMIC. IPs that match the trusted_relay configuration are flagged as TRUSTED.

Examples from the log file (not the SMTP error message returned):

2005Jul29 13:56:53 [71207] connect from p50863492.dip0.t-ipconnect.de at ('80.134.52.146', 1858) EXTERNAL DYN
2005Jul29 18:10:15 [74511] connect from foopub at ('1.2.3.4', 46513) EXTERNAL TRUSTED
2005Jul29 14:41:00 [71805] connect from foobar at ('192.168.0.1', 41205) INTERNAL
2005Jul29 14:41:15 [71806] connect from cncln.online.ln.cn at ('218.25.240.137', 35992) EXTERNAL

Certain obviously evil PTR names are blocked at this point: "localhost" (when IP is not 127.*) and ".".

2005Jul29 14:49:50 [71918] connect from localhost at ('221.132.0.6', 50507) EXTERNAL
2005Jul29 14:49:50 [71918] REJECT: PTR is localhost

HELO Check

The HELO name provided by the client is saved for later verification (for example by SPF). We could validate the HELO at this point by verifying that an A record for the HELO name matches the connect ip. However, currently we only block certain obvious problems. HELO names that look like an IP4 address and ones that match the hello_blacklist configuration are immediately rejected. The hello_blacklist typically contains the current MTAs own HELO name or email domains. Clients that attempt to skip HELO are immediately rejected.

2005Jul29 18:10:15 [74512] hello from example.com
2005Jul29 18:10:15 [74512] REJECT: spam from self: example.com
2005Jul29 18:17:09 [74581] hello from 80.191.244.69
2005Jul29 18:17:09 [74581] REJECT: numeric hello name: 80.191.244.69

MAIL FROM Check

Before calling our milter, sendmail checks a DNS blacklist to block banned sender domains. We never see a blocked domain.

The MAIL FROM address is saved for possible use by the smart-alias feature. First, the internal_domains is used for a simple screening if defined. If the MAIL FROM for an INTERNAL connection is NOT in internal_domains, then it is rejected (the PC is most likely infected and attempting to send out spam). If the MAIL FROM for an EXTERNAL connection IS in internal_domains, then the message is immediately rejected. This is quick and effective for most small company MTAs. For more complex mail networks, it is too simplistic, and should not be defined. SPF will handle the complex cases.

wiretap

The wiretap feature can screen and/or monitor mail to/from certain users. If the MAIL FROM is being wiretapped, the recipients are altered accordingly.

SPF check

The MAIL FROM, connect IP, and HELO name are checked against any SPF records published via DNS for the alleged sender (MAIL FROM) to determine the official SPF policy result. The offical SPF result is then logged in the Received-SPF header field, but certain results are subjected to further processing to create an effective result for policy purposes.

If the official result is 'none', we try to turn it into an effective result of 'pass' or 'fail'. First, we check for a local substitute SPF record under the domain defined in the [spf]delegate configuration. It is often useful to add local SPF records for correspondents that are too clueless to add their own. If there is no local substitute, we use a "best guess" SPF record of "v=spf1 a/24 mx/24 ptr" for MAIL FROM or "v=spf1 a/24 mx/24" for HELO. In addition, a HELO that is a subdomain of MAIL FROM and resolves to the connect IP results in an effective result of 'pass'.

If there is no local SPF record, and the effective result is still not 'pass', we check for either a valid HELO name or a valid PTR record for the connect IP. A valid HELO or PTR cannot look like a dynamic name as determined by the heuristic in Milter.dynip.

If HELO has an SPF record, and the result is anything but pass, we reject the connection:

2005Jul30 19:45:16 [93991] connect from [221.200.41.54] at ('221.200.41.54', 3581) EXTERNAL DYN
2005Jul30 19:45:18 [93991] hello from adelphia.net
2005Jul30 19:45:19 [93991] mail from <wendy.stubbsua@link-it.com> ()
2005Jul30 19:45:19 [93991] REJECT: hello SPF: fail 550 access denied

Note that HELO does not have any forwarding issues like MAIL FROM, and so any result other than 'pass' or 'none' should be treated like 'fail'.

Only if nothing about the SMTP envelope can be validated does the effective result remain 'none. I call this the "3 strikes" rule.

If the official result is 'permerror' (a syntax error in the sender's policy), we use the 'lax' option in pyspf to try various heuristics to guess what they really meant. For instance, the invalid mechanism "ip:1.2.3.4" is treated as "ip4:1.2.3.4". The result of lax processing is then used as the effective result for policy purposes.

With an effective SPF result in hand, we consult the sendmail access database to find our receiver policy for the sender.

REJECT Reject the sender with a 550 5.7.1 SMTP code. The SMTP rejection includes a detailed description of the problem.

CBV Do a Call Back Validation by connecting to an MX of the sender and checking that using the sender as the RCPT TO is not rejected. We quit the CBV connection before actualling sending a message. If the CBV is rejected, our SMTP connection is rejected with the same error code and message. CBV results are cached.

DSN Do a Call Back Validation by connecting to an MX of the sender and checking that using the sender as the RCPT TO is not rejected. Unlike a CBV, we continue on to data and send a detailed message explaining the problem. This can be useful for reporting PermError or SoftFail to the sender. Keep in mind that for any result other than 'pass', the sender could be forged, and your DSN could annoy the wrong person. However, a SoftFail result is requesting such feedback for debugging and a PermError result needs to be fixed by the sender ASAP whether forged or not. DSN results are cached so that senders are annoyed only weekly.

OK Accept the sender. The message may still be rejected via reputation or content filtering.

SPF policy syntax

First, the full sender is checked:

SPF-Fail:abeb@adelphia.net     DSN

This says to accept mail from that adelphia.net user despite the SPF fail, but only after annoying them with a DSN about their ISP's broken policy.

If there is no match on the full sender, the domain is checked:

SPF-Neutral:aol.com     REJECT

This says to reject mail from AOL with an SPF result of neutral. This means AOL users can't use their AOL address with another mail service to send us mail. This is good because the other mail service is likely a badly configured greeting card site or a virus.

Finally, a default policy for the result is checked. While there are program defaults, you should have defaults in the access database for SPF results:

SPF-Neutral:            CBV
SPF-Softfail:           DSN
SPF-PermError:          DSN
SPF-TempError:          REJECT
SPF-None:               REJECT
SPF-Fail:               REJECT
SPF-Pass:               OK

Reputation

If the sender has not been rejected by this point, and if a GOSSiP server is configured, we consult GOSSiP for the reputation score of the sender and SPF result. The score is a number from -100 to 100 with a confidence percentage from 0 to 100. A really bad reputation (less than -50 with confidence greater than 3) is rejected. Note that the reputation is tracked independently for each SPF result and sender combination. So aol.com:neutral might have a really bad reputation, while aol.com:pass would be ok. Furthermore, when a sender finally publishes an SPF policy and starts getting SPF pass, their reputation is effectively reset.

Whitelists and Blacklists

The administrator can whitelist or blacklist senders and sending domains by appending them to ${datadir}/auto_whitelist.log or ${datadir}/blacklist.log respectively. In addition, recipients of internal senders (except for automatic replies like vacation messages and return receipts) are automatically whitelisted for 60 days, and senders that fail CBV or DSN checks are automatically blacklisted for 30 days. Whitelisted and blacklisted senders are used to automatically train the bayesian content filter before being delivered or rejected, respectively.

Real Soon Now users will be able to maintain their own whitelist and blacklist that applies only when they are the recipient.

Recipient Check

When the pysrs package is installed and configured, outgoing mail is "signed" by adding a cryto-cookie to MAIL FROM. All DSNs (null MAIL FROM) must be sent to a MAIL FROM address only, so a DSN without a validated cookie in RCPT is immediately rejected. Forwarded domains can have a list of valid recipients configured, and invalid recipients are rejected. The MTA rejects invalid local RCPTs. Four or more invalid RCPTs cause the IP to be blacklisted.

Content Filter

Most messages have been rejected or delivered by now, but spammers are always finding new places to send their junk from. For instance, we get around 10000 emails a day, of which around 500 are first time spam senders. A bayesian filter is trained by the whitelists and blacklists, and scores the message. What is likely spam is either rejected or quarantined. If the sender is an effective SPF pass, then they get a DSN notifying them that their message has been quarantined. (A DSN failure gets the sender auto blacklisted.) Else, if the reject_spam option is set, the message is rejected. Otherwise, a CBV is done (failure gets the sender auto blacklisted) and the message is silently quarantined.

Normally, you don't want email messages to silently disappear into a black hole, so you should set the reject_spam option. However, if you don't want your correspondent's email to get rejected, you can check your quarantine frequently instead.

Honeypot

You can also blacklist recipients by listing them as aliases of the 'honeypot' dspam user. These are collectively called the honeypot. Any email to these recipients is used to train the spam filter as spam and chalk up a reputation demerit for the sender, then discarded. It might be a good idea to blacklist the sender if it has SPF pass as well, but I'm afraid of accidents.

Reputation

Reputation is tracked by sending domain and effective SPF result. The GOSSiP server tracks the spam/ham status of the last 1024 messages for each domain:result combination. When the server is queried during the SMTP envelope phase (MAIL FROM), it also queries any configured peers, and the scores are combined. Domains with a history of spam for a given SPF result are rejected at MAIL FROM. The GOSSiP system has a command line utility to reset (delete) a reputation for cases where a sender that was infected with malware is repaired. In addition, the confidence score of a reputation decays with time, so a bad sender will eventually be able to try again without manual intervention.