DCC – Distributed Checksum Clearinghouse

This thing is the ultimate spamkiller. I love it. The official web page is here.

In just one week I recieved 74 spams, and DCC correctly filtered 73 of them. No non-spam emails were incorrectly filtered.

It creates a fingerprint of every email you recieve, and then collaborates with thousands of other people using DCC to see if they got emails with the same fingerprint. Based on how many people matched, and how closely they match, DCC attaches a header to your message telling you the probability that it is spam. You can then add filter rules to your favorite email client that move messages above a certain probability into a “junk” folder.

Requirements

There are two requirements:

  • Your mail server has to be running some flavor of UNIX

  • Either you have .procmailrc access on your mail server, or you can convince the administrator of your mail server to install DCC for you.

Step 1: Set up DCC

Download and unpack DCC.

wget http://www.rhyolite.com/anti-spam/dcc/source/dcc-dccproc.tar.Z
zcat dcc-dccproc.tar.Z | tar xvf -

Configure and compile it. You can replace ~/bin/dcc with something like /usr/local if you want DCC to be available to everybody on your system.

mkdir -p ~/bin/dcc
cd dcc-dccproc-1*
./configure --prefix=$HOME/bin/dcc/ --bindir=$HOME/bin/dcc/ --disable-sys-inst
make install

Finally, tell dcc which servers to use to swap fingerprints with other users.

mkdir ~/.dcc
~/bin/dcc/cdcc -h ~/.dcc "new map"
~/bin/dcc/cdcc -h ~/.dcc "add dcc.etherboy.com"
~/bin/dcc/cdcc -h ~/.dcc "add dcc.meer.net"
~/bin/dcc/cdcc -h ~/.dcc "add dcc.rhyolite.com"
~/bin/dcc/cdcc -h ~/.dcc "add dcc.rollanet.com"

Step 2: Have DCC fingerprint and mark your incoming mail

Add this line to your ~/.procmailrc

########### Add X-DCC header
:0 f
| ~/bin/dcc/dccproc -R -h ~/.dcc

Now send yourself an email and check to see if a new header appeared on it, starting with X-DCC. The header will look something like this:

X-DCC-sackHeads-Metrics: alonzo.megacz.com 1012; Body=1 Fuz1=4 Fuz2=1

The only important part is the “Fuz1=X” and “Fuz2=X” parts – these numbers indicate how many other people recieved the same message as you. In my experience, anything above 3 is spam.

Step 3: Tell your mail client to filter the spam

My advice is to set your mail client to move anything with a Fuz1/Fuz2 value at or above 4 into a “spam” folder, and check that folder about once a week.

I use the following rule with procmail, although you can do this in other mail clients (like Eudora or Outlook).

:0
*X-DCC-[^\:]*:.*Fuz[12]=(many|[4-9]|[1-9][0-9])
|/home/megacz/bin/fixdeliver.pl user.megacz.junk

The only catch is that you have to put any filter rules for mailing lists that you're on before the filter that junks your spam – otherwise, DCC will incorrectly put your mailing list mail in the spam folder. Technically, DCC only identifies bulk mail – spam is unwanted bulk mail. So you have to explain to DCC what kinds of bulk mail you actually want.