November 17th, 2006


userinfo senji
2006/11/17 23:26:00 - Spam, and evil shell hackery.


This is a graph of spam incidence upon mail arriving at one of my mail addresses at ysolde.ucam.org. It doesn't include spam received by various kinds of automated facilities, mailing lists (except where I receive the mail), other users, or the few blacklisted/actively SAUCEd source-addresses. You may need to view the image on its own at full size — the shrunk version above is a link to the full image.

This is how I produced it.

; cd ~/mh/myspam/ I use nmh, which means that my mail is stored one-mail-per-file in directories.
; find . -type f | xargs grep -ah ^Delivery-Date: | sed 's/Delivery-Date: ... \(... ..\).* \(....\)/\2 \1/' | sed 's/Jan/01/;s/Feb/02/;s/Mar/03/;s/Apr/04/;s/May/05/;s/Jun/06/;s/Jul/07/;s/Aug/08/;s/Sep/09/;s/Oct/10/;s/Nov/11/;s/Dec/12/' > ~/tmp/intermediatePull out the Delivery-Dates (which are added by exim) and munge them into an entirely numerical format.
; sort ~/tmp/intermediate | uniq -c > ~/tmp/uniquedSort the dates, and turn it into a list of how many times each date appears.
; cd ~/tmpChange into the temporary directory to work.
; for x in $(seq 2001 2006); do for y in $(seq -w 1 12); do case $y in 09|04|06|11) q=30;; 02) q=28; if [ $x -eq 2004 ]; then q=29; fi;; *) q=31;; esac; for z in $(seq -w 1 $q); do echo "$x $y $z"; done; done; done | (while read foo; do case $foo in *01) echo -n 'X ';; *) echo -n 'Y ';; esac; (grep "$foo" uniqued || echo 0) | sed 's/[^0-9]*\([0-9]*\).*/\1/'; done) > rawProduce a list of all dates in the target range, then look for each of those dates in the previously generated file. Extract the number of spam if we found it, output 0 if we did't. Also, if it's the first day of the month output X otherwise Y.
; < raw (while read foo bar; do case $foo in X) echo -n '1 1 1 1 1 1 1 1 ';; Y) echo -n '0 0 0 0 0 0 0 1 ';; esac; case $bar in 0) ;; *) for y in $(seq 1 $bar); do echo -n '1 '; done; ;; esac; for y in $(seq 1 $(( 255 - $bar)) ); do echo -n '0 '; done; echo; done) > outIf we have an X then output 8 "1 "s, otherwise 7 "0 "s and a "1 " (these make up the beginnings of the axis markings). Then, for each spam incidence output a "1 ", and fill the line up with "0 "s to make 255 in total. Save all this in a file called out.
; wc -l outFind the number of lines in the output file (so we know the size of the image — the width is 263. This turns out to be 2191 lines.
; cat > header <<EOF
P1
# This pbm file was produced with lots of evil shell hackery
263 2191
EOF
This is the bit of header that turns a grid of "0 "s and "1 "s into a PBM format image.
; cat header out > file.pbmPiece the two bits together.

The slightly longer notches for the 6 and 12 month breaks and the text were added later in the GIMP, and the image was rotated. I also removed December 2006, because it hasn't even started yet, and converted the file to PNG format which is more widely recognised.
Unfortunately it's already out of date...
Current Mood: [mood icon] geeky
Current Music: Elgar — Symphony Number 1
Entry Tags: geeky, random, shell hackery, spam

< | link | comment | > )

Spam, and evil shell hackery. - Squaring the circle...

> log in
> recent entries
> fiends
> archive
> toothywiki page
> profile
> new entry
> recent comments


> go to top