Training your retriever

Joe Gregorio

I have been using Mozilla 1.4 for the past two weeks and have been very happy with the spam filtering capabilities. I don't really get a lot of spam, maybe three to five pieces a day, but it came with good initial filters and has learned pretty quickly. When I first started up and had it scan my Inbox, which was clean of spam, it marked an entire thread on [xml-dev] as junk. I almost corrected it, but then thought about the topic of the thread and decided to side with Mozilla, it was junk.

Which makes me wonder how specific I could train the Mozilla mail filter. Could I get it to mark as junk all the emails I receive that refer to the Semantic Web? Yes, yes, I know I could create a filter to do that, that's not a challenge. It's like training your dog to retrieve bikini tops, sure you could just run up and take them yourself, but it's not the same as getting the dog to do it.

Ok, a little update to that last line. No, I never trained my dog to that, but I did know a guy who tried. He even had a mannequin and a bikini top that he would have the dog practice on. Did I mention that I lived in a very small town, way out in the boonies, with very little to do, and very little to entertain us? Just something to consider if you're thinking of moving out into the country to raise your kids in a more wholesome environment.

There is a Mozilla bug out there to change the bayesian filter to filter all your mail into different categories: http://bugzilla.mozilla.org/show_bug.cgi?id=168905

Posted by John Beimler on 2003-09-04

Well, I dono about Mozilla, specifically, but I've been using the PopFile Bayesian-categorizing POP3 proxy-server for a while, and it allows multiple categories, instead of just "Spam" and "Ham" ... so you can train it to recognize anything ... quite nice.

Posted by Joel Bennett on 2003-10-02

comments powered by Disqus