Beitragvon JohnMoran » 30. Sep 2004, 17:43

I am using Spami 0.9.8 with the following filters: Attachment, Empty String, Substring, Learning, Spam Word, Alphabet Soup -- in the order given. Each filter terminates filtering if it decides the message is spam.

I have been using Spami for about a month but since I get relatively few spam messages (3-12 per day) the Bayes learning filter is learning very slowly.

My thought is that the Bayes filter, once trained, will recognize all the spam I receive and that the auxiliary filters merely provide some protection during the training period.

Apparent Bugs:

The Empty Mail filter decision level is 3 words. It erroneously recognizes some messages which have many more than 3 words and classifies them as spam. So far, they have been spam but not because of length.

So far, the spam word filter recognizes more spams than any other filter. In the window at the bottom of the Recycle page, some of the words listed as the reason this filter classified the message as spam are not found in the message.

Some suggestions:

Change the wording for filter priorities from:
"Finish filtering process (recommended)"
"Terminate filtering, message is SPAM" (for the spam box)
or "Terminate filtering, message is non-SPAM" (for the non-spam box)

This is a subtlety of parsing the English, but the "finish" wording can be mis-understood to mean continue filtering.

If possible, copy messages re-classified by the user (as spam OR non-spam) into the Recycle bin (copy non-spam to mail box, of course). These mis-classified messages are the most interesting messages because they demonstrate problems in classification.

In addition, it would be most useful if a command were added to the Recycle page which would allow running the messages through the filtering again (learning temporarily disabled) with additional columns showing the new results. This would help in evaluating changes to the filter plugins and their data. Over time the learning filter's results should also change for these stored messages - an additional column showing the learning filter's results would also be informative. An additional command to select the Settings page from the Recycle page would then be useful too.

