2.64 Upgrade Notes ------------------ Note: "make test" will produce some warnings if the account you are using has SpamAssassin 3.0 Bayes database files under ~/.spamassassin/ so in that case, you should move them out of the way temporarily. Upgrade Notes ------------- The SpamAssassin 2.6x release series will be the last set of releases to officially support perl versions earlier than perl 5.6.0. If you are using an earlier release, you will need to upgrade when SpamAssassin 3.0 is eventually released. Users of SpamAssassin versions earlier than 2.50 should note that the default tagging behavior has changed. If an incoming message is tagged as spam, instead of modifying the original message, SpamAssassin will create a new report message and attach the original message as a message/rfc822 MIME part (ensuring the original message is completely preserved and easier to recover). If you do not want to modify the body of incoming spam, use the "report_safe" option. The "report_header" and "defang_mime" options are also deprecated as a result. Read the docs for "report_safe" for details. Please note that the use of the following commandline parameters for spamassassin and spamd have been deprecated! They will still work in the 2.6x series (a warning will be displayed), but they will be removed in the next major release. If you currently use these flags, please remove them. The flags are: --add-from, --pipe, -F, -P SpamAssassin now runs in "taint mode" by default for improved security. If you have Razor2 turned on and are using Vipul's Razor 2.22 or higher (through at least 2.36), you will need to apply a small patch. The instructions are located in the Razor2.patch file. The 2.6x series has a new Bayes backend and database format (since 2.5x). Your old database(s) will automatically be upgraded the first time SpamAssassin tries to write to the DB, and any journal, if it exists, will be wiped out without being synced. Please see the INSTALL document for more details. Finally, the "check_bayes_db" script has been deprecated and will no longer function properly with 2.6x series databases. The functionality from that script was added to sa-learn via the "--dump" parameter. Please see the sa-learn man/pod documentation for more info. Welcome to SpamAssassin! ------------------------ SpamAssassin is a mail filter which attempts to identify spam using text analysis and several internet-based realtime blacklists. Using its rule base, it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email. Once identified, the mail can then be optionally tagged as spam for later filtering using the user's own mail user-agent application. SpamAssassin typically differentiates successfully between spam and non-spam in between 95% and 99% of cases, depending on what kind of mail you get. SpamAssassin also includes support for reporting spam messages automatically, and/or manually, to collaborative filtering databases such as Vipul's Razor [1]. [1]: http://razor.sourceforge.net/ The distribution provides "spamassassin", a command line tool to perform filtering, along with "Mail::SpamAssassin", a set of perl modules which implement a Mail::Audit plugin, allowing SpamAssassin to be used in a Mail::Audit filter, spam-protection proxy SMTP or POP/IMAP server, or a variety of different spam-blocking scenarios. In addition, Craig Hughes has contributed "spamd", a daemonized version of SpamAssassin, which runs persistently. Using "spamc", a lightweight C client, this allows an MTA to process large volumes of mail through SpamAssassin without having to fork/exec a perl interpreter for each one. For those sites running qmail as your MTA, the "qmail" directory contains two ways to integrate spamc with your system. Kobe Lenjou has contributed a patch to qmail-scanner, and John Peacock has contributed a QMAILQUEUE enabled qmail-spamc. See the README*'s for more details. Ian R. Justman has contributed "spamproxy", a spam-filtering SMTP proxy server. This lives in the "released" directory on the web site. SpamAssassin lives at http://spamassassin.org/ or in CPAN, and is distributed under the same license as Perl itself. Use of the SpamAssassin name is restricted as documented in the file named "Trademark". This module owes a lot of inspiration to Mark Jeftovic's filter.plx, which I used for a long time, and contributed some code to. However, SpamAssassin is a ground-up rewrite with a new, greatly improved ruleset, a different code model and installation system, and hopefully will be easy to adapt for a multitude of applications. [2]: http://AntiSpam.shmOOze.net/filter/ Questions regarding SpamAssassin should be sent to the mailing list: . Installing SpamAssassin ----------------------- See the INSTALL file. Customising SpamAssassin ------------------------ These are the configuration files installed by SpamAssassin. The commands that can be used therein are listed in the POD documentation for the Mail::SpamAssassin::Conf class (run the following command to read it: "perldoc Mail::SpamAssassin::Conf"). Note: The following directories are the standard defaults that people use. There is an explanation of all the default locations that SpamAssassin will look at the end. - /usr/share/spamassassin/*.cf: Distributed configuration files, with all defaults. Do not modify these, as they are overwritten when you upgrade. - /etc/mail/spamassassin/*.cf: Site config files, for system admins to create, modify, and add local rules and scores to. Modifications here will be appended to the config loaded from the above directory. - /usr/share/spamassassin/user_prefs.template: Distributed default user preferences. Do not modify this, as it is overwritten when you upgrade. - /etc/mail/spamassassin/user_prefs.template: Default user preferences, for system admins to create, modify, and set defaults for users' preferences files. Takes precedence over the above prefs file, if it exists. Do not put system-wide settings in here; put them in the /etc/mail/spamassassin directory. This file is just a template, which will be copied to a user's home directory for them to change. - $USER_HOME/.spamassassin: User state directory. Used to hold spamassassin state, such as a per-user automatic whitelist, and the user's preferences file. - $USER_HOME/.spamassassin/user_prefs: User preferences file. If it does not exist, one of the default prefs file from above will be copied here for the user to edit later, if they wish. Unless you're using spamd, there is no difference in interpretation between the rules file and the preferences file, so users can add new rules for their own use in the "~/.spamassassin/user_prefs" file, if they like. (spamd disables this for security and increased speed.) - $USER_HOME/.spamassassin/bayes* Statistics databases used for Bayesian filtering. If they do not exist, they will be created by SpamAssassin. Spamd users may wish to create a shared set of bayes databases; the "bayes_path" and "bayes_file_mode" configuration settings can be used to do this. See "perldoc sa-learn" for more documentation on how to train this. File Locations: SpamAssassin will look in a number of areas to find the default configuration files that are used. The "__*__" text are variables which get set during installation. You can see their values by viewing the first several lines of the "spamassassin" or "spamd" scripts. - Distributed Configuration Files '__def_rules_dir__' '__prefix__/share/spamassassin' '/usr/local/share/spamassassin' '/usr/share/spamassassin' - Site Configuration Files '__local_rules_dir__' '__prefix__/etc/mail/spamassassin' '__prefix__/etc/spamassassin' '/usr/local/etc/spamassassin' '/usr/pkg/etc/spamassassin' '/usr/etc/spamassassin' '/etc/mail/spamassassin' '/etc/spamassassin' - Default User Preferences File '__local_rules_dir__/user_prefs.template' '__prefix__/etc/mail/spamassassin/user_prefs.template' '__prefix__/share/spamassassin/user_prefs.template' '/etc/spamassassin/user_prefs.template' '/etc/mail/spamassassin/user_prefs.template' '/usr/local/share/spamassassin/user_prefs.template' '/usr/share/spamassassin/user_prefs.template' After installation, try "perldoc Mail::SpamAssassin::Conf" to see what can be set. Common first-time tweaks include: - required_hits Set this higher to make SpamAssassin less sensitive. If you are installing SpamAssassin system-wide, this is **strongly** recommended! Statistics on how many false positives to expect at various different thresholds are available in the "STATISTICS.txt" file in the "rules" directory. - subject_tag When rewrite_subject is on, the subject stamp is *****SPAM*****. This can be used to change it. - ok_locales If you expect to receive mail in non-ISO-8859 character sets (ie. Chinese, Cyrillic, Japanese, Korean, or Thai) then set this. Learning -------- Since SpamAssassin now includes a Bayesian learning filter (in version 2.50 on), it is worthwhile training SpamAssassin with your collection of non-spam and spam, if possible. This will make it more accurate for your incoming mail. Do this using the "sa-learn" tools, like so: sa-learn --spam ~/Mail/saved-spam-folder sa-learn --nonspam ~/Mail/inbox sa-learn --nonspam ~/Mail/other-nonspam-folder Use as many mailboxes as you like. Note that SpamAssassin will remember what mails it's learnt from, so you can re-run this as often as you like. Locali[sz]ation --------------- All text displayed to users is taken from the configuration files. This means that you can translate messages, test descriptions, and templates into other languages. If you do so, I would *really* appreciate if you could send a copy back of the updated messages; mail them to . Hopefully if it takes off, I can add them to the distribution as "official" translations and build in support for this. You will, of course, get credited for this work ;) Help With SpamAssassin ---------------------- There's a mailing list for support or discussion of SpamAssassin. It lives at . See http://spamassassin.org/lists.html for the sign-up address and a link to the archive of past messages. Commercial Tests ---------------- There are several tests in the spamassassin configuration file which are turned off by default, namely the mail-abuse.org tests. The mail-abuse.org tests are RCVD_IN_MAPS_RBL, RCVD_IN_MAPS_DUL, RCVD_IN_MAPS_RSS, RCVD_IN_MAPS_NML, and RCVD_IN_MAPS_OPS. These are commercial services, so you need to pay money to use them. The mail-abuse.org tests are free, with caveats. You must subscribe, asking for free access for personal use -- if you're using SpamAssassin as a personal mail filter you may turn them on. More information on the mail-abuse.org services can be found here: http://mail-abuse.org/rbl+/ and http://www.mail-abuse.org/feestructure.html . To turn on the tests, simply assign them a non-zero score, e.g. by adding these lines to your ~/.spamassassin/user_prefs file: score RCVD_IN_MAPS_RBL 3 score RCVD_IN_MAPS_DUL 1 score RCVD_IN_MAPS_RSS 2 score RCVD_IN_MAPS_NML 2 score RCVD_IN_MAPS_OPS 2 To see how RBL checks work, use `spamassassin -D rbl=-3'. This will display all debug messages related to rbl checking. Automatic Whitelist System -------------------------- SpamAssassin includes automatic whitelisting; The current iteration is considerably more complex than the original version. The way it works is by tracking for each sender address the average score of messages so far seen from there. Then, it combines this long-term average score for the sender with the score for the particular message being evaluated, after all other rules have been applied. This functionality is off by default, and is enabled with the "-a" flag to either spamassassin or spamd. A system-wide auto-whitelist can be used, by setting the auto_whitelist_path and auto_whitelist_file_mode configuration commands appropriately, e.g. auto_whitelist_path /var/spool/spamassassin/auto-whitelist auto_whitelist_file_mode 0666 The spamassassin -W and -R command line flags provide an API to add and remove entries 'manually', if you so desire. They operate based on an input mail message, to allow them to be set up as aliases which users can simply forward their mails to. See the spamassassin manual page for more details. The default address-list implementation, Mail::SpamAssassin::DBBasedAddrList, uses Berkeley DB files to store the addresses. There may be synchronization issues with this implementation in an NFS environment. Reasonable attempts have been made to ensure proper locking of the DB file, but it may yet be somewhat flakey. (end of README) // vim:tw=74: