Automating Spam Training with rspamd-learn-helper

Fighting spam is a constant battle. Even with powerful tools like Rspamd, it’s easy to forget to regularly train your filters, especially in multi-user or low-maintenance environments. That’s where rspamd-learn-helper comes in—a simple yet effective script that brings automation, clarity, and ease of use to the spam learning process.

Motivation

Rspamd offers a powerful bayesian filter that improves significantly with regular learning. However, manually invoking rspamc learn_spam or learn_ham can become tedious, and scripting it correctly for production use (e.g., on a mail server) is not always straightforward. Especially when mailboxes are spread across several folders or spam is collected into shared IMAP locations, it’s easy to lose track or to miss the opportunity to retrain the filter efficiently.

This project aims to:

Automate and standardize the learning process
Make spam training repeatable and consistent
Log clearly what was learned and why
Support integration into systemd timers or cron jobs

How It Works

The rspamd-learn-helper script scans defined mail folders (e.g., Junk, Spam, INBOX, or Ham) and invokes rspamc to train Rspamd accordingly. It supports:

Multiple mailbox sources
Custom folder mappings for spam and ham
Dry-run mode for safe testing
Verbose logging with color-coded CLI output
HTML reporting for monitoring and automation logs

You can integrate the script into a daily or hourly cron job, or run it manually as needed. For production environments, it’s ideal to connect it to a scheduled systemd timer, ensuring that your spam filters evolve over time with minimal effort.

Example Use Case

Imagine a mail server where:

Users occasionally move misclassified spam into a personal folder named .TrainingSpam
Legitimate emails (ham) that were falsely marked as spam are moved into .TrainingHam
Each user has their own Maildir folder structure, e.g. /home/username/Maildir/

With the default configuration:

home_directory="/home"
spam_folder=".TrainingSpam"
ham_folder=".TrainingHam"

The script will recursively loop through all subdirectories in /home, and look for the paths:

/home/*/Maildir/.TrainingSpam/cur/home/*/Maildir/.TrainingHam/cur

Each message file in these cur/ folders will be passed to Rspamd for learning as spam or ham, respectively.

This setup ensures:

Learning is done only on explicitly marked samples
Each user contributes to Rspamd’s bayesian model without risk of misclassification
Automation is safe and transparent, with optional logging

You can adjust the folder names or base path via variables, or use custom regex-based logic if needed.

Designed for Admins

rspamd-learn-helper was created to:

Reduce false positives and false negatives
Automate repetitive mail processing tasks
Gain insight into the effectiveness of the bayesian filter
Offer a transparent, low-maintenance solution for small and medium-sized mail servers

If you value automation and want your mail server to get smarter over time, this helper script can make a noticeable difference in your spam detection quality.

👉 Get it here:

Utility script for rspamd spam and ham training.
https://github.com/filipnet/rspamd-learn-helper
0 forks.
1 stars.
0 open issues.

Recent commits:

Pull requests, suggestions, and issues are welcome.

Motivation

How It Works

Example Use Case

Designed for Admins

Related Posts

Monitoring Proxmox Snapshots – Why It Matters More Than You Think

Matrix Notifications for Checkmk – Modern, Secure, and Self-Hosted Alerting

Smart UV Protection with Home Assistant