Introduction
If you’re operating a modern mail system and rely on technologies like SPF, DKIM, and DMARC, you expect spoofed and unauthenticated messages to be stopped. But what if spam arrives perfectly authenticated – via Google’s own infrastructure? That’s what happened in our environment, and it took far too long to identify the true cause: Google Groups.
Despite their legitimacy on paper, many spam messages bypassed filtering because they originated from *.googlegroups.com
, using valid SPF, DKIM signatures, and impressive reputation scores. These properties rendered Rspamd’s default filters ineffective. The spam arrived flawlessly, often with a List-ID header indicating a Google Group – yet Rspamd let them pass with minimal scoring.
Rejecting them based solely on source domain would be irresponsible. Google Groups are used by legitimate projects, teams, and mailing lists. A blunt solution like domain-based blocking would break too much.
So instead, I decided to analyze first, then block selectively.
Objective
Before taking action, I wanted visibility. Which Google Groups were sending messages to our infrastructure? Which ones were legitimate? Which ones were spam? The goal: create a curated allowlist and denylist based on the List-ID
header, which uniquely identifies a mailing list – even across Google’s infrastructure.
Implementation
To extract actionable insights, I wrote a shell script:analyze_google_groups.sh
It parses mail logs or .eml
files, extracts the List-ID
headers for messages received from Google Groups, and aggregates them by frequency.
Key characteristics:
- Fast
awk
-based parsing - Groups by domain and count
- Works with
.eml
archives or live logs - Exportable to whitelist/blacklist map formats for Rspamd
This gave me a clear picture of which List-IDs were:
- Frequently seen (likely internal or legitimate)
- Rare or suspicious (spam candidates)
Rspamd Integration
Once the data was available, I integrated it with Rspamd via multimap
:
LIST_ID_WHITELIST {
type = "header";
header = "List-ID";
filter = "regexp";
map = "${LOCAL_CONFDIR}/local.d/listid_whitelist.map";
score = -10.0;
}
LIST_ID_BLOCK {
type = "header";
header = "List-ID";
filter = "regexp";
map = "${LOCAL_CONFDIR}/local.d/listid_blacklist.map";
score = 20.0;
action = "reject";
}
Whitelisted List-IDs are explicitly scored down. Blacklisted ones are scored heavily and rejected immediately. This bypasses the issue of high reputation and perfect SPF/DKIM alignment.
Operational Tip: Always place the whitelist rule before the block rule in the configuration. Rspamd applies multimap entries in order, and once a reject
is triggered, later rules are skipped.
Outcome
After activating the rules:
- Spam from obscure or disposable Google Groups was stopped instantly.
- Legitimate dev lists and team communication remained unaffected.
- No false positives were reported.
Most importantly, we moved from guesswork to control.
Repository:
https://github.com/filipnet/google-groups-scan
Summary
Google Groups is not the enemy — but blind trust is.
When spam is relayed by a high-reputation domain, classic filters like SPF/DKIM won’t help.
My approach:
- Use real data to identify groups in use
- Only whitelist what’s proven legitimate
- Reject or penalize everything else with confidence
This gives me full control over a common spam vector without disrupting valid traffic.