Google Groups as a Spam Vector

Introduction

If you’re operating a modern mail system and rely on technologies like SPF, DKIM, and DMARC, you expect spoofed and unauthenticated messages to be stopped. But what if spam arrives perfectly authenticated – via Google’s own infrastructure? That’s what happened in our environment, and it took far too long to identify the true cause: Google Groups.

Despite their legitimacy on paper, many spam messages bypassed filtering because they originated from *.googlegroups.com, using valid SPF, DKIM signatures, and impressive reputation scores. These properties rendered Rspamd’s default filters ineffective. The spam arrived flawlessly, often with a List-ID header indicating a Google Group – yet Rspamd let them pass with minimal scoring.

Rejecting them based solely on source domain would be irresponsible. Google Groups are used by legitimate projects, teams, and mailing lists. A blunt solution like domain-based blocking would break too much.

So instead, I decided to analyze first, then block selectively.

Objective

Before taking action, I wanted visibility. Which Google Groups were sending messages to our infrastructure? Which ones were legitimate? Which ones were spam? The goal: create a curated allowlist and denylist based on the List-ID header, which uniquely identifies a mailing list – even across Google’s infrastructure.

Implementation

To extract actionable insights, I wrote a shell script:
analyze_google_groups.sh

It parses mail logs or .eml files, extracts the List-ID headers for messages received from Google Groups, and aggregates them by frequency.

Key characteristics:

Fast awk-based parsing
Groups by domain and count
Works with .eml archives or live logs
Exportable to whitelist/blacklist map formats for Rspamd

This gave me a clear picture of which List-IDs were:

Frequently seen (likely internal or legitimate)
Rare or suspicious (spam candidates)

Rspamd Integration

Once the data was available, I integrated it with Rspamd via multimap:

LIST_ID_WHITELIST {
  type = "header";
  header = "List-ID";
  filter = "regexp";
  map = "${LOCAL_CONFDIR}/local.d/listid_whitelist.map";
  score = -10.0;
}

LIST_ID_BLOCK {
  type = "header";
  header = "List-ID";
  filter = "regexp";
  map = "${LOCAL_CONFDIR}/local.d/listid_blacklist.map";
  score = 20.0;
  action = "reject";
}

Whitelisted List-IDs are explicitly scored down. Blacklisted ones are scored heavily and rejected immediately. This bypasses the issue of high reputation and perfect SPF/DKIM alignment.

Operational Tip: Always place the whitelist rule before the block rule in the configuration. Rspamd applies multimap entries in order, and once a reject is triggered, later rules are skipped.

Outcome

After activating the rules:

Spam from obscure or disposable Google Groups was stopped instantly.
Legitimate dev lists and team communication remained unaffected.
No false positives were reported.

Most importantly, we moved from guesswork to control.

Repository:

https://github.com/filipnet/google-groups-scan

Summary

Google Groups is not the enemy — but blind trust is.
When spam is relayed by a high-reputation domain, classic filters like SPF/DKIM won’t help.

My approach:

Use real data to identify groups in use
Only whitelist what’s proven legitimate
Reject or penalize everything else with confidence

This gives me full control over a common spam vector without disrupting valid traffic.

Introduction

Objective

Implementation

Rspamd Integration

Repository:

Summary

Related Posts

Nextcloud Calendar and Contacts

Self-Hosting Radicale CalDAV

Mirror GitHub Repos to Gitea