Google Groups as a Spam Vector

Introduction

If you’re operating a modern mail system and rely on technologies like SPF, DKIM, and DMARC, you expect spoofed and unauthenticated messages to be stopped. But what if spam arrives perfectly authenticated – via Google’s own infrastructure? That’s what happened in our environment, and it took far too long to identify the true cause: Google Groups.

Despite their legitimacy on paper, many spam messages bypassed filtering because they originated from *.googlegroups.com, using valid SPF, DKIM signatures, and impressive reputation scores. These properties rendered Rspamd’s default filters ineffective. The spam arrived flawlessly, often with a List-ID header indicating a Google Group – yet Rspamd let them pass with minimal scoring.

Rejecting them based solely on source domain would be irresponsible. Google Groups are used by legitimate projects, teams, and mailing lists. A blunt solution like domain-based blocking would break too much.

So instead, I decided to analyze first, then block selectively.

Objective

Before taking action, I wanted visibility. Which Google Groups were sending messages to our infrastructure? Which ones were legitimate? Which ones were spam? The goal: create a curated allowlist and denylist based on the List-ID header, which uniquely identifies a mailing list – even across Google’s infrastructure.

Implementation

To extract actionable insights, I wrote a shell script:
analyze_google_groups.sh

It parses mail logs or .eml files, extracts the List-ID headers for messages received from Google Groups, and aggregates them by frequency.

Key characteristics:

  • Fast awk-based parsing
  • Groups by domain and count
  • Works with .eml archives or live logs
  • Exportable to whitelist/blacklist map formats for Rspamd

This gave me a clear picture of which List-IDs were:

  • Frequently seen (likely internal or legitimate)
  • Rare or suspicious (spam candidates)

Blocking Google Groups in Rspamd with Multimap Whitelist

To block all Google Groups messages by default and allow only specific ones, use a multimap whitelist. This avoids relying on scores and makes the policy predictable.

/etc/rspamd/local.d/multimap.conf

ALLOW_GOOGLEGROUPS {
  type = "header";
  header = "From";
  map = "file:///etc/rspamd/local.d/googlegroups_allow.map";
  action = "accept";
}

BLOCK_GOOGLEGROUPS {
  type = "header";
  header = "X-Google-Group-Id";
  action = "reject";
  message = "Google Groups mailing lists are not permitted";
}

/etc/rspamd/maps/googlegroups_allow.map

# Example group for internal alerts
^internal-alerts@googlegroups\.com$

# Example group for operations reports
^ops-reports@googlegroups\.com$

# Example group for neighbourhood parents
^neighbourhood-parents@googlegroups\.com$

Reload

rspamadm configtest && systemctl reload rspamd

Outcome After activating the rules:

  • Spam from obscure or disposable Google Groups was stopped instantly.
  • Legitimate dev lists and team communication remained unaffected.
  • No false positives were reported.

Most importantly, we moved from guesswork to control.

Repository:

https://github.com/filipnet/google-groups-scan

Summary

Google Groups is not the enemy — but blind trust is.
When spam is relayed by a high-reputation domain, classic filters like SPF/DKIM won’t help.

My approach:

  • Use real data to identify groups in use
  • Only whitelist what’s proven legitimate
  • Reject or penalize everything else with confidence

This gives me full control over a common spam vector without disrupting valid traffic.