Understanding DMARC Aggregate Reports (rua): XML Decoded

The Reports Nobody Reads

You set up DMARC six months ago. You added rua=mailto:dmarc@example.com to your record. Every day since, you have received a handful of ZIP files containing XML documents from Gmail, Yahoo, Microsoft, and a dozen other providers. The mailbox has thousands of unread messages. You have never opened a single one.

You are not alone. DMARC aggregate reports are the most useful and least-used artifact in email authentication. They tell you exactly which IPs are sending mail as your domain, how much of it passes authentication, and whether someone is spoofing you. But the XML format is dense, the volume is high, and without a parser the data is opaque.

This article decodes the XML structure and shows you how to extract the information that matters.

What an Aggregate Report Contains

Each report is a daily summary from one reporting organization. It contains one or more <record> elements, each describing mail from a specific source IP. A record includes:

source_ip — The IP address that connected to the reporter's MTA.
count — How many messages from that IP.
disposition — What the reporter did with the messages: none (delivered), quarantine (spam folder), or reject (bounced).
dkim — DKIM result: pass or fail, plus the signing domain.
spf — SPF result: pass, fail, softfail, neutral, or none, plus the envelope domain.
header_from — The domain in the From: header. This is what DMARC evaluates against.

Here is a minimal example record from a real report:

<record>
  <row>
    <source_ip>198.51.100.25</source_ip>
    <count>47</count>
    <policy_evaluated>
      <disposition>none</disposition>
      <dkim>pass</dkim>
      <spf>pass</spf>
    </policy_evaluated>
  </row>
  <identifiers>
    <header_from>example.com</header_from>
  </identifiers>
  <auth_results>
    <dkim>
      <domain>example.com</domain>
      <result>pass</result>
    </dkim>
    <spf>
      <domain>example.com</domain>
      <result>pass</result>
    </spf>
  </auth_results>
</record>

Nothing surprising: 47 messages, all passed, all delivered. This is what a healthy report looks like. The interesting records are the failures.

Finding Spoofing Attempts

A record with dkim=fail, spf=fail, and header_from=example.com means someone attempted to send mail as your domain from an IP you do not control. The source_ip tells you where it came from. The count tells you the volume.

If your DMARC policy is p=reject, those messages were bounced. The report confirms the policy worked. If your policy is p=none, those messages were delivered — the report is telling you that you have a problem you are not yet stopping.

A single spoofing attempt from a residential IP is likely just a misconfigured newsletter or a user who added their personal Gmail as a send-as address in your domain without authorization. Repeated attempts from a hosting provider's IP range suggest someone is actively abusing your domain.

Finding Shadow Senders

More common than outright spoofing: legitimate services sending as your domain that you forgot about, or that a department signed up for without telling IT. The report shows a source IP you do not recognize, with dkim=pass or spf=pass on your domain. The mail is authentic — someone configured SPF or DKIM for a service you did not know existed.

This is shadow IT in your mail stream. It is not malicious, but it is a governance problem. That service might have weak security practices. It might send poorly formatted mail that damages your sender reputation. It might get compromised and used for phishing.

Cross-reference the source IP with a WHOIS lookup:

whois 198.51.100.25 | grep -i org

The organization name usually identifies the service. SendGrid, Mailgun, and Amazon SES IPs are easy to spot. Unknown ASNs require more digging.

Parsing the XML

You can parse DMARC reports with standard XML tools. Most reports arrive as gzip-compressed XML files. Extract and examine a single report:

gunzip -c report.xml.gz | xmllint --format - | less

To extract just the source IPs and pass/fail counts across all reports in a directory, a quick one-liner with xmlstarlet:

for f in *.xml; do
  xmlstarlet sel -t -m "//record" \
    -v "row/source_ip" -o "," \
    -v "row/count" -o "," \
    -v "row/policy_evaluated/dkim" -o "," \
    -v "row/policy_evaluated/spf" -o "," \
    -v "identifiers/header_from" -n "$f"
done

This produces a CSV of every record across all reports, which you can sort, count, and filter:

sort dmarc.csv | uniq -c | sort -rn | head -20

For ongoing monitoring, use a dedicated DMARC report processor. Several open-source options exist: parsedmarc (Python) ingests reports into Elasticsearch or Splunk, dmarcts-report-viewer (PHP/MySQL) provides a web dashboard, and commercial services like Valimail, Dmarcian, and Postmark DMARC handle parsing and alerting.

Extracting Actionable Data

Set up a pipeline before you need it. The day you move from p=none to p=quarantine or p=reject, you will want a baseline of what normal looks like. If reports show 10,000 messages per day from your legitimate sources and suddenly 50,000 messages from a Bulgarian VPS provider appear, you have a problem that needs attention immediately.

The Email Authentication Checker on this site validates your SPF, DKIM, and DMARC configuration, but for ongoing DMARC monitoring you need to process the reports. At minimum, configure a rule in your mail client to filter DMARC reports into a dedicated folder and open one per week to scan for unknown IPs.

Forensic Reports (ruf)

DMARC also supports forensic reports via the ruf= tag. These contain redacted copies of individual messages that failed authentication. In theory, forensic reports show you exactly what the spoofed message looked like — subject line, body, links.

In practice, almost no major receiver sends forensic reports. Gmail does not. Microsoft does not. Yahoo does not. The privacy and data-handling concerns are too high. If you add a ruf= address, expect near-total silence. Put your effort into aggregate reports instead.

Working with the Email Header Analyzer

When a specific message lands in spam or gets bounced, aggregate reports are too high-level to help. You need to examine the headers of that specific message. The Email Header Analyzer parses raw headers and extracts every hop, authentication result, and timing detail.

Paste the headers from a delivered message into the analyzer. It shows you the SPF, DKIM, and DMARC results as evaluated by each receiving MTA in the delivery path. If Gmail saw dkim=fail but your DKIM key is valid, check whether an intermediary modified a signed header. If spf=softfail appears, check whether the connecting IP is in your SPF record. The analyzer traces the full path and highlights where authentication broke.

DMARC reports tell you that something is wrong with your mail stream. The header analyzer tells you exactly what is wrong with a specific message. Together they cover both aggregate monitoring and individual troubleshooting.

← Back to Blog