Why Every Phishing Simulator Has False Positives — Except One

Why Every Phishing Simulator Has False Positives — Except One

Every other SAT platform uses pixel tracking to detect opens. Pixel tracking fires whenever scanners, link rewriters, or preview panes touch an email. Here's the architectural decision that eliminates them.



TL;DR

Every phishing simulation platform except Kinds Security uses pixel tracking to detect when users open emails. Pixel tracking produces false positives because any system that touches the email — secure email gateways, link rewriters, preview panes — can trigger the pixel without the user ever seeing the message. Kinds uses Microsoft Graph API and Gmail API polling instead, which eliminates false positives by design. The trade-off: this requires elevated OAuth permissions that about 11% of customers won't grant. Kinds doesn't serve those customers.

The phishing simulation dashboard says Jane in accounting opened the simulated DocuSign email three times in four minutes. Jane swears she never saw it. Her Outlook shows it in the inbox, unread, preview disabled.

Both are telling the truth.

Every other phishing simulation platform on the market uses pixel tracking to detect opens. A 1×1 transparent image is embedded in the email body. When the image loads, the platform records an "open." It's the same mechanism marketing emails have used for twenty years, and it has the same problem: the image fires whenever any automated system touches the email. In modern mail infrastructure, that happens constantly.

This post is about why that happens, what it costs you, and the architectural decision Kinds made to eliminate it.

Definition — False positive (in phishing simulation): A detection event recorded by the platform without corresponding user exposure to the email. Scanner activity, prefetch requests, and automated inspection can all trigger pixel fires without a human ever opening the message.

What actually causes false positives

A 1×1 tracking pixel doesn't care who loads it. It fires whenever any client, server, or automated system requests the image. In a typical Microsoft 365 or Google Workspace environment, between three and eight different systems can request that image before the user ever sees the email.

Secure Email Gateways — Mimecast, Proofpoint, Barracuda, and similar products open and inspect messages during delivery. Image requests fire during inspection.

Link rewriters and URL sandboxes — Microsoft Defender Safe Links, Mimecast URL Protection, and similar tools prefetch embedded content to check for malicious payloads. Pixels fire during prefetch.

Email clients with image auto-load — Outlook on the web and some mobile clients load images as soon as the mailbox syncs, whether the user opens the message or not.

Preview panes with image download enabled — display images when the user scrolls past the message in list view, even without opening it.

Forwarded messages, alternate devices, and mail-to-Slack integrations each generate additional pixel fires unrelated to the original recipient actually reading the message.

One real phishing simulation, delivered to one real user, can produce anywhere from zero to a dozen pixel fires before the user ever consciously engages with the email. The platform records every one of those as an "open."

The false-positive problem is well-known inside the industry. Major SAT vendors reference "preventing false positives" directly in their own configuration documentation.

The three architectural patterns in SAT today

There are really only three ways a phishing simulation platform can deliver an email to a user and detect when they open it. Each combination of choices produces a different false-positive rate.


Delivery method

Detection method

Triggered by scanners?

Requires elevated permissions?

False positives possible?

Pattern A

Inbound SMTP + allowlisting or Advanced Delivery Policy

Pixel tracking

Yes

No

Yes

Pattern B

API injection (Microsoft Graph / Gmail API)

Pixel tracking

Partial

Yes

Yes

Pattern C

API injection (Microsoft Graph / Gmail API)

API polling (isRead / UNREAD)

No

Yes

No

Pattern A is the industry default. The email travels the normal inbound mail path (SMTP → MX → SEG → inbox), relying on the customer to allowlist the vendor's sending IPs to prevent filtering. Detection uses a tracking pixel. Every scanner and link rewriter in the path can trigger the pixel.

Pattern A has two common sub-variants: basic IP allowlisting in the customer's mail filter, and Microsoft Advanced Delivery Policy, which provides a Defender exception for permitted third-party phishing simulators. Both route mail through the inbound path and both rely on pixel tracking for detection.

Pattern B improves the delivery side — the email is placed directly into the user's mailbox via API, bypassing the mail gateway — but still uses pixels for detection, which means false positives persist. Several vendors offer this as an optional configuration. Some are actively moving away from it due to the February 2025 Microsoft Application Impersonation retirement, which affected the legacy EWS-based injection approach. (Modern Microsoft Graph API injection is unaffected and fully supported by Microsoft — the retirement only broke the legacy pattern.)

Pattern C combines API injection on delivery with API polling on detection. Because the platform injected the email via API, it retains the exact emailMessageId, which means it can poll isRead (Microsoft) or the UNREAD label (Google) directly against the user's mailbox. Scanners, link rewriters, and preview fetches don't flip isRead — only the user's mail client does. No pixel, no false positive.

Kinds Security uses Pattern C by default. No other SAT vendor currently does.

What false positives cost you

Your training data is wrong. If a user is recorded as opening three simulations they never saw, the platform thinks that user is engaging with phishing content at a higher rate than they actually are. Risk scores get inflated. "High-risk learner" tags get assigned to people who did nothing wrong. Remediation training gets pushed to the people who least need it, while the people who actually need it blend into the average.

Your compliance reporting is wrong in the same direction. Auditors and cyber insurance carriers look at engagement rates. When those rates are artificially elevated by scanner activity, you're reporting performance that doesn't reflect what your employees actually did. The report passes the audit. The numbers underneath it don't mean anything.

Your MSP helpdesk fields the complaints. The user who swears they didn't open the email opens a ticket. The MSP investigates. The MSP tries to explain to the client that the platform uses pixel tracking and sometimes counts pre-fetches. The user is unsatisfied. The client calls the MSP's operations lead. Twenty minutes disappear on a ticket that should never have existed. Multiply by 40 clients and the labor cost of false positives becomes a real line item.

The common industry framing is that a few false positives are an acceptable cost of coverage. They're not. They're the reason MSP operators stop trusting their own dashboards — and a platform you can't trust isn't really running your security awareness program.

Why other vendors settled for Patterns A and B

The Pattern C architecture requires elevated OAuth permissions. On Microsoft 365, that's the Mail.ReadWrite application permission, granted by an admin at the tenant level. On Google Workspace, it's the equivalent scope for Gmail API access.

Roughly 89% of customers are willing to grant those permissions. The other 11% aren't — usually because enterprise security policies prohibit broad third-party mailbox access, or because the security team hasn't approved that scope class for any vendor.

Every other SAT vendor has made the same commercial decision: build for 100% coverage, accept the detection inaccuracy, default to pixels. The honest version of the industry position is "we'd lose 11% of deals if we required these permissions, so we chose an architecture that works for everyone at the cost of accuracy." That's a defensible business decision. It's also the reason Jane shows up as having opened three emails she never saw.

The architectural decision Kinds made

Kinds requires Graph API or Gmail API permissions at onboarding. Customers who won't grant them aren't served.

Here's what actually happens when a Kinds phishing simulation reaches a user:

Delivery. The simulation is placed directly into the user's mailbox via the customer's tenant OAuth grant. The email isn't routed through SMTP, doesn't pass through the customer's secure email gateway, and doesn't get scanned, rewritten, or banner-injected. It arrives intact — exactly as a real attacker's email would appear if the attacker had somehow bypassed the gateway.

Read detection. Because the email was injected via API, Kinds retains the exact emailMessageId. A cron job polls the read state every five minutes over a fifteen-day window — isRead on Microsoft tenants (batched via Graph $batch requests) and the UNREAD label on Google Workspace tenants.

What counts as an "open." The user's own mail client flipped isRead to true. No scanner can do this. No link rewriter can do this. No preview-fetch from Defender can do this. A human, or a mail client configured to mark-as-read on the user's behalf, caused the state change.

By the definition of false positive stated at the top of this post — a detection event not caused by the user — the polling approach produces zero false positives. Not dramatically fewer. Not mostly accurate. Zero.

What this means for your training data

When the detection is accurate, everything downstream of it becomes useful.

Risk scores reflect actual user behavior. "High-risk learner" classifications are earned. Remediation training goes to the people who need it. Compliance reports show real engagement rates that an auditor or insurance carrier can rely on. The MSP dashboard says what happened, not what scanners did.

The platform's "recovered" signal — users who fell for a phish and then later recognized and reported it — becomes meaningful rather than noisy. That signal is the behavior worth training toward, because real attackers will eventually land something convincing. The question isn't whether your team catches every phish on the first look. It's whether they catch the one that got through on the second look. You can only answer that question if your detection data is real.

The 11% trade-off

Kinds won't serve customers who won't grant the required permissions. This is a deliberate commercial trade-off.

For 89% of MSPs and IT teams, Graph API or Gmail API permissions are already standard — the same permission scope other security platforms in the stack already require. For the remaining 11%, usually governed by policies that prohibit broad third-party mailbox access, Kinds isn't the right fit. We'd rather say that out loud than ship a fallback architecture that produces wrong numbers.

Most SaaS vendors wouldn't make that choice. We think the 89% of customers who care about accurate detection deserve an architecture built for them, not a lowest-common-denominator compromise that protects the 11% at everyone else's expense.

Frequently Asked Questions

Why do phishing simulation platforms report opens by users who never opened the email?

Most platforms use a 1×1 tracking pixel embedded in the email to detect opens. The pixel fires whenever any automated system loads it — secure email gateways during inspection, link rewriters during URL prefetching, preview panes when a user scrolls past, mobile clients on mailbox sync. None of these require the user to actually open the message. The platform records them all as opens.

What is mailbox injection in phishing simulation?

Mailbox injection is a delivery method in which the phishing simulation is placed directly into the user's mailbox through the customer's authorized API access, rather than being sent through the normal SMTP inbound path. Because the email bypasses the mail gateway, it arrives intact — no link rewriting, no banner injection, no scanner inspection. The platform also retains the exact message ID, which enables direct polling of read state rather than pixel-based detection.

What Microsoft Graph API permissions does phishing simulation require?

Mailbox injection and API-based read polling on Microsoft 365 require the Mail.ReadWrite application permission, granted by a tenant admin. The equivalent permission on Google Workspace grants API access to the Gmail service. These are standard OAuth scopes, but they're broader than some enterprises allow for third-party applications.

Can phishing simulation work without pixel tracking?

Yes. By using Microsoft Graph API or Gmail API polling to check the isRead status of injected messages directly, a phishing simulator can detect opens without relying on tracking pixels. This approach is architecturally more accurate because read state is only modified by the user's mail client — not by scanners, link rewriters, or automated inspection systems.

Why don't most SAT vendors use API polling for read detection?

API polling requires the customer to grant elevated OAuth permissions to the vendor. Roughly 11% of customers won't grant these permissions for security policy reasons. Rather than build an architecture that excludes 11% of the market, most vendors default to pixel tracking, which works for everyone but produces false positives.

What is the February 2025 Application Impersonation retirement?

Microsoft retired the Application Impersonation admin role in February 2025, which affected the legacy EWS-based approach some phishing simulation vendors used for mail injection. Modern Microsoft Graph API-based injection with the Mail.ReadWrite application permission is unaffected by this retirement and remains fully supported and documented by Microsoft.

The bottom line

Every other vendor chose coverage over accuracy. Kinds chose the other way.

If your current phishing simulation platform reports opens by users who swear they never saw the email, now you know why. The platform isn't lying. It's just measuring something other than what you thought it was measuring — the behavior of scanners and mail clients, not the behavior of your employees.

You don't have a training problem. You have a detection problem.

Always automated.
Nothing to manage.

Leave Training & Simulated Phishing to us.

Leave Training & Simulated Phishing to us.

Always automated.
Nothing to manage.

Leave Training & Simulated Phishing to us.

Always automated.
Nothing to manage.

Leave Training & Simulated Phishing to us.

© 2026 Kinds Security Inc. All rights reserved.

© 2026 Kinds Security Inc. All rights reserved.

© 2026 Kinds Security Inc. All rights reserved.