Attack Techniques

What Is a Homograph Attack?

A homograph attack, also called a homoglyph attack or IDN homograph attack, is a deceptive cyberattack where threat actors register domain names using characters from different writing systems—Cyrillic, Greek, Latin—that visually resemble one another.

Always automated.

Nothing to manage.

Leave Training & Simulated Phishing to us.

A homograph attack, also called a homoglyph attack or IDN homograph attack, is a deceptive cyberattack where threat actors register domain names using characters from different writing systems—Cyrillic, Greek, Latin—that visually resemble one another. Attackers replace Latin characters with nearly identical-looking non-Latin characters to create domains that appear legitimate to human eyes but are technically different to computer systems. The goal is to trick users into visiting malicious sites they believe are legitimate, according to Palo Alto Networks Unit 42, CyberWire, and Wikipedia research published in 2024-2025.

Unlike typosquatting that relies on spelling errors or combosquatting that adds keywords, homograph attacks exploit Unicode character sets to create visually perfect imitations of legitimate domains that are indistinguishable to casual observation.

How does a homograph attack work?

Homograph attacks exploit the visual similarity between characters from different Unicode writing systems to create domains that appear identical to legitimate ones.

Character substitution begins when attackers identify Unicode characters from non-Latin scripts that visually resemble Latin characters. Common substitutions include Cyrillic 'а' (U+0430), 'о' (U+043E), 'р' (U+0440), 'с' (U+0441), 'х' (U+0445), and 'у' (U+0443), and Greek 'α' (alpha) and 'ο' (omicron). These characters appear identical to their Latin equivalents but use different Unicode values.

Domain creation proceeds when these lookalike characters are substituted into legitimate domain names. For example, using Cyrillic characters to create "google.com" where some letters are Cyrillic rather than Latin creates a domain that appears visually identical but is technically different.

Visual deception succeeds when to human observers, the forged domain appears identical to the legitimate one. Computer systems recognize the domains as distinct because different Unicode values represent each character. This creates a disconnect between human perception and technical reality.

Distribution occurs when attackers embed these URLs in phishing emails claiming to be from legitimate organizations, advertisements that appear in search results or on legitimate sites, and social engineering attacks via messaging platforms. Users cannot distinguish the homograph domain from the legitimate one by visual inspection alone.

Real-world attack cases demonstrate the technique's effectiveness. Google Drive impersonation saw attackers create fake financial service accounts using homograph-laden display names to share malicious documents requesting user verification, according to Unit 42 research published in 2024. DocuSign credential harvesting involved threat actors impersonating DocuSign with homograph characters in sender addresses and subject lines, directing victims through elaborate redirect chains to credential-harvesting pages. Spotify phishing used homograph characters in sender address, claiming payment issues to trick users into updating account details.

How does a homograph attack differ from other domain attacks?

Homograph attacks employ distinct mechanisms compared to related domain-based attack techniques.

Typosquatting relies on user typing errors while homograph attacks use character substitution within correctly-spelled domains. Combosquatting combines brand names with keywords while homograph attacks use visually identical Unicode replacements. Domain spoofing is a broader category with homograph attacks as a specific type using Unicode character substitution. Lookalike domains represent a broader category with homograph attacks as a sophisticated subcategory. Email spoofing targets email users in general, but homograph attacks focus specifically on Unicode character deception.

The key distinction is visual perfection versus detectable differences. Typosquatted domains like "goggle.com" are visually distinct from "google.com" if examined carefully. Homograph domains using Cyrillic characters appear pixel-perfect identical to "google.com" in many fonts and display contexts.

This makes homograph attacks particularly dangerous because users cannot detect them through careful URL inspection alone. Even security-conscious users who check URLs before clicking may be deceived by homograph domains.

Why do homograph attacks matter?

The sophistication and detection difficulty of homograph attacks demonstrate significant risk despite relatively limited deployment.

8,284 IDN homographs were identified in 4.4 million registered internationalized domain names across 570 TLDs, according to Chiba et al. research via DomainScouter published in 2019. While this represents a small percentage of total domains, the targeted nature of these registrations indicates deliberate attack infrastructure.

Average of 67 newly-detected homograph IDNs per day were identified during a recent 32-day analysis. This consistent discovery rate indicates ongoing attacker investment in the technique.

Every identified homograph IDN was queried in DNS traffic by at least one device during the monitoring period, suggesting active use rather than defensive registration.

80% of homoglyph pairs occur within high-traffic domains worldwide, with regional concentrations in Pakistan and Europe. This targeting of high-value domains indicates strategic selection by attackers.

90% of organizations faced domain-based attacks with average losses of $1.1 million per incident, according to IDC research published in 2023. While not all domain-based attacks use homographs, the technique represents one of the most sophisticated approaches.

Homograph attacks are primarily used by Advanced Persistent Threats (APTs) and high-level adversaries requiring technical sophistication to set up and maintain. This targeting by sophisticated actors indicates deployment against high-value targets including government, finance, and critical infrastructure.

What are the limitations of homograph attacks?

Despite their visual perfection, homograph attacks face operational and technical constraints that limit widespread deployment.

Technical sophistication requirements limit attacker pool when the attack requires technical knowledge to identify and register appropriate homograph domains, maintenance burden as attackers must manage multiple domain registrations and keep them active, and cost of registering many homograph domains for comprehensive coverage is substantial.

Browser and system defenses have evolved with modern browsers increasingly highlighting IDN domains with non-ASCII characters, Unicode-aware security tools detecting character substitution through metadata analysis, and browser URL bar inspection revealing non-Latin characters to observant users in many contexts.

Cultural and linguistic awareness provides detection opportunities when international users may recognize non-Latin characters used incorrectly, creating suspicion when Cyrillic characters appear in domains for English-language services. Native speakers of languages using these scripts can often identify misuse.

Email client rendering varies when some email clients may not render non-Latin characters properly, hiding attack indicators or revealing the technical difference between homograph and legitimate domains depending on client configuration.

Mobile browser limitations can either hide or reveal homographs when mobile browsers often display abbreviated domain names, obscuring the homograph, but some mobile systems render Unicode differently, making non-Latin characters more visible.

Many users do not carefully inspect URLs before clicking links. Visual similarity makes user awareness training less effective than for other attack types. International Domain Name (IDN) adoption creates legitimate need for non-Latin domains, making detection harder. Cost and complexity limit the number of homograph variations attackers can feasibly maintain.

How can organizations defend against homograph attacks?

Defense against homograph attacks requires implementing Unicode-aware detection systems, monitoring for malicious registrations, and establishing strong email security controls.

Deploy advanced email filtering through email security solutions analyzing metadata, content, and behavioral patterns to detect homograph domains, Unicode character analysis to flag non-ASCII substitutions in sender addresses and domains, sender reputation systems that identify newly registered IDN domains, and integration with threat intelligence feeds containing known homograph domains.

Implement domain monitoring by monitoring for homograph registrations targeting brand names across multiple language scripts including Cyrillic, Greek, Arabic, and Chinese character sets, subscribing to domain registration feeds that include IDN registrations, analyzing Certificate Transparency logs for SSL certificates issued for homograph variants, and establishing alerts for newly registered domains using brand names in non-Latin scripts.

Deploy sender verification protocols through DMARC, SPF, and DKIM to authenticate email senders and prevent spoofing, strict DMARC policies (p=reject) that block emails failing authentication, monitoring DMARC reports to identify spoofing attempts using homograph domains, and email header analysis to detect discrepancies between display names and actual sender domains.

Conduct user training by educating users on examining Unicode characters and full domain names, teaching users not to rely solely on visual appearance but to verify through alternative channels, training users to recognize when domains contain mixed scripts (Latin plus Cyrillic), and encouraging users to use bookmarks or password managers rather than manually entering URLs.

Monitor IDN registrations through continuous monitoring of internationalized domain name registrations for suspicious patterns, tracking domains registered with character sets not matching expected business geography, analyzing registration patterns for bulk IDN registrations of brand-related terms, and coordinating with registrars to identify and take down malicious homograph registrations.

Watch Certificate Transparency logs by monitoring for SSL certificates issued for homograph variants of organizational domains, setting alerts for certificates containing brand names in non-Latin character sets, analyzing certificate issuance patterns to identify homograph attack infrastructure, and reporting fraudulent certificates to Certificate Authorities for revocation.

Register common homograph variants preemptively across high-risk language scripts including Cyrillic, Greek, and other scripts with visually similar characters, prioritizing registration of high-value brand terms in homograph form, maintaining defensive registrations even for domains not actively used, and working with domain registrars that support IDN registrations.

Deploy technical controls using Unicode/Punycode analyzers that reveal the technical representation of IDN domains, displaying punycode domain conversion to reveal hidden characters (xn-- prefix indicates punycode), Content Security Policy (CSP) headers to restrict which domains can be loaded, email link rewriting to route links through security scanning systems, and DNS monitoring for suspicious homograph domain queries.

FAQs

How is a homograph attack different from other lookalike domain attacks?

Unlike typosquatting which exploits typing errors or combosquatting which adds keywords to brand names, homograph attacks use Unicode character substitution, according to Palo Alto Networks Unit 42 research published in 2024. To human eyes, the domain appears identical to the legitimate one, but computer systems recognize it as different because it uses characters from different writing systems.

The visual perfection distinguishes homograph attacks from other techniques. A typosquatted domain like "goggle.com" is visually distinct if examined. A homograph domain using Cyrillic 'о' instead of Latin 'o' appears pixel-perfect identical in most fonts.

Can I see a homograph attack in my browser's URL bar?

Modern browsers increasingly show warnings for IDN domains with mixed scripts, and some display the technical punycode representation, according to Palo Alto Networks and Akamai research. However, older browsers or certain email clients may render homograph domains indistinguishably from legitimate ones, which is why this attack is particularly dangerous.

Punycode representation begins with "xn--" and reveals the encoded Unicode characters. For example, a homograph domain might display as "xn--pple-43d.com" in punycode form, revealing it is not the actual "apple.com" domain.

Users should enable browser security features that warn about IDN domains and display punycode when available.

What characters are most commonly used in homograph attacks?

Cyrillic characters are most commonly used because they visually resemble Latin characters, according to Palo Alto Networks Unit 42 research. Specific high-value substitutions include Cyrillic 'а' looks like Latin 'a', 'о' looks like 'o', 'р' looks like 'p', and 'с' looks like 'c'. Greek characters including alpha and omicron are also used.

The effectiveness of specific character substitutions depends on the font and rendering system. Some character pairs are indistinguishable in certain fonts while others may show subtle differences. Attackers test various font environments to identify the most convincing substitutions.

Are homograph attacks widespread?

Homograph attacks are not mainstream attacks used by typical cybercriminals, according to Akamai and Unit 42 research published in 2023-2024. They require technical sophistication and are primarily used by Advanced Persistent Threats (APTs) and sophisticated threat actors for targeted campaigns. Daily detection of approximately 67 new homograph domains indicates ongoing activity by these advanced groups.

The technique remains relatively rare compared to typosquatting and combosquatting, which are deployed at much larger scale. However, homograph attacks represent the most sophisticated form of domain impersonation and are deployed selectively against high-value targets.

How do legitimate international domain names differ from homograph attacks?

Legitimate IDNs use non-Latin characters intentionally and transparently for websites serving non-English speaking populations. Homograph attacks disguise non-Latin characters to impersonate Latin-character domains. The malicious intent and deceptive substitution distinguish attacks from legitimate internationalization.

A legitimate IDN might be "موقع.com" for an Arabic-language site, clearly using Arabic script. A homograph attack uses Cyrillic 'а' substituted for Latin 'a' in "paypal.com" to create a visually identical but technically different domain intended to deceive users.