Adversaries may use homoglyphs in filenames to disguise malicious payloads and evade detection, leveraging lookalike characters to mimic legitimate files. SOC teams should proactively hunt for this behavior in Azure Sentinel to identify potential obfuscation tactics and prevent unauthorized file execution.
Detection Rule
title: Potential Homoglyph Attack Using Lookalike Characters in Filename
id: 4f1707b1-b50b-45b4-b5a2-3978b5a5d0d6
status: test
description: |
Detects the presence of unicode characters which are homoglyphs, or identical in appearance, to ASCII letter characters.
This is used as an obfuscation and masquerading techniques. Only "perfect" homoglyphs are included; these are characters that
are indistinguishable from ASCII characters and thus may make excellent candidates for homoglyph attack characters.
references:
- https://redcanary.com/threat-detection-report/threats/socgholish/#threat-socgholish
- http://www.irongeek.com/homoglyph-attack-generator.php
author: Micah Babinski, @micahbabinski
date: 2023-05-08
tags:
- attack.defense-evasion
- attack.t1036
- attack.t1036.003
# - attack.t1036.008
logsource:
category: file_event
product: windows
detection:
selection_upper:
TargetFilename|contains:
- "\u0410" # А/A
- "\u0412" # В/B
- "\u0415" # Е/E
- "\u041a" # К/K
- "\u041c" # М/M
- "\u041d" # Н/H
- "\u041e" # О/O
- "\u0420" # Р/P
- "\u0421" # С/C
- "\u0422" # Т/T
- "\u0425" # Х/X
- "\u0405" # Ѕ/S
- "\u0406" # І/I
- "\u0408" # Ј/J
- "\u04ae" # Ү/Y
- "\u04c0" # Ӏ/I
- "\u050C" # Ԍ/G
- "\u051a" # Ԛ/Q
- "\u051c" # Ԝ/W
- "\u0391" # Α/A
- "\u0392" # Β/B
- "\u0395" # Ε/E
- "\u0396" # Ζ/Z
- "\u0397" # Η/H
- "\u0399" # Ι/I
- "\u039a" # Κ/K
- "\u039c" # Μ/M
- "\u039d" # Ν/N
- "\u039f" # Ο/O
- "\u03a1" # Ρ/P
- "\u03a4" # Τ/T
- "\u03a5" # Υ/Y
- "\u03a7" # Χ/X
selection_lower:
TargetFilename|contains:
- "\u0430" # а/a
- "\u0435" # е/e
- "\u043e" # о/o
- "\u0440" # р/p
- "\u0441" # с/c
- "\u0445" # х/x
- "\u0455" # ѕ/s
- "\u0456" # і/i
- "\u04cf" # ӏ/l
- "\u0458" # ј/j
- "\u04bb" # һ/h
- "\u0501" # ԁ/d
- "\u051b" # ԛ/q
- "\u051d" # ԝ/w
- "\u03bf" # ο/o
condition: 1 of selection_*
falsepositives:
- File names with legitimate Cyrillic text. Will likely require tuning (or not be usable) in countries where these alphabets are in use.
level: medium
imFileEvent
| where (TargetFileName contains "А" or TargetFileName contains "В" or TargetFileName contains "Е" or TargetFileName contains "К" or TargetFileName contains "М" or TargetFileName contains "Н" or TargetFileName contains "О" or TargetFileName contains "Р" or TargetFileName contains "С" or TargetFileName contains "Т" or TargetFileName contains "Х" or TargetFileName contains "Ѕ" or TargetFileName contains "І" or TargetFileName contains "Ј" or TargetFileName contains "Ү" or TargetFileName contains "Ӏ" or TargetFileName contains "Ԍ" or TargetFileName contains "Ԛ" or TargetFileName contains "Ԝ" or TargetFileName contains "Α" or TargetFileName contains "Β" or TargetFileName contains "Ε" or TargetFileName contains "Ζ" or TargetFileName contains "Η" or TargetFileName contains "Ι" or TargetFileName contains "Κ" or TargetFileName contains "Μ" or TargetFileName contains "Ν" or TargetFileName contains "Ο" or TargetFileName contains "Ρ" or TargetFileName contains "Τ" or TargetFileName contains "Υ" or TargetFileName contains "Χ") or (TargetFileName contains "а" or TargetFileName contains "е" or TargetFileName contains "о" or TargetFileName contains "р" or TargetFileName contains "с" or TargetFileName contains "х" or TargetFileName contains "ѕ" or TargetFileName contains "і" or TargetFileName contains "ӏ" or TargetFileName contains "ј" or TargetFileName contains "һ" or TargetFileName contains "ԁ" or TargetFileName contains "ԛ" or TargetFileName contains "ԝ" or TargetFileName contains "ο")
Scenario: Legitimate File Naming with Unicode Characters
Description: A system administrator or developer may use Unicode characters (e.g., ü, ç, ñ) in filenames for localization or multilingual support.
Filter/Exclusion: Exclude files with Unicode characters that are part of a known multilingual file naming convention (e.g., *.txt, *.log) or files created by specific tools like gettext or Babel during localization tasks.
Scenario: Scheduled Job Generating Temporary Files
Description: A scheduled job (e.g., cron job, Task Scheduler task) may generate temporary files with lookalike characters to avoid confusion with other files.
Filter/Exclusion: Exclude files created by known scheduled jobs (e.g., backup_script.sh, daily_cleanup.bat) or files with timestamps in their names (e.g., report_20250415.txt).
Scenario: User-Generated Content with Unicode Characters
Description: Users may upload files with Unicode characters in their names (e.g., résumé.docx, café.jpg) for personal or project-specific reasons.
Filter/Exclusion: Exclude files uploaded by known users or groups (e.g., [email protected], project-team) or files with specific MIME types (e.g., image/jpeg, application/pdf).
Scenario: Log File with Unicode Characters
Description: System or application logs may contain Unicode characters (e.g., ł, ø, ß) due to locale settings or internationalized logging.
Filter/Exclusion: Exclude files with known log formats (e.g., syslog, auth.log, application.log) or files generated by specific logging tools (e.g., rsyslog, logrotate,