search menu icon-carat-right cmu-wordmark

CERT Coordination Center

Compilers permit Unicode control and homoglyph characters

Vulnerability Note VU#999008

Original Release Date: 2021-11-09 | Last Revised: 2024-12-10

Overview

Attacks that allow for unintended control of Unicode and homoglyphic characters, described by the researchers in this report leverage text encoding that may cause source code to be interpreted differently by a compiler than it appears visually to a human reviewer. Source code compilers, interpreters, and other development tools may permit Unicode control and homoglyph characters, changing the visually apparent meaning of source code.

Description

Internationalized text encodings require support for both left-to-right languages and also right-to-left languages. Unicode has built-in functions to allow for encoding of characters to account for bi-directional, or Bidi ordering. Included in these functions are characters that represent non-visual functions. These characters, as well as characters from other human language sets (i.e., English vs. Cyrillic) can also introduce ambiguities into the code base if improperly used.

This type of attack could potentially be used to compromise a code base by capitalizing on a gap in visually rendered source code as a human reviewer would see and the raw code that the compiler would evaluate.

Impact

The use of attacks that incorporate maliciously encoded source code may go undetected by human developers and by many automated coding tools. These attacks also work against many of the compilers currently in use. An attacker with the ability to influence source code could introduce undetected ambiguity into source code using this type of attack.

Solution

The simplest defense is to ban the use of text directionality control characters both in language specifications and in compilers implementing these languages.

Two CVEs were assigned to address the two types of attacks described in this report.

CVE-2021-42574 was created for tracking the Bidi attack. CVE-2021-42694 was created for tracking the homoglyph attack.

Acknowledgements

Thanks to the reporters, Nicholas Boucher and Ross Anderson of The University of Cambridge (UK).

This document was written by Chuck Yarbrough.

Vendor Information

999008
 

View all 16 vendors View less vendors


Other Information

CVE IDs: CVE-2021-42574 CVE-2021-42694
API URL: VINCE JSON | CSAF
Date Public: 2021-11-09
Date First Published: 2021-11-09
Date Last Updated: 2024-12-10 02:09 UTC
Document Revision: 3

Sponsored by CISA.