search menu icon-carat-right cmu-wordmark

CERT Coordination Center

Compilers permit Unicode control and homoglyph characters

Vulnerability Note VU#999008

Original Release Date: 2021-11-09 | Last Revised: 2024-12-10

Overview

Attacks that allow for unintended control of Unicode and homoglyphic characters, described by the researchers in this report leverage text encoding that may cause source code to be interpreted differently by a compiler than it appears visually to a human reviewer. Source code compilers, interpreters, and other development tools may permit Unicode control and homoglyph characters, changing the visually apparent meaning of source code.

Description

Internationalized text encodings require support for both left-to-right languages and also right-to-left languages. Unicode has built-in functions to allow for encoding of characters to account for bi-directional, or Bidi ordering. Included in these functions are characters that represent non-visual functions. These characters, as well as characters from other human language sets (i.e., English vs. Cyrillic) can also introduce ambiguities into the code base if improperly used.

This type of attack could potentially be used to compromise a code base by capitalizing on a gap in visually rendered source code as a human reviewer would see and the raw code that the compiler would evaluate.

Impact

The use of attacks that incorporate maliciously encoded source code may go undetected by human developers and by many automated coding tools. These attacks also work against many of the compilers currently in use. An attacker with the ability to influence source code could introduce undetected ambiguity into source code using this type of attack.

Solution

The simplest defense is to ban the use of text directionality control characters both in language specifications and in compilers implementing these languages.

Two CVEs were assigned to address the two types of attacks described in this report.

CVE-2021-42574 was created for tracking the Bidi attack. CVE-2021-42694 was created for tracking the homoglyph attack.

Acknowledgements

Thanks to the reporters, Nicholas Boucher and Ross Anderson of The University of Cambridge (UK).

This document was written by Chuck Yarbrough.

Vendor Information

999008
 

Atlassian Affected

Notified:  2021-09-27 Updated: 2021-11-09

Statement Date:   November 03, 2021

CVE-2021-42574 Affected
CVE-2021-42694 Affected
VU#999008.1 Affected

Vendor Statement

We have not received a statement from the vendor.

References

Rust Security Response WG Affected

Notified:  2021-10-26 Updated: 2021-11-09

Statement Date:   November 04, 2021

CVE-2021-42574 Affected
CVE-2021-42694 Not Affected
VU#999008.1 Affected

Vendor Statement

Regarding CVE-2021-42574, the Rust project released Rust 1.56.1, featuring new lints to alert developers about the presence of bidirectional-override codepoints in their source code. No builtin mitigation is present in Rust 1.0.0 to Rust 1.56.0: we recommend users of those compiler versions to either upgrade to a newer compiler, or to perform out-of-band checks for the presence of those codepoints in their codebase.

Regarding CVE-2021-42694, Rust already includes protection from homoglyphs in identifiers. Rust 1.0.0 to Rust 1.52.1 doesn't support non-ASCII identifiers, which prevents the issue completely. Rust 1.53.0 and later versions do support non-ASCII identifiers, but include lints to alert developers about the presence of homoglyphs or similar issues.

References

The LLVM Security Group Affected

Notified:  2021-09-27 Updated: 2021-11-09

Statement Date:   October 30, 2021

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Affected

Vendor Statement

In a future release the LLVM project will include new checkers as part of clang-tidy to detect occurences of both CVE-2021-42574 and CVE-2021-42694. In the meantime we recommend clang users to perform out-of-band checks for the presence of these security issues in their codebases.

References

Meta Not Affected

Notified:  2021-09-27 Updated: 2021-11-09

Statement Date:   October 18, 2021

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Not Affected

Vendor Statement

We have not received a statement from the vendor.

Veracode Not Affected

Notified:  2021-10-26 Updated: 2021-11-09

Statement Date:   November 02, 2021

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Not Affected

Vendor Statement

We have not received a statement from the vendor.

Node.js Unknown

Notified:  2021-10-19 Updated: 2024-12-10

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

References

CERT Addendum

Per the node.js statement published Nov 1, 2021, 1:29:33 PM:

You may have read the announcement today about the potential for supply chain attacks using characters within source files that are not visible to human code reviewers: https://www.trojansource.codes/.

The ECMAScript specification requires support for these characters (see section 12.1 at https://tc39.es/ecma262/#sec-unicode-format-control-characters). Node.js or any ECMAScript-compliant engine must allow these characters, which have valid uses in source code.

Due diligence including code scans (for example for licenses) should already be part of your processes both for the code you write and dependencies that you use within your application. The script provided by Red Hat [at] https://access.redhat.com/sites/default/files/find_unicode_control2--2021-11-01-1136.zip is a good way to scan and identify files that you may want to review with respect to usage of the special characters identified.

For some statically compiled languages, it may make sense to incorporate a check into the compiler instead of using an external script. However, for dynamic languages such as JavaScript, there are potential issues with that approach. These include:

** Finding out too late that there is usage of these characters. Dynamic languages may load a source file in the middle of their execution. At this point the application is already deployed and you don't necessarily want to block it from running and non-blocking warnings may not be noticed. It is more effective to scan all files that make up the application before it is run.

** The runtime overhead of the scan will be incurred unnecessarily every time the application is run. It is better to scan as part of your development/build/release processes as it will not add any additional runtime overhead once the application is deployed.

At this time, we do not plan to provide an option to scan at runtime. We recommend that external scripts/processes be used instead

Red Hat Unknown

Notified:  2021-09-27 Updated: 2024-12-10

Statement Date:   December 17, 2021

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

Red Hat's guidance for this issue can be found at Security Bulletin RHSB-2021-007

References

Amazon Unknown

Notified:  2021-09-27 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Apple Unknown

Notified:  2021-09-27 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

GitLab Inc. Unknown

Notified:  2021-09-27 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

GNU Compiler Collection Unknown

Notified:  2021-10-19 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Google Unknown

Notified:  2021-09-27 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Micro Focus Unknown

Notified:  2021-10-26 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Microsoft Unknown

Notified:  2021-10-19 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Oracle Corporation Unknown

Notified:  2021-09-27 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

Snyk Unknown

Notified:  2021-11-02 Updated: 2021-11-09

CVE-2021-42574 Unknown
CVE-2021-42694 Unknown
VU#999008.1 Unknown

Vendor Statement

We have not received a statement from the vendor.

View all 16 vendors View less vendors


Other Information

CVE IDs: CVE-2021-42574 CVE-2021-42694
API URL: VINCE JSON | CSAF
Date Public: 2021-11-09
Date First Published: 2021-11-09
Date Last Updated: 2024-12-10 02:09 UTC
Document Revision: 3

Sponsored by CISA.