A heart can be black-and-white or red. A snowman can be a black silhouette in a typeset paragraph or a colour emoji on a phone. A Japanese surname can pick between two correct glyphs for the same codepoint, depending on whose family is being written about. Variation selectors are the Unicode mechanism that resolves these choices — invisible codepoints that follow a base character and tell the renderer which variant to draw. They have no glyph of their own, they add nothing to the meaning of the text, and they are the difference between a red ❤️ and a black ❤ in the same font.

What a variation selector is

A variation selector is a codepoint whose only effect is to modify the preceding character's glyph choice. The codepoint itself is invisible. Like a combining mark, it never stands alone — its position in the stream is meaningful only as a suffix to a base. Unlike a combining mark, it does not add or remove anything semantic; it picks between alternative drawings of the same character.

The standard provides 256 variation selectors, numbered VS1 through VS256, in two contiguous ranges:

U+FE00–U+FE0F
Variation Selectors block. 16 codepoints, VS1 through VS16. Located in the BMP. Most general use of variation selectors lives here, including the emoji-versus-text toggle.
U+E0100–U+E01EF
Variation Selectors Supplement. 240 codepoints, VS17 through VS256. Located in the Supplementary Special-purpose Plane (Plane 14). Mostly used for CJK ideographic variant selection.

A variation selector by itself does nothing. The combination of a specific base codepoint with a specific selector is meaningful only if the standard explicitly registers that combination. The list of registered combinations lives in two files in the Unicode Character Database: StandardizedVariants.txt for general standardized variation sequences, and IVD_Sequences.txt for the Ideographic Variation Database.

The crucial pair: VS15 and VS16

The two most important variation selectors are VS15 (U+FE0E) and VS16 (U+FE0F). They control presentation style for codepoints that exist in both worlds: the older black-and-white symbol world and the newer colour emoji world.

VS15 — U+FE0E "text style"
Forces a text presentation. The preceding character is drawn as a monochrome glyph, sized to match the surrounding text.
VS16 — U+FE0F "emoji style"
Forces an emoji presentation. The preceding character is drawn as the colour emoji glyph, often larger than the surrounding text and rendered by the OS emoji font.

Many codepoints exist in the no-selector limbo where their default presentation depends on the codepoint's Emoji_Presentation property. A character with Emoji_Presentation=Yes (most of the emoji blocks from U+1F300 onward) defaults to emoji. A character with Emoji_Presentation=No defaults to text. The selector overrides the default in either direction.

The hearts and the snowmen

A short tour of the most common dual-presentation codepoints:

BaseCodepointDefault+ VS15 (text)+ VS16 (emoji)
U+2764 HEAVY BLACK HEARTText❤︎❤️ (red)
U+2603 SNOWMANText☃︎☃️
U+263A WHITE SMILING FACEText☺︎☺️
U+2600 BLACK SUN WITH RAYSText☀︎☀️
U+2708 AIRPLANEText✈︎✈️
U+2702 BLACK SCISSORSText✂︎✂️
U+260E BLACK TELEPHONEText☎︎☎️
U+267B BLACK UNIVERSAL RECYCLING SYMBOLText♻︎♻️

All of these are codepoints whose first life was in a 1990s symbol block — Dingbats, Miscellaneous Symbols, Transport — and whose second life began when emoji vendors started rendering them in colour. The codepoint has not changed. The default presentation is text. To request the colour version, append VS16. To force the monochrome version against an emoji-defaulting platform, append VS15.

Most BMP emoji-eligible codepoints in the U+2600–U+27BF range have a text default and need VS16 to render in colour. Most Plane 1 emoji (U+1F300 and up) have an emoji default and need VS15 only if you want the text version — most fonts do not even provide one.

What the bytes look like

A heart without and with VS16 is the same codepoint with one or two suffix codepoints, encoded in UTF-8 as:

❤      U+2764         E2 9D A4         (3 bytes)
❤️    U+2764 U+FE0F  E2 9D A4 EF B8 8F (6 bytes)

The VS16 codepoint U+FE0F encodes as the three UTF-8 bytes EF B8 8F. Add it to any base and the byte length grows by three. UTF-16 stores VS16 as a single 16-bit unit FE0F; UTF-32 as 00 00 FE 0F. Selectors in the supplementary range (U+E0100+) require a surrogate pair in UTF-16 and four bytes in UTF-8.

JavaScript and the extra codepoint

Because the selector is a codepoint, anything that counts codepoints sees one more codepoint when the selector is present:

const text  = '❤';        // ❤
const emoji = '❤️';  // ❤️

text.length;                     // 1
emoji.length;                    // 2
[...emoji].length;               // 2
emoji.codePointAt(0);            // 10084 (0x2764)
emoji.codePointAt(1);            // 65039 (0xFE0F)

emoji === text;                  // false — different strings
emoji.normalize() === text;      // false — VS16 is preserved by NFC

The selector survives NFC, NFD, NFKC, and NFKD. Normalization does not strip it. Two strings that render identically — say, a heart with VS16 and a heart that the renderer chooses to draw in colour anyway — compare unequal as byte sequences. Code that compares user input against a stored emoji has to decide whether to strip selectors, normalize them in, or use a higher-level grapheme-cluster comparison. See the guide on codepoints versus characters for the indexing implications.

Ideographic Variation Sequences

The 240 selectors in the Variation Selectors Supplement do an entirely different job. They are used to register glyph variants of CJK ideographs in the Ideographic Variation Database (IVD), defined by UTS #37. The IVD is a registry where collections — fonts, type foundries, government registries — submit named variants of single ideographs and reserve specific base-plus-selector pairs for them.

The most consequential example: U+845B (葛) is a single ideograph, but the lower-right portion of the character is drawn two distinct ways depending on the surname. The Adobe-Japan1 IVD collection registers:

葛 U+845B U+E0100   — Adobe-Japan1 CID 1481, "Katsushika" form
葛 U+845B U+E0101   — Adobe-Japan1 CID 13046, "Kuzu" form

The Tokyo neighbourhood Katsushika is conventionally written with the first variant; the surname Kuzu and several place names use the second. Family registry systems in Japan, where the legally correct glyph for a name matters, rely on the IVD to disambiguate. The bare codepoint U+845B is ambiguous; the codepoint plus the IVS is precise.

Three IVD collections are registered as of Unicode 16: Adobe-Japan1, Moji_Joho (Japan's Ministry of Justice character collection for family registry use), and MSARG (a Macau-specific Hong Kong supplementary set). Each collection numbers its variants independently and reserves a range of selectors from the supplementary block.

Outside of CJK, the supplementary selectors are not used. Inside CJK, they are essential for any application — government, publishing, archival — that needs glyph fidelity rather than codepoint identity.

Mongolian free variation selectors

The Mongolian script in Unicode uses a different mechanism for choosing among the four positional forms (initial, medial, final, isolated) of each letter. The shaping is largely automatic from context, the same way Arabic shaping works. But Mongolian has additional contextual variants that cannot be determined from neighbouring letters alone — and for those it provides three dedicated free variation selectors:

U+180B FVS1
First Mongolian free variation selector.
U+180C FVS2
Second.
U+180D FVS3
Third.

These are functionally similar to VS1–VS3 but live in their own block and are documented separately in the standard. A fourth Mongolian-specific format character, U+180F MONGOLIAN FREE VARIATION SELECTOR FOUR, was added in Unicode 14 to address shaping cases that the original three could not cover. Mongolian's mechanism is older and predates the general variation-selector framework; it was kept rather than retrofitted.

Why selectors survive normalization

It would be tempting to treat and ❤️ as the same string for comparison purposes. Unicode declines to do this — the two are distinct codepoint sequences, the selector is not a combining mark in any category sense, and NFC has nothing to say about it. The reason is that a variation selector encodes intent. A user who typed VS16 wanted the emoji presentation; one who did not, did not. Throwing the selector away during normalization would lose authored information.

The practical consequence for applications: if you need to match a heart regardless of presentation, you have to strip selectors yourself. The transform is small:

function stripVariationSelectors(s) {
  return s.replace(/[︀-️]/g, '')
          .replace(/[\u{E0100}-\u{E01EF}]/gu, '');
}

Some security frameworks do this as part of identifier folding. UTS #39 lists variation selectors as part of the default-ignorable set; identifier comparison routines that follow the recommended profile strip them out before equality. Do not apply that strip to display text, only to keys.

A variation selector carries no information about a character's identity, only about which of two correct drawings of that character the author preferred.

Practical cases to know

The places variation selectors most often matter in code:

Emoji input
iOS, Android, and Windows insert VS16 after symbols whose default is text — pressing the heart key produces U+2764 U+FE0F, not bare U+2764. Code that processes emoji output should expect the selector and either preserve or strip it explicitly.
Emoji search
A search index built from emoji strings should strip VS15 and VS16 before indexing. Otherwise a user who types on a desktop keyboard (no selector) will not find a row stored with ❤️ from a phone.
Filenames on macOS
The filesystem preserves variation selectors. file.txt with a heart in the name can refer to two distinct files differing only by VS16. Renaming and de-duplication tools need to normalize on intent.
Length in messaging
SMS gateways that bill by UTF-16 unit count charge an extra unit per selector. A red heart in an SMS is one more 16-bit unit than a plain heart.

The General Category for every variation selector is Cf (Format). Selectors are invisible by construction; renderers that draw them as a dotted circle when they appear without a base are not following the spec — Cf characters have zero advance width and no glyph of their own.

What to remember

A variation selector picks a glyph variant for the preceding character. The pair VS15/VS16 toggles between text and emoji presentation for the symbols where both are possible. The supplementary range (U+E0100+) carries CJK ideographic variants registered in the IVD, important for Japanese name fidelity. Selectors are invisible, they survive normalization, and they add codepoints to a string's length without changing its visible content. When two strings render the same but compare unequal, a variation selector is one of the usual suspects.

Further reading