Menu
×
   ❮     
HTML CSS JAVASCRIPT SQL PYTHON JAVA PHP HOW TO W3.CSS C C++ C# BOOTSTRAP REACT MYSQL JQUERY EXCEL XML DJANGO NUMPY PANDAS NODEJS R TYPESCRIPT ANGULAR GIT POSTGRESQL MONGODB ASP AI GO KOTLIN SASS VUE DSA GEN AI SCIPY AWS CYBERSECURITY DATA SCIENCE
     ❯   

HTML Character Sets


To display an HTML page correctly, the browser must know what character set (encoding) to use:

Example

<meta charset="UTF-8">

HTML Character Sets

All modern computer languages use the UTF-8 character set a sdefault.

The encoding for the early web was ASCII. ASCII used 7 bits for the character, and could only represent 128 different characters (English letters).

For a closer look, study our Complete ASCII Reference.

Windows-1252 was first character set in Windows. It was a copy of ASCII, but used 8-bits to represent 256 different characters (international letters). Windows-1252 is supported by all browsers.

For a closer look, study our Complete Windows-1252 Reference.

Unicode Web growth

HTML 4: ISO-8859-1

The default character from HTML 2.0 to HTML 4.01, was ISO-8859-1.

ISO-8859-1 is an extension to ASCII, with added international characters.

Example

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

In HTML 4, a character set different from ISO-8859-1 can be specified in the <meta> tag:

Example

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-8">

All HTML 4 processors also support UTF-8:

Example

<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

When a browser detects ISO-8859-1 it normally defaults to Windows-1252, because Windows-1252 has 32 more international characters.

For a closer look, please study: The Complete ISO-8859-1 Reference



In HTML5: Unicode UTF-8

The HTML5 specification encourages web developers to use the UTF-8 character set.

Example

<meta charset="UTF-8">

A character-set different from UTF-8 can be specified in the <meta> tag:

Example

<meta charset="ISO-8859-1">

The Unicode Consortium developed the UTF-8 and UTF-16 standards, because the ISO-8859 character-sets are limited, and not compatible a multilingual environment.

The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.

All HTML5 and XML processors support UTF-8, UTF-16, Windows-1252, and ISO-8859.

For a closer look, please study: The Complete Unicode Reference.


×

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail:
[email protected]

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail:
[email protected]

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning. Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness of all content. While using W3Schools, you agree to have read and accepted our terms of use, cookie and privacy policy.

Copyright 1999-2024 by Refsnes Data. All Rights Reserved. W3Schools is Powered by W3.CSS.