Sometimes it happens that a web page is displayed correctly on the author's computer, but when visitors see it on the Internet, the text is illegible. This usually happens with languages other than English. Instead of accented letters there appear small squares, question marks, or completely different letters. This is because the author did not specify encoding of the page (or s/he specified it incorrectly). This mistake is rather common, because author usually will not notice it him/herself – someone else must report it.
What is the encoding? Simply said, from technical point of view all data in computer are stored as numbers (and all numbers are stored as ones and zeroes – but this is not important now). So also letters and other characters written in text editor are remembered by computer as numbers; for example „A“ is 65, „B“ is 66,... and the text file is saved on disk as a sequence of numbers, then it is loaded from the disk as a sequence of numbers; and it is also sent across the Internet as a sequence of numbers.
The problem is: which character is which number? For historical causes, there are a few standards. Each of them takes some set of characters and assigns them numbers. The 8-bit standards try to use only numbers from 0 to 255 – of course this cannot include all possible letters, so each standard includes only for a few languages. English MS Windows by default saves text files with encoding „windows-1252“. (If you try to save a TXT file in different language, you may lose some letters after saving.) Linux typically uses ISO standard „ISO-8859-1“.
There is also a Unicode standard, which tries to include all characters from all alphabets; one of its encodings is „UTF-8“. If you save a text file in UTF-8 encoding, it can be written in any language. So I strongly recommend using this encoding, if you use languages other than English.
The important part is that the web browser of the visitor of your pages should know, which encoding uses the page. Today's web browser usually understand a lot of encodings, and visitor can select the correct encoding in menu. But if you specify the encoding in the page, visitor does not have to select anything, because it will be selected automatically. So if you use encoding „windows-1252“, write in the header of page:
<meta equiv="Content-Type" content="text/html; charset=windows-1252"/>
If you want to save your page in encoding UTF-8, in program Notepad select in menu „File | Save as...“ and in the botom row select „Encoding: UTF-8“. Do this when first saving the file, the program will remember it later. And in the header of page write:
<meta equiv="Content-Type" content="text/html; charset=UTF-8"/>