Archive for the ‘General’ Category

The problem with  – an encoding story

Friday, November 27th, 2009

Often, you might see erroneous characters appear in your text / code. In particular, the following has always bugged me:

“This jacket costs £25″

This is an issue with how you are encoding your text. The error shown above is a telltale sign that you are inadvertantly viewing UTF-8 (Unicode) encoded text in ISO-8859-1 (ASCII) mode. You must then find out where this conversion is happening.

The reason for the above error is because the £ symbol in Unicode is encoded in double the number of bytes as the ascii version, the hex values of which are shown below:

£ in unicode: 0xC2,0xA3
 in ascii: 0xC2
£ in ascii: 0xA3

If you are seeing this error in HTML, check the that the file is saved in the same encoding as the doctype of the document.

D93FSVECFCH8