Appendix C. Unicode String Encode Support

QR Code 2005 does not support Unicode natively. The default character set is ISO8859-1. Theoretically the support could become available using ECI; however there is no ECI published for any of Unicode character set (UTF16, UTF8 etc.). There is also lack of ECI support in most 2D barcode readers.

Because there is a demand on encoding characters outside ISO8859-1, several methods have been developed. The common approach is to encode characters in native character set, and the reader is configured to read based on the default locale. This approach produces the smallest barcode as possible with one major caveat. The same QR code is decoded into different text when read by readers with different locale configured. In many use cases this is not an issue, as a QR code with Chinese text encoded is intended to be used in China only.

Smart phone readers, such as the one in Android and IOS phones, will "guess" the character set, then converting the decode result to Unicode. Those character set detectors usually work well; however they do not work in all cases. If the goal is to get unambiguous reading result from a smart phone, it is best to encode the text in UTF-8 with BOM. This is the approach we take in 5.1 release.

A new API QRCodeEncode2W is added to accept a UTF16 string. Internally, the encoder examine the contents of the UTF-16 string. If all characters fall into ISO8859-1, it converts them into ISO8859-1 and encoded as is. Otherwise, it converts the UTF16 string into UTF-8 with BOM, and encode the result. You can still use the QRCodeEncode2 API and take care of the character set conversion by yourself, for cases that you are required to use local character set.

Several components that accept Unicode parameters are updated in 5.1 release. If you are working exclusively with ASCII or ISO8859-1, you wont' see any changes in the results. Previously, characters outside ISO8859-1 are converted to its ANSI counterpart with the default locale. Now with 5.1 release, the whole string will be converted to UTF-8 with BOM. This makes the QR code portable among countries. Those components include QRCode ActiveX control, Word AddIn and Crystal Reports UFL.