Double-Byte Character Sets

Western languages are based on a set of sounds. Words are built from letters, each of which makes a particular sound. (Well, in English, it's a little more complicated because some letters make different sounds in different situations—or even the same situation. Take "read", for example. But the key point holds. Words are built from sounds.) As a result, Western languages can be represented with relatively few characters. English has 26 characters; other Western languages have similar quantities.In the Far East, words are built from ideas, not from sounds. Languages like Chinese and Japanese use a system of pictographs, where a particular symbol represents an idea and symbols are combined to make words or phrases. These languages have far more characters than Western languages.Representing the Eastern languages on a computer has always been a challenge. The standard character set on Western computers uses eight bits (one byte) to represent a character. That provides 256 choices, far more than are needed for the alphabetic characters, the digits and common punctuation marks. Several sets of these characters are available, in the form of code pages, to support the various diacritical marks (accents and so forth) of the different European languages. But 256 doesn't even come close to enough for Eastern languages.The solution is to use two bytes (16 bits) per character. That provides more than 65,000 variations. Such character sets are called "double-byte character sets."Beginning with version 3.0b, Visual FoxPro provides support for double-byte character sets (DBCS). You can write an application using the English version of Visual FoxPro (but a Japanese or Chinese version of Windows) and it can be used in the Far East. Obviously, the interface elements—menu names, text box captions and message box prompts—need to be "localized" for the target language, but the code should run as is.There are a host of functions to support DBCS. Many of them duplicate longstanding FoxPro functions (like AT() and SUBSTR()), while a few add necessary new functionality.Because we don't use any of the Far East versions of Windows, we haven't been able to bang on these new functions as much as we'd like. They all appear to work in our single-byte versions of Windows.

See Also

At_C(), AtCC(), ChrTranC(), IsLeadByte(), IMEStatus(), LeftC(), LenC(), LikeC(), RatC(), RightC(), StrConv(), StuffC(), SubStrC()


Back to Table of Contents

Copyright © 2002-2018 by Tamar E. Granor, Ted Roche, Doug Hennig, and Della Martin. Click for license .