Handling and Rendering of Text
As discussed before, the handling and rendering of text has its some of their own features in MiniGUI. We will elaborate the concepts related to text handling and introduce the relevant APIs in this chapter .
Charset and Encoding
Charset (character set) is a collection of characters defined to represent certain language; Encoding is the coding rules set to represent characters in certain charset. Encoding usually ranks the character by fixed order and uses them as the internal features of recording, storing, transfer and exchange. People who have conducted computer know the ASCII code defined by the US National Standard Authorization. ASCII code can be understood as a encoding format of American English charset; this coding format uses one 7-bit byte to represent one character scoped from 0x00 to 0x7F.
[Note] Type man ascii to get the definition of ASCII on Linux command line.
ASCII cannot meet the requirement of non-English speakers as the use of computer has spreads to the entire world. Therefore, almost all countries define the standard of charset and encoding based on their own official languages. The well-known standard GB2312-80 is the simplified Chinese charset standard defined by China government. GB2312-80 includes 682 symbols and 6763 Chinese words. It has 87 divisions, each of which has 94 characters. There are other standards, such as ISO8859 used for the single-byte charsets, JISX0201, JISX0208 charset defined by Japan, BIG5 traditional Chinese charset, and so on.
One charset can have different encoding format. Usually we use EUC encode (extended UNIX code) for GB2312 charset. EUC encodes each GB2312 charset as 2 bytes scoped in 0xA1~0xFE. The higher byte represents GB2312 area code while the lower one represents GB2312 position code. Another popular GB2312 code is HZ code, which removes the highest bit of EUC code, thus using ASCII to represent Chinese characters. For example, the code of Chinese word "啊" is 0xB1A1 in EUC encoding, while in HZ encoding the character is ~{1!~}.
With the publication and update of the charset of each county and area, the problem of compatibility rises. For example, a text file that uses GB2312 EUC code cannot be properly displayed on BIG5 system. Therefore, some international organizations begin to develop the globally universal charset standard, that is, the well-known UNICODE charset.
The international standard organization established ISO/IEC JTC1/SC2/WG2 work group in April 1984. This group is responsible for integrating different characters and symbols of different countries. In 1991, some American companies established Unicode Consortium and got agreement with WG2 to use the same code-set in October 1991. At present, UNICODE 2.0 version includes 6811 characters, 20902 Chinese characters, 11172 Korean characters, 6400 make-up divisions and 20249 reserved characters, totally 65534 characters. UNICODE charset has multiple encoding formats, the most popular one is using 16-bit double bytes to express one character; it is also called USC2; another is UTF8 encoding format, which can be compatible with ASCII and ISO8859-1 charset. The byte-count used to represent a character is variable.
[Hint] Type man unicode and man utf-8 on Linux command line can get the information of UNICODE charset and UTF8 encoding.
UNICODE can solve the compatibility problem of charsets. However, most countries and regions do not recognize UNICODE charset. For example, China government asks all OS software products must support GB18030 charset, not the UNICODE charset. The reason is that GB18030 is compatible with GB2312 and GBK charset popularly used in China main land areas, but not compatible with UNICODE.
UNICODE provides a way to solve charset compatibility problem for general-purpose operating systems. However, it is not the best way for the embedded systems. MiniGUI uses the internal code that is completely consistent with the default code of that charset to represent. Then, the abstract interfaces provide an universal analysis interface to text in any charset. This interface can be used to analyze both of the font module and the multi-byte character string. So far MiniGUI can support ISO8859-x (single-byte charsets), GB2312, GBK, GB18030, BIG5, EUCKR, Shift-JIS, and EUCJP (multi-byte charsets). MiniGUI also support UTF-8 encode of UNICODE charset through the abstract interface of charset.
[Hint] The charset support of MiniGUI can be also understood as the support of certain encoding format of that charset.
MiniGUI uses logical font interface to support multi-byte charset. When application displays text, it usually needs to set logical font and assign the encoding name of chosen charset. After creating logical font, application can use it to display text or analyze text string.
Device Font
To correctly display text needs to get the shape data corresponding to each character. These shape data is called glyph and is saved in a file of certain type, which is usually called a font file. The most popular type of font file is dot-matrix (bitmap) font, which uses bitmap to store the information of dot-matrix glyph of each character. Another popular type is vector font, which stores the frame information of each character and can be zoomed by certain algorithms. The popular types of vector font are TrueType and Adobe Type1.
Similar to charset, MiniGUI defines a series of abstract interfaces for font. Now MiniGUI can support RBF, VBF (two MiniGUI-defined dot-matrix font formats), TrueType and Adobe Type1 fonts.
When initializing MiniGUI, it is needed to read MiniGUI.cfg and load certain font files. The loaded font is internally called device font. Device font defines the format name, type, size and it’s supported charset. According to the loaded device font and the font type, name, size and character information assigned by application program, MiniGUI searches the proper device font to display text.
[Hint] Information of the definition, name and format of device font is included in Chapter 4 of MiniGUI User Manual.
MiniGUI-Processes does not load vector device font (TrueType and Type1 font) while initializing. If a MiniGUI-Processes application wants to use vector font, it should call InitVectorialFonts, and call TermVectorialFonts when done.
Logical font
The logical font of MiniGUI has strong functions, including abundant information such as charset, font type, and style. It can be used not only to render text, but also to analyze the text string. This is very useful in most text edition applications. Before using its logical font, you need firstly build it and choose it to the device context, which will use this logical font to output text. The default logical font of each device context is the default system-defined font in MiniGUI.cfg. You can establish the logical font by calling CreateLogFont, CreateLogFontByName, and CreateLogFontIndirect. You can also use function SelectFont to select a logical font to a device context. It is DestroyLogFont that is used to destroy logical font. However, you cannot destroy the selected logical font. The prototypes of these functions are as follow (minigui/gdi.h):
PLOGFONT GUIAPI CreateLogFont (const char* type, const char* family,
const char* charset, char weight, char slant, char set_width,
char spacing, char underline, char struckout,
int size, int rotation);
PLOGFONT GUIAPI CreateLogFontByName (const char* font_name);
PLOGFONT GUIAPI CreateLogFontIndirect (LOGFONT* logfont);
void GUIAPI DestroyLogFont (PLOGFONT log_font);
void GUIAPI GetLogFontInfo (HDC hdc, LOGFONT* log_font);
PLOGFONT GUIAPI GetSystemFont (int font_id);
PLOGFONT GUIAPI GetCurFont (HDC hdc);
PLOGFONT GUIAPI SelectFont (HDC hdc, PLOGFONT log_font);
The following code fragment creates multiple logical fonts:
static LOGFONT *logfont, *logfontgb12, *logfontbig24;
logfont = CreateLogFont (NULL, "SansSerif", "ISO8859-1",
FONT_WEIGHT_REGULAR, FONT_SLANT_ITALIC, FONT_SETWIDTH_NORMAL,
FONT_SPACING_CHARCELL, FONT_UNDERLINE_NONE, FONT_STRUCKOUT_LINE,
16, 0);
logfontgb12 = CreateLogFont (NULL, "song", "GB2312",
FONT_WEIGHT_REGULAR, FONT_SLANT_ROMAN, FONT_SETWIDTH_NORMAL,
FONT_SPACING_CHARCELL, FONT_UNDERLINE_LINE, FONT_STRUCKOUT_LINE,
12, 0);
logfontbig24 = CreateLogFont (NULL, "ming", "BIG5",
FONT_WEIGHT_REGULAR, FONT_SLANT_ROMAN, FONT_SETWIDTH_NORMAL,
FONT_SPACING_CHARCELL, FONT_UNDERLINE_LINE, FONT_STRUCKOUT_NONE,
24, 0);
The first font, logfont, belongs to ISO8859-1 charset and uses SansSerif with the height of 16 pixels; logfontgb12 belongs to GB2312 charset and uses Song with the height of 12 pixels; logfontbig24 belongs to BIG5 charset and uses Ming.
We can also call GetSystemFont function to return a system logical font, the argument font_id in that can be one of the following values:
- SYSLOGFONT_DEFAULT: System default font, it has to be a single-byte charset logical font and must be formed by RBF device font.
- SYSLOGFONT_WCHAR_DEF: System default multi-byte charset font. It is usually formed by RBF device font. Its width is twice of the SYSLOGFONT_DEFAULT logical font.
- SYSLOGFONT_FIXED: System font with fixed width.
- SYSLOGFONT_CAPTION: The logical font used to display text on caption bar.
- SYSLOGFONT_MENU: The logical font used to display menu text.
- SYSLOGFONT_CONTROL: The default logical font used by controls.
The system logical fonts above are created corresponding to definition of MiniGUI.cfg when MiniGUI is initialized.
[Hint] The information of definition, name and format of system logical font is described in the Chapter 4 of MiniGUI User Manual.
GetCurFont function returns current logical font in a device context. You can not call DestroyLogFont to destroy a system logical font.
Text Analysis
After establishing logical font, the application program can use logical font to analyze multi-language-mixed text. Here the multi-language-mixed text means the character string formed by two non-intersected charset texts, such as GB2312 and ISO8859-1, or BIG5 and ISO8859-2. You can use the following functions to analyze the text constitutes of multi-language-mixed text (minigui/gdi.h):
// Text parse support
int GUIAPI GetTextMCharInfo (PLOGFONT log_font, const char* mstr, int len,
int* pos_chars);
int GUIAPI GetTextWordInfo (PLOGFONT log_font, const char* mstr, int len,
int* pos_words, WORDINFO* info_words);
int GUIAPI GetFirstMCharLen (PLOGFONT log_font, const char* mstr, int len);
int GUIAPI GetFirstWord (PLOGFONT log_font, const char* mstr, int len,
WORDINFO* word_info);
GetTextMCharInfo returns the byte address of each character of the multi-language-mixed text. For example, for the string "ABC 汉语", this function will return {0, 1, 2, 3, 5} five values in pos_chars. GetTextWordInfo will analyze the place of each word of the multi-language-mixed text. As for single-byte charset text, we use blank and TAB key as the delimiter; as for multi-byte charset text, the word uses single-byte character as the delimiter. GetFirstMCharLen returns the byte length of the first character. GetFirstWord returns the word information of the first word.
Text Output
The following functions can be used to calculate the output length and width of text (minigui/gdi.h):
int GUIAPI GetTextExtentPoint (HDC hdc, const char* text, int len, int max_extent,
int* fit_chars, int* pos_chars, int* dx_chars, SIZE* size);
// Text output support
int GUIAPI GetFontHeight (HDC hdc);
int GUIAPI GetMaxFontWidth (HDC hdc);
void GUIAPI GetTextExtent (HDC hdc, const char* spText, int len, SIZE* pSize);
void GUIAPI GetTabbedTextExtent (HDC hdc, const char* spText, int len, SIZE* pSize);
GetTextExtentPoint is used to calculate the maximal number of the characters can be output, the byte place of each character, the output place of each character, and the actual output width and height of multi-byte text in a given output width (that is, the width of the output character is limited in a certain extent). GetTextExtentPoint is an integrated function, which is very useful for editor-type application. For example, in the single-line and multi-line edit box control, MiniGUI uses this function to calculate the position of the caret.
GetFontHeight and GetMaxFontWidth return the height and maximum width of a font. GetTextExtent calculates the output width and height of text. GetTabbedTextExtent returns the output width and height of formatted text string.
The following function is used to output text (include/gdi.h):
int GUIAPI TextOutLen (HDC hdc, int x, int y, const char* spText, int len);
int GUIAPI TabbedTextOutLen (HDC hdc, int x, int y, const char* spText, int len);
int GUIAPI TabbedTextOutEx (HDC hdc, int x, int y, const char* spText, int nCount,
int nTabPositions, int *pTabPositions, int nTabOrigin);
void GUIAPI GetLastTextOutPos (HDC hdc, POINT* pt);
// Compatiblity definitions
#define TextOut(hdc, x, y, text) TextOutLen (hdc, x, y, text, -1)
#define TabbedTextOut(hdc, x, y, text) TabbedTextOutLen (hdc, x, y, text, -1)
...
int GUIAPI DrawTextEx (HDC hdc, const char* pText, int nCount,
RECT* pRect, int nIndent, UINT nFormat);
TextOutLen is used to output a certain text with appropriate length at given position. If length is -1, the character string must terminated with '\0'. TabbedTextOutLen is used to output formatted text string. TabbedTextOutEx is used to output formatted character string, but also can specify the position of each TAB character in the text string.
Fig. 14.1 is the output of TextOut, TabbedTextOut, and TabbedTextOutEx functions.
Fig. 14.1 Output of TextOut, TabbedTextOut, and TabbedTextOutEx functions
DrawText is the most complicated text output function, which can use different ways to output text in a given rectangle. Table 14.1 lists the formats supported by DrawText.
Table 14.1 Output formats of DrawText function
| Format identifier | Meaning | Note |
| DT_TOP | Top-justifies the text. | single line only (DT_SINGLELINE) |
| DT_VCENTER | Centers text vertically. | |
| DT_BOTTOM | Justifies the text to the bottom of the rectangle. | |
| DT_LEFT | Aligns text to the left | - |
| DT_CENTER | Aligns text in the center. | - |
| DT_RIGHT | Aligns text to the right. | - |
| DT_WORDBREAK | Lines are automatically broken between words if a word would extend past the edge of the rectangle specified by the pRect parameter. | - |
| DT_SINGLELINE | Displays text on the single line only. Carriage returns and linefeeds do not break the line. | The vertical align flag will be ignored when there is not this flag |
| DT_EXPANDTABS | Expands TAB characters. | - |
| DT_TABSTOP | Sets tab stops. Bits 15-8 (high-order byte of the low-order word) of the uFormat parameter specify the number of characters for each TAB. | - |
| DT_NOCLIP | Draws without clipping. Output will be clipped to the specified rectangle by default. | - |
| DT_CALCRECT | Do not output actually, only calculate the size of output rectangle. | - |
Code in List 14.1 calls DrawText function to perform aligned text output, according to the description of character to be output. Please refer to fontdemo.c program in MDE for complete code of the program. Fig. 14.2 shows the output effect of the program.
List 14.1 Using DrawText function
void OnModeDrawText (HDC hdc)
{
RECT rc1, rc2, rc3, rc4;
const char* szBuff1 = "This is a good day. \n"
"这是利用 DrawText 绘制的文本, 使用字体 GB2312 Song 12. "
"文本垂直靠上, 水平居中";
const char* szBuff2 = "This is a good day. \n"
"这是利用 DrawText 绘制的文本, 使用字体 GB2312 Song 16. "
"文本垂直靠上, 水平靠右";
const char* szBuff3 = "单行文本垂直居中, 水平居中";
const char* szBuff4 =
"这是利用 DrawTextEx 绘制的文本, 使用字体 GB2312 Song 16. "
"首行缩进值为 32. 文本垂直靠上, 水平靠左";
rc1.left = 1; rc1.top = 1; rc1.right = 401; rc1.bottom = 101;
rc2.left = 0; rc2.top = 110; rc2.right = 401; rc2.bottom = 351;
rc3.left = 0; rc3.top = 361; rc3.right = 401; rc3.bottom = 451;
rc4.left = 0; rc4.top = 461; rc4.right = 401; rc4.bottom = 551;
SetBkColor (hdc, COLOR_lightwhite);
Rectangle (hdc, rc1.left, rc1.top, rc1.right, rc1.bottom);
Rectangle (hdc, rc2.left, rc2.top, rc2.right, rc2.bottom);
Rectangle (hdc, rc3.left, rc3.top, rc3.right, rc3.bottom);
Rectangle (hdc, rc4.left, rc4.top, rc4.right, rc4.bottom);
InflateRect (&rc1, -1, -1);
InflateRect (&rc2, -1, -1);
InflateRect (&rc3, -1, -1);
InflateRect (&rc4, -1, -1);
SelectFont (hdc, logfontgb12);
DrawText (hdc, szBuff1, -1, &rc1, DT_NOCLIP | DT_CENTER | DT_WORDBREAK);
SelectFont (hdc, logfontgb16);
DrawText (hdc, szBuff2, -1, &rc2, DT_NOCLIP | DT_RIGHT | DT_WORDBREAK);
SelectFont (hdc, logfontgb24);
DrawText (hdc, szBuff3, -1, &rc3, DT_NOCLIP | DT_SINGLELINE | DT_CENTER | DT_VCENTER);
SelectFont (hdc, logfontgb16);
DrawTextEx (hdc, szBuff4, -1, &rc4, 32, DT_NOCLIP | DT_LEFT | DT_WORDBREAK);
}
Fig. 14.2 The output of DrawText function
Except the above output functions, MiniGUI also provides functions listed in Table 14.2, which can be used to set or get the extra space between characters and lines.
Table 14.2 Functions to set/get extra space between characters and lines
| Function | Meaning |
| GetTextCharacterExtra | Get the extra space between characters |
| SetTextCharacterExtra | Set the extra space between characters |
| GetTextAboveLineExtra | Get the extra space above line |
| SetTextAboveLineExtra | Set the extra space above line |
| GetTextBellowLineExtra | Get the extra space bellow line |
| SetTextBellowLineExtra | Set the extra space bellow line |
The more usage of logical font and text output functions is illustrated in fontdemo.c file of MDE.
Anti-Alias and Auto Zoom in of Logical Font
A font anti-alias character is added to the new GDI. It uses low filter algorithm to make font have the ability of anti-alias and auto zoom in function. You can refer to the MiniGUI Manual for the use of this new character.