rt.compiler.util.utf

Encode and decode UTF-8, UTF-16 and UTF-32 strings.

For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D wchar type. For linux systems, the C wchar_t type is UTF-32 and corresponds to the D utf.dchar type.

UTF character support is restricted to (\u0000 <= character <= \U0010FFFF).

Members

Aliases

wptr
alias wptr = wchar*
Undocumented in source.

Functions

decode
dchar decode(char[] s, size_t idx)
dchar decode(wchar[] s, size_t idx)
dchar decode(dchar[] s, size_t idx)

Decodes and returns character starting at sidx. idx is advanced past the decoded character. If the character is not well formed, a UtfException is thrown and idx remains unchanged.

encode
void encode(char[] s, dchar c)
void encode(wchar[] s, dchar c)
void encode(dchar[] s, dchar c)

Encodes character c and appends it to array s[].

isValidDchar
bool isValidDchar(dchar c)

Test if c is a valid UTF-32 character.

onUnicodeError
void onUnicodeError(char[] msg, size_t idx)
Undocumented in source but is binding to C. You might be able to learn more by searching the web for its name.
stride
uint stride(char[] s, size_t i)

stride() returns the length of a UTF-8 sequence starting at index i in string s.

stride
uint stride(wchar[] s, size_t i)

stride() returns the length of a UTF-16 sequence starting at index i in string s.

stride
uint stride(dchar[] s, size_t i)

stride() returns the length of a UTF-32 sequence starting at index i in string s.

toUCSindex
size_t toUCSindex(char[] s, size_t i)
size_t toUCSindex(wchar[] s, size_t i)
size_t toUCSindex(dchar[] s, size_t i)

Given an index i into an array of characters s[], and assuming that index i is at the start of a UTF character, determine the number of UCS characters up to that index i.

toUTF16
wchar[] toUTF16(wchar[2] buf, dchar c)
Undocumented in source.
toUTF16
wchar[] toUTF16(char[] s)
wchar[] toUTF16(wchar[] s)
wchar[] toUTF16(dchar[] s)
toUTF16z
wptr toUTF16z(char[] s)

Encodes string s into UTF-16 and returns the encoded string. toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take an LPWSTR or LPCWSTR argument.

toUTF32
dchar[] toUTF32(char[] s)
dchar[] toUTF32(wchar[] s)
dchar[] toUTF32(dchar[] s)

Encodes string s into UTF-32 and returns the encoded string.

toUTF8
char[] toUTF8(char[4] buf, dchar c)
Undocumented in source.
toUTF8
char[] toUTF8(char[] s)
char[] toUTF8(wchar[] s)
char[] toUTF8(dchar[] s)

Encodes string s into UTF-8 and returns the encoded string.

toUTFindex
size_t toUTFindex(char[] s, size_t n)
size_t toUTFindex(wchar[] s, size_t n)
size_t toUTFindex(dchar[] s, size_t n)

Given a UCS index n into an array of characters s[], return the UTF index.

validate
void validate(char[] s)
Undocumented in source. Be warned that the author may not have intended to support it.
validate
void validate(wchar[] s)
Undocumented in source. Be warned that the author may not have intended to support it.
validate
void validate(dchar[] s)
Undocumented in source. Be warned that the author may not have intended to support it.

Variables

UTF8stride
ubyte[256] UTF8stride;
Undocumented in source.

See Also

Meta