ocean.text.utf.UtfString

Struct template to iterate over strings in variable encoding format (utf8, utf16, utf32), extracting one unicode character at a time. Each unicode character may be represented by one or more character in the input string, depending on the encoding format.

The struct takes a template parameter (pull_dchars) which determines whether its methods return unicode characters (utf32 - dchars) or characters in the same format as the source string.

The template also has an index operator, to extract the nth unicode character in the string, and methods and static methods for extracting single characters from a string of variable encoding.

Example usage:

import ocean.text.utf.UtfString;

char[] test = "test string";
UtfString!(char) utfstr = { test };

foreach ( width, i, c; utfstr )
{
    Stdout.formatln("Character {} is {} and it's {} wide", i, c, width);
}

There is also a utf_match function in the module, which compares two strings for equivalence, irrespective of whether they're in the same encoding or not.

Members

Functions

utf_match
bool utf_match(Char1[] str1, Char2[] str2)

Encoding agnostic string compare function.

Static variables

InvalidUnicode
dchar InvalidUnicode;

Invalid unicode.

Structs

UtfString
struct UtfString(Char = char, bool pull_dchars = false)

UtfString template struct

Examples

import ocean.text.utf.UtfString;

char[] str1 = "hello world ®"; // utf8 encoding
dchar[] str2 = "hello world ®"; // utf32 encoding

assert(utf_match(str1, str2));

Meta

License

Boost Software License Version 1.0. See LICENSE_BOOST.txt for details. Alternatively, this file may be distributed under the terms of the Tango 3-Clause BSD License (see LICENSE_BSD.txt for details).