BomSniffer

Handle byte-order-mark prefixes

Members

Functions

encoded
bool encoded()

Was an encoding located in the text (configured via setup)

encoding
Encoding encoding()

Return the current encoding. This is either the originally specified encoding, or a derived one obtained by inspecting the content for a BOM. The latter is performed as part of the decode() method

setup
void setup(Encoding encoding, bool found)

Configure this instance with unicode converters

signature
const(void)[] signature()

Return the signature (BOM) of the current encoding

Static functions

test
const(Info)* test(void[] content)

Scan the BOM signatures looking for a match. We scan in reverse order to get the longest match first

Examples

void[] INPUT2 = "abc\xE3\x81\x82\xE3\x81\x84\xE3\x81\x86".dup;
void[] INPUT = "\xEF\xBB\xBF" ~ INPUT2;
auto bom = new UnicodeBom!(char)(Encoding.Unknown);
size_t ate;
char[256] buf;

auto temp = bom.decode (INPUT, buf, &ate);
test (ate == INPUT.length);
test (bom.encoding == Encoding.UTF_8);

temp = bom.decode (INPUT2, buf, &ate);
test (ate == INPUT2.length);
test (bom.encoding == Encoding.UTF_8);

Meta