BomSniffer

Handle byte-order-mark prefixes

class BomSniffer {

Encoding encoding();

bool encoded();

const(void)[] signature();

void setup(Encoding encoding, bool found);

static const(Info)* test(void[] content);

}

Members

Functions

encoded bool encoded(): Was an encoding located in the text (configured via setup)
encoding Encoding encoding(): Return the current encoding. This is either the originally specified encoding, or a derived one obtained by inspecting the content for a BOM. The latter is performed as part of the decode() method
setup void setup(Encoding encoding, bool found): Configure this instance with unicode converters
signature const(void)[] signature(): Return the signature (BOM) of the current encoding

Static functions

test const(Info)* test(void[] content): Scan the BOM signatures looking for a match. We scan in reverse order to get the longest match first

Examples

void[] INPUT2 = "abc\xE3\x81\x82\xE3\x81\x84\xE3\x81\x86".dup;
void[] INPUT = "\xEF\xBB\xBF" ~ INPUT2;
auto bom = new UnicodeBom!(char)(Encoding.Unknown);
size_t ate;
char[256] buf;

auto temp = bom.decode (INPUT, buf, &ate);
test (ate == INPUT.length);
test (bom.encoding == Encoding.UTF_8);

temp = bom.decode (INPUT2, buf, &ate);
test (ate == INPUT2.length);
test (bom.encoding == Encoding.UTF_8);

BomSniffer

Members

Functions

Static functions

Examples

Meta

Source