ocean.text.Util

Placeholder for a variety of wee functions.

Several of these functions return an index value, representing where some criteria was identified. When said criteria is not matched, the functions return a value representing the array length provided to them. That is, for those scenarios where C functions might typically return -1 these functions return length instead. This operate nicely with D slices:

auto text = "happy:faces";

assert (text[0 .. locate (text, ':')] == "happy");

assert (text[0 .. locate (text, '!')] == "happy:faces");

The contains() function is more convenient for trivial lookup cases:

if (contains ("fubar", '!'))
    ...

Note that where some functions expect a size_t as an argument, the D template-matching algorithm will fail where an int is provided instead. This is the typically the cause of "template not found" errors. Also note that name overloading is not supported cleanly by IFTI at this time, so is not applied here.

More...

Members

Aliases

splitLines
alias splitLines = toLines
Undocumented in source.

Functions

chopl
inout(char)[] chopl(inout(char)[] source, cstring match)

Chop the given source by stripping the provided match from the left hand side. Returns a slice of the original content

chopr
inout(char)[] chopr(inout(char)[] source, cstring match)

Chop the given source by stripping the provided match from the right hand side. Returns a slice of the original content

combine
mstring combine(mstring dst, cstring prefix, cstring postfix, const(char[])[] src)

Combine a series of text segments together, each prefixed and/or postfixed with optional strings. An optional output buffer can be provided to avoid heap activity - which should be large enough to contain the entire output, otherwise the heap will be used instead.

contains
bool contains(cstring source, char match)

Returns whether or not the provided array contains an instance of the given match

containsPattern
bool containsPattern(cstring source, cstring match)

Returns whether or not the provided array contains an instance of the given match

count
size_t count(cstring source, cstring match)

Count all instances of match within source

delimit
inout(char)[][] delimit(inout(char)[] src, cstring set)

Split the provided array wherever a delimiter-set instance is found, and return the resultant segments. The delimiters are excluded from each of the segments. Note that delimiters are matched as a set of alternates rather than as a pattern.

delimiters
DelimFruct!T delimiters(T[] src, cstring set)

Iterator to isolate text elements.

head
inout(char)[] head(inout(char)[] src, cstring pattern, inout(char)[] tail)

Split the provided array on the first pattern instance, and return the resultant head and tail. The pattern is excluded from the two segments.

index
size_t index(cstring source, cstring match, size_t start)

Return the index of the next instance of 'match' starting at position 'start', or source.length where there is no match.

indexOf
size_t indexOf(const(char)* str, char match, size_t length)

Returns the index of the first match in str, failing once length is reached. Note that we return 'length' for failure and a 0-based index on success

isSpace
bool isSpace(char c)

Is the argument a whitespace character?

jhash
size_t jhash(ubyte* k, size_t len, size_t c)
size_t jhash(void[] x, size_t c)

jhash() -- hash a variable-length key into a 32-bit value

join
mstring join(const(char[])[] src, cstring postfix, mstring dst)

Combine a series of text segments together, each appended with a postfix pattern. An optional output buffer can be provided to avoid heap activity - it should be large enough to contain the entire output, otherwise the heap will be used instead.

lineOf
inout(char)[] lineOf(inout(char)[] src, size_t index)

Return the indexed line, where each line is identified by a \n or \r\n combination. The line terminator is stripped from the resultant line

lines
LineFruct!(T) lines(T[] src)

Iterator to isolate lines.

locate
size_t locate(cstring source, char match, size_t start)

Return the index of the next instance of 'match' starting at position 'start', or source.length where there is no match.

locatePattern
size_t locatePattern(cstring source, cstring match, size_t start)

Return the index of the next instance of 'match' starting at position 'start', or source.length where there is no match.

locatePatternPrior
size_t locatePatternPrior(cstring source, cstring match, size_t start)

Return the index of the prior instance of 'match' starting just before 'start', or source.length where there is no match.

locatePrior
size_t locatePrior(cstring source, char match, size_t start)

Return the index of the prior instance of 'match' starting just before 'start', or source.length where there is no match.

main
void main()
Undocumented in source. Be warned that the author may not have intended to support it.
matching
bool matching(const(char)* s1, const(char)* s2, size_t length)

Return whether or not the two arrays have matching content

mismatch
size_t mismatch(const(char)* s1, const(char)* s2, size_t length)

Returns the index of a mismatch between s1 & s2, failing when length is reached. Note that we return 'length' upon failure (array content matches) and a 0-based index upon success.

patterns
PatternFruct!T patterns(T[] src, cstring pattern, T[] sub)

Iterator to isolate text elements.

postfix
mstring postfix(mstring dst, cstring postfix, cstring[] src)

Combine a series of text segments together, each appended with an optional postfix pattern. An optional output buffer can be provided to avoid heap activity - it should be large enough to contain the entire output, otherwise the heap will be used instead.

prefix
mstring prefix(mstring dst, cstring prefix, const(char[])[] src)

Combine a series of text segments together, each prepended with a prefix pattern. An optional output buffer can be provided to avoid heap activity - it should be large enough to contain the entire output, otherwise the heap will be used instead.

quotes
QuoteFruct!T quotes(T[] src, cstring set)

Iterator to isolate optionally quoted text elements.

repeat
mstring repeat(cstring src, size_t count, mstring dst)

Repeat an array for a specific number of times. An optional output buffer can be provided to avoid heap activity - it should be large enough to contain the entire output, otherwise the heap will be used instead.

replace
mstring replace(mstring source, char match, char replacement)

Replace all instances of one element with another (in place)

rindex
size_t rindex(cstring source, cstring match, size_t start)

Return the index of the prior instance of 'match' starting just before 'start', or source.length where there is no match.

split
inout(char)[][] split(inout(char)[] src, cstring pattern)

Split the provided array wherever a pattern instance is found, and return the resultant segments. The pattern is excluded from each of the segments.

strip
inout(char)[] strip(inout(char)[] source, char match)

Trim the given array by stripping the provided match from both ends. Returns a slice of the original content

stripl
inout(char)[] stripl(inout(char)[] source, char match)

Trim the given array by stripping the provided match from the left hand side. Returns a slice of the original content

stripr
inout(char)[] stripr(inout(char)[] source, char match)

Trim the given array by stripping the provided match from the right hand side. Returns a slice of the original content

substitute
mstring substitute(cstring source, cstring match, cstring replacement)

Substitute all instances of match from source. Set replacement to null in order to remove instead of replace

tail
inout(char)[] tail(inout(char)[] src, cstring pattern, inout(char)[] head)

Split the provided array on the last pattern instance, and return the resultant head and tail. The pattern is excluded from the two segments.

toLines
inout(char)[][] toLines(inout(char)[] src)

Convert text into a set of lines, where each line is identified by a \n or \r\n combination. The line terminator is stripped from each resultant array

trim
inout(char)[] trim(inout(char)[] source)

Trim the provided array by stripping whitespace from both ends. Returns a slice of the original content

triml
inout(char)[] triml(inout(char)[] source)

Trim the provided array by stripping whitespace from the left. Returns a slice of the original content

trimr
inout(char)[] trimr(inout(char)[] source)

Trim the provided array by stripping whitespace from the right. Returns a slice of the original content

unescape
cstring unescape(cstring src, mstring dst)

Convert 'escaped' chars to normal ones: \t => ^t for example. Supports \" \' \\ \a \b \f \n \r \t \v

Structs

PatternFruct
struct PatternFruct(T)

Helper fruct for iterator patterns(). A fruct is a low impact mechanism for capturing context relating to an opApply (conjunction of the names struct and foreach)

Variables

x
auto x;
Undocumented in source.

Detailed Description

Applying the D "import alias" mechanism to this module is highly recommended, in order to limit namespace pollution:

import Util = ocean.text.Util;

auto s = Util.trim ("  foo ");

Function templates:

trim (source)                               // trim whitespace
triml (source)                              // trim whitespace
trimr (source)                              // trim whitespace
strip (source, match)                       // trim elements
stripl (source, match)                      // trim elements
stripr (source, match)                      // trim elements
chopl (source, match)                       // trim pattern match
chopr (source, match)                       // trim pattern match
delimit (src, set)                          // split on delims
split (source, pattern)                     // split on pattern
splitLines (source);                        // split on lines
head (source, pattern, tail)                // split to head & tail
join (source, postfix, output)              // join text segments
prefix (dst, prefix, content...)            // prefix text segments
postfix (dst, postfix, content...)          // postfix text segments
combine (dst, prefix, postfix, content...)  // combine lotsa stuff
repeat (source, count, output)              // repeat source
replace (source, match, replacement)        // replace chars
substitute (source, match, replacement)     // replace/remove matches
count (source, match)                       // count instances
contains (source, match)                    // has char?
containsPattern (source, match)             // has pattern?
index (source, match, start)                // find match index
locate (source, match, start)               // find char
locatePrior (source, match, start)          // find prior char
locatePattern (source, match, start);       // find pattern
locatePatternPrior (source, match, start);  // find prior pattern
indexOf (s*, match, length)                 // low-level lookup
mismatch (s1*, s2*, length)                 // low-level compare
matching (s1*, s2*, length)                 // low-level compare
isSpace (match)                             // is whitespace?
unescape(source, output)                    // convert '\' prefixes
lines (str)                                 // foreach lines
quotes (str, set)                           // foreach quotes
delimiters (str, set)                       // foreach delimiters
patterns (str, pattern)                     // foreach patterns

Please note that any 'pattern' referred to within this module refers to a pattern of characters, and not some kind of regex descriptor. Use the Regex module for regex operation.

Meta

License

Tango Dual License: 3-Clause BSD License / Academic Free License v3.0. See LICENSE_TANGO.txt for details.

Version

Apr 2004: Initial release Dec 2006: South Seas version