class StringUtils

StringUtils contains a set of static methods to operate with strings, needed to serialize/normalize person names, titles, years and so on.

Constants

WHITE_SPACE

AT_LEAST_TWO_WHITE_SPACES

SINGLE_NUMBER

FOUR_NUMBERS

NON_NUMBERS

NON_NUMBERS_OR_LETTERS

NON_NUMBERS_OR_LETTERS_OR_DOTS_OR_SPACE

NON_NUMBERS_OR_LETTERS_OR_DOTS_OR_COMMA_OR_SPACE

NON_LETTERS_OR_DOTS_OR_COMMA_OR_SEMICOLON_OR_SPACE

TITLE_SOURCE_SPLIT_PATTERN

DEFAULT_CHARSET

PARSE_MODE_KEY

PARSE_MODE_VALUE

Methods

static string
removeNonNumbersOrLetters(string $string)

Removes everything which is neither a number nor a letter.

static string
removeNonNumbers(string $string)

Removes everything, but numbers.

static string,
getStringFromList(array|ArrayList $array)

All strings in the array are concatenated and returned as one single string, i.e. like [item1,item2,item3,.

static 
removeNonNumbersOrLettersOrDotsOrSpace($string)

No description

static 
normalizeWhitespace($string)

No description

static string
removeNonNumbersOrLettersOrDotsOrCommaOrSpace(string $string)

Removes everything which is neither a number nor a letter nor a dot (.) nor a comma nor nor space.

static string
removeNonLettersOrDotsOrCommaOrSemicolonOrSpace(string $string)

Removes everything which is neither a letter nor a dot (.) nor a comma nor a semicolon nor white space.

static string
cleanTitle(string $title)

two or more spaces in a row will be replaced by a single space character.

static string
cleanTitle2(string $title)

decodes html entities, removes tags, converts to utf8 and removes double, triple (a.s.o.) white spaces

static array|boolean
split(string $string, string $pattern)

No description

static string
extractDateYearFromTitleSource(string $titleSource)

Returns the year from an string containing substring like: JAN 19, 2013

static string
extractYear(string $string)

Extracts four digits (year) from a string.

static array
splitTitleSource(string $titleSource)

No description

static string
extractJournalTitle(string $titleSource)

No description

static string
extractYearFromTitleSource(string $titleSource)

No description

static string
extractVolume(string $titleSource)

No description

static string
extractIssue(string $titleSource)

No description

static string
extractPage(string $titleSource)

No description

static array
toStringArray(array $array)

Converts an array of objects to an array of strings.

static 
toASCII($str)

No description

static 
md5utf8($string)

No description

static 
utf8_encode($str)

No description

static 
parseBracketedKeyValuePairs($input, $assignmentOperator, $pairDelimiter, $bracketOpen, $bracketClosed)

No description

Details

at line 73
static string removeNonNumbersOrLetters(string $string)

Removes everything which is neither a number nor a letter.

Parameters

string $string

Return Value

string result

at line 84
static string removeNonNumbers(string $string)

Removes everything, but numbers.

Parameters

string $string

Return Value

string result

at line 98
static string, getStringFromList(array|ArrayList $array)

All strings in the array are concatenated and returned as one single string, i.e. like [item1,item2,item3,.

..].

Parameters

array|ArrayList $array a collection of strings to be concatenated

Return Value

string, i.e. like [item1,item2,item3,...].

at line 107
static removeNonNumbersOrLettersOrDotsOrSpace($string)

Parameters

$string

at line 113
static normalizeWhitespace($string)

Parameters

$string

at line 127
static string removeNonNumbersOrLettersOrDotsOrCommaOrSpace(string $string)

Removes everything which is neither a number nor a letter nor a dot (.) nor a comma nor nor space.

Note: does not remove whitespace around the numbers!

Parameters

string $string source string

Return Value

string result

at line 138
static string removeNonLettersOrDotsOrCommaOrSemicolonOrSpace(string $string)

Removes everything which is neither a letter nor a dot (.) nor a comma nor a semicolon nor white space.

Parameters

string $string

Return Value

string

at line 149
static string cleanTitle(string $title)

two or more spaces in a row will be replaced by a single space character.

Parameters

string $title

Return Value

string cleaned title

at line 161
static string cleanTitle2(string $title)

decodes html entities, removes tags, converts to utf8 and removes double, triple (a.s.o.) white spaces

Parameters

string $title

Return Value

string

at line 173
static array|boolean split(string $string, string $pattern)

Parameters

string $string
string $pattern

Return Value

array|boolean

at line 183
static string extractDateYearFromTitleSource(string $titleSource)

Returns the year from an string containing substring like: JAN 19, 2013

Parameters

string $titleSource

Return Value

string

at line 199
static string extractYear(string $string)

Extracts four digits (year) from a string.

If a year pattern will be found it returns them, otherwise an empty string.

Parameters

string $string

Return Value

string

at line 215
static array splitTitleSource(string $titleSource)

Parameters

string $titleSource

Return Value

array

Exceptions

Exception

at line 228
static string extractJournalTitle(string $titleSource)

Parameters

string $titleSource

Return Value

string

at line 238
static string extractYearFromTitleSource(string $titleSource)

Parameters

string $titleSource

Return Value

string

at line 249
static string extractVolume(string $titleSource)

Parameters

string $titleSource

Return Value

string

at line 259
static string extractIssue(string $titleSource)

Parameters

string $titleSource

Return Value

string

at line 269
static string extractPage(string $titleSource)

Parameters

string $titleSource

Return Value

string

at line 282
static array toStringArray(array $array)

Converts an array of objects to an array of strings.

ATTENTION: The __toString() method has to be implemented in ALL objects, which the array contains.

Parameters

array $array

Return Value

array

at line 291
static toASCII($str)

Parameters

$str

at line 303
static md5utf8($string)

Parameters

$string

at line 308
static utf8_encode($str)

Parameters

$str

at line 314
static parseBracketedKeyValuePairs($input, $assignmentOperator, $pairDelimiter, $bracketOpen, $bracketClosed)

Parameters

$input
$assignmentOperator
$pairDelimiter
$bracketOpen
$bracketClosed