ILIAS
release_5-3 Revision v5.3.23-19-g915713cf615
|
Static Public Member Functions | |
static | cleanUp ($string) |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition. More... | |
static | toNFC ($string) |
Convert a UTF-8 string to normal form C, canonical composition. More... | |
static | toNFD ($string) |
Convert a UTF-8 string to normal form D, canonical decomposition. More... | |
static | toNFKC ($string) |
Convert a UTF-8 string to normal form KC, compatibility composition. More... | |
static | toNFKD ($string) |
Convert a UTF-8 string to normal form KD, compatibility decomposition. More... | |
static | loadData () |
Load the basic composition data if necessary. More... | |
static | quickIsNFC ($string) |
Returns true if the string is definitely in NFC. More... | |
static | quickIsNFCVerify (&$string) |
Returns true if the string is definitely in NFC. More... | |
static | NFC ($string) |
static | NFD ($string) |
static | NFKC ($string) |
static | NFKD ($string) |
static | fastDecompose ($string, $map) |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us). More... | |
static | fastCombiningSort ($string) |
Sorts combining characters into canonical order. More... | |
static | fastCompose ($string) |
Produces canonically composed sequences, i.e. More... | |
static | placebo ($string) |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance. More... | |
Definition at line 112 of file UtfNormal.php.
|
static |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. Not as fast as toNFC().
string | $string | a UTF-8 string |
Definition at line 125 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFCVerify(), UNORM_NFC, UTF8_FFFE, UTF8_FFFF, and UTF8_REPLACEMENT.
Referenced by CleanUpTest\doTestBytes(), CleanUpTest\doTestDoubleBytes(), CleanUpTest\doTestTripleBytes(), CleanUpTest\testAscii(), CleanUpTest\testBomRegression(), CleanUpTest\testChunkRegression(), CleanUpTest\testForbiddenRegression(), CleanUpTest\testHangulRegression(), CleanUpTest\testInterposeRegression(), CleanUpTest\testLatin(), CleanUpTest\testLatinNormal(), CleanUpTest\testNull(), CleanUpTest\testOverlongRegression(), CleanUpTest\testSurrogateRegression(), and CleanUpTest\XtestAllChars().
|
static |
Sorts combining characters into canonical order.
This is the final step in creating decomposed normal forms D and KD.
string | $string | a valid, decomposed UTF-8 string. Input is not validated. |
Definition at line 638 of file UtfNormal.php.
References $i, $n, $out, $utfCombiningClass, array, and loadData().
Referenced by NFD(), and NFKD().
|
static |
Produces canonically composed sequences, i.e.
normal form C or KC.
string | $string | a valid UTF-8 string in sorted normal form D or KD. Input is not validated. |
Definition at line 693 of file UtfNormal.php.
References $i, $n, $out, $utfCanonicalComp, $utfCombiningClass, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_TCOUNT, UNICODE_HANGUL_VCOUNT, UTF8_HANGUL_FIRST, UTF8_HANGUL_LAST, UTF8_HANGUL_LBASE, UTF8_HANGUL_LEND, UTF8_HANGUL_TBASE, UTF8_HANGUL_TEND, UTF8_HANGUL_VBASE, and UTF8_HANGUL_VEND.
Referenced by NFC(), and NFKC().
|
static |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us).
Input is assumed to be valid UTF-8. Invalid code will break.
string | $string | Valid UTF-8 string |
array | $map | hash of expanded decomposition map |
Definition at line 576 of file UtfNormal.php.
References $i, $index, $l, $n, $out, $t, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_NCOUNT, UNICODE_HANGUL_TCOUNT, UTF8_HANGUL_FIRST, and UTF8_HANGUL_LAST.
Referenced by NFD(), and NFKD().
|
static |
Load the basic composition data if necessary.
Definition at line 232 of file UtfNormal.php.
References $utfCombiningClass.
Referenced by fastCombiningSort(), fastCompose(), fastDecompose(), NFD(), quickIsNFC(), and quickIsNFCVerify().
|
static |
string | $string |
Definition at line 517 of file UtfNormal.php.
References fastCompose(), and NFD().
Referenced by cleanUp(), CleanUpTest\doTestDoubleBytes(), CleanUpTest\doTestTripleBytes(), toNFC(), and CleanUpTest\XtestAllChars().
|
static |
string | $string |
Definition at line 528 of file UtfNormal.php.
References $utfCanonicalDecomp, fastCombiningSort(), fastDecompose(), and loadData().
Referenced by NFC(), and toNFD().
|
static |
string | $string |
Definition at line 543 of file UtfNormal.php.
References fastCompose(), and NFKD().
Referenced by toNFKC().
|
static |
string | $string |
Definition at line 554 of file UtfNormal.php.
References $utfCompatibilityDecomp, fastCombiningSort(), and fastDecompose().
Referenced by NFKC(), and toNFKD().
|
static |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance.
string | $string |
Definition at line 829 of file UtfNormal.php.
|
static |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 247 of file UtfNormal.php.
References $i, $n, $utfCombiningClass, and loadData().
Referenced by toNFC().
|
static |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a UTF-8 string, altered on output to be valid UTF-8 safe for XML. |
Definition at line 291 of file UtfNormal.php.
References $base, $i, $n, $out, $remaining, $utfCombiningClass, array, down(), is, loadData(), security, to, UTF8_FFFE, UTF8_FFFF, UTF8_MAX, UTF8_OVERLONG_A, UTF8_OVERLONG_B, UTF8_OVERLONG_C, UTF8_REPLACEMENT, and UTF8_SURROGATE_FIRST.
Referenced by cleanUp().
|
static |
Convert a UTF-8 string to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 157 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFC(), and UNORM_NFC.
Referenced by ilDAVServer\davDeslashify(), ilDAVServer\davUrlEncode(), ilTree\getNodePathForTitlePath(), and ilStr\normalizeUtf8String().
|
static |
Convert a UTF-8 string to normal form D, canonical decomposition.
Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 176 of file UtfNormal.php.
References NFD(), NORMALIZE_ICU, and UNORM_NFD.
|
static |
Convert a UTF-8 string to normal form KC, compatibility composition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 196 of file UtfNormal.php.
References NFKC(), NORMALIZE_ICU, and UNORM_NFKC.
|
static |
Convert a UTF-8 string to normal form KD, compatibility decomposition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 216 of file UtfNormal.php.
References NFKD(), NORMALIZE_ICU, and UNORM_NFKD.