ILIAS
release_4-3 Revision
|
Public Member Functions | |
cleanUp ($string) | |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition. | |
toNFC ($string) | |
Convert a UTF-8 string to normal form C, canonical composition. | |
toNFD ($string) | |
Convert a UTF-8 string to normal form D, canonical decomposition. | |
toNFKC ($string) | |
Convert a UTF-8 string to normal form KC, compatibility composition. | |
toNFKD ($string) | |
Convert a UTF-8 string to normal form KD, compatibility decomposition. | |
loadData () | |
Load the basic composition data if necessary private. | |
quickIsNFC ($string) | |
Returns true if the string is definitely in NFC. | |
quickIsNFCVerify (&$string) | |
Returns true if the string is definitely in NFC. | |
NFC ($string) | |
NFD ($string) | |
NFKC ($string) | |
NFKD ($string) | |
fastDecompose ($string, &$map) | |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us). | |
fastCombiningSort ($string) | |
Sorts combining characters into canonical order. | |
fastCompose ($string) | |
Produces canonically composed sequences, i.e. | |
placebo ($string) | |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance. |
Static Public Member Functions | |
static | cleanUp ($string) |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition. | |
static | toNFC ($string) |
Convert a UTF-8 string to normal form C, canonical composition. | |
static | toNFD ($string) |
Convert a UTF-8 string to normal form D, canonical decomposition. | |
static | toNFKC ($string) |
Convert a UTF-8 string to normal form KC, compatibility composition. | |
static | toNFKD ($string) |
Convert a UTF-8 string to normal form KD, compatibility decomposition. | |
static | quickIsNFC ($string) |
Returns true if the string is definitely in NFC. | |
static | quickIsNFCVerify (&$string) |
Returns true if the string is definitely in NFC. | |
static | placebo ($string) |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance. |
Static Private Member Functions | |
static | loadData () |
Load the basic composition data if necessary. | |
static | NFC ($string) |
static | NFD ($string) |
static | NFKC ($string) |
static | NFKD ($string) |
static | fastDecompose ($string, $map) |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us). | |
static | fastCombiningSort ($string) |
Sorts combining characters into canonical order. | |
static | fastCompose ($string) |
Produces canonically composed sequences, i.e. |
Definition at line 117 of file UtfNormal.php.
|
static |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. Not as fast as toNFC().
string | $string | a UTF-8 string |
Definition at line 124 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFCVerify(), UNORM_NFC, UTF8_FFFE, UTF8_FFFF, and UTF8_REPLACEMENT.
UtfNormal::cleanUp | ( | $string | ) |
The ultimate convenience function! Clean up invalid UTF-8 sequences, and convert to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters. Not as fast as toNFC().
string | $string | a UTF-8 string |
Definition at line 128 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFCVerify(), UNORM_NFC, UTF8_FFFE, UTF8_FFFF, and UTF8_REPLACEMENT.
Referenced by CleanUpTest\doTestBytes(), CleanUpTest\doTestDoubleBytes(), CleanUpTest\doTestTripleBytes(), CleanUpTest\testAscii(), CleanUpTest\testBomRegression(), CleanUpTest\testChunkRegression(), CleanUpTest\testForbiddenRegression(), CleanUpTest\testHangulRegression(), CleanUpTest\testInterposeRegression(), CleanUpTest\testLatin(), CleanUpTest\testLatinNormal(), CleanUpTest\testNull(), CleanUpTest\testOverlongRegression(), CleanUpTest\testSurrogateRegression(), and CleanUpTest\XtestAllChars().
UtfNormal::fastCombiningSort | ( | $string | ) |
Sorts combining characters into canonical order.
This is the final step in creating decomposed normal forms D and KD. private
string | $string | a valid, decomposed UTF-8 string. Input is not validated. |
Definition at line 600 of file UtfNormal.php.
References $n, $out, $utfCombiningClass, and loadData().
Referenced by NFD(), and NFKD().
|
staticprivate |
Sorts combining characters into canonical order.
This is the final step in creating decomposed normal forms D and KD.
string | $string | a valid, decomposed UTF-8 string. Input is not validated. |
Definition at line 610 of file UtfNormal.php.
References $n, $out, $utfCombiningClass, and loadData().
UtfNormal::fastCompose | ( | $string | ) |
Produces canonically composed sequences, i.e.
normal form C or KC.
private
string | $string | a valid UTF-8 string in sorted normal form D or KD. Input is not validated. |
Definition at line 649 of file UtfNormal.php.
References $n, $out, $utfCanonicalComp, $utfCombiningClass, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_TCOUNT, UNICODE_HANGUL_VCOUNT, UTF8_HANGUL_FIRST, UTF8_HANGUL_LAST, UTF8_HANGUL_LBASE, UTF8_HANGUL_LEND, UTF8_HANGUL_TBASE, UTF8_HANGUL_TEND, UTF8_HANGUL_VBASE, and UTF8_HANGUL_VEND.
Referenced by NFC(), and NFKC().
|
staticprivate |
Produces canonically composed sequences, i.e.
normal form C or KC.
string | $string | a valid UTF-8 string in sorted normal form D or KD. Input is not validated. |
Definition at line 664 of file UtfNormal.php.
References $n, $out, $utfCanonicalComp, $utfCombiningClass, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_TCOUNT, UNICODE_HANGUL_VCOUNT, UTF8_HANGUL_FIRST, UTF8_HANGUL_LAST, UTF8_HANGUL_LBASE, UTF8_HANGUL_LEND, UTF8_HANGUL_TBASE, UTF8_HANGUL_TEND, UTF8_HANGUL_VBASE, and UTF8_HANGUL_VEND.
UtfNormal::fastDecompose | ( | $string, | |
& | $map | ||
) |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us).
Input is assumed to be valid UTF-8. Invalid code will break. private
string | $string | Valid UTF-8 string |
array | $map | hash of expanded decomposition map |
Definition at line 540 of file UtfNormal.php.
References $n, $out, $t, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_NCOUNT, UNICODE_HANGUL_TCOUNT, UTF8_HANGUL_FIRST, and UTF8_HANGUL_LAST.
Referenced by NFD(), and NFKD().
|
staticprivate |
Perform decomposition of a UTF-8 string into either D or KD form (depending on which decomposition map is passed to us).
Input is assumed to be valid UTF-8. Invalid code will break.
string | $string | Valid UTF-8 string |
array | $map | hash of expanded decomposition map |
Definition at line 549 of file UtfNormal.php.
References $n, $out, $t, loadData(), UNICODE_HANGUL_FIRST, UNICODE_HANGUL_NCOUNT, UNICODE_HANGUL_TCOUNT, UTF8_HANGUL_FIRST, and UTF8_HANGUL_LAST.
UtfNormal::loadData | ( | ) |
Load the basic composition data if necessary private.
Definition at line 220 of file UtfNormal.php.
References $utfCanonicalComp, $utfCanonicalDecomp, and $utfCombiningClass.
Referenced by fastCombiningSort(), fastCompose(), fastDecompose(), NFD(), quickIsNFC(), and quickIsNFCVerify().
|
staticprivate |
Load the basic composition data if necessary.
Definition at line 221 of file UtfNormal.php.
References $utfCombiningClass.
UtfNormal::NFC | ( | $string | ) |
string | $string |
Definition at line 491 of file UtfNormal.php.
References fastCompose(), and NFD().
Referenced by cleanUp(), CleanUpTest\doTestDoubleBytes(), CleanUpTest\doTestTripleBytes(), toNFC(), and CleanUpTest\XtestAllChars().
|
staticprivate |
string | $string |
Definition at line 496 of file UtfNormal.php.
References fastCompose(), and NFD().
UtfNormal::NFD | ( | $string | ) |
string | $string |
Definition at line 500 of file UtfNormal.php.
References $utfCanonicalDecomp, fastCombiningSort(), fastDecompose(), and loadData().
Referenced by NFC(), and toNFD().
|
staticprivate |
string | $string |
Definition at line 506 of file UtfNormal.php.
References $utfCanonicalDecomp, fastCombiningSort(), fastDecompose(), and loadData().
UtfNormal::NFKC | ( | $string | ) |
string | $string |
Definition at line 512 of file UtfNormal.php.
References fastCompose(), and NFKD().
Referenced by toNFKC().
|
staticprivate |
string | $string |
Definition at line 519 of file UtfNormal.php.
References fastCompose(), and NFKD().
UtfNormal::NFKD | ( | $string | ) |
string | $string |
Definition at line 521 of file UtfNormal.php.
References $utfCompatibilityDecomp, fastCombiningSort(), and fastDecompose().
Referenced by NFKC(), and toNFKD().
|
staticprivate |
string | $string |
Definition at line 529 of file UtfNormal.php.
References $utfCompatibilityDecomp, fastCombiningSort(), and fastDecompose().
UtfNormal::placebo | ( | $string | ) |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance.
string | $string |
Definition at line 781 of file UtfNormal.php.
References $out.
|
static |
This is just used for the benchmark, comparing how long it takes to interate through a string without really doing anything of substance.
string | $string |
Definition at line 797 of file UtfNormal.php.
References $out.
UtfNormal::quickIsNFC | ( | $string | ) |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 233 of file UtfNormal.php.
References $n, $utfCombiningClass, and loadData().
Referenced by toNFC().
|
static |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 235 of file UtfNormal.php.
References $n, $utfCombiningClass, and loadData().
UtfNormal::quickIsNFCVerify | ( | & | $string | ) |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a UTF-8 string, altered on output to be valid UTF-8 safe for XML. |
Definition at line 273 of file UtfNormal.php.
References $n, $utfCombiningClass, loadData(), UTF8_FFFE, UTF8_FFFF, UTF8_MAX, UTF8_OVERLONG_A, UTF8_OVERLONG_B, UTF8_OVERLONG_C, UTF8_REPLACEMENT, and UTF8_SURROGATE_FIRST.
Referenced by cleanUp().
|
static |
Returns true if the string is definitely in NFC.
Returns false if not or uncertain.
string | $string | a UTF-8 string, altered on output to be valid UTF-8 safe for XML. |
Definition at line 276 of file UtfNormal.php.
References $n, $utfCombiningClass, loadData(), UTF8_FFFE, UTF8_FFFF, UTF8_MAX, UTF8_OVERLONG_A, UTF8_OVERLONG_B, UTF8_OVERLONG_C, UTF8_REPLACEMENT, and UTF8_SURROGATE_FIRST.
|
static |
Convert a UTF-8 string to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 154 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFC(), and UNORM_NFC.
UtfNormal::toNFC | ( | $string | ) |
Convert a UTF-8 string to normal form C, canonical composition.
Fast return for pure ASCII strings; some lesser optimizations for strings containing only known-good characters.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 157 of file UtfNormal.php.
References NFC(), NORMALIZE_ICU, quickIsNFC(), and UNORM_NFC.
Referenced by ilDAVServer\davDeslashify(), ilDAVServer\davUrlEncode(), and ilTree\getNodePathForTitlePath().
|
static |
Convert a UTF-8 string to normal form D, canonical decomposition.
Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 171 of file UtfNormal.php.
References NFD(), NORMALIZE_ICU, and UNORM_NFD.
UtfNormal::toNFD | ( | $string | ) |
Convert a UTF-8 string to normal form D, canonical decomposition.
Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 173 of file UtfNormal.php.
References NFD(), NORMALIZE_ICU, and UNORM_NFD.
|
static |
Convert a UTF-8 string to normal form KC, compatibility composition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 189 of file UtfNormal.php.
References NFKC(), NORMALIZE_ICU, and UNORM_NFKC.
UtfNormal::toNFKC | ( | $string | ) |
Convert a UTF-8 string to normal form KC, compatibility composition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 190 of file UtfNormal.php.
References NFKC(), NORMALIZE_ICU, and UNORM_NFKC.
|
static |
Convert a UTF-8 string to normal form KD, compatibility decomposition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 207 of file UtfNormal.php.
References NFKD(), NORMALIZE_ICU, and UNORM_NFKD.
UtfNormal::toNFKD | ( | $string | ) |
Convert a UTF-8 string to normal form KD, compatibility decomposition.
This may cause irreversible information loss, use judiciously. Fast return for pure ASCII strings.
string | $string | a valid UTF-8 string. Input is not validated. |
Definition at line 207 of file UtfNormal.php.
References NFKD(), NORMALIZE_ICU, and UNORM_NFKD.
Referenced by ShibAuth\toAscii().