ILIAS  release_9 Revision v9.13-25-g2c18ec4c24f
Sanitizer Class Reference
+ Collaboration diagram for Sanitizer:

Static Public Member Functions

static decodeCharReferences ($text)
 Decode any character references, numeric or named entities, in the text and return a UTF-8 string. More...
 
static decodeCharReferencesCallback ($matches)
 
static decodeChar ($codepoint)
 Return UTF-8 string for a codepoint if that is a valid character reference, otherwise U+FFFD REPLACEMENT CHARACTER. More...
 
static decodeEntity ($name)
 If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the UTF-8 encoding of that character. More...
 

Static Private Member Functions

static validateCodepoint ($codepoint)
 Returns true if a given Unicode codepoint is a valid character in XML. More...
 

Detailed Description

Definition at line 355 of file Sanitizer.php.

Member Function Documentation

◆ decodeChar()

static Sanitizer::decodeChar (   $codepoint)
static

Return UTF-8 string for a codepoint if that is a valid character reference, otherwise U+FFFD REPLACEMENT CHARACTER.

Parameters
int$codepoint
Returns
string

Definition at line 416 of file Sanitizer.php.

421  {
422  if (Sanitizer::validateCodepoint($codepoint)) {
423  return "";
424  //return codepointToUtf8($codepoint);
static validateCodepoint($codepoint)
Returns true if a given Unicode codepoint is a valid character in XML.
Definition: Sanitizer.php:362

◆ decodeCharReferences()

static Sanitizer::decodeCharReferences (   $text)
static

Decode any character references, numeric or named entities, in the text and return a UTF-8 string.

Parameters
string$text
Returns
string

Definition at line 381 of file Sanitizer.php.

Referenced by Title\newFromText().

386  {
387  return preg_replace_callback(
const MW_CHAR_REFS_REGEX
Regular expression to match various types of character references in Sanitizer::normalizeCharReferenc...
Definition: Sanitizer.php:30
+ Here is the caller graph for this function:

◆ decodeCharReferencesCallback()

static Sanitizer::decodeCharReferencesCallback (   $matches)
static
Parameters
string$matches
Returns
string

Definition at line 394 of file Sanitizer.php.

399  {
400  if ($matches[1] != '') {
401  return Sanitizer::decodeEntity($matches[1]);
402  } elseif ($matches[2] != '') {
403  return Sanitizer::decodeChar(intval($matches[2]));
404  } elseif ($matches[3] != '') {
405  return Sanitizer::decodeChar(hexdec($matches[3]));
406  } elseif ($matches[4] != '') {
407  return Sanitizer::decodeChar(hexdec($matches[4]));
static decodeChar($codepoint)
Return UTF-8 string for a codepoint if that is a valid character reference, otherwise U+FFFD REPLACEM...
Definition: Sanitizer.php:416
static decodeEntity($name)
If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the UTF-8 encoding of that chara...
Definition: Sanitizer.php:434

◆ decodeEntity()

static Sanitizer::decodeEntity (   $name)
static

If the named entity is defined in the HTML 4.0/XHTML 1.0 DTD, return the UTF-8 encoding of that character.

Otherwise, returns pseudo-entity source (eg )

Parameters
string$name
Returns
string

Definition at line 434 of file Sanitizer.php.

439  {
441 
442  if (isset($wgHtmlEntityAliases[$name])) {
443  $name = $wgHtmlEntityAliases[$name];
444  }
445  if (isset($wgHtmlEntities[$name])) {
446  return "";
447  //return codepointToUtf8($wgHtmlEntities[$name]);
global $wgHtmlEntities
List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html.
Definition: Sanitizer.php:63
global $wgHtmlEntityAliases
Character entity aliases accepted by MediaWiki.
Definition: Sanitizer.php:321

◆ validateCodepoint()

static Sanitizer::validateCodepoint (   $codepoint)
staticprivate

Returns true if a given Unicode codepoint is a valid character in XML.

Parameters
int$codepoint
Returns
bool

Definition at line 362 of file Sanitizer.php.

367  {
368  return ($codepoint == 0x09)
369  || ($codepoint == 0x0a)
370  || ($codepoint == 0x0d)

The documentation for this class was generated from the following file: