ILIAS  trunk Revision v11.0_alpha-1723-g8e69f309bab
All Data Structures Namespaces Files Functions Variables Enumerations Enumerator Modules Pages
Sanitizer.php File Reference

Go to the source code of this file.

Data Structures

class  Sanitizer
 

Functions

 codepointToUtf8 ($codepoint)
 

Variables

const MW_CHAR_REFS_REGEX '/&([A-Za-z0-9\x80-\xff]+); |&\#([0-9]+); |&\#x([0-9A-Za-z]+); |&\#X([0-9A-Za-z]+); |(&)/x'
 This file is part of ILIAS, a powerful learning management system published by ILIAS open source e-Learning e.V. More...
 
 $attrib = '[A-Za-z0-9]'
 Regular expression to match HTML/XML attribute pairs within a tag. More...
 
 $space = '[\x09\x0a\x0d\x20]'
 
const MW_ATTRIBS_REGEX "/(?:^|$space)($attrib+) ($space*=$space* (?: # The attribute value: quoted or alone \"([^<\"]*)\" | '([^<']*)' | ([a-zA-Z0-9!#$%&()*,\\-.\\/:;<>?@[\\]^_`{|}~]+) | (\#[0-9a-fA-F]+) # Technically wrong, but lots of # colors are specified like this. # We'll be normalizing it. ) )?(?=$space|\$)/sx"
 
global $wgHtmlEntities
 List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html. More...
 
global $wgHtmlEntityAliases
 Character entity aliases accepted by MediaWiki. More...
 

Function Documentation

◆ codepointToUtf8()

codepointToUtf8 (   $codepoint)

Definition at line 319 of file Sanitizer.php.

324 {
325  if ($codepoint < 0x80) {
326  return chr($codepoint);
327  }
328  if ($codepoint < 0x800) {
329  return chr($codepoint >> 6 & 0x3f | 0xc0) .
330  chr($codepoint & 0x3f | 0x80);
331  }
332  if ($codepoint < 0x10000) {
333  return chr($codepoint >> 12 & 0x0f | 0xe0) .
334  chr($codepoint >> 6 & 0x3f | 0x80) .
335  chr($codepoint & 0x3f | 0x80);
336  }
337  if ($codepoint < 0x110000) {
338  return chr($codepoint >> 18 & 0x07 | 0xf0) .
339  chr($codepoint >> 12 & 0x3f | 0x80) .
340  chr($codepoint >> 6 & 0x3f | 0x80) .

Variable Documentation

◆ $attrib

$attrib = '[A-Za-z0-9]'

Regular expression to match HTML/XML attribute pairs within a tag.

Allows some... latitude. Used in Sanitizer::fixTagAttributes and Sanitizer::decodeTagAttributes

Definition at line 34 of file Sanitizer.php.

Referenced by SurveyImportParser\handlerBeginTag().

◆ $space

$space = '[\x09\x0a\x0d\x20]'

◆ $wgHtmlEntities

$wgHtmlEntities
private

List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html.

Definition at line 55 of file Sanitizer.php.

◆ $wgHtmlEntityAliases

$wgHtmlEntityAliases
Initial value:
= array(
'רלמ' => 'rlm',
'رلم' => 'rlm',
)

Character entity aliases accepted by MediaWiki.

Definition at line 313 of file Sanitizer.php.

◆ MW_ATTRIBS_REGEX

const MW_ATTRIBS_REGEX "/(?:^|$space)($attrib+) ($space*=$space* (?: # The attribute value: quoted or alone \"([^<\"]*)\" | '([^<']*)' | ([a-zA-Z0-9!#$%&()*,\\-.\\/:;<>?@[\\]^_`{|}~]+) | (\#[0-9a-fA-F]+) # Technically wrong, but lots of # colors are specified like this. # We'll be normalizing it. ) )?(?=$space|\$)/sx"

Definition at line 36 of file Sanitizer.php.

◆ MW_CHAR_REFS_REGEX

const MW_CHAR_REFS_REGEX '/&([A-Za-z0-9\x80-\xff]+); |&\#([0-9]+); |&\#x([0-9A-Za-z]+); |&\#X([0-9A-Za-z]+); |(&)/x'

This file is part of ILIAS, a powerful learning management system published by ILIAS open source e-Learning e.V.

ILIAS is licensed with the GPL-3.0, see https://www.gnu.org/licenses/gpl-3.0.en.html You should have received a copy of said license along with the source code, too.

If this is not the case or you just want to try ILIAS, you'll find us at: https://www.ilias.de https://github.com/ILIAS-eLearning Regular expression to match various types of character references in Sanitizer::normalizeCharReferences and Sanitizer::decodeCharReferences

Definition at line 22 of file Sanitizer.php.