ILIAS  trunk Revision v11.0_alpha-3011-gc6b235a2e85
Sanitizer.php File Reference

Go to the source code of this file.

Data Structures

class  Sanitizer
 

Functions

 codepointToUtf8 ($codepoint)
 

Variables

const MW_CHAR_REFS_REGEX '/&([A-Za-z0-9\x80-\xff]+); |&\#([0-9]+); |&\#x([0-9A-Za-z]+); |&\#X([0-9A-Za-z]+); |(&)/x'
 This file is part of ILIAS, a powerful learning management system published by ILIAS open source e-Learning e.V. More...
 
 $attrib = '[A-Za-z0-9]'
 Regular expression to match HTML/XML attribute pairs within a tag. More...
 
 $space = '[\x09\x0a\x0d\x20]'
 
const MW_ATTRIBS_REGEX "/(?:^|$space)($attrib+) ($space*=$space* (?: \"([^<\"]*)\" | '([^<']*)' | ([a-zA-Z0-9!#$%&()*,\\-.\\/:;<>?@[\\]^_`{|}~]+) | (\#[0-9a-fA-F]+) # Technically wrong, but lots of ) )?(?=$space|\$)/sx"
 
global $wgHtmlEntities
 List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html. More...
 
global $wgHtmlEntityAliases
 Character entity aliases accepted by MediaWiki. More...
 

Function Documentation

◆ codepointToUtf8()

codepointToUtf8 (   $codepoint)

Definition at line 319 of file Sanitizer.php.

324{
325 if ($codepoint < 0x80) {
326 return chr($codepoint);
327 }
328 if ($codepoint < 0x800) {
329 return chr($codepoint >> 6 & 0x3f | 0xc0) .
330 chr($codepoint & 0x3f | 0x80);
331 }
332 if ($codepoint < 0x10000) {
333 return chr($codepoint >> 12 & 0x0f | 0xe0) .
334 chr($codepoint >> 6 & 0x3f | 0x80) .
335 chr($codepoint & 0x3f | 0x80);
336 }
337 if ($codepoint < 0x110000) {
338 return chr($codepoint >> 18 & 0x07 | 0xf0) .
339 chr($codepoint >> 12 & 0x3f | 0x80) .
340 chr($codepoint >> 6 & 0x3f | 0x80) .

Variable Documentation

◆ $attrib

$attrib = '[A-Za-z0-9]'

Regular expression to match HTML/XML attribute pairs within a tag.

Allows some... latitude. Used in Sanitizer::fixTagAttributes and Sanitizer::decodeTagAttributes

Definition at line 34 of file Sanitizer.php.

Referenced by SurveyImportParser\handlerBeginTag().

◆ $space

$space = '[\x09\x0a\x0d\x20]'

◆ $wgHtmlEntities

$wgHtmlEntities
private

List of all named character entities defined in HTML 4.01 http://www.w3.org/TR/html4/sgml/entities.html.

Definition at line 55 of file Sanitizer.php.

◆ $wgHtmlEntityAliases

$wgHtmlEntityAliases
Initial value:
= array(
'רלמ' => 'rlm',
'رلم' => 'rlm',
)

Character entity aliases accepted by MediaWiki.

Definition at line 313 of file Sanitizer.php.

◆ MW_ATTRIBS_REGEX

const MW_ATTRIBS_REGEX "/(?:^|$space)($attrib+) ($space*=$space* (?: \"([^<\"]*)\" | '([^<']*)' | ([a-zA-Z0-9!#$%&()*,\\-.\\/:;<>?@[\\]^_`{|}~]+) | (\#[0-9a-fA-F]+) # Technically wrong, but lots of ) )?(?=$space|\$)/sx"

Definition at line 48 of file Sanitizer.php.

◆ MW_CHAR_REFS_REGEX

const MW_CHAR_REFS_REGEX '/&([A-Za-z0-9\x80-\xff]+); |&\#([0-9]+); |&\#x([0-9A-Za-z]+); |&\#X([0-9A-Za-z]+); |(&)/x'

This file is part of ILIAS, a powerful learning management system published by ILIAS open source e-Learning e.V.

ILIAS is licensed with the GPL-3.0, see https://www.gnu.org/licenses/gpl-3.0.en.html You should have received a copy of said license along with the source code, too.

If this is not the case or you just want to try ILIAS, you'll find us at: https://www.ilias.de https://github.com/ILIAS-eLearning Regular expression to match various types of character references in Sanitizer::normalizeCharReferences and Sanitizer::decodeCharReferences

Definition at line 27 of file Sanitizer.php.