ILIAS  release_5-2 Revision v5.2.25-18-g3f80b82851
enshrined\svgSanitize\Sanitizer Class Reference
+ Collaboration diagram for enshrined\svgSanitize\Sanitizer:

Public Member Functions

 __construct ()
 
 getAllowedTags ()
 Get the array of allowed tags. More...
 
 setAllowedTags (TagInterface $allowedTags)
 Set custom allowed tags. More...
 
 getAllowedAttrs ()
 Get the array of allowed attributes. More...
 
 setAllowedAttrs (AttributeInterface $allowedAttrs)
 Set custom allowed attributes. More...
 
 removeRemoteReferences ($removeRemoteRefs=false)
 Should we remove references to remote files? More...
 
 sanitize ($dirty)
 Sanitize the passed string. More...
 
 minify ($shouldMinify=false)
 Should we minify the output? More...
 

Data Fields

const SCRIPT_REGEX = '/(?:\w+script|data):/xi'
 Regex to catch script and data values in attributes. More...
 
const REMOTE_REFERENCE_REGEX = '/url\(([\'"]?(?:http|https):)[\'"]?([^\'"\)]*)[\'"]?\)/xi'
 Regex to test for remote URLs in linked assets. More...
 

Protected Member Functions

 resetInternal ()
 Set up the DOMDocument. More...
 
 setUpBefore ()
 Set up libXML before we start. More...
 
 resetAfter ()
 Reset the class after use. More...
 
 removeDoctype ()
 Remove the XML Doctype It may be caught later on output but that seems to be buggy, so we need to make sure it's gone. More...
 
 startClean (\DOMNodeList $elements)
 Start the cleaning with tags, then we move onto attributes and hrefs later. More...
 
 cleanAttributesOnWhitelist (\DOMElement $element)
 Only allow attributes that are on the whitelist. More...
 
 cleanXlinkHrefs (\DOMElement &$element)
 Clean the xlink:hrefs of script and data embeds. More...
 
 cleanHrefs (\DOMElement &$element)
 Clean the hrefs of script and data embeds. More...
 
 hasRemoteReference ($value)
 Does this attribute value have a remote reference? More...
 

Protected Attributes

 $xmlDocument
 
 $allowedTags
 
 $allowedAttrs
 
 $xmlLoaderValue
 
 $minifyXML = false
 
 $removeRemoteReferences = false
 

Detailed Description

Definition at line 18 of file Sanitizer.php.

Constructor & Destructor Documentation

◆ __construct()

enshrined\svgSanitize\Sanitizer::__construct ( )

Definition at line 64 of file Sanitizer.php.

References enshrined\svgSanitize\data\AllowedAttributes\getAttributes(), enshrined\svgSanitize\data\AllowedTags\getTags(), and enshrined\svgSanitize\Sanitizer\resetInternal().

65  {
66  $this->resetInternal();
67 
68  // Load default tags/attributes
69  $this->allowedAttrs = AllowedAttributes::getAttributes();
70  $this->allowedTags = AllowedTags::getTags();
71  }
resetInternal()
Set up the DOMDocument.
Definition: Sanitizer.php:76
static getTags()
Returns an array of tags.
Definition: AllowedTags.php:20
static getAttributes()
Returns an array of attributes.
+ Here is the call graph for this function:

Member Function Documentation

◆ cleanAttributesOnWhitelist()

enshrined\svgSanitize\Sanitizer::cleanAttributesOnWhitelist ( \DOMElement  $element)
protected

Only allow attributes that are on the whitelist.

Parameters
\DOMElement$element

Definition at line 256 of file Sanitizer.php.

References $x, enshrined\svgSanitize\Sanitizer\hasRemoteReference(), and enshrined\svgSanitize\Sanitizer\removeRemoteReferences().

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

257  {
258  for ($x = $element->attributes->length - 1; $x >= 0; $x--) {
259  // get attribute name
260  $attrName = $element->attributes->item($x)->name;
261 
262  // Remove attribute if not in whitelist
263  if (!in_array(strtolower($attrName), $this->allowedAttrs)) {
264  $element->removeAttribute($attrName);
265  }
266 
267  // Do we want to strip remote references?
268  if($this->removeRemoteReferences) {
269  // Remove attribute if it has a remote reference
270  if (isset($element->attributes->item($x)->value) && $this->hasRemoteReference($element->attributes->item($x)->value)) {
271  $element->removeAttribute($attrName);
272  }
273  }
274  }
275  }
hasRemoteReference($value)
Does this attribute value have a remote reference?
Definition: Sanitizer.php:309
$x
Definition: example_009.php:98
removeRemoteReferences($removeRemoteRefs=false)
Should we remove references to remote files?
Definition: Sanitizer.php:134
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ cleanHrefs()

enshrined\svgSanitize\Sanitizer::cleanHrefs ( \DOMElement &  $element)
protected

Clean the hrefs of script and data embeds.

Parameters
\DOMElement$element

Definition at line 295 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

296  {
297  $href = $element->getAttribute('href');
298  if (preg_match(self::SCRIPT_REGEX, $href) === 1) {
299  $element->removeAttribute('href');
300  }
301  }
+ Here is the caller graph for this function:

◆ cleanXlinkHrefs()

enshrined\svgSanitize\Sanitizer::cleanXlinkHrefs ( \DOMElement &  $element)
protected

Clean the xlink:hrefs of script and data embeds.

Parameters
\DOMElement$element

Definition at line 282 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

283  {
284  $xlinks = $element->getAttributeNS('http://www.w3.org/1999/xlink', 'href');
285  if (preg_match(self::SCRIPT_REGEX, $xlinks) === 1) {
286  $element->removeAttributeNS('http://www.w3.org/1999/xlink', 'href');
287  }
288  }
+ Here is the caller graph for this function:

◆ getAllowedAttrs()

enshrined\svgSanitize\Sanitizer::getAllowedAttrs ( )

Get the array of allowed attributes.

Returns
array

Definition at line 114 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\$allowedAttrs.

115  {
116  return $this->allowedAttrs;
117  }

◆ getAllowedTags()

enshrined\svgSanitize\Sanitizer::getAllowedTags ( )

Get the array of allowed tags.

Returns
array

Definition at line 94 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\$allowedTags.

95  {
96  return $this->allowedTags;
97  }

◆ hasRemoteReference()

enshrined\svgSanitize\Sanitizer::hasRemoteReference (   $value)
protected

Does this attribute value have a remote reference?

Parameters
$value
Returns
bool

Definition at line 309 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\cleanAttributesOnWhitelist().

310  {
311  if (preg_match(self::REMOTE_REFERENCE_REGEX, $value) === 1) {
312  return true;
313  }
314 
315  return false;
316  }
+ Here is the caller graph for this function:

◆ minify()

enshrined\svgSanitize\Sanitizer::minify (   $shouldMinify = false)

Should we minify the output?

Parameters
bool$shouldMinify

Definition at line 323 of file Sanitizer.php.

324  {
325  $this->minifyXML = (bool) $shouldMinify;
326  }

◆ removeDoctype()

enshrined\svgSanitize\Sanitizer::removeDoctype ( )
protected

Remove the XML Doctype It may be caught later on output but that seems to be buggy, so we need to make sure it's gone.

Definition at line 215 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

216  {
217  foreach ($this->xmlDocument->childNodes as $child) {
218  if ($child->nodeType === XML_DOCUMENT_TYPE_NODE) {
219  $child->parentNode->removeChild($child);
220  }
221  }
222  }
+ Here is the caller graph for this function:

◆ removeRemoteReferences()

enshrined\svgSanitize\Sanitizer::removeRemoteReferences (   $removeRemoteRefs = false)

Should we remove references to remote files?

Parameters
bool$removeRemoteRefs

Definition at line 134 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\cleanAttributesOnWhitelist().

135  {
136  $this->removeRemoteReferences = $removeRemoteRefs;
137  }
removeRemoteReferences($removeRemoteRefs=false)
Should we remove references to remote files?
Definition: Sanitizer.php:134
+ Here is the caller graph for this function:

◆ resetAfter()

enshrined\svgSanitize\Sanitizer::resetAfter ( )
protected

Reset the class after use.

Definition at line 202 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\resetInternal().

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

203  {
204  // Reset DOMDocument to a clean state in case we use it again
205  $this->resetInternal();
206 
207  // Reset the entity loader3
208  libxml_disable_entity_loader($this->xmlLoaderValue);
209  }
resetInternal()
Set up the DOMDocument.
Definition: Sanitizer.php:76
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ resetInternal()

enshrined\svgSanitize\Sanitizer::resetInternal ( )
protected

Set up the DOMDocument.

Definition at line 76 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\__construct(), and enshrined\svgSanitize\Sanitizer\resetAfter().

77  {
78  $this->xmlDocument = new DOMDocument();
79  $this->xmlDocument->preserveWhiteSpace = false;
80  $this->xmlDocument->strictErrorChecking = false;
81  $this->xmlDocument->formatOutput = true;
82 
83  // Maybe don't format the output
84  if($this->minifyXML) {
85  $this->xmlDocument->formatOutput = false;
86  }
87  }
+ Here is the caller graph for this function:

◆ sanitize()

enshrined\svgSanitize\Sanitizer::sanitize (   $dirty)

Sanitize the passed string.

Parameters
string$dirty
Returns
string

Definition at line 145 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\removeDoctype(), enshrined\svgSanitize\Sanitizer\resetAfter(), enshrined\svgSanitize\Sanitizer\setUpBefore(), and enshrined\svgSanitize\Sanitizer\startClean().

146  {
147  // Don't run on an empty string
148  if (empty($dirty)) {
149  return '';
150  }
151 
152  // Strip php tags
153  $dirty = preg_replace('/<\?(=|php)(.+?)\?>/i', '', $dirty);
154 
155  $this->setUpBefore();
156 
157  $loaded = $this->xmlDocument->loadXML($dirty);
158 
159  // If we couldn't parse the XML then we go no further. Reset and return false
160  if (!$loaded) {
161  $this->resetAfter();
162  return false;
163  }
164 
165  $this->removeDoctype();
166 
167  // Grab all the elements
168  $allElements = $this->xmlDocument->getElementsByTagName("*");
169 
170  // Start the cleaning proccess
171  $this->startClean($allElements);
172 
173  // Save cleaned XML to a variable
174  $clean = $this->xmlDocument->saveXML($this->xmlDocument->documentElement, LIBXML_NOEMPTYTAG);
175 
176  $this->resetAfter();
177 
178  // Remove any extra whitespaces when minifying
179  if($this->minifyXML) {
180  $clean = preg_replace('/\s+/', ' ', $clean);
181  }
182 
183  // Return result
184  return $clean;
185  }
setUpBefore()
Set up libXML before we start.
Definition: Sanitizer.php:190
startClean(\DOMNodeList $elements)
Start the cleaning with tags, then we move onto attributes and hrefs later.
Definition: Sanitizer.php:229
resetAfter()
Reset the class after use.
Definition: Sanitizer.php:202
removeDoctype()
Remove the XML Doctype It may be caught later on output but that seems to be buggy, so we need to make sure it&#39;s gone.
Definition: Sanitizer.php:215
+ Here is the call graph for this function:

◆ setAllowedAttrs()

enshrined\svgSanitize\Sanitizer::setAllowedAttrs ( AttributeInterface  $allowedAttrs)

Set custom allowed attributes.

Parameters
AttributeInterface$allowedAttrs

Definition at line 124 of file Sanitizer.php.

125  {
126  $this->allowedAttrs = $allowedAttrs::getAttributes();
127  }

◆ setAllowedTags()

enshrined\svgSanitize\Sanitizer::setAllowedTags ( TagInterface  $allowedTags)

Set custom allowed tags.

Parameters
TagInterface$allowedTags

Definition at line 104 of file Sanitizer.php.

105  {
106  $this->allowedTags = $allowedTags::getTags();
107  }

◆ setUpBefore()

enshrined\svgSanitize\Sanitizer::setUpBefore ( )
protected

Set up libXML before we start.

Definition at line 190 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

191  {
192  // Turn off the entity loader
193  $this->xmlLoaderValue = libxml_disable_entity_loader(true);
194 
195  // Suppress the errors because we don't really have to worry about formation before cleansing
196  libxml_use_internal_errors(true);
197  }
+ Here is the caller graph for this function:

◆ startClean()

enshrined\svgSanitize\Sanitizer::startClean ( \DOMNodeList  $elements)
protected

Start the cleaning with tags, then we move onto attributes and hrefs later.

Parameters
\DOMNodeList$elements

Definition at line 229 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\cleanAttributesOnWhitelist(), enshrined\svgSanitize\Sanitizer\cleanHrefs(), and enshrined\svgSanitize\Sanitizer\cleanXlinkHrefs().

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

230  {
231  // loop through all elements
232  // we do this backwards so we don't skip anything if we delete a node
233  // see comments at: http://php.net/manual/en/class.domnamednodemap.php
234  for ($i = $elements->length - 1; $i >= 0; $i--) {
235  $currentElement = $elements->item($i);
236 
237  // If the tag isn't in the whitelist, remove it and continue with next iteration
238  if (!in_array(strtolower($currentElement->tagName), $this->allowedTags)) {
239  $currentElement->parentNode->removeChild($currentElement);
240  continue;
241  }
242 
243  $this->cleanAttributesOnWhitelist($currentElement);
244 
245  $this->cleanXlinkHrefs($currentElement);
246 
247  $this->cleanHrefs($currentElement);
248  }
249  }
cleanAttributesOnWhitelist(\DOMElement $element)
Only allow attributes that are on the whitelist.
Definition: Sanitizer.php:256
cleanXlinkHrefs(\DOMElement &$element)
Clean the xlink:hrefs of script and data embeds.
Definition: Sanitizer.php:282
cleanHrefs(\DOMElement &$element)
Clean the hrefs of script and data embeds.
Definition: Sanitizer.php:295
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Field Documentation

◆ $allowedAttrs

enshrined\svgSanitize\Sanitizer::$allowedAttrs
protected

Definition at line 44 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\getAllowedAttrs().

◆ $allowedTags

enshrined\svgSanitize\Sanitizer::$allowedTags
protected

Definition at line 39 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\getAllowedTags().

◆ $minifyXML

enshrined\svgSanitize\Sanitizer::$minifyXML = false
protected

Definition at line 54 of file Sanitizer.php.

◆ $removeRemoteReferences

enshrined\svgSanitize\Sanitizer::$removeRemoteReferences = false
protected

Definition at line 59 of file Sanitizer.php.

◆ $xmlDocument

enshrined\svgSanitize\Sanitizer::$xmlDocument
protected

Definition at line 34 of file Sanitizer.php.

◆ $xmlLoaderValue

enshrined\svgSanitize\Sanitizer::$xmlLoaderValue
protected

Definition at line 49 of file Sanitizer.php.

◆ REMOTE_REFERENCE_REGEX

const enshrined\svgSanitize\Sanitizer::REMOTE_REFERENCE_REGEX = '/url\(([\'"]?(?:http|https):)[\'"]?([^\'"\)]*)[\'"]?\)/xi'

Regex to test for remote URLs in linked assets.

Definition at line 29 of file Sanitizer.php.

◆ SCRIPT_REGEX

const enshrined\svgSanitize\Sanitizer::SCRIPT_REGEX = '/(?:\w+script|data):/xi'

Regex to catch script and data values in attributes.

Definition at line 24 of file Sanitizer.php.


The documentation for this class was generated from the following file: