ILIAS  release_5-1 Revision 5.0.0-5477-g43f3e3fab5f
enshrined\svgSanitize\Sanitizer Class Reference
+ Collaboration diagram for enshrined\svgSanitize\Sanitizer:

Public Member Functions

 __construct ()
 
 getAllowedTags ()
 Get the array of allowed tags. More...
 
 setAllowedTags (TagInterface $allowedTags)
 Set custom allowed tags. More...
 
 getAllowedAttrs ()
 Get the array of allowed attributes. More...
 
 setAllowedAttrs (AttributeInterface $allowedAttrs)
 Set custom allowed attributes. More...
 
 removeRemoteReferences ($removeRemoteRefs=false)
 Should we remove references to remote files? More...
 
 sanitize ($dirty)
 Sanitize the passed string. More...
 
 minify ($shouldMinify=false)
 Should we minify the output? More...
 

Data Fields

const SCRIPT_REGEX = '/(?:\w+script|data):/xi'
 Regex to catch script and data values in attributes. More...
 
const REMOTE_REFERENCE_REGEX = '/url\(([\'"]?(?:http|https):)[\'"]?([^\'"\‍)]*)[\'"]?\‍)/xi'
 Regex to test for remote URLs in linked assets. More...
 

Protected Member Functions

 resetInternal ()
 Set up the DOMDocument. More...
 
 setUpBefore ()
 Set up libXML before we start. More...
 
 resetAfter ()
 Reset the class after use. More...
 
 removeDoctype ()
 Remove the XML Doctype It may be caught later on output but that seems to be buggy, so we need to make sure it's gone. More...
 
 startClean (\DOMNodeList $elements)
 Start the cleaning with tags, then we move onto attributes and hrefs later. More...
 
 cleanAttributesOnWhitelist (\DOMElement $element)
 Only allow attributes that are on the whitelist. More...
 
 cleanXlinkHrefs (\DOMElement &$element)
 Clean the xlink:hrefs of script and data embeds. More...
 
 cleanHrefs (\DOMElement &$element)
 Clean the hrefs of script and data embeds. More...
 
 hasRemoteReference ($value)
 Does this attribute value have a remote reference? More...
 

Protected Attributes

 $xmlDocument
 
 $allowedTags
 
 $allowedAttrs
 
 $xmlLoaderValue
 
 $minifyXML = false
 
 $removeRemoteReferences = false
 

Detailed Description

Definition at line 18 of file Sanitizer.php.

Constructor & Destructor Documentation

◆ __construct()

enshrined\svgSanitize\Sanitizer::__construct ( )

Definition at line 64 of file Sanitizer.php.

65 {
66 $this->resetInternal();
67
68 // Load default tags/attributes
69 $this->allowedAttrs = AllowedAttributes::getAttributes();
70 $this->allowedTags = AllowedTags::getTags();
71 }
resetInternal()
Set up the DOMDocument.
Definition: Sanitizer.php:76
static getAttributes()
Returns an array of attributes.
static getTags()
Returns an array of tags.
Definition: AllowedTags.php:20

References enshrined\svgSanitize\data\AllowedAttributes\getAttributes(), enshrined\svgSanitize\data\AllowedTags\getTags(), and enshrined\svgSanitize\Sanitizer\resetInternal().

+ Here is the call graph for this function:

Member Function Documentation

◆ cleanAttributesOnWhitelist()

enshrined\svgSanitize\Sanitizer::cleanAttributesOnWhitelist ( \DOMElement  $element)
protected

Only allow attributes that are on the whitelist.

Parameters
\DOMElement$element

Definition at line 256 of file Sanitizer.php.

257 {
258 for ($x = $element->attributes->length - 1; $x >= 0; $x--) {
259 // get attribute name
260 $attrName = $element->attributes->item($x)->name;
261
262 // Remove attribute if not in whitelist
263 if (!in_array(strtolower($attrName), $this->allowedAttrs)) {
264 $element->removeAttribute($attrName);
265 }
266
267 // Do we want to strip remote references?
268 if($this->removeRemoteReferences) {
269 // Remove attribute if it has a remote reference
270 if (isset($element->attributes->item($x)->value) && $this->hasRemoteReference($element->attributes->item($x)->value)) {
271 $element->removeAttribute($attrName);
272 }
273 }
274 }
275 }
removeRemoteReferences($removeRemoteRefs=false)
Should we remove references to remote files?
Definition: Sanitizer.php:134
$x
Definition: example_009.php:98

References $x, and enshrined\svgSanitize\Sanitizer\removeRemoteReferences().

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ cleanHrefs()

enshrined\svgSanitize\Sanitizer::cleanHrefs ( \DOMElement &  $element)
protected

Clean the hrefs of script and data embeds.

Parameters
\DOMElement$element

Definition at line 295 of file Sanitizer.php.

296 {
297 $href = $element->getAttribute('href');
298 if (preg_match(self::SCRIPT_REGEX, $href) === 1) {
299 $element->removeAttribute('href');
300 }
301 }

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

+ Here is the caller graph for this function:

◆ cleanXlinkHrefs()

enshrined\svgSanitize\Sanitizer::cleanXlinkHrefs ( \DOMElement &  $element)
protected

Clean the xlink:hrefs of script and data embeds.

Parameters
\DOMElement$element

Definition at line 282 of file Sanitizer.php.

283 {
284 $xlinks = $element->getAttributeNS('http://www.w3.org/1999/xlink', 'href');
285 if (preg_match(self::SCRIPT_REGEX, $xlinks) === 1) {
286 $element->removeAttributeNS('http://www.w3.org/1999/xlink', 'href');
287 }
288 }

Referenced by enshrined\svgSanitize\Sanitizer\startClean().

+ Here is the caller graph for this function:

◆ getAllowedAttrs()

enshrined\svgSanitize\Sanitizer::getAllowedAttrs ( )

Get the array of allowed attributes.

Returns
array

Definition at line 114 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\$allowedAttrs.

◆ getAllowedTags()

enshrined\svgSanitize\Sanitizer::getAllowedTags ( )

Get the array of allowed tags.

Returns
array

Definition at line 94 of file Sanitizer.php.

References enshrined\svgSanitize\Sanitizer\$allowedTags.

◆ hasRemoteReference()

enshrined\svgSanitize\Sanitizer::hasRemoteReference (   $value)
protected

Does this attribute value have a remote reference?

Parameters
$value
Returns
bool

Definition at line 309 of file Sanitizer.php.

310 {
311 if (preg_match(self::REMOTE_REFERENCE_REGEX, $value) === 1) {
312 return true;
313 }
314
315 return false;
316 }

◆ minify()

enshrined\svgSanitize\Sanitizer::minify (   $shouldMinify = false)

Should we minify the output?

Parameters
bool$shouldMinify

Definition at line 323 of file Sanitizer.php.

324 {
325 $this->minifyXML = (bool) $shouldMinify;
326 }

◆ removeDoctype()

enshrined\svgSanitize\Sanitizer::removeDoctype ( )
protected

Remove the XML Doctype It may be caught later on output but that seems to be buggy, so we need to make sure it's gone.

Definition at line 215 of file Sanitizer.php.

216 {
217 foreach ($this->xmlDocument->childNodes as $child) {
218 if ($child->nodeType === XML_DOCUMENT_TYPE_NODE) {
219 $child->parentNode->removeChild($child);
220 }
221 }
222 }

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

+ Here is the caller graph for this function:

◆ removeRemoteReferences()

enshrined\svgSanitize\Sanitizer::removeRemoteReferences (   $removeRemoteRefs = false)

Should we remove references to remote files?

Parameters
bool$removeRemoteRefs

Definition at line 134 of file Sanitizer.php.

135 {
136 $this->removeRemoteReferences = $removeRemoteRefs;
137 }

References enshrined\svgSanitize\Sanitizer\removeRemoteReferences().

Referenced by enshrined\svgSanitize\Sanitizer\cleanAttributesOnWhitelist(), and enshrined\svgSanitize\Sanitizer\removeRemoteReferences().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ resetAfter()

enshrined\svgSanitize\Sanitizer::resetAfter ( )
protected

Reset the class after use.

Definition at line 202 of file Sanitizer.php.

203 {
204 // Reset DOMDocument to a clean state in case we use it again
205 $this->resetInternal();
206
207 // Reset the entity loader3
208 libxml_disable_entity_loader($this->xmlLoaderValue);
209 }

References enshrined\svgSanitize\Sanitizer\resetInternal().

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ resetInternal()

enshrined\svgSanitize\Sanitizer::resetInternal ( )
protected

Set up the DOMDocument.

Definition at line 76 of file Sanitizer.php.

77 {
78 $this->xmlDocument = new DOMDocument();
79 $this->xmlDocument->preserveWhiteSpace = false;
80 $this->xmlDocument->strictErrorChecking = false;
81 $this->xmlDocument->formatOutput = true;
82
83 // Maybe don't format the output
84 if($this->minifyXML) {
85 $this->xmlDocument->formatOutput = false;
86 }
87 }

Referenced by enshrined\svgSanitize\Sanitizer\__construct(), and enshrined\svgSanitize\Sanitizer\resetAfter().

+ Here is the caller graph for this function:

◆ sanitize()

enshrined\svgSanitize\Sanitizer::sanitize (   $dirty)

Sanitize the passed string.

Parameters
string$dirty
Returns
string

Definition at line 145 of file Sanitizer.php.

146 {
147 // Don't run on an empty string
148 if (empty($dirty)) {
149 return '';
150 }
151
152 // Strip php tags
153 $dirty = preg_replace('/<\?(=|php)(.+?)\?>/i', '', $dirty);
154
155 $this->setUpBefore();
156
157 $loaded = $this->xmlDocument->loadXML($dirty);
158
159 // If we couldn't parse the XML then we go no further. Reset and return false
160 if (!$loaded) {
161 $this->resetAfter();
162 return false;
163 }
164
165 $this->removeDoctype();
166
167 // Grab all the elements
168 $allElements = $this->xmlDocument->getElementsByTagName("*");
169
170 // Start the cleaning proccess
171 $this->startClean($allElements);
172
173 // Save cleaned XML to a variable
174 $clean = $this->xmlDocument->saveXML($this->xmlDocument->documentElement, LIBXML_NOEMPTYTAG);
175
176 $this->resetAfter();
177
178 // Remove any extra whitespaces when minifying
179 if($this->minifyXML) {
180 $clean = preg_replace('/\s+/', ' ', $clean);
181 }
182
183 // Return result
184 return $clean;
185 }
setUpBefore()
Set up libXML before we start.
Definition: Sanitizer.php:190
resetAfter()
Reset the class after use.
Definition: Sanitizer.php:202
removeDoctype()
Remove the XML Doctype It may be caught later on output but that seems to be buggy,...
Definition: Sanitizer.php:215
startClean(\DOMNodeList $elements)
Start the cleaning with tags, then we move onto attributes and hrefs later.
Definition: Sanitizer.php:229

References enshrined\svgSanitize\Sanitizer\removeDoctype(), enshrined\svgSanitize\Sanitizer\resetAfter(), enshrined\svgSanitize\Sanitizer\setUpBefore(), and enshrined\svgSanitize\Sanitizer\startClean().

+ Here is the call graph for this function:

◆ setAllowedAttrs()

enshrined\svgSanitize\Sanitizer::setAllowedAttrs ( AttributeInterface  $allowedAttrs)

Set custom allowed attributes.

Parameters
AttributeInterface$allowedAttrs

Definition at line 124 of file Sanitizer.php.

125 {
126 $this->allowedAttrs = $allowedAttrs::getAttributes();
127 }

◆ setAllowedTags()

enshrined\svgSanitize\Sanitizer::setAllowedTags ( TagInterface  $allowedTags)

Set custom allowed tags.

Parameters
TagInterface$allowedTags

Definition at line 104 of file Sanitizer.php.

105 {
106 $this->allowedTags = $allowedTags::getTags();
107 }

◆ setUpBefore()

enshrined\svgSanitize\Sanitizer::setUpBefore ( )
protected

Set up libXML before we start.

Definition at line 190 of file Sanitizer.php.

191 {
192 // Turn off the entity loader
193 $this->xmlLoaderValue = libxml_disable_entity_loader(true);
194
195 // Suppress the errors because we don't really have to worry about formation before cleansing
196 libxml_use_internal_errors(true);
197 }

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

+ Here is the caller graph for this function:

◆ startClean()

enshrined\svgSanitize\Sanitizer::startClean ( \DOMNodeList  $elements)
protected

Start the cleaning with tags, then we move onto attributes and hrefs later.

Parameters
\DOMNodeList$elements

Definition at line 229 of file Sanitizer.php.

230 {
231 // loop through all elements
232 // we do this backwards so we don't skip anything if we delete a node
233 // see comments at: http://php.net/manual/en/class.domnamednodemap.php
234 for ($i = $elements->length - 1; $i >= 0; $i--) {
235 $currentElement = $elements->item($i);
236
237 // If the tag isn't in the whitelist, remove it and continue with next iteration
238 if (!in_array(strtolower($currentElement->tagName), $this->allowedTags)) {
239 $currentElement->parentNode->removeChild($currentElement);
240 continue;
241 }
242
243 $this->cleanAttributesOnWhitelist($currentElement);
244
245 $this->cleanXlinkHrefs($currentElement);
246
247 $this->cleanHrefs($currentElement);
248 }
249 }
cleanAttributesOnWhitelist(\DOMElement $element)
Only allow attributes that are on the whitelist.
Definition: Sanitizer.php:256
cleanHrefs(\DOMElement &$element)
Clean the hrefs of script and data embeds.
Definition: Sanitizer.php:295
cleanXlinkHrefs(\DOMElement &$element)
Clean the xlink:hrefs of script and data embeds.
Definition: Sanitizer.php:282

References enshrined\svgSanitize\Sanitizer\cleanAttributesOnWhitelist(), enshrined\svgSanitize\Sanitizer\cleanHrefs(), and enshrined\svgSanitize\Sanitizer\cleanXlinkHrefs().

Referenced by enshrined\svgSanitize\Sanitizer\sanitize().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Field Documentation

◆ $allowedAttrs

enshrined\svgSanitize\Sanitizer::$allowedAttrs
protected

Definition at line 44 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\getAllowedAttrs().

◆ $allowedTags

enshrined\svgSanitize\Sanitizer::$allowedTags
protected

Definition at line 39 of file Sanitizer.php.

Referenced by enshrined\svgSanitize\Sanitizer\getAllowedTags().

◆ $minifyXML

enshrined\svgSanitize\Sanitizer::$minifyXML = false
protected

Definition at line 54 of file Sanitizer.php.

◆ $removeRemoteReferences

enshrined\svgSanitize\Sanitizer::$removeRemoteReferences = false
protected

Definition at line 59 of file Sanitizer.php.

◆ $xmlDocument

enshrined\svgSanitize\Sanitizer::$xmlDocument
protected

Definition at line 34 of file Sanitizer.php.

◆ $xmlLoaderValue

enshrined\svgSanitize\Sanitizer::$xmlLoaderValue
protected

Definition at line 49 of file Sanitizer.php.

◆ REMOTE_REFERENCE_REGEX

const enshrined\svgSanitize\Sanitizer::REMOTE_REFERENCE_REGEX = '/url\(([\'"]?(?:http|https):)[\'"]?([^\'"\‍)]*)[\'"]?\‍)/xi'

Regex to test for remote URLs in linked assets.

Definition at line 29 of file Sanitizer.php.

◆ SCRIPT_REGEX

const enshrined\svgSanitize\Sanitizer::SCRIPT_REGEX = '/(?:\w+script|data):/xi'

Regex to catch script and data values in attributes.

Definition at line 24 of file Sanitizer.php.


The documentation for this class was generated from the following file: