ILIAS  release_5-0 Revision 5.0.0-1144-gc4397b1f87
HTMLPurifier_Generator Class Reference

Generates HTML from tokens. More...

+ Collaboration diagram for HTMLPurifier_Generator:

Public Member Functions

 __construct ($config, $context)
 
 generateFromTokens ($tokens)
 Generates HTML from an array of tokens. More...
 
 generateFromToken ($token)
 Generates HTML from a single token. More...
 
 generateScriptFromToken ($token)
 Special case processor for the contents of script tags. More...
 
 generateAttributes ($assoc_array_of_attributes, $element='')
 Generates attribute declarations from attribute array. More...
 
 escape ($string, $quote=null)
 Escapes raw text data. More...
 

Protected Attributes

 $config
 Configuration for the generator HTMLPurifier_Config. More...
 

Private Attributes

 $_xhtml = true
 Whether or not generator should produce XML output. More...
 
 $_scriptFix = false
 :HACK: Whether or not generator should comment the insides of <script> tags. More...
 
 $_def
 Cache of HTMLDefinition during HTML output to determine whether or not attributes should be minimized. More...
 
 $_sortAttr
 Cache of Output.SortAttr. More...
 
 $_flashCompat
 Cache of Output.FlashCompat. More...
 
 $_innerHTMLFix
 Cache of Output.FixInnerHTML. More...
 
 $_flashStack = array()
 Stack for keeping track of object information when outputting IE compatibility code. More...
 

Detailed Description

Generates HTML from tokens.

Todo:

Refactor interface so that configuration/context is determined upon instantiation, no need for messy generateFromTokens() calls

Make some of the more internal functions protected, and have unit tests work around that

Definition at line 10 of file Generator.php.

Constructor & Destructor Documentation

◆ __construct()

HTMLPurifier_Generator::__construct (   $config,
  $context 
)
Parameters
HTMLPurifier_Config$config
HTMLPurifier_Context$context

Definition at line 67 of file Generator.php.

References $config.

68  {
69  $this->config = $config;
70  $this->_scriptFix = $config->get('Output.CommentScriptContents');
71  $this->_innerHTMLFix = $config->get('Output.FixInnerHTML');
72  $this->_sortAttr = $config->get('Output.SortAttr');
73  $this->_flashCompat = $config->get('Output.FlashCompat');
74  $this->_def = $config->getHTMLDefinition();
75  $this->_xhtml = $this->_def->doctype->xml;
76  }
$config
Configuration for the generator HTMLPurifier_Config.
Definition: Generator.php:61

Member Function Documentation

◆ escape()

HTMLPurifier_Generator::escape (   $string,
  $quote = null 
)

Escapes raw text data.

Todo:
This really ought to be protected, but until we have a facility for properly generating HTML here w/o using tokens, it stays public.
Parameters
string$stringString data to escape for HTML.
int$quoteQuoting style, like htmlspecialchars. ENT_NOQUOTES is permissible for non-attribute output.
Returns
string escaped data.

Definition at line 275 of file Generator.php.

Referenced by generateAttributes(), and generateFromToken().

276  {
277  // Workaround for APC bug on Mac Leopard reported by sidepodcast
278  // http://htmlpurifier.org/phorum/read.php?3,4823,4846
279  if ($quote === null) {
280  $quote = ENT_COMPAT;
281  }
282  return htmlspecialchars($string, $quote, 'UTF-8');
283  }
+ Here is the caller graph for this function:

◆ generateAttributes()

HTMLPurifier_Generator::generateAttributes (   $assoc_array_of_attributes,
  $element = '' 
)

Generates attribute declarations from attribute array.

Note
This does not include the leading or trailing space.
Parameters
array$assoc_array_of_attributesAttribute array
string$elementName of element attributes are for, used to check attribute minimization.
Returns
string Generated HTML fragment for insertion.

Definition at line 211 of file Generator.php.

References escape().

Referenced by generateFromToken().

212  {
213  $html = '';
214  if ($this->_sortAttr) {
215  ksort($assoc_array_of_attributes);
216  }
217  foreach ($assoc_array_of_attributes as $key => $value) {
218  if (!$this->_xhtml) {
219  // Remove namespaced attributes
220  if (strpos($key, ':') !== false) {
221  continue;
222  }
223  // Check if we should minimize the attribute: val="val" -> val
224  if ($element && !empty($this->_def->info[$element]->attr[$key]->minimized)) {
225  $html .= $key . ' ';
226  continue;
227  }
228  }
229  // Workaround for Internet Explorer innerHTML bug.
230  // Essentially, Internet Explorer, when calculating
231  // innerHTML, omits quotes if there are no instances of
232  // angled brackets, quotes or spaces. However, when parsing
233  // HTML (for example, when you assign to innerHTML), it
234  // treats backticks as quotes. Thus,
235  // <img alt="``" />
236  // becomes
237  // <img alt=`` />
238  // becomes
239  // <img alt='' />
240  // Fortunately, all we need to do is trigger an appropriate
241  // quoting style, which we do by adding an extra space.
242  // This also is consistent with the W3C spec, which states
243  // that user agents may ignore leading or trailing
244  // whitespace (in fact, most don't, at least for attributes
245  // like alt, but an extra space at the end is barely
246  // noticeable). Still, we have a configuration knob for
247  // this, since this transformation is not necesary if you
248  // don't process user input with innerHTML or you don't plan
249  // on supporting Internet Explorer.
250  if ($this->_innerHTMLFix) {
251  if (strpos($value, '`') !== false) {
252  // check if correct quoting style would not already be
253  // triggered
254  if (strcspn($value, '"\' <>') === strlen($value)) {
255  // protect!
256  $value .= ' ';
257  }
258  }
259  }
260  $html .= $key.'="'.$this->escape($value).'" ';
261  }
262  return rtrim($html);
263  }
escape($string, $quote=null)
Escapes raw text data.
Definition: Generator.php:275
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ generateFromToken()

HTMLPurifier_Generator::generateFromToken (   $token)

Generates HTML from a single token.

Parameters
HTMLPurifier_Token$tokenHTMLPurifier_Token object.
Returns
string Generated HTML

Definition at line 139 of file Generator.php.

References escape(), and generateAttributes().

Referenced by generateFromTokens(), and generateScriptFromToken().

140  {
141  if (!$token instanceof HTMLPurifier_Token) {
142  trigger_error('Cannot generate HTML from non-HTMLPurifier_Token object', E_USER_WARNING);
143  return '';
144 
145  } elseif ($token instanceof HTMLPurifier_Token_Start) {
146  $attr = $this->generateAttributes($token->attr, $token->name);
147  if ($this->_flashCompat) {
148  if ($token->name == "object") {
149  $flash = new stdclass();
150  $flash->attr = $token->attr;
151  $flash->param = array();
152  $this->_flashStack[] = $flash;
153  }
154  }
155  return '<' . $token->name . ($attr ? ' ' : '') . $attr . '>';
156 
157  } elseif ($token instanceof HTMLPurifier_Token_End) {
158  $_extra = '';
159  if ($this->_flashCompat) {
160  if ($token->name == "object" && !empty($this->_flashStack)) {
161  // doesn't do anything for now
162  }
163  }
164  return $_extra . '</' . $token->name . '>';
165 
166  } elseif ($token instanceof HTMLPurifier_Token_Empty) {
167  if ($this->_flashCompat && $token->name == "param" && !empty($this->_flashStack)) {
168  $this->_flashStack[count($this->_flashStack)-1]->param[$token->attr['name']] = $token->attr['value'];
169  }
170  $attr = $this->generateAttributes($token->attr, $token->name);
171  return '<' . $token->name . ($attr ? ' ' : '') . $attr .
172  ( $this->_xhtml ? ' /': '' ) // <br /> v. <br>
173  . '>';
174 
175  } elseif ($token instanceof HTMLPurifier_Token_Text) {
176  return $this->escape($token->data, ENT_NOQUOTES);
177 
178  } elseif ($token instanceof HTMLPurifier_Token_Comment) {
179  return '<!--' . $token->data . '-->';
180  } else {
181  return '';
182 
183  }
184  }
Concrete end token class.
Definition: End.php:10
Concrete start token class.
Definition: Start.php:6
generateAttributes($assoc_array_of_attributes, $element='')
Generates attribute declarations from attribute array.
Definition: Generator.php:211
Abstract base token class that all others inherit from.
Definition: Token.php:6
Concrete empty token class.
Definition: Empty.php:6
escape($string, $quote=null)
Escapes raw text data.
Definition: Generator.php:275
Concrete text token class.
Definition: Text.php:12
Concrete comment token class.
Definition: Comment.php:6
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ generateFromTokens()

HTMLPurifier_Generator::generateFromTokens (   $tokens)

Generates HTML from an array of tokens.

Parameters
HTMLPurifier_Token[]$tokens Array of HTMLPurifier_Token
Returns
string Generated HTML

Definition at line 83 of file Generator.php.

References $size, generateFromToken(), and generateScriptFromToken().

84  {
85  if (!$tokens) {
86  return '';
87  }
88 
89  // Basic algorithm
90  $html = '';
91  for ($i = 0, $size = count($tokens); $i < $size; $i++) {
92  if ($this->_scriptFix && $tokens[$i]->name === 'script'
93  && $i + 2 < $size && $tokens[$i+2] instanceof HTMLPurifier_Token_End) {
94  // script special case
95  // the contents of the script block must be ONE token
96  // for this to work.
97  $html .= $this->generateFromToken($tokens[$i++]);
98  $html .= $this->generateScriptFromToken($tokens[$i++]);
99  }
100  $html .= $this->generateFromToken($tokens[$i]);
101  }
102 
103  // Tidy cleanup
104  if (extension_loaded('tidy') && $this->config->get('Output.TidyFormat')) {
105  $tidy = new Tidy;
106  $tidy->parseString(
107  $html,
108  array(
109  'indent'=> true,
110  'output-xhtml' => $this->_xhtml,
111  'show-body-only' => true,
112  'indent-spaces' => 2,
113  'wrap' => 68,
114  ),
115  'utf8'
116  );
117  $tidy->cleanRepair();
118  $html = (string) $tidy; // explicit cast necessary
119  }
120 
121  // Normalize newlines to system defined value
122  if ($this->config->get('Core.NormalizeNewlines')) {
123  $nl = $this->config->get('Output.Newline');
124  if ($nl === null) {
125  $nl = PHP_EOL;
126  }
127  if ($nl !== "\n") {
128  $html = str_replace("\n", $nl, $html);
129  }
130  }
131  return $html;
132  }
Concrete end token class.
Definition: End.php:10
$size
Definition: RandomTest.php:79
generateScriptFromToken($token)
Special case processor for the contents of script tags.
Definition: Generator.php:193
generateFromToken($token)
Generates HTML from a single token.
Definition: Generator.php:139
+ Here is the call graph for this function:

◆ generateScriptFromToken()

HTMLPurifier_Generator::generateScriptFromToken (   $token)

Special case processor for the contents of script tags.

Parameters
HTMLPurifier_Token$tokenHTMLPurifier_Token object.
Returns
string
Warning
This runs into problems if there's already a literal –> somewhere inside the script contents.

Definition at line 193 of file Generator.php.

References $data, and generateFromToken().

Referenced by generateFromTokens().

194  {
195  if (!$token instanceof HTMLPurifier_Token_Text) {
196  return $this->generateFromToken($token);
197  }
198  // Thanks <http://lachy.id.au/log/2005/05/script-comments>
199  $data = preg_replace('#//\s*$#', '', $token->data);
200  return '<!--//--><![CDATA[//><!--' . "\n" . trim($data) . "\n" . '//--><!]]>';
201  }
Concrete text token class.
Definition: Text.php:12
generateFromToken($token)
Generates HTML from a single token.
Definition: Generator.php:139
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Field Documentation

◆ $_def

HTMLPurifier_Generator::$_def
private

Cache of HTMLDefinition during HTML output to determine whether or not attributes should be minimized.

HTMLPurifier_HTMLDefinition

Definition at line 30 of file Generator.php.

◆ $_flashCompat

HTMLPurifier_Generator::$_flashCompat
private

Cache of Output.FlashCompat.

bool

Definition at line 42 of file Generator.php.

◆ $_flashStack

HTMLPurifier_Generator::$_flashStack = array()
private

Stack for keeping track of object information when outputting IE compatibility code.

array

Definition at line 55 of file Generator.php.

◆ $_innerHTMLFix

HTMLPurifier_Generator::$_innerHTMLFix
private

Cache of Output.FixInnerHTML.

bool

Definition at line 48 of file Generator.php.

◆ $_scriptFix

HTMLPurifier_Generator::$_scriptFix = false
private

:HACK: Whether or not generator should comment the insides of <script> tags.

bool

Definition at line 23 of file Generator.php.

◆ $_sortAttr

HTMLPurifier_Generator::$_sortAttr
private

Cache of Output.SortAttr.

bool

Definition at line 36 of file Generator.php.

◆ $_xhtml

HTMLPurifier_Generator::$_xhtml = true
private

Whether or not generator should produce XML output.

bool

Definition at line 17 of file Generator.php.

◆ $config

HTMLPurifier_Generator::$config
protected

Configuration for the generator HTMLPurifier_Config.

Definition at line 61 of file Generator.php.

Referenced by __construct().


The documentation for this class was generated from the following file: