ILIAS  release_4-3 Revision
 All Data Structures Namespaces Files Functions Variables Groups Pages
HTMLPurifier_Generator Class Reference

Generates HTML from tokens. More...

+ Collaboration diagram for HTMLPurifier_Generator:

Public Member Functions

 __construct ($config, $context)
 generateFromTokens ($tokens)
 Generates HTML from an array of tokens.
 generateFromToken ($token)
 Generates HTML from a single token.
 generateScriptFromToken ($token)
 Special case processor for the contents of script tags.
 generateAttributes ($assoc_array_of_attributes, $element=false)
 Generates attribute declarations from attribute array.
 escape ($string, $quote=null)
 Escapes raw text data.

Protected Attributes

 $config
 Configuration for the generator.

Private Attributes

 $_xhtml = true
 Whether or not generator should produce XML output.
 $_scriptFix = false
 :HACK: Whether or not generator should comment the insides of <script> tags
 $_def
 Cache of HTMLDefinition during HTML output to determine whether or not attributes should be minimized.
 $_sortAttr
 Cache of Output.SortAttr.
 $_flashCompat
 Cache of Output.FlashCompat.
 $_innerHTMLFix
 Cache of Output.FixInnerHTML.
 $_flashStack = array()
 Stack for keeping track of object information when outputting IE compatibility code.

Detailed Description

Generates HTML from tokens.

Todo:

Refactor interface so that configuration/context is determined upon instantiation, no need for messy generateFromTokens() calls

Make some of the more internal functions protected, and have unit tests work around that

Definition at line 10 of file Generator.php.

Constructor & Destructor Documentation

HTMLPurifier_Generator::__construct (   $config,
  $context 
)
Parameters
$configInstance of HTMLPurifier_Config
$contextInstance of HTMLPurifier_Context

Definition at line 59 of file Generator.php.

References $config.

{
$this->config = $config;
$this->_scriptFix = $config->get('Output.CommentScriptContents');
$this->_innerHTMLFix = $config->get('Output.FixInnerHTML');
$this->_sortAttr = $config->get('Output.SortAttr');
$this->_flashCompat = $config->get('Output.FlashCompat');
$this->_def = $config->getHTMLDefinition();
$this->_xhtml = $this->_def->doctype->xml;
}

Member Function Documentation

HTMLPurifier_Generator::escape (   $string,
  $quote = null 
)

Escapes raw text data.

Todo:
This really ought to be protected, but until we have a facility for properly generating HTML here w/o using tokens, it stays public.
Parameters
$stringString data to escape for HTML.
$quoteQuoting style, like htmlspecialchars. ENT_NOQUOTES is permissible for non-attribute output.
Returns
String escaped data.

Definition at line 245 of file Generator.php.

Referenced by generateAttributes(), and generateFromToken().

{
// Workaround for APC bug on Mac Leopard reported by sidepodcast
// http://htmlpurifier.org/phorum/read.php?3,4823,4846
if ($quote === null) $quote = ENT_COMPAT;
return htmlspecialchars($string, $quote, 'UTF-8');
}

+ Here is the caller graph for this function:

HTMLPurifier_Generator::generateAttributes (   $assoc_array_of_attributes,
  $element = false 
)

Generates attribute declarations from attribute array.

Note
This does not include the leading or trailing space.
Parameters
$assoc_array_of_attributesAttribute array
$elementName of element attributes are for, used to check attribute minimization.
Returns
Generate HTML fragment for insertion.

Definition at line 186 of file Generator.php.

References escape().

Referenced by generateFromToken().

{
$html = '';
if ($this->_sortAttr) ksort($assoc_array_of_attributes);
foreach ($assoc_array_of_attributes as $key => $value) {
if (!$this->_xhtml) {
// Remove namespaced attributes
if (strpos($key, ':') !== false) continue;
// Check if we should minimize the attribute: val="val" -> val
if ($element && !empty($this->_def->info[$element]->attr[$key]->minimized)) {
$html .= $key . ' ';
continue;
}
}
// Workaround for Internet Explorer innerHTML bug.
// Essentially, Internet Explorer, when calculating
// innerHTML, omits quotes if there are no instances of
// angled brackets, quotes or spaces. However, when parsing
// HTML (for example, when you assign to innerHTML), it
// treats backticks as quotes. Thus,
// <img alt="``" />
// becomes
// <img alt=`` />
// becomes
// <img alt='' />
// Fortunately, all we need to do is trigger an appropriate
// quoting style, which we do by adding an extra space.
// This also is consistent with the W3C spec, which states
// that user agents may ignore leading or trailing
// whitespace (in fact, most don't, at least for attributes
// like alt, but an extra space at the end is barely
// noticeable). Still, we have a configuration knob for
// this, since this transformation is not necesary if you
// don't process user input with innerHTML or you don't plan
// on supporting Internet Explorer.
if ($this->_innerHTMLFix) {
if (strpos($value, '`') !== false) {
// check if correct quoting style would not already be
// triggered
if (strcspn($value, '"\' <>') === strlen($value)) {
// protect!
$value .= ' ';
}
}
}
$html .= $key.'="'.$this->escape($value).'" ';
}
return rtrim($html);
}

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

HTMLPurifier_Generator::generateFromToken (   $token)

Generates HTML from a single token.

Parameters
$tokenHTMLPurifier_Token object.
Returns
Generated HTML

Definition at line 120 of file Generator.php.

References escape(), and generateAttributes().

Referenced by generateFromTokens(), and generateScriptFromToken().

{
if (!$token instanceof HTMLPurifier_Token) {
trigger_error('Cannot generate HTML from non-HTMLPurifier_Token object', E_USER_WARNING);
return '';
} elseif ($token instanceof HTMLPurifier_Token_Start) {
$attr = $this->generateAttributes($token->attr, $token->name);
if ($this->_flashCompat) {
if ($token->name == "object") {
$flash = new stdclass();
$flash->attr = $token->attr;
$flash->param = array();
$this->_flashStack[] = $flash;
}
}
return '<' . $token->name . ($attr ? ' ' : '') . $attr . '>';
} elseif ($token instanceof HTMLPurifier_Token_End) {
$_extra = '';
if ($this->_flashCompat) {
if ($token->name == "object" && !empty($this->_flashStack)) {
// doesn't do anything for now
}
}
return $_extra . '</' . $token->name . '>';
} elseif ($token instanceof HTMLPurifier_Token_Empty) {
if ($this->_flashCompat && $token->name == "param" && !empty($this->_flashStack)) {
$this->_flashStack[count($this->_flashStack)-1]->param[$token->attr['name']] = $token->attr['value'];
}
$attr = $this->generateAttributes($token->attr, $token->name);
return '<' . $token->name . ($attr ? ' ' : '') . $attr .
( $this->_xhtml ? ' /': '' ) // <br /> v. <br>
. '>';
} elseif ($token instanceof HTMLPurifier_Token_Text) {
return $this->escape($token->data, ENT_NOQUOTES);
} elseif ($token instanceof HTMLPurifier_Token_Comment) {
return '<!--' . $token->data . '-->';
} else {
return '';
}
}

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

HTMLPurifier_Generator::generateFromTokens (   $tokens)

Generates HTML from an array of tokens.

Parameters
$tokensArray of HTMLPurifier_Token
$configHTMLPurifier_Config object
Returns
Generated HTML

Definition at line 75 of file Generator.php.

References $size, generateFromToken(), and generateScriptFromToken().

{
if (!$tokens) return '';
// Basic algorithm
$html = '';
for ($i = 0, $size = count($tokens); $i < $size; $i++) {
if ($this->_scriptFix && $tokens[$i]->name === 'script'
&& $i + 2 < $size && $tokens[$i+2] instanceof HTMLPurifier_Token_End) {
// script special case
// the contents of the script block must be ONE token
// for this to work.
$html .= $this->generateFromToken($tokens[$i++]);
$html .= $this->generateScriptFromToken($tokens[$i++]);
}
$html .= $this->generateFromToken($tokens[$i]);
}
// Tidy cleanup
if (extension_loaded('tidy') && $this->config->get('Output.TidyFormat')) {
$tidy = new Tidy;
$tidy->parseString($html, array(
'indent'=> true,
'output-xhtml' => $this->_xhtml,
'show-body-only' => true,
'indent-spaces' => 2,
'wrap' => 68,
), 'utf8');
$tidy->cleanRepair();
$html = (string) $tidy; // explicit cast necessary
}
// Normalize newlines to system defined value
if ($this->config->get('Core.NormalizeNewlines')) {
$nl = $this->config->get('Output.Newline');
if ($nl === null) $nl = PHP_EOL;
if ($nl !== "\n") $html = str_replace("\n", $nl, $html);
}
return $html;
}

+ Here is the call graph for this function:

HTMLPurifier_Generator::generateScriptFromToken (   $token)

Special case processor for the contents of script tags.

Warning
This runs into problems if there's already a literal –> somewhere inside the script contents.

Definition at line 171 of file Generator.php.

References $data, and generateFromToken().

Referenced by generateFromTokens().

{
if (!$token instanceof HTMLPurifier_Token_Text) return $this->generateFromToken($token);
// Thanks <http://lachy.id.au/log/2005/05/script-comments>
$data = preg_replace('#//\s*$#', '', $token->data);
return '<!--//--><![CDATA[//><!--' . "\n" . trim($data) . "\n" . '//--><!]]>';
}

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

Field Documentation

HTMLPurifier_Generator::$_def
private

Cache of HTMLDefinition during HTML output to determine whether or not attributes should be minimized.

Definition at line 27 of file Generator.php.

HTMLPurifier_Generator::$_flashCompat
private

Cache of Output.FlashCompat.

Definition at line 37 of file Generator.php.

HTMLPurifier_Generator::$_flashStack = array()
private

Stack for keeping track of object information when outputting IE compatibility code.

Definition at line 48 of file Generator.php.

HTMLPurifier_Generator::$_innerHTMLFix
private

Cache of Output.FixInnerHTML.

Definition at line 42 of file Generator.php.

HTMLPurifier_Generator::$_scriptFix = false
private

:HACK: Whether or not generator should comment the insides of <script> tags

Definition at line 21 of file Generator.php.

HTMLPurifier_Generator::$_sortAttr
private

Cache of Output.SortAttr.

Definition at line 32 of file Generator.php.

HTMLPurifier_Generator::$_xhtml = true
private

Whether or not generator should produce XML output.

Definition at line 16 of file Generator.php.

HTMLPurifier_Generator::$config
protected

Configuration for the generator.

Definition at line 53 of file Generator.php.

Referenced by __construct().


The documentation for this class was generated from the following file: