ILIAS  release_5-4 Revision v5.4.26-12-gabc799a52e6
PhpOffice\PhpSpreadsheet\Reader\Xml Class Reference

Reader for SpreadsheetML, the XML schema for Microsoft Office Excel 2003. More...

+ Inheritance diagram for PhpOffice\PhpSpreadsheet\Reader\Xml:
+ Collaboration diagram for PhpOffice\PhpSpreadsheet\Reader\Xml:

Public Member Functions

 __construct ()
 Create a new Excel2003XML Reader instance. More...
 
 canRead ($pFilename)
 Can the current IReader read the file? More...
 
 trySimpleXMLLoadString ($pFilename)
 Check if the file is a valid SimpleXML. More...
 
 listWorksheetNames ($filename)
 Reads names of the worksheets from a file, without parsing the whole file to a Spreadsheet object. More...
 
 listWorksheetInfo ($filename)
 Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns). More...
 
 load ($filename)
 Loads Spreadsheet from file. More...
 
- Public Member Functions inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 __construct ()
 IReader constructor. More...
 
 getReadDataOnly ()
 Read data only? If this is true, then the Reader will only read data values for cells, it will not read any formatting information. More...
 
 setReadDataOnly ($pValue)
 Set read data only Set to true, to advise the Reader only to read data values for cells, and to ignore any formatting information. More...
 
 getReadEmptyCells ()
 Read empty cells? If this is true (the default), then the Reader will read data values for all cells, irrespective of value. More...
 
 setReadEmptyCells ($pValue)
 Set read empty cells Set to true (the default) to advise the Reader read data values for all cells, irrespective of value. More...
 
 getIncludeCharts ()
 Read charts in workbook? If this is true, then the Reader will include any charts that exist in the workbook. More...
 
 setIncludeCharts ($pValue)
 Set read charts in workbook Set to true, to advise the Reader to include any charts that exist in the workbook. More...
 
 getLoadSheetsOnly ()
 Get which sheets to load Returns either an array of worksheet names (the list of worksheets that should be loaded), or a null indicating that all worksheets in the workbook should be loaded. More...
 
 setLoadSheetsOnly ($value)
 Set which sheets to load. More...
 
 setLoadAllSheets ()
 Set all sheets to load Tells the Reader to load all worksheets from the workbook. More...
 
 getReadFilter ()
 Read filter. More...
 
 setReadFilter (IReadFilter $pValue)
 Set read filter. More...
 
 getSecurityScanner ()
 

Static Public Member Functions

static xmlMappings ()
 

Protected Member Functions

 parseCellComment (SimpleXMLElement $comment, array $namespaces, Spreadsheet $spreadsheet, string $columnID, int $rowID)
 
 parseRichText (string $annotation)
 
- Protected Member Functions inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 openFile ($pFilename)
 Open file for reading. More...
 

Protected Attributes

 $styles = []
 
- Protected Attributes inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 $readDataOnly = false
 
 $readEmptyCells = true
 
 $includeCharts = false
 
 $loadSheetsOnly
 
 $readFilter
 
 $fileHandle
 
 $securityScanner
 

Static Private Member Functions

static getAttributes (?SimpleXMLElement $simple, string $node)
 

Private Attributes

 $fileContents = ''
 

Detailed Description

Reader for SpreadsheetML, the XML schema for Microsoft Office Excel 2003.

Definition at line 26 of file Xml.php.

Constructor & Destructor Documentation

◆ __construct()

PhpOffice\PhpSpreadsheet\Reader\Xml::__construct ( )

Create a new Excel2003XML Reader instance.

Implements PhpOffice\PhpSpreadsheet\Reader\IReader.

Definition at line 38 of file Xml.php.

References PhpOffice\PhpSpreadsheet\Reader\Security\XmlScanner\getInstance().

39  {
40  parent::__construct();
41  $this->securityScanner = XmlScanner::getInstance($this);
42  }
static getInstance(Reader\IReader $reader)
Definition: XmlScanner.php:39
+ Here is the call graph for this function:

Member Function Documentation

◆ canRead()

PhpOffice\PhpSpreadsheet\Reader\Xml::canRead (   $pFilename)

Can the current IReader read the file?

Parameters
string$pFilename
Returns
bool

Implements PhpOffice\PhpSpreadsheet\Reader\IReader.

Definition at line 61 of file Xml.php.

References $data, $valid, and PhpOffice\PhpSpreadsheet\Shared\StringHelper\convertEncoding().

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetInfo(), PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetNames(), and PhpOffice\PhpSpreadsheet\Reader\Xml\load().

62  {
63  // Office xmlns:o="urn:schemas-microsoft-com:office:office"
64  // Excel xmlns:x="urn:schemas-microsoft-com:office:excel"
65  // XML Spreadsheet xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
66  // Spreadsheet component xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet"
67  // XML schema xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
68  // XML data type xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
69  // MS-persist recordset xmlns:rs="urn:schemas-microsoft-com:rowset"
70  // Rowset xmlns:z="#RowsetSchema"
71  //
72 
73  $signature = [
74  '<?xml version="1.0"',
75  'xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet',
76  ];
77 
78  // Open file
79  $data = file_get_contents($pFilename);
80 
81  // Why?
82  //$data = str_replace("'", '"', $data); // fix headers with single quote
83 
84  $valid = true;
85  foreach ($signature as $match) {
86  // every part of the signature must be present
87  if (strpos($data, $match) === false) {
88  $valid = false;
89 
90  break;
91  }
92  }
93 
94  // Retrieve charset encoding
95  if (preg_match('/<?xml.*encoding=[\'"](.*?)[\'"].*?>/m', $data, $matches)) {
96  $charSet = strtoupper($matches[1]);
97  if (1 == preg_match('/^ISO-8859-\d[\dL]?$/i', $charSet)) {
98  $data = StringHelper::convertEncoding($data, 'UTF-8', $charSet);
99  $data = preg_replace('/(<?xml.*encoding=[\'"]).*?([\'"].*?>)/um', '$1' . 'UTF-8' . '$2', $data, 1);
100  }
101  }
102  $this->fileContents = $data;
103 
104  return $valid;
105  }
$valid
static convertEncoding($value, $to, $from)
Convert string from one encoding to another.
$data
Definition: bench.php:6
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ getAttributes()

static PhpOffice\PhpSpreadsheet\Reader\Xml::getAttributes ( ?SimpleXMLElement  $simple,
string  $node 
)
staticprivate

Definition at line 534 of file Xml.php.

535  {
536  return ($simple === null)
537  ? new SimpleXMLElement('<xml></xml>')
538  : ($simple->attributes($node) ?? new SimpleXMLElement('<xml></xml>'));
539  }

◆ listWorksheetInfo()

PhpOffice\PhpSpreadsheet\Reader\Xml::listWorksheetInfo (   $filename)

Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).

Parameters
string$filename
Returns
array

Definition at line 169 of file Xml.php.

References $filename, $xml, PhpOffice\PhpSpreadsheet\Shared\File\assertFile(), PhpOffice\PhpSpreadsheet\Reader\Xml\canRead(), PhpOffice\PhpSpreadsheet\Cell\Coordinate\stringFromColumnIndex(), and PhpOffice\PhpSpreadsheet\Reader\Xml\trySimpleXMLLoadString().

170  {
172  if (!$this->canRead($filename)) {
173  throw new Exception($filename . ' is an Invalid Spreadsheet file.');
174  }
175 
176  $worksheetInfo = [];
177 
179  if ($xml === false) {
180  throw new Exception("Problem reading {$filename}");
181  }
182 
183  $namespaces = $xml->getNamespaces(true);
184 
185  $worksheetID = 1;
186  $xml_ss = $xml->children($namespaces['ss']);
187  foreach ($xml_ss->Worksheet as $worksheet) {
188  $worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);
189 
190  $tmpInfo = [];
191  $tmpInfo['worksheetName'] = '';
192  $tmpInfo['lastColumnLetter'] = 'A';
193  $tmpInfo['lastColumnIndex'] = 0;
194  $tmpInfo['totalRows'] = 0;
195  $tmpInfo['totalColumns'] = 0;
196 
197  $tmpInfo['worksheetName'] = "Worksheet_{$worksheetID}";
198  if (isset($worksheet_ss['Name'])) {
199  $tmpInfo['worksheetName'] = (string) $worksheet_ss['Name'];
200  }
201 
202  if (isset($worksheet->Table->Row)) {
203  $rowIndex = 0;
204 
205  foreach ($worksheet->Table->Row as $rowData) {
206  $columnIndex = 0;
207  $rowHasData = false;
208 
209  foreach ($rowData->Cell as $cell) {
210  if (isset($cell->Data)) {
211  $tmpInfo['lastColumnIndex'] = max($tmpInfo['lastColumnIndex'], $columnIndex);
212  $rowHasData = true;
213  }
214 
215  ++$columnIndex;
216  }
217 
218  ++$rowIndex;
219 
220  if ($rowHasData) {
221  $tmpInfo['totalRows'] = max($tmpInfo['totalRows'], $rowIndex);
222  }
223  }
224  }
225 
226  $tmpInfo['lastColumnLetter'] = Coordinate::stringFromColumnIndex($tmpInfo['lastColumnIndex'] + 1);
227  $tmpInfo['totalColumns'] = $tmpInfo['lastColumnIndex'] + 1;
228 
229  $worksheetInfo[] = $tmpInfo;
230  ++$worksheetID;
231  }
232 
233  return $worksheetInfo;
234  }
trySimpleXMLLoadString($pFilename)
Check if the file is a valid SimpleXML.
Definition: Xml.php:114
canRead($pFilename)
Can the current IReader read the file?
Definition: Xml.php:61
$filename
Definition: buildRTE.php:89
static assertFile($filename)
Assert that given path is an existing file and is readable, otherwise throw exception.
Definition: File.php:143
static stringFromColumnIndex($columnIndex)
String from column index.
Definition: Coordinate.php:313
+ Here is the call graph for this function:

◆ listWorksheetNames()

PhpOffice\PhpSpreadsheet\Reader\Xml::listWorksheetNames (   $filename)

Reads names of the worksheets from a file, without parsing the whole file to a Spreadsheet object.

Parameters
string$filename
Returns
array

Definition at line 137 of file Xml.php.

References $filename, $xml, PhpOffice\PhpSpreadsheet\Shared\File\assertFile(), PhpOffice\PhpSpreadsheet\Reader\Xml\canRead(), and PhpOffice\PhpSpreadsheet\Reader\Xml\trySimpleXMLLoadString().

138  {
140  if (!$this->canRead($filename)) {
141  throw new Exception($filename . ' is an Invalid Spreadsheet file.');
142  }
143 
144  $worksheetNames = [];
145 
147  if ($xml === false) {
148  throw new Exception("Problem reading {$filename}");
149  }
150 
151  $namespaces = $xml->getNamespaces(true);
152 
153  $xml_ss = $xml->children($namespaces['ss']);
154  foreach ($xml_ss->Worksheet as $worksheet) {
155  $worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);
156  $worksheetNames[] = (string) $worksheet_ss['Name'];
157  }
158 
159  return $worksheetNames;
160  }
trySimpleXMLLoadString($pFilename)
Check if the file is a valid SimpleXML.
Definition: Xml.php:114
canRead($pFilename)
Can the current IReader read the file?
Definition: Xml.php:61
$filename
Definition: buildRTE.php:89
static assertFile($filename)
Assert that given path is an existing file and is readable, otherwise throw exception.
Definition: File.php:143
+ Here is the call graph for this function:

◆ load()

PhpOffice\PhpSpreadsheet\Reader\Xml::load (   $filename)

Loads Spreadsheet from file.

Parameters
string$filename
Returns
Spreadsheet

Implements PhpOffice\PhpSpreadsheet\Reader\IReader.

Definition at line 243 of file Xml.php.

References $filename, $name, $style, $type, $xml, PhpOffice\PhpSpreadsheet\Spreadsheet\addDefinedName(), PhpOffice\PhpSpreadsheet\Shared\File\assertFile(), PhpOffice\PhpSpreadsheet\Reader\Xml\canRead(), PhpOffice\PhpSpreadsheet\Cell\Coordinate\columnIndexFromString(), PhpOffice\PhpSpreadsheet\Cell\AddressHelper\convertFormulaToA1(), PhpOffice\PhpSpreadsheet\DefinedName\createInstance(), PhpOffice\PhpSpreadsheet\Spreadsheet\createSheet(), PhpOffice\PhpSpreadsheet\Spreadsheet\getActiveSheet(), PhpOffice\PhpSpreadsheet\Reader\BaseReader\getReadFilter(), PhpOffice\PhpSpreadsheet\Reader\Xml\parseCellComment(), PhpOffice\PhpSpreadsheet\Shared\Date\PHPToExcel(), PhpOffice\PhpSpreadsheet\Spreadsheet\setActiveSheetIndex(), PhpOffice\PhpSpreadsheet\Cell\Coordinate\stringFromColumnIndex(), PhpOffice\PhpSpreadsheet\Reader\Xml\trySimpleXMLLoadString(), PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_BOOL, PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_ERROR, PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_FORMULA, PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_NULL, PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_NUMERIC, and PhpOffice\PhpSpreadsheet\Cell\DataType\TYPE_STRING.

244  {
245  // Create new Spreadsheet
246  $spreadsheet = new Spreadsheet();
247  $spreadsheet->removeSheetByIndex(0);
248 
249  // Load into this instance
250  return $this->loadIntoExisting($filename, $spreadsheet);
251  }
$filename
Definition: buildRTE.php:89
+ Here is the call graph for this function:

◆ parseCellComment()

PhpOffice\PhpSpreadsheet\Reader\Xml::parseCellComment ( SimpleXMLElement  $comment,
array  $namespaces,
Spreadsheet  $spreadsheet,
string  $columnID,
int  $rowID 
)
protected

Definition at line 505 of file Xml.php.

References PhpOffice\PhpSpreadsheet\Spreadsheet\getActiveSheet(), and PhpOffice\PhpSpreadsheet\Reader\Xml\parseRichText().

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\load().

511  : void {
512  $commentAttributes = $comment->attributes($namespaces['ss']);
513  $author = 'unknown';
514  if (isset($commentAttributes->Author)) {
515  $author = (string) $commentAttributes->Author;
516  }
517 
518  $node = $comment->Data->asXML();
519  $annotation = strip_tags((string) $node);
520  $spreadsheet->getActiveSheet()->getComment($columnID . $rowID)
521  ->setAuthor($author)
522  ->setText($this->parseRichText($annotation));
523  }
parseRichText(string $annotation)
Definition: Xml.php:525
$comment
Definition: buildRTE.php:83
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ parseRichText()

PhpOffice\PhpSpreadsheet\Reader\Xml::parseRichText ( string  $annotation)
protected

Definition at line 525 of file Xml.php.

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\parseCellComment().

525  : RichText
526  {
527  $value = new RichText();
528 
529  $value->createText($annotation);
530 
531  return $value;
532  }
+ Here is the caller graph for this function:

◆ trySimpleXMLLoadString()

PhpOffice\PhpSpreadsheet\Reader\Xml::trySimpleXMLLoadString (   $pFilename)

Check if the file is a valid SimpleXML.

Parameters
string$pFilename
Returns
false|SimpleXMLElement

Definition at line 114 of file Xml.php.

References $xml, and PhpOffice\PhpSpreadsheet\Settings\getLibXmlLoaderOptions().

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetInfo(), PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetNames(), and PhpOffice\PhpSpreadsheet\Reader\Xml\load().

115  {
116  try {
117  $xml = simplexml_load_string(
118  $this->securityScanner->scan($this->fileContents ?: file_get_contents($pFilename)),
119  'SimpleXMLElement',
121  );
122  } catch (\Exception $e) {
123  throw new Exception('Cannot load invalid XML file: ' . $pFilename, 0, $e);
124  }
125  $this->fileContents = '';
126 
127  return $xml;
128  }
static getLibXmlLoaderOptions()
Get default options for libxml loader.
Definition: Settings.php:116
+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ xmlMappings()

static PhpOffice\PhpSpreadsheet\Reader\Xml::xmlMappings ( )
static

Definition at line 46 of file Xml.php.

46  : array
47  {
48  return array_merge(
49  Style\Fill::FILL_MAPPINGS,
50  Style\Border::BORDER_MAPPINGS
51  );
52  }

Field Documentation

◆ $fileContents

PhpOffice\PhpSpreadsheet\Reader\Xml::$fileContents = ''
private

Definition at line 44 of file Xml.php.

◆ $styles

PhpOffice\PhpSpreadsheet\Reader\Xml::$styles = []
protected

Definition at line 33 of file Xml.php.


The documentation for this class was generated from the following file: