ILIAS  release_5-4 Revision v5.4.26-12-gabc799a52e6
PhpOffice\PhpSpreadsheet\Reader\Xml Class Reference

Reader for SpreadsheetML, the XML schema for Microsoft Office Excel 2003. More...

+ Inheritance diagram for PhpOffice\PhpSpreadsheet\Reader\Xml:
+ Collaboration diagram for PhpOffice\PhpSpreadsheet\Reader\Xml:

Public Member Functions

 __construct ()
 Create a new Excel2003XML Reader instance. More...
 
 canRead ($pFilename)
 Can the current IReader read the file? More...
 
 trySimpleXMLLoadString ($pFilename)
 Check if the file is a valid SimpleXML. More...
 
 listWorksheetNames ($filename)
 Reads names of the worksheets from a file, without parsing the whole file to a Spreadsheet object. More...
 
 listWorksheetInfo ($filename)
 Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns). More...
 
 load ($filename)
 Loads Spreadsheet from file. More...
 
- Public Member Functions inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 __construct ()
 IReader constructor. More...
 
 getReadDataOnly ()
 Read data only? If this is true, then the Reader will only read data values for cells, it will not read any formatting information. More...
 
 setReadDataOnly ($pValue)
 Set read data only Set to true, to advise the Reader only to read data values for cells, and to ignore any formatting information. More...
 
 getReadEmptyCells ()
 Read empty cells? If this is true (the default), then the Reader will read data values for all cells, irrespective of value. More...
 
 setReadEmptyCells ($pValue)
 Set read empty cells Set to true (the default) to advise the Reader read data values for all cells, irrespective of value. More...
 
 getIncludeCharts ()
 Read charts in workbook? If this is true, then the Reader will include any charts that exist in the workbook. More...
 
 setIncludeCharts ($pValue)
 Set read charts in workbook Set to true, to advise the Reader to include any charts that exist in the workbook. More...
 
 getLoadSheetsOnly ()
 Get which sheets to load Returns either an array of worksheet names (the list of worksheets that should be loaded), or a null indicating that all worksheets in the workbook should be loaded. More...
 
 setLoadSheetsOnly ($value)
 Set which sheets to load. More...
 
 setLoadAllSheets ()
 Set all sheets to load Tells the Reader to load all worksheets from the workbook. More...
 
 getReadFilter ()
 Read filter. More...
 
 setReadFilter (IReadFilter $pValue)
 Set read filter. More...
 
 getSecurityScanner ()
 
 __construct ()
 IReader constructor. More...
 
 canRead ($pFilename)
 Can the current IReader read the file? More...
 
 getReadDataOnly ()
 Read data only? If this is true, then the Reader will only read data values for cells, it will not read any formatting information. More...
 
 setReadDataOnly ($pValue)
 Set read data only Set to true, to advise the Reader only to read data values for cells, and to ignore any formatting information. More...
 
 getReadEmptyCells ()
 Read empty cells? If this is true (the default), then the Reader will read data values for all cells, irrespective of value. More...
 
 setReadEmptyCells ($pValue)
 Set read empty cells Set to true (the default) to advise the Reader read data values for all cells, irrespective of value. More...
 
 getIncludeCharts ()
 Read charts in workbook? If this is true, then the Reader will include any charts that exist in the workbook. More...
 
 setIncludeCharts ($pValue)
 Set read charts in workbook Set to true, to advise the Reader to include any charts that exist in the workbook. More...
 
 getLoadSheetsOnly ()
 Get which sheets to load Returns either an array of worksheet names (the list of worksheets that should be loaded), or a null indicating that all worksheets in the workbook should be loaded. More...
 
 setLoadSheetsOnly ($value)
 Set which sheets to load. More...
 
 setLoadAllSheets ()
 Set all sheets to load Tells the Reader to load all worksheets from the workbook. More...
 
 getReadFilter ()
 Read filter. More...
 
 setReadFilter (IReadFilter $pValue)
 Set read filter. More...
 
 load ($pFilename)
 Loads PhpSpreadsheet from file. More...
 

Static Public Member Functions

static xmlMappings ()
 

Protected Member Functions

 parseCellComment (SimpleXMLElement $comment, array $namespaces, Spreadsheet $spreadsheet, string $columnID, int $rowID)
 
 parseRichText (string $annotation)
 
- Protected Member Functions inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 openFile ($pFilename)
 Open file for reading. More...
 

Protected Attributes

 $styles = []
 
- Protected Attributes inherited from PhpOffice\PhpSpreadsheet\Reader\BaseReader
 $readDataOnly = false
 
 $readEmptyCells = true
 
 $includeCharts = false
 
 $loadSheetsOnly
 
 $readFilter
 
 $fileHandle
 
 $securityScanner
 

Static Private Member Functions

static getAttributes (?SimpleXMLElement $simple, string $node)
 

Private Attributes

 $fileContents = ''
 

Detailed Description

Reader for SpreadsheetML, the XML schema for Microsoft Office Excel 2003.

Definition at line 26 of file Xml.php.

Constructor & Destructor Documentation

◆ __construct()

PhpOffice\PhpSpreadsheet\Reader\Xml::__construct ( )

Create a new Excel2003XML Reader instance.

Reimplemented from PhpOffice\PhpSpreadsheet\Reader\BaseReader.

Definition at line 38 of file Xml.php.

39 {
40 parent::__construct();
41 $this->securityScanner = XmlScanner::getInstance($this);
42 }
static getInstance(Reader\IReader $reader)
Definition: XmlScanner.php:39

References PhpOffice\PhpSpreadsheet\Reader\Security\XmlScanner\getInstance().

+ Here is the call graph for this function:

Member Function Documentation

◆ canRead()

PhpOffice\PhpSpreadsheet\Reader\Xml::canRead (   $pFilename)

Can the current IReader read the file?

Parameters
string$pFilename
Returns
bool

Implements PhpOffice\PhpSpreadsheet\Reader\IReader.

Definition at line 61 of file Xml.php.

62 {
63 // Office xmlns:o="urn:schemas-microsoft-com:office:office"
64 // Excel xmlns:x="urn:schemas-microsoft-com:office:excel"
65 // XML Spreadsheet xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
66 // Spreadsheet component xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet"
67 // XML schema xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
68 // XML data type xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
69 // MS-persist recordset xmlns:rs="urn:schemas-microsoft-com:rowset"
70 // Rowset xmlns:z="#RowsetSchema"
71 //
72
73 $signature = [
74 '<?xml version="1.0"',
75 'xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet',
76 ];
77
78 // Open file
79 $data = file_get_contents($pFilename);
80
81 // Why?
82 //$data = str_replace("'", '"', $data); // fix headers with single quote
83
84 $valid = true;
85 foreach ($signature as $match) {
86 // every part of the signature must be present
87 if (strpos($data, $match) === false) {
88 $valid = false;
89
90 break;
91 }
92 }
93
94 // Retrieve charset encoding
95 if (preg_match('/<?xml.*encoding=[\'"](.*?)[\'"].*?>/m', $data, $matches)) {
96 $charSet = strtoupper($matches[1]);
97 if (1 == preg_match('/^ISO-8859-\d[\dL]?$/i', $charSet)) {
98 $data = StringHelper::convertEncoding($data, 'UTF-8', $charSet);
99 $data = preg_replace('/(<?xml.*encoding=[\'"]).*?([\'"].*?>)/um', '$1' . 'UTF-8' . '$2', $data, 1);
100 }
101 }
102 $this->fileContents = $data;
103
104 return $valid;
105 }
static convertEncoding($value, $to, $from)
Convert string from one encoding to another.
$valid
$data
Definition: bench.php:6

References $data, $valid, and PhpOffice\PhpSpreadsheet\Shared\StringHelper\convertEncoding().

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetInfo(), and PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetNames().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ getAttributes()

static PhpOffice\PhpSpreadsheet\Reader\Xml::getAttributes ( ?SimpleXMLElement  $simple,
string  $node 
)
staticprivate

Definition at line 534 of file Xml.php.

534 : SimpleXMLElement
535 {
536 return ($simple === null)
537 ? new SimpleXMLElement('<xml></xml>')
538 : ($simple->attributes($node) ?? new SimpleXMLElement('<xml></xml>'));
539 }

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetInfo(), and PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetNames().

+ Here is the caller graph for this function:

◆ listWorksheetInfo()

PhpOffice\PhpSpreadsheet\Reader\Xml::listWorksheetInfo (   $filename)

Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).

Parameters
string$filename
Returns
array

Definition at line 169 of file Xml.php.

170 {
172 if (!$this->canRead($filename)) {
173 throw new Exception($filename . ' is an Invalid Spreadsheet file.');
174 }
175
176 $worksheetInfo = [];
177
179 if ($xml === false) {
180 throw new Exception("Problem reading {$filename}");
181 }
182
183 $namespaces = $xml->getNamespaces(true);
184
185 $worksheetID = 1;
186 $xml_ss = $xml->children($namespaces['ss']);
187 foreach ($xml_ss->Worksheet as $worksheet) {
188 $worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);
189
190 $tmpInfo = [];
191 $tmpInfo['worksheetName'] = '';
192 $tmpInfo['lastColumnLetter'] = 'A';
193 $tmpInfo['lastColumnIndex'] = 0;
194 $tmpInfo['totalRows'] = 0;
195 $tmpInfo['totalColumns'] = 0;
196
197 $tmpInfo['worksheetName'] = "Worksheet_{$worksheetID}";
198 if (isset($worksheet_ss['Name'])) {
199 $tmpInfo['worksheetName'] = (string) $worksheet_ss['Name'];
200 }
201
202 if (isset($worksheet->Table->Row)) {
203 $rowIndex = 0;
204
205 foreach ($worksheet->Table->Row as $rowData) {
206 $columnIndex = 0;
207 $rowHasData = false;
208
209 foreach ($rowData->Cell as $cell) {
210 if (isset($cell->Data)) {
211 $tmpInfo['lastColumnIndex'] = max($tmpInfo['lastColumnIndex'], $columnIndex);
212 $rowHasData = true;
213 }
214
215 ++$columnIndex;
216 }
217
218 ++$rowIndex;
219
220 if ($rowHasData) {
221 $tmpInfo['totalRows'] = max($tmpInfo['totalRows'], $rowIndex);
222 }
223 }
224 }
225
226 $tmpInfo['lastColumnLetter'] = Coordinate::stringFromColumnIndex($tmpInfo['lastColumnIndex'] + 1);
227 $tmpInfo['totalColumns'] = $tmpInfo['lastColumnIndex'] + 1;
228
229 $worksheetInfo[] = $tmpInfo;
230 ++$worksheetID;
231 }
232
233 return $worksheetInfo;
234 }
$filename
Definition: buildRTE.php:89
static stringFromColumnIndex($columnIndex)
String from column index.
Definition: Coordinate.php:313
static getAttributes(?SimpleXMLElement $simple, string $node)
Definition: Xml.php:534
trySimpleXMLLoadString($pFilename)
Check if the file is a valid SimpleXML.
Definition: Xml.php:114
canRead($pFilename)
Can the current IReader read the file?
Definition: Xml.php:61
static assertFile($filename)
Assert that given path is an existing file and is readable, otherwise throw exception.
Definition: File.php:143

References $filename, $xml, PhpOffice\PhpSpreadsheet\Shared\File\assertFile(), PhpOffice\PhpSpreadsheet\Reader\Xml\canRead(), PhpOffice\PhpSpreadsheet\Reader\Xml\getAttributes(), PhpOffice\PhpSpreadsheet\Cell\Coordinate\stringFromColumnIndex(), and PhpOffice\PhpSpreadsheet\Reader\Xml\trySimpleXMLLoadString().

+ Here is the call graph for this function:

◆ listWorksheetNames()

PhpOffice\PhpSpreadsheet\Reader\Xml::listWorksheetNames (   $filename)

Reads names of the worksheets from a file, without parsing the whole file to a Spreadsheet object.

Parameters
string$filename
Returns
array

Definition at line 137 of file Xml.php.

138 {
140 if (!$this->canRead($filename)) {
141 throw new Exception($filename . ' is an Invalid Spreadsheet file.');
142 }
143
144 $worksheetNames = [];
145
147 if ($xml === false) {
148 throw new Exception("Problem reading {$filename}");
149 }
150
151 $namespaces = $xml->getNamespaces(true);
152
153 $xml_ss = $xml->children($namespaces['ss']);
154 foreach ($xml_ss->Worksheet as $worksheet) {
155 $worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);
156 $worksheetNames[] = (string) $worksheet_ss['Name'];
157 }
158
159 return $worksheetNames;
160 }

References $filename, $xml, PhpOffice\PhpSpreadsheet\Shared\File\assertFile(), PhpOffice\PhpSpreadsheet\Reader\Xml\canRead(), PhpOffice\PhpSpreadsheet\Reader\Xml\getAttributes(), and PhpOffice\PhpSpreadsheet\Reader\Xml\trySimpleXMLLoadString().

+ Here is the call graph for this function:

◆ load()

PhpOffice\PhpSpreadsheet\Reader\Xml::load (   $filename)

Loads Spreadsheet from file.

Parameters
string$filename
Returns
Spreadsheet

Implements PhpOffice\PhpSpreadsheet\Reader\IReader.

Definition at line 243 of file Xml.php.

244 {
245 // Create new Spreadsheet
246 $spreadsheet = new Spreadsheet();
247 $spreadsheet->removeSheetByIndex(0);
248
249 // Load into this instance
250 return $this->loadIntoExisting($filename, $spreadsheet);
251 }

References $filename.

◆ parseCellComment()

PhpOffice\PhpSpreadsheet\Reader\Xml::parseCellComment ( SimpleXMLElement  $comment,
array  $namespaces,
Spreadsheet  $spreadsheet,
string  $columnID,
int  $rowID 
)
protected

Definition at line 505 of file Xml.php.

511 : void {
512 $commentAttributes = $comment->attributes($namespaces['ss']);
513 $author = 'unknown';
514 if (isset($commentAttributes->Author)) {
515 $author = (string) $commentAttributes->Author;
516 }
517
518 $node = $comment->Data->asXML();
519 $annotation = strip_tags((string) $node);
520 $spreadsheet->getActiveSheet()->getComment($columnID . $rowID)
521 ->setAuthor($author)
522 ->setText($this->parseRichText($annotation));
523 }
$comment
Definition: buildRTE.php:83
parseRichText(string $annotation)
Definition: Xml.php:525

◆ parseRichText()

PhpOffice\PhpSpreadsheet\Reader\Xml::parseRichText ( string  $annotation)
protected

Definition at line 525 of file Xml.php.

525 : RichText
526 {
527 $value = new RichText();
528
529 $value->createText($annotation);
530
531 return $value;
532 }

◆ trySimpleXMLLoadString()

PhpOffice\PhpSpreadsheet\Reader\Xml::trySimpleXMLLoadString (   $pFilename)

Check if the file is a valid SimpleXML.

Parameters
string$pFilename
Returns
false|SimpleXMLElement

Definition at line 114 of file Xml.php.

115 {
116 try {
117 $xml = simplexml_load_string(
118 $this->securityScanner->scan($this->fileContents ?: file_get_contents($pFilename)),
119 'SimpleXMLElement',
121 );
122 } catch (\Exception $e) {
123 throw new Exception('Cannot load invalid XML file: ' . $pFilename, 0, $e);
124 }
125 $this->fileContents = '';
126
127 return $xml;
128 }
static getLibXmlLoaderOptions()
Get default options for libxml loader.
Definition: Settings.php:116

References $xml, and PhpOffice\PhpSpreadsheet\Settings\getLibXmlLoaderOptions().

Referenced by PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetInfo(), and PhpOffice\PhpSpreadsheet\Reader\Xml\listWorksheetNames().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ xmlMappings()

static PhpOffice\PhpSpreadsheet\Reader\Xml::xmlMappings ( )
static

Field Documentation

◆ $fileContents

PhpOffice\PhpSpreadsheet\Reader\Xml::$fileContents = ''
private

Definition at line 44 of file Xml.php.

◆ $styles

PhpOffice\PhpSpreadsheet\Reader\Xml::$styles = []
protected

Definition at line 33 of file Xml.php.


The documentation for this class was generated from the following file: