Filename | /home/micha/Projekt/spreadsheet-parsexlsx/lib/Spreadsheet/ParseXLSX.pm |
Statements | Executed 9186778 statements in 4.67s |
Calls | P | F | Exclusive Time |
Inclusive Time |
Subroutine |
---|---|---|---|---|---|
15608 | 1 | 1 | 5.52s | 12.9s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:443] | Spreadsheet::ParseXLSX::
239270 | 6 | 1 | 861ms | 1.04s | _cell_to_row_col | Spreadsheet::ParseXLSX::
18180 | 1 | 1 | 218ms | 960ms | __ANON__[lib/Spreadsheet/ParseXLSX.pm:302] | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 198ms | 70.6s | _parse_sheet | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 44.8ms | 50.0ms | BEGIN@15 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 13.5ms | 50.2ms | BEGIN@11 | Spreadsheet::ParseXLSX::
15730 | 7 | 1 | 13.4ms | 13.4ms | _xml_boolean | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 6.82ms | 22.0ms | BEGIN@14 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 3.00ms | 3.51ms | BEGIN@12 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 688µs | 13.8ms | BEGIN@17 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 671µs | 70.7s | _parse_workbook | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 648µs | 10.1ms | _parse_styles | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 174µs | 241µs | BEGIN@18 | Spreadsheet::ParseXLSX::
28 | 3 | 1 | 114µs | 147µs | _color | Spreadsheet::ParseXLSX::
15 | 1 | 1 | 105µs | 280µs | _get_text_and_rich_font_by_cell | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 90µs | 72.2ms | _extract_files | Spreadsheet::ParseXLSX::
15 | 1 | 1 | 68µs | 715µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:655] | Spreadsheet::ParseXLSX::
7 | 3 | 1 | 53µs | 19.6ms | _zip_file_member | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 51µs | 70.7s | parse | Spreadsheet::ParseXLSX::
7 | 3 | 1 | 31µs | 12.0ms | _new_twig | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 28µs | 1.10ms | _parse_themes | Spreadsheet::ParseXLSX::
5 | 5 | 1 | 26µs | 51.8ms | _parse_xml | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 22µs | 50µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:268] | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 14µs | 98µs | _check_signature | Spreadsheet::ParseXLSX::
5 | 5 | 1 | 13µs | 16µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:979] | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 12µs | 72µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:246] | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 12µs | 14µs | BEGIN@3 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 10µs | 11.2ms | _parse_shared_strings | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 10µs | 24µs | _dimensions | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 9µs | 49µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:338] | Spreadsheet::ParseXLSX::
2 | 2 | 1 | 7µs | 7µs | _rels_for | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 7µs | 7µs | BEGIN@5 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 5µs | 16µs | BEGIN@13 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 5µs | 5µs | _base_path_for | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 4µs | 27µs | __ANON__[lib/Spreadsheet/ParseXLSX.pm:313] | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 3µs | 19µs | BEGIN@4 | Spreadsheet::ParseXLSX::
1 | 1 | 1 | 2µs | 2µs | new | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:258] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:287] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:325] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:346] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:495] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | __ANON__[lib/Spreadsheet/ParseXLSX.pm:555] | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | _apply_tint | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | _is_merged | Spreadsheet::ParseXLSX::
0 | 0 | 0 | 0s | 0s | _parse_sheet_links | Spreadsheet::ParseXLSX::
Line | State ments |
Time on line |
Calls | Time in subs |
Code |
---|---|---|---|---|---|
1 | package Spreadsheet::ParseXLSX; | ||||
2 | |||||
3 | 2 | 24µs | 2 | 15µs | # spent 14µs (12+2) within Spreadsheet::ParseXLSX::BEGIN@3 which was called:
# once (12µs+2µs) by main::BEGIN@7 at line 3 # spent 14µs making 1 call to Spreadsheet::ParseXLSX::BEGIN@3
# spent 2µs making 1 call to strict::import |
4 | 2 | 14µs | 2 | 35µs | # spent 19µs (3+16) within Spreadsheet::ParseXLSX::BEGIN@4 which was called:
# once (3µs+16µs) by main::BEGIN@7 at line 4 # spent 19µs making 1 call to Spreadsheet::ParseXLSX::BEGIN@4
# spent 16µs making 1 call to warnings::import |
5 | 2 | 28µs | 1 | 7µs | # spent 7µs within Spreadsheet::ParseXLSX::BEGIN@5 which was called:
# once (7µs+0s) by main::BEGIN@7 at line 5 # spent 7µs making 1 call to Spreadsheet::ParseXLSX::BEGIN@5 |
6 | |||||
7 | # VERSION | ||||
8 | |||||
9 | # ABSTRACT: parse XLSX files | ||||
10 | |||||
11 | 3 | 98µs | 3 | 50.2ms | # spent 50.2ms (13.5+36.6) within Spreadsheet::ParseXLSX::BEGIN@11 which was called:
# once (13.5ms+36.6ms) by main::BEGIN@7 at line 11 # spent 50.2ms making 1 call to Spreadsheet::ParseXLSX::BEGIN@11
# spent 10µs making 1 call to Exporter::import
# spent 7µs making 1 call to UNIVERSAL::VERSION |
12 | 2 | 73µs | 2 | 3.55ms | # spent 3.51ms (3.00+512µs) within Spreadsheet::ParseXLSX::BEGIN@12 which was called:
# once (3.00ms+512µs) by main::BEGIN@7 at line 12 # spent 3.51ms making 1 call to Spreadsheet::ParseXLSX::BEGIN@12
# spent 40µs making 1 call to Exporter::import |
13 | 2 | 15µs | 2 | 28µs | # spent 16µs (5+12) within Spreadsheet::ParseXLSX::BEGIN@13 which was called:
# once (5µs+12µs) by main::BEGIN@7 at line 13 # spent 16µs making 1 call to Spreadsheet::ParseXLSX::BEGIN@13
# spent 12µs making 1 call to Exporter::import |
14 | 2 | 81µs | 2 | 22.0ms | # spent 22.0ms (6.82+15.2) within Spreadsheet::ParseXLSX::BEGIN@14 which was called:
# once (6.82ms+15.2ms) by main::BEGIN@7 at line 14 # spent 22.0ms making 1 call to Spreadsheet::ParseXLSX::BEGIN@14
# spent 600ns making 1 call to Spreadsheet::ParseXLSX::__ANON__ |
15 | 2 | 79µs | 2 | 50.0ms | # spent 50.0ms (44.8+5.23) within Spreadsheet::ParseXLSX::BEGIN@15 which was called:
# once (44.8ms+5.23ms) by main::BEGIN@7 at line 15 # spent 50.0ms making 1 call to Spreadsheet::ParseXLSX::BEGIN@15
# spent 2µs making 1 call to UNIVERSAL::import |
16 | |||||
17 | 2 | 78µs | 2 | 13.8ms | # spent 13.8ms (688µs+13.1) within Spreadsheet::ParseXLSX::BEGIN@17 which was called:
# once (688µs+13.1ms) by main::BEGIN@7 at line 17 # spent 13.8ms making 1 call to Spreadsheet::ParseXLSX::BEGIN@17
# spent 1µs making 1 call to UNIVERSAL::import |
18 | 2 | 3.27ms | 2 | 242µs | # spent 241µs (174+68) within Spreadsheet::ParseXLSX::BEGIN@18 which was called:
# once (174µs+68µs) by main::BEGIN@7 at line 18 # spent 241µs making 1 call to Spreadsheet::ParseXLSX::BEGIN@18
# spent 1µs making 1 call to UNIVERSAL::import |
19 | |||||
20 | =head1 SYNOPSIS | ||||
21 | |||||
22 | use Spreadsheet::ParseXLSX; | ||||
23 | |||||
24 | my $parser = Spreadsheet::ParseXLSX->new; | ||||
25 | my $workbook = $parser->parse("file.xlsx"); | ||||
26 | # see Spreadsheet::ParseExcel for further documentation | ||||
27 | |||||
28 | =head1 DESCRIPTION | ||||
29 | |||||
30 | This module is an adaptor for L<Spreadsheet::ParseExcel> that reads XLSX files. | ||||
31 | For documentation about the various data that you can retrieve from these | ||||
32 | classes, please see L<Spreadsheet::ParseExcel>, | ||||
33 | L<Spreadsheet::ParseExcel::Workbook>, L<Spreadsheet::ParseExcel::Worksheet>, | ||||
34 | and L<Spreadsheet::ParseExcel::Cell>. | ||||
35 | |||||
36 | =cut | ||||
37 | |||||
38 | =method new(%opts) | ||||
39 | |||||
40 | Returns a new parser instance. Takes a hash of parameters: | ||||
41 | |||||
42 | =over 4 | ||||
43 | |||||
44 | =item Password | ||||
45 | |||||
46 | Password to use for decrypting encrypted files. | ||||
47 | |||||
48 | =back | ||||
49 | |||||
50 | =cut | ||||
51 | |||||
52 | # spent 2µs within Spreadsheet::ParseXLSX::new which was called:
# once (2µs+0s) by main::RUNTIME at line 11 of /home/micha/Projekt/spreadsheet-parsexlsx/t/bug-md-11.t | ||||
53 | 1 | 200ns | my $class = shift; | ||
54 | 1 | 400ns | my (%args) = @_; | ||
55 | |||||
56 | 1 | 500ns | my $self = bless {}, $class; | ||
57 | 1 | 200ns | $self->{Password} = $args{Password} if defined $args{Password}; | ||
58 | |||||
59 | 1 | 2µs | return $self; | ||
60 | } | ||||
61 | |||||
62 | =method parse($file, $formatter) | ||||
63 | |||||
64 | Parses an XLSX file. Parsing errors throw an exception. C<$file> can be either | ||||
65 | a filename or an open filehandle. Returns a | ||||
66 | L<Spreadsheet::ParseExcel::Workbook> instance containing the parsed data. | ||||
67 | The C<$formatter> argument is an optional formatter class as described in L<Spreadsheet::ParseExcel>. | ||||
68 | |||||
69 | =cut | ||||
70 | |||||
71 | # spent 70.7s (51µs+70.7) within Spreadsheet::ParseXLSX::parse which was called:
# once (51µs+70.7s) by main::RUNTIME at line 11 of /home/micha/Projekt/spreadsheet-parsexlsx/t/bug-md-11.t | ||||
72 | 1 | 100ns | my $self = shift; | ||
73 | 1 | 400ns | my ($file, $formatter) = @_; | ||
74 | |||||
75 | 1 | 2µs | 1 | 16µs | my $zip = Archive::Zip->new; # spent 16µs making 1 call to Archive::Zip::new |
76 | 1 | 2µs | 1 | 2µs | my $workbook = Spreadsheet::ParseExcel::Workbook->new; # spent 2µs making 1 call to Spreadsheet::ParseExcel::Workbook::new |
77 | |||||
78 | 1 | 700ns | 1 | 98µs | if ($self->_check_signature($file)) { # spent 98µs making 1 call to Spreadsheet::ParseXLSX::_check_signature |
79 | my $decrypted_file = Spreadsheet::ParseXLSX::Decryptor->open($file, $self->{Password}); | ||||
80 | $file = $decrypted_file if $decrypted_file; | ||||
81 | } | ||||
82 | |||||
83 | 1 | 2µs | 1 | 400ns | if (openhandle($file)) { # spent 400ns making 1 call to Scalar::Util::openhandle |
84 | bless $file, 'IO::File' if ref($file) eq 'GLOB'; # sigh | ||||
85 | my $fh = | ||||
86 | ref($file) eq 'File::Temp' | ||||
87 | ? IO::File->new("<&=" . fileno($file)) | ||||
88 | : $file; | ||||
89 | $zip->readFromFileHandle($fh) == Archive::Zip::AZ_OK | ||||
90 | or die "Can't open filehandle as a zip file"; | ||||
91 | $workbook->{File} = undef; | ||||
92 | $workbook->{__tempfile} = $file; | ||||
93 | } elsif (ref($file) eq 'SCALAR') { | ||||
94 | open my $fh, '+<', $file | ||||
95 | or die "Can't create filehandle from memory data"; | ||||
96 | $zip->readFromFileHandle($fh) == Archive::Zip::AZ_OK | ||||
97 | or die "Can't open scalar ref as a zip file"; | ||||
98 | $workbook->{File} = undef; | ||||
99 | } elsif (!ref($file)) { | ||||
100 | 1 | 2µs | 1 | 851µs | $zip->read($file) == Archive::Zip::AZ_OK # spent 851µs making 1 call to Archive::Zip::Archive::read |
101 | or die "Can't open file '$file' as a zip file"; | ||||
102 | 1 | 2µs | $workbook->{File} = $file; | ||
103 | } else { | ||||
104 | die "First argument to 'parse' must be a filename, open filehandle, or scalar ref"; | ||||
105 | } | ||||
106 | |||||
107 | 1 | 48µs | 1 | 70.7s | return $self->_parse_workbook($zip, $workbook, $formatter); # spent 70.7s making 1 call to Spreadsheet::ParseXLSX::_parse_workbook |
108 | } | ||||
109 | |||||
110 | # spent 98µs (14+84) within Spreadsheet::ParseXLSX::_check_signature which was called:
# once (14µs+84µs) by Spreadsheet::ParseXLSX::parse at line 78 | ||||
111 | 1 | 100ns | my $self = shift; | ||
112 | 1 | 100ns | my ($file) = @_; | ||
113 | |||||
114 | 1 | 200ns | my $signature = ''; | ||
115 | 1 | 4µs | 1 | 1µs | if (openhandle($file)) { # spent 1µs making 1 call to Scalar::Util::openhandle |
116 | bless $file, 'IO::File' if ref($file) eq 'GLOB'; # sigh | ||||
117 | $file->read($signature, 2); | ||||
118 | $file->seek(-2, IO::File::SEEK_CUR); | ||||
119 | } elsif (ref($file) eq 'SCALAR') { | ||||
120 | $signature = substr($$file, 0, 2); | ||||
121 | } elsif (!ref($file)) { | ||||
122 | 1 | 2µs | 1 | 63µs | my $fh = IO::File->new($file, 'r'); # spent 63µs making 1 call to IO::File::new |
123 | 1 | 3µs | 1 | 13µs | $fh->read($signature, 2); # spent 13µs making 1 call to IO::Handle::read |
124 | 1 | 4µs | 1 | 7µs | $fh->close; # spent 7µs making 1 call to IO::Handle::close |
125 | } | ||||
126 | |||||
127 | 1 | 2µs | return $signature eq "\xd0\xcf"; | ||
128 | } | ||||
129 | |||||
130 | # spent 70.7s (671µs+70.7) within Spreadsheet::ParseXLSX::_parse_workbook which was called:
# once (671µs+70.7s) by Spreadsheet::ParseXLSX::parse at line 107 | ||||
131 | 1 | 100ns | my $self = shift; | ||
132 | 1 | 300ns | my ($zip, $workbook, $formatter) = @_; | ||
133 | |||||
134 | 1 | 1µs | 1 | 72.2ms | my $files = $self->_extract_files($zip); # spent 72.2ms making 1 call to Spreadsheet::ParseXLSX::_extract_files |
135 | |||||
136 | 1 | 2µs | 1 | 292µs | my ($version) = $files->{workbook}->find_nodes('//s:fileVersion'); # spent 292µs making 1 call to XML::Twig::get_xpath |
137 | 1 | 2µs | 1 | 245µs | my ($properties) = $files->{workbook}->find_nodes('//s:workbookPr'); # spent 245µs making 1 call to XML::Twig::get_xpath |
138 | |||||
139 | 1 | 4µs | 3 | 2µs | if ($version) { # spent 2µs making 3 calls to XML::Twig::Elt::att, avg 700ns/call |
140 | $workbook->{Version} = $version->att('appName') | ||||
141 | . ( | ||||
142 | $version->att('lowestEdited') | ||||
143 | ? ('-' . $version->att('lowestEdited')) | ||||
144 | : ("") | ||||
145 | ); | ||||
146 | } | ||||
147 | |||||
148 | 1 | 3µs | 2 | 2µs | $workbook->{Flg1904} = $self->_xml_boolean($properties->att('date1904')) # spent 2µs making 1 call to Spreadsheet::ParseXLSX::_xml_boolean
# spent 400ns making 1 call to XML::Twig::Elt::att |
149 | if $properties; | ||||
150 | |||||
151 | 1 | 6µs | 1 | 3µs | $workbook->{FmtClass} = $formatter || Spreadsheet::ParseExcel::FmtDefault->new; # spent 3µs making 1 call to Spreadsheet::ParseExcel::FmtDefault::new |
152 | |||||
153 | 1 | 3µs | 1 | 1.10ms | my $themes = $self->_parse_themes((values %{$files->{themes}})[0]); # XXX # spent 1.10ms making 1 call to Spreadsheet::ParseXLSX::_parse_themes |
154 | |||||
155 | 1 | 700ns | $workbook->{Color} = $themes->{Color}; | ||
156 | |||||
157 | 1 | 2µs | 1 | 10.1ms | my $styles = $self->_parse_styles($workbook, $files->{styles}); # spent 10.1ms making 1 call to Spreadsheet::ParseXLSX::_parse_styles |
158 | |||||
159 | 1 | 600ns | $workbook->{Format} = $styles->{Format}; | ||
160 | 1 | 1µs | $workbook->{FormatStr} = $styles->{FormatStr}; | ||
161 | 1 | 500ns | $workbook->{Font} = $styles->{Font}; | ||
162 | |||||
163 | 1 | 400ns | if ($files->{strings}) { | ||
164 | 1 | 3µs | 1 | 11.2ms | my %string_parse_data = $self->_parse_shared_strings($files->{strings}, $themes->{Color}); # spent 11.2ms making 1 call to Spreadsheet::ParseXLSX::_parse_shared_strings |
165 | 1 | 600ns | $workbook->{PkgStr} = $string_parse_data{PkgStr}; | ||
166 | 1 | 600ns | $workbook->{Rich} = $string_parse_data{Rich}; | ||
167 | } | ||||
168 | |||||
169 | # $workbook->{StandardWidth} = ...; | ||||
170 | |||||
171 | # $workbook->{Author} = ...; | ||||
172 | |||||
173 | # $workbook->{PrintArea} = ...; | ||||
174 | # $workbook->{PrintTitle} = ...; | ||||
175 | |||||
176 | my @sheets = map { | ||||
177 | 1 | 1µs | 1 | 1µs | my $idx = $_->att('rels:id'); # spent 1µs making 1 call to XML::Twig::Elt::att |
178 | 1 | 1µs | if ($files->{sheets}{$idx}) { | ||
179 | 1 | 5µs | 2 | 8µs | my $sheet = Spreadsheet::ParseExcel::Worksheet->new( # spent 8µs making 1 call to Spreadsheet::ParseExcel::Worksheet::new
# spent 500ns making 1 call to XML::Twig::Elt::att |
180 | Name => $_->att('name'), | ||||
181 | _Book => $workbook, | ||||
182 | _SheetNo => $idx, | ||||
183 | ); | ||||
184 | 1 | 900ns | 1 | 600ns | $sheet->{SheetHidden} = 1 if defined $_->att('state') and $_->att('state') eq 'hidden'; # spent 600ns making 1 call to XML::Twig::Elt::att |
185 | 1 | 2µs | 1 | 70.6s | $self->_parse_sheet($sheet, $files->{sheets}{$idx}); # spent 70.6s making 1 call to Spreadsheet::ParseXLSX::_parse_sheet |
186 | |||||
187 | # Do we have a rels for for this sheet? | ||||
188 | 1 | 1µs | if ( $files->{sheets_rels} | ||
189 | && $files->{sheets_rels}{$idx}) | ||||
190 | { | ||||
191 | # Yes - now parse the rels to extract the hyperlinks | ||||
192 | $self->_parse_sheet_links($sheet, $files->{sheets}{$idx}, $files->{sheets_rels}{$idx}); | ||||
193 | } | ||||
194 | |||||
195 | 1 | 1µs | ($sheet); | ||
196 | } else { | ||||
197 | () | ||||
198 | } | ||||
199 | 1 | 3µs | 1 | 376µs | } $files->{workbook}->find_nodes('//s:sheets/s:sheet'); # spent 376µs making 1 call to XML::Twig::get_xpath |
200 | |||||
201 | 1 | 1µs | $workbook->{Worksheet} = \@sheets; | ||
202 | 1 | 3µs | $workbook->{SheetCount} = scalar(@sheets); | ||
203 | |||||
204 | 1 | 6µs | 1 | 463µs | my ($node) = $files->{workbook}->find_nodes('//s:workbookView'); # spent 463µs making 1 call to XML::Twig::get_xpath |
205 | 1 | 3µs | 1 | 2µs | my $selected = $node ? $node->att('activeTab') : undef; # spent 2µs making 1 call to XML::Twig::Elt::att |
206 | 1 | 800ns | $workbook->{SelectedSheet} = defined($selected) ? 0 + $selected : 0; | ||
207 | |||||
208 | 1 | 590µs | 3 | 354µs | return $workbook; # spent 354µs making 3 calls to XML::Twig::DESTROY, avg 118µs/call |
209 | } | ||||
210 | |||||
211 | # spent 70.6s (198ms+70.4) within Spreadsheet::ParseXLSX::_parse_sheet which was called:
# once (198ms+70.4s) by Spreadsheet::ParseXLSX::_parse_workbook at line 185 | ||||
212 | 1 | 200ns | my $self = shift; | ||
213 | 1 | 400ns | my ($sheet, $sheet_file) = @_; | ||
214 | |||||
215 | 1 | 2µs | $sheet->{MinRow} = 0; | ||
216 | 1 | 500ns | $sheet->{MinCol} = 0; | ||
217 | 1 | 500ns | $sheet->{MaxRow} = -1; | ||
218 | 1 | 700ns | $sheet->{MaxCol} = -1; | ||
219 | 1 | 900ns | $sheet->{Selection} = [0, 0]; | ||
220 | |||||
221 | 1 | 400ns | my @column_formats; | ||
222 | my @column_widths; | ||||
223 | my @columns_hidden; | ||||
224 | my @row_heights; | ||||
225 | my @rows_hidden; | ||||
226 | |||||
227 | 1 | 200ns | my $default_row_height = 15; | ||
228 | 1 | 200ns | my $default_column_width = 10; | ||
229 | |||||
230 | 1 | 300ns | my $row_idx = 0; | ||
231 | |||||
232 | my $sheet_xml = $self->_new_twig( | ||||
233 | twig_roots => { | ||||
234 | #XXX need a fallback here, the dimension tag is optional | ||||
235 | # spent 72µs (12+60) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:246] which was called:
# once (12µs+60µs) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm | ||||
236 | 1 | 700ns | my ($twig, $dimension) = @_; | ||
237 | |||||
238 | 1 | 4µs | 2 | 26µs | my ($rmin, $cmin, $rmax, $cmax) = $self->_dimensions($dimension->att('ref')); # spent 24µs making 1 call to Spreadsheet::ParseXLSX::_dimensions
# spent 2µs making 1 call to XML::Twig::Elt::att |
239 | |||||
240 | 1 | 700ns | $sheet->{MinRow} = $rmin; | ||
241 | 1 | 800ns | $sheet->{MinCol} = $cmin; | ||
242 | 1 | 500ns | $sheet->{MaxRow} = $rmax ? $rmax : -1; | ||
243 | 1 | 600ns | $sheet->{MaxCol} = $cmax ? $cmax : -1; | ||
244 | |||||
245 | 1 | 3µs | 1 | 35µs | $twig->purge; # spent 35µs making 1 call to XML::Twig::purge |
246 | }, | ||||
247 | |||||
248 | 's:headerFooter' => sub { | ||||
249 | my ($twig, $hf) = @_; | ||||
250 | |||||
251 | my ($helem, $felem) = map { $hf->first_child("s:$_") } qw(oddHeader oddFooter); | ||||
252 | $sheet->{Header} = $helem->text | ||||
253 | if $helem; | ||||
254 | $sheet->{Footer} = $felem->text | ||||
255 | if $felem; | ||||
256 | |||||
257 | $twig->purge; | ||||
258 | }, | ||||
259 | |||||
260 | # spent 50µs (22+28) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:268] which was called:
# once (22µs+28µs) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm | ||||
261 | 1 | 500ns | my ($twig, $margin) = @_; | ||
262 | map { | ||||
263 | 7 | 5µs | my $key = "\u${_}Margin"; | ||
264 | 6 | 8µs | 12 | 4µs | $sheet->{$key} = defined $margin->att($_) ? $margin->att($_) : 0 # spent 4µs making 12 calls to XML::Twig::Elt::att, avg 367ns/call |
265 | } qw(left right top bottom header footer); | ||||
266 | |||||
267 | 1 | 3µs | 1 | 24µs | $twig->purge; # spent 24µs making 1 call to XML::Twig::purge |
268 | }, | ||||
269 | |||||
270 | 's:pageSetup' => sub { | ||||
271 | my ($twig, $setup) = @_; | ||||
272 | $sheet->{Scale} = | ||||
273 | defined $setup->att('scale') | ||||
274 | ? $setup->att('scale') | ||||
275 | : 100; | ||||
276 | $sheet->{Landscape} = ($setup->att('orientation') || '') ne 'landscape'; | ||||
277 | $sheet->{PaperSize} = | ||||
278 | defined $setup->att('paperSize') | ||||
279 | ? $setup->att('paperSize') | ||||
280 | : 1; | ||||
281 | $sheet->{PageStart} = $setup->att('firstPageNumber'); | ||||
282 | $sheet->{UsePage} = $self->_xml_boolean($setup->att('useFirstPageNumber')); | ||||
283 | $sheet->{HorizontalDPI} = $setup->att('horizontalDpi'); | ||||
284 | $sheet->{VerticalDPI} = $setup->att('verticalDpi'); | ||||
285 | |||||
286 | $twig->purge; | ||||
287 | }, | ||||
288 | |||||
289 | # spent 960ms (218+742) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:302] which was called 18180 times, avg 53µs/call:
# 18180 times (218ms+742ms) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm, avg 53µs/call | ||||
290 | 18180 | 3.06ms | my ($twig, $merge_area) = @_; | ||
291 | |||||
292 | 18180 | 15.4ms | 18180 | 12.4ms | if (my $ref = $merge_area->att('ref')) { # spent 12.4ms making 18180 calls to XML::Twig::Elt::att, avg 684ns/call |
293 | 18180 | 50.7ms | 18180 | 27.8ms | my ($topleft, $bottomright) = $ref =~ /([^:]+):([^:]+)/; # spent 27.8ms making 18180 calls to CORE::match, avg 2µs/call |
294 | |||||
295 | 18180 | 13.9ms | 18180 | 80.2ms | my ($toprow, $leftcol) = $self->_cell_to_row_col($topleft); # spent 80.2ms making 18180 calls to Spreadsheet::ParseXLSX::_cell_to_row_col, avg 4µs/call |
296 | 18180 | 7.95ms | 18180 | 53.1ms | my ($bottomrow, $rightcol) = $self->_cell_to_row_col($bottomright); # spent 53.1ms making 18180 calls to Spreadsheet::ParseXLSX::_cell_to_row_col, avg 3µs/call |
297 | |||||
298 | 18180 | 15.5ms | push @{$sheet->{MergedArea}}, [$toprow, $leftcol, $bottomrow, $rightcol,]; | ||
299 | } | ||||
300 | |||||
301 | 18180 | 68.9ms | 18180 | 568ms | $twig->purge; # spent 568ms making 18180 calls to XML::Twig::purge, avg 31µs/call |
302 | }, | ||||
303 | |||||
304 | # spent 27µs (4+23) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:313] which was called:
# once (4µs+23µs) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm | ||||
305 | 1 | 300ns | my ($twig, $format) = @_; | ||
306 | |||||
307 | 1 | 400ns | $default_row_height = $format->att('defaultRowHeight') | ||
308 | unless defined $default_row_height; | ||||
309 | 1 | 300ns | $default_column_width = $format->att('baseColWidth') | ||
310 | unless defined $default_column_width; | ||||
311 | |||||
312 | 1 | 3µs | 1 | 23µs | $twig->purge; # spent 23µs making 1 call to XML::Twig::purge |
313 | }, | ||||
314 | |||||
315 | 's:col' => sub { | ||||
316 | my ($twig, $col) = @_; | ||||
317 | |||||
318 | for my $colnum ($col->att('min') .. $col->att('max')) { | ||||
319 | $column_widths[$colnum - 1] = $col->att('width'); | ||||
320 | $column_formats[$colnum - 1] = $col->att('style'); | ||||
321 | $columns_hidden[$colnum - 1] = $self->_xml_boolean($col->att('hidden')); | ||||
322 | } | ||||
323 | |||||
324 | $twig->purge; | ||||
325 | }, | ||||
326 | |||||
327 | # spent 49µs (9+39) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:338] which was called:
# once (9µs+39µs) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm | ||||
328 | 1 | 400ns | my ($twig, $selection) = @_; | ||
329 | |||||
330 | 1 | 4µs | 2 | 14µs | if (my $cell = $selection->att('activeCell')) { # spent 13µs making 1 call to Spreadsheet::ParseXLSX::_cell_to_row_col
# spent 800ns making 1 call to XML::Twig::Elt::att |
331 | $sheet->{Selection} = [$self->_cell_to_row_col($cell)]; | ||||
332 | } elsif (my $range = $selection->att('sqref')) { | ||||
333 | my ($topleft, $bottomright) = $range =~ /([^:]+):([^:]+)/; | ||||
334 | $sheet->{Selection} = [$self->_cell_to_row_col($topleft), $self->_cell_to_row_col($bottomright),]; | ||||
335 | } | ||||
336 | |||||
337 | 1 | 2µs | 1 | 25µs | $twig->purge; # spent 25µs making 1 call to XML::Twig::purge |
338 | }, | ||||
339 | |||||
340 | 's:sheetPr/s:tabColor' => sub { | ||||
341 | my ($twig, $tab_color) = @_; | ||||
342 | |||||
343 | $sheet->{TabColor} = $self->_color($sheet->{_Book}{Color}, $tab_color); | ||||
344 | |||||
345 | $twig->purge; | ||||
346 | }, | ||||
347 | |||||
348 | # spent 12.9s (5.52+7.35) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:443] which was called 15608 times, avg 825µs/call:
# 15608 times (5.52s+7.35s) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm, avg 825µs/call | ||||
349 | 15608 | 3.21ms | my ($twig, $row_elt) = @_; | ||
350 | 15608 | 13.4ms | 15608 | 12.8ms | my $explicit_row_idx = $row_elt->att('r'); # spent 12.8ms making 15608 calls to XML::Twig::Elt::att, avg 820ns/call |
351 | 15608 | 8.87ms | $row_idx = $explicit_row_idx - 1 if defined $explicit_row_idx; | ||
352 | |||||
353 | 15608 | 9.81ms | 15608 | 5.77ms | $row_heights[$row_idx] = $row_elt->att('ht'); # spent 5.77ms making 15608 calls to XML::Twig::Elt::att, avg 369ns/call |
354 | 15608 | 22.4ms | 31216 | 18.8ms | $rows_hidden[$row_idx] = $self->_xml_boolean($row_elt->att('hidden')); # spent 13.3ms making 15608 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 855ns/call
# spent 5.46ms making 15608 calls to XML::Twig::Elt::att, avg 350ns/call |
355 | |||||
356 | 15608 | 2.34ms | my $col_idx = 0; | ||
357 | 15608 | 18.9ms | 15608 | 1.57s | for my $cell ($row_elt->children('s:c')) { # spent 1.57s making 15608 calls to XML::Twig::Elt::children, avg 101µs/call |
358 | 202907 | 104ms | 202907 | 100ms | my $loc = $cell->att('r'); # spent 100ms making 202907 calls to XML::Twig::Elt::att, avg 493ns/call |
359 | 202907 | 13.9ms | my ($row, $col); | ||
360 | 202907 | 39.1ms | if ($loc) { | ||
361 | 202907 | 127ms | 202907 | 903ms | ($row, $col) = $self->_cell_to_row_col($loc); # spent 903ms making 202907 calls to Spreadsheet::ParseXLSX::_cell_to_row_col, avg 4µs/call |
362 | 202907 | 23.6ms | if ($row != $row_idx) { | ||
363 | warn "mismatched coords: got $loc for cell in row $row_idx"; | ||||
364 | } | ||||
365 | 202907 | 25.8ms | $col_idx = $col - 1; | ||
366 | } else { | ||||
367 | ($row, $col) = ($row_idx, $col_idx); | ||||
368 | } | ||||
369 | $sheet->{MaxRow} = $row | ||||
370 | 202907 | 35.1ms | if $sheet->{MaxRow} < $row; | ||
371 | $sheet->{MaxCol} = $col | ||||
372 | 202907 | 24.5ms | if $sheet->{MaxCol} < $col; | ||
373 | 202907 | 99.2ms | 202907 | 68.3ms | my $type = $cell->att('t') || 'n'; # spent 68.3ms making 202907 calls to XML::Twig::Elt::att, avg 336ns/call |
374 | 202907 | 13.7ms | my $val_xml; | ||
375 | 202907 | 102ms | 202907 | 825ms | if ($type ne 'inlineStr') { # spent 825ms making 202907 calls to XML::Twig::Elt::first_child, avg 4µs/call |
376 | $val_xml = $cell->first_child('s:v'); | ||||
377 | } elsif (defined $cell->first_child('s:is')) { | ||||
378 | $val_xml = ($cell->find_nodes('.//s:t'))[0]; | ||||
379 | } | ||||
380 | 202907 | 87.2ms | 127276 | 678ms | my $val = $val_xml ? $val_xml->text : undef; # spent 678ms making 127276 calls to XML::Twig::Elt::text, avg 5µs/call |
381 | |||||
382 | 202907 | 15.6ms | my $long_type; | ||
383 | my $Rich; | ||||
384 | 202907 | 51.4ms | if (!defined($val)) { | ||
385 | 75631 | 7.31ms | $long_type = 'Text'; | ||
386 | 75631 | 7.09ms | $val = ''; | ||
387 | } elsif ($type eq 's') { | ||||
388 | 127276 | 11.8ms | $long_type = 'Text'; | ||
389 | 127276 | 60.3ms | $Rich = $sheet->{_Book}{Rich}->{$val}; | ||
390 | 127276 | 42.5ms | $val = $sheet->{_Book}{PkgStr}[$val]; | ||
391 | } elsif ($type eq 'n') { | ||||
392 | $long_type = 'Numeric'; | ||||
393 | $val = defined($val) && $val ne '' ? 0 + $val : undef; | ||||
394 | } elsif ($type eq 'd') { | ||||
395 | $long_type = 'Date'; | ||||
396 | } elsif ($type eq 'b') { | ||||
397 | $long_type = 'Text'; | ||||
398 | $val = $val ? "TRUE" : "FALSE"; | ||||
399 | } elsif ($type eq 'e') { | ||||
400 | $long_type = 'Text'; | ||||
401 | } elsif ($type eq 'str' || $type eq 'inlineStr') { | ||||
402 | $long_type = 'Text'; | ||||
403 | } else { | ||||
404 | die "unimplemented type $type"; # XXX | ||||
405 | } | ||||
406 | |||||
407 | 202907 | 91.8ms | 202907 | 73.0ms | my $format_idx = $cell->att('s') || 0; # spent 73.0ms making 202907 calls to XML::Twig::Elt::att, avg 360ns/call |
408 | 202907 | 52.8ms | my $format = $sheet->{_Book}{Format}[$format_idx]; | ||
409 | 202907 | 71.1ms | my $formatstr = $sheet->{_Book}{FormatStr}{$format->{FmtIdx}}; | ||
410 | 202907 | 16.0ms | die "unknown format $format_idx" unless $format; | ||
411 | |||||
412 | # see the list of built-in formats below in _parse_styles | ||||
413 | # XXX probably should figure this out from the actual format string, | ||||
414 | # but that's not entirely trivial | ||||
415 | 202907 | 314ms | if (grep { $format->{FmtIdx} == $_ } 14 .. 22, 45 .. 47) { | ||
416 | $long_type = 'Date'; | ||||
417 | } | ||||
418 | |||||
419 | 202907 | 496ms | 202907 | 180ms | if ($formatstr =~ /\b(mmm|m|d|yy|h|hh|mm|ss)\b/i) { # spent 180ms making 202907 calls to CORE::match, avg 886ns/call |
420 | $long_type = 'Date'; | ||||
421 | } | ||||
422 | |||||
423 | 202907 | 86.3ms | 202907 | 1.14s | my $formula = $cell->first_child('s:f'); # spent 1.14s making 202907 calls to XML::Twig::Elt::first_child, avg 6µs/call |
424 | 202907 | 147ms | 202907 | 362ms | my $cell = Spreadsheet::ParseXLSX::Cell->new( # spent 362ms making 202907 calls to Spreadsheet::ParseExcel::Cell::new, avg 2µs/call |
425 | Val => $val, | ||||
426 | Type => $long_type, | ||||
427 | Format => $format, | ||||
428 | FormatNo => $format_idx, | ||||
429 | ( | ||||
430 | $formula | ||||
431 | ? (Formula => $formula->text) | ||||
432 | : () | ||||
433 | ), | ||||
434 | Rich => $Rich, | ||||
435 | ); | ||||
436 | 202907 | 164ms | 202907 | 810ms | $cell->{_Value} = $sheet->{_Book}{FmtClass}->ValFmt($cell, $sheet->{_Book}); # spent 810ms making 202907 calls to Spreadsheet::ParseExcel::FmtDefault::ValFmt, avg 4µs/call |
437 | 202907 | 78.1ms | $sheet->{Cells}[$row][$col] = $cell; | ||
438 | 202907 | 134ms | $col_idx++; | ||
439 | } | ||||
440 | |||||
441 | 15608 | 11.1ms | 15608 | 604ms | $twig->purge; # spent 604ms making 15608 calls to XML::Twig::purge, avg 39µs/call |
442 | 15608 | 56.2ms | $row_idx++; | ||
443 | }, | ||||
444 | } | ||||
445 | 1 | 18µs | 1 | 4.90ms | ); # spent 4.90ms making 1 call to Spreadsheet::ParseXLSX::_new_twig |
446 | |||||
447 | 1 | 900ns | 1 | 70.4s | $sheet_xml->parse($sheet_file); # spent 70.4s making 1 call to XML::Twig::parse |
448 | |||||
449 | 1 | 1µs | if ($sheet->{Cells}) { | ||
450 | # SMELL: we have to connect cells their sheet as well as their position | ||||
451 | 1 | 2µs | for my $r (0 .. $#{$sheet->{Cells}}) { | ||
452 | 15608 | 2.69ms | my $row = $sheet->{Cells}[$r] or next; | ||
453 | 15608 | 7.39ms | for my $c (0 .. $#$row) { | ||
454 | 232453 | 35.0ms | my $cell = $row->[$c] or next; | ||
455 | 202907 | 77.6ms | $cell->{Sheet} = $sheet; | ||
456 | 202907 | 26.6ms | $cell->{Row} = $r; | ||
457 | 202907 | 45.5ms | $cell->{Col} = $c; | ||
458 | } | ||||
459 | } | ||||
460 | } else { | ||||
461 | $sheet->{MaxRow} = $sheet->{MaxCol} = -1; | ||||
462 | } | ||||
463 | |||||
464 | 1 | 2µs | $sheet->{DefRowHeight} = 0 + $default_row_height; | ||
465 | 1 | 1µs | $sheet->{DefColWidth} = 0 + $default_column_width; | ||
466 | 1 | 3.08ms | $sheet->{RowHeight} = [map { defined $_ ? 0 + $_ : 0 + $default_row_height } @row_heights]; | ||
467 | 1 | 2µs | $sheet->{RowHidden} = \@rows_hidden; | ||
468 | 1 | 2µs | $sheet->{ColWidth} = [map { defined $_ ? 0 + $_ : 0 + $default_column_width } @column_widths]; | ||
469 | 1 | 3µs | $sheet->{ColFmtNo} = \@column_formats; | ||
470 | 1 | 12µs | $sheet->{ColHidden} = \@columns_hidden; | ||
471 | |||||
472 | } | ||||
473 | |||||
474 | sub _parse_sheet_links { | ||||
475 | my $self = shift; | ||||
476 | my ($sheet, $sheet_file, $rels_file) = @_; | ||||
477 | |||||
478 | # First we need to parse the hyperlinks out of the rels XML | ||||
479 | my $rels; | ||||
480 | |||||
481 | my $rels_xml = XML::Twig->new( | ||||
482 | twig_roots => { | ||||
483 | 'Relationships/Relationship' => sub { | ||||
484 | my $twig = shift; | ||||
485 | my $relationship = shift; | ||||
486 | |||||
487 | if ( $relationship->att('Type') eq 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink' | ||||
488 | && $relationship->att('TargetMode') eq 'External') | ||||
489 | { | ||||
490 | # Store the target URL in a hash by relationship id | ||||
491 | $rels->{$relationship->att('Id')} = $relationship->att('Target'); | ||||
492 | } | ||||
493 | |||||
494 | $twig->purge; | ||||
495 | }, | ||||
496 | }, | ||||
497 | ); | ||||
498 | |||||
499 | # Run the parser | ||||
500 | $rels_xml->parse($rels_file); | ||||
501 | |||||
502 | # Now iterate over the sheet XML again, this time processing hyperlink entries | ||||
503 | my $sheet_xml = XML::Twig->new( | ||||
504 | twig_roots => { | ||||
505 | 'hyperlinks/hyperlink' => sub { | ||||
506 | my $twig = shift; | ||||
507 | my $hyperlink = shift; | ||||
508 | |||||
509 | # Work out our row and column | ||||
510 | my ($row, $col) = $self->_cell_to_row_col($hyperlink->att('ref')); | ||||
511 | |||||
512 | # Get the cell | ||||
513 | my $cell = $sheet->{Cells}[$row][$col]; | ||||
514 | |||||
515 | # Do I have a cell? | ||||
516 | unless ($cell) { | ||||
517 | # No - just create an empty value for now | ||||
518 | $cell = $sheet->{Cells}[$row][$col] = Spreadsheet::ParseXLSX::Cell->new(); | ||||
519 | } | ||||
520 | |||||
521 | # Is this an external hyperlink I've parsed from the rels? | ||||
522 | if ( $hyperlink->att('r:id') | ||||
523 | && $rels | ||||
524 | && $rels->{$hyperlink->att('r:id')}) | ||||
525 | { | ||||
526 | # Yes - Check if we need to frig our destination a bit | ||||
527 | my $destination_url = sprintf('%s%s%s', $rels->{$hyperlink->att('r:id')}, $hyperlink->att('location') ? '#' : '', $hyperlink->att('location') || '',); | ||||
528 | |||||
529 | # Add the hyperlink | ||||
530 | $cell->{Hyperlink} = [ | ||||
531 | $hyperlink->att('display') || $cell->{_Value} || undef, # Description | ||||
532 | $destination_url, # Target | ||||
533 | undef, # Target Frame | ||||
534 | $row, # Start Row | ||||
535 | $row, # End Row | ||||
536 | $col, # Start Column | ||||
537 | $col, # End Column | ||||
538 | ]; | ||||
539 | } else { | ||||
540 | # This is an internal hyperlink | ||||
541 | |||||
542 | # Add the hyperlink | ||||
543 | $cell->{Hyperlink} = [ | ||||
544 | $hyperlink->att('display') || $cell->{_Value} || undef, # Description | ||||
545 | $hyperlink->att('location'), # Target | ||||
546 | undef, # Target Frame | ||||
547 | $row, # Start Row | ||||
548 | $row, # End Row | ||||
549 | $col, # Start Column | ||||
550 | $col, # End Column | ||||
551 | ]; | ||||
552 | } | ||||
553 | |||||
554 | $twig->purge; | ||||
555 | }, | ||||
556 | }, | ||||
557 | ); | ||||
558 | |||||
559 | # Now parse the XML | ||||
560 | $sheet_xml->parse($sheet_file); | ||||
561 | } | ||||
562 | |||||
563 | # spent 280µs (105+175) within Spreadsheet::ParseXLSX::_get_text_and_rich_font_by_cell which was called 15 times, avg 19µs/call:
# 15 times (105µs+175µs) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:655] at line 651, avg 19µs/call | ||||
564 | 15 | 2µs | my $self = shift; | ||
565 | 15 | 2µs | my ($si, $theme_colors) = @_; | ||
566 | |||||
567 | # XXX | ||||
568 | 15 | 22µs | my %default_font_opts = ( | ||
569 | Height => 12, | ||||
570 | Color => '#000000', | ||||
571 | Name => '', | ||||
572 | Bold => 0, | ||||
573 | Italic => 0, | ||||
574 | Underline => 0, | ||||
575 | UnderlineStyle => 0, | ||||
576 | Strikeout => 0, | ||||
577 | Super => 0, | ||||
578 | ); | ||||
579 | |||||
580 | 15 | 2µs | my $string_text = ''; | ||
581 | 15 | 1µs | my @rich_font_by_cell; | ||
582 | 15 | 12µs | 15 | 65µs | for my $subnode ($si->children) { # spent 65µs making 15 calls to XML::Twig::Elt::children, avg 4µs/call |
583 | 15 | 25µs | 30 | 110µs | if ($subnode->name eq 's:t') { # spent 100µs making 15 calls to XML::Twig::Elt::text, avg 7µs/call
# spent 10µs making 15 calls to XML::Twig::Elt::gi, avg 660ns/call |
584 | $string_text .= $subnode->text; | ||||
585 | } elsif ($subnode->name eq 's:r') { | ||||
586 | for my $chunk ($subnode->children) { | ||||
587 | my $string_length = length($string_text); | ||||
588 | if ($chunk->name eq 's:t') { | ||||
589 | if (!@rich_font_by_cell) { | ||||
590 | push @rich_font_by_cell, [$string_length, Spreadsheet::ParseExcel::Font->new(%default_font_opts)]; | ||||
591 | } | ||||
592 | $string_text .= $chunk->text; | ||||
593 | } elsif ($chunk->name eq 's:rPr') { | ||||
594 | my %format_text = %default_font_opts; | ||||
595 | for my $node_format ($chunk->children) { | ||||
596 | if ($node_format->name eq 's:sz') { | ||||
597 | $format_text{Height} = $node_format->att('val'); | ||||
598 | } elsif ($node_format->name eq 's:color') { | ||||
599 | $format_text{Color} = $self->_color($theme_colors, $node_format); | ||||
600 | } elsif ($node_format->name eq 's:rFont') { | ||||
601 | $format_text{Name} = $node_format->att('val'); | ||||
602 | } elsif ($node_format->name eq 's:b') { | ||||
603 | $format_text{Bold} = 1; | ||||
604 | } elsif ($node_format->name eq 's:i') { | ||||
605 | $format_text{Italic} = 1; | ||||
606 | } elsif ($node_format->name eq 's:u') { | ||||
607 | $format_text{Underline} = 1; | ||||
608 | if (defined $node_format->att('val')) { | ||||
609 | $format_text{UnderlineStyle} = 2; | ||||
610 | } else { | ||||
611 | $format_text{UnderlineStyle} = 1; | ||||
612 | } | ||||
613 | } elsif ($node_format->name eq 's:strike') { | ||||
614 | $format_text{Strikeout} = 1; | ||||
615 | } elsif ($node_format->name eq 's:vertAlign') { | ||||
616 | if ($node_format->att('val') eq 'superscript') { | ||||
617 | $format_text{Super} = 1; | ||||
618 | } elsif ($node_format->att('val') eq 'subscript') { | ||||
619 | $format_text{Super} = 2; | ||||
620 | } | ||||
621 | } | ||||
622 | } | ||||
623 | push @rich_font_by_cell, [$string_length, Spreadsheet::ParseExcel::Font->new(%format_text)]; | ||||
624 | } | ||||
625 | } | ||||
626 | } else { | ||||
627 | # $subnode->name is either 's:rPh' or 's:phoneticPr' | ||||
628 | # We ignore phonetic information and do nothing. | ||||
629 | } | ||||
630 | } | ||||
631 | |||||
632 | return ( | ||||
633 | 15 | 24µs | String => $string_text, | ||
634 | Rich => \@rich_font_by_cell, | ||||
635 | ); | ||||
636 | } | ||||
637 | |||||
638 | # spent 11.2ms (10µs+11.1) within Spreadsheet::ParseXLSX::_parse_shared_strings which was called:
# once (10µs+11.1ms) by Spreadsheet::ParseXLSX::_parse_workbook at line 164 | ||||
639 | 1 | 300ns | my $self = shift; | ||
640 | 1 | 400ns | my ($strings, $theme_colors) = @_; | ||
641 | |||||
642 | 1 | 400ns | my $PkgStr = []; | ||
643 | |||||
644 | 1 | 200ns | my %richfonts; | ||
645 | 1 | 400ns | if ($strings) { | ||
646 | my $xml = $self->_new_twig( | ||||
647 | twig_handlers => { | ||||
648 | # spent 715µs (68+647) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:655] which was called 15 times, avg 48µs/call:
# 15 times (68µs+647µs) by XML::Twig::_twig_end at line 2350 of XML/Twig.pm, avg 48µs/call | ||||
649 | 15 | 2µs | my ($twig, $si) = @_; | ||
650 | |||||
651 | 15 | 16µs | 15 | 280µs | my %text_rich = $self->_get_text_and_rich_font_by_cell($si, $theme_colors); # spent 280µs making 15 calls to Spreadsheet::ParseXLSX::_get_text_and_rich_font_by_cell, avg 19µs/call |
652 | 15 | 10µs | $richfonts{scalar @$PkgStr} = $text_rich{Rich}; | ||
653 | 15 | 6µs | push @$PkgStr, $text_rich{String}; | ||
654 | 15 | 24µs | 15 | 367µs | $twig->purge; # spent 367µs making 15 calls to XML::Twig::purge, avg 24µs/call |
655 | }, | ||||
656 | } | ||||
657 | 1 | 3µs | 1 | 3.96ms | ); # spent 3.96ms making 1 call to Spreadsheet::ParseXLSX::_new_twig |
658 | 1 | 1µs | 1 | 7.19ms | $xml->parse($strings); # spent 7.19ms making 1 call to XML::Twig::parse |
659 | } | ||||
660 | return ( | ||||
661 | 1 | 2µs | Rich => \%richfonts, | ||
662 | PkgStr => $PkgStr, | ||||
663 | ); | ||||
664 | } | ||||
665 | |||||
666 | # spent 1.10ms (28µs+1.08) within Spreadsheet::ParseXLSX::_parse_themes which was called:
# once (28µs+1.08ms) by Spreadsheet::ParseXLSX::_parse_workbook at line 153 | ||||
667 | 1 | 300ns | my $self = shift; | ||
668 | 1 | 400ns | my ($themes) = @_; | ||
669 | |||||
670 | 1 | 400ns | return {} unless $themes; | ||
671 | |||||
672 | 13 | 12µs | 25 | 1.08ms | my @color = map { $_->name eq 'drawmain:sysClr' ? $_->att('lastClr') : $_->att('val') } $themes->find_nodes('//drawmain:clrScheme/*/*'); # spent 1.06ms making 1 call to XML::Twig::get_xpath
# spent 7µs making 12 calls to XML::Twig::Elt::att, avg 600ns/call
# spent 4µs making 12 calls to XML::Twig::Elt::gi, avg 292ns/call |
673 | |||||
674 | # this shouldn't be necessary, but the documentation is wrong here | ||||
675 | # see http://stackoverflow.com/questions/2760976/theme-confusion-in-spreadsheetml | ||||
676 | 1 | 900ns | ($color[0], $color[1]) = ($color[1], $color[0]); | ||
677 | 1 | 800ns | ($color[2], $color[3]) = ($color[3], $color[2]); | ||
678 | |||||
679 | 1 | 3µs | return {Color => \@color,}; | ||
680 | } | ||||
681 | |||||
682 | # spent 10.1ms (648µs+9.48) within Spreadsheet::ParseXLSX::_parse_styles which was called:
# once (648µs+9.48ms) by Spreadsheet::ParseXLSX::_parse_workbook at line 157 | ||||
683 | 1 | 300ns | my $self = shift; | ||
684 | 1 | 400ns | my ($workbook, $styles) = @_; | ||
685 | |||||
686 | # these defaults are from | ||||
687 | # http://social.msdn.microsoft.com/Forums/en-US/oxmlsdk/thread/e27aaf16-b900-4654-8210-83c5774a179c | ||||
688 | 1 | 16µs | my %default_format_str = ( | ||
689 | 0 => 'GENERAL', | ||||
690 | 1 => '0', | ||||
691 | 2 => '0.00', | ||||
692 | 3 => '#,##0', | ||||
693 | 4 => '#,##0.00', | ||||
694 | 5 => '$#,##0_);($#,##0)', | ||||
695 | 6 => '$#,##0_);[Red]($#,##0)', | ||||
696 | 7 => '$#,##0.00_);($#,##0.00)', | ||||
697 | 8 => '$#,##0.00_);[Red]($#,##0.00)', | ||||
698 | 9 => '0%', | ||||
699 | 10 => '0.00%', | ||||
700 | 11 => '0.00E+00', | ||||
701 | 12 => '# ?/?', | ||||
702 | 13 => '# ??/??', | ||||
703 | 14 => 'm/d/yyyy', | ||||
704 | 15 => 'd-mmm-yy', | ||||
705 | 16 => 'd-mmm', | ||||
706 | 17 => 'mmm-yy', | ||||
707 | 18 => 'h:mm AM/PM', | ||||
708 | 19 => 'h:mm:ss AM/PM', | ||||
709 | 20 => 'h:mm', | ||||
710 | 21 => 'h:mm:ss', | ||||
711 | 22 => 'm/d/yyyy h:mm', | ||||
712 | 37 => '#,##0_);(#,##0)', | ||||
713 | 38 => '#,##0_);[Red](#,##0)', | ||||
714 | 39 => '#,##0.00_);(#,##0.00)', | ||||
715 | 40 => '#,##0.00_);[Red](#,##0.00)', | ||||
716 | 45 => 'mm:ss', | ||||
717 | 46 => '[h]:mm:ss', | ||||
718 | 47 => 'mm:ss.0', | ||||
719 | 48 => '##0.0E+0', | ||||
720 | 49 => '@', | ||||
721 | ); | ||||
722 | |||||
723 | 1 | 8µs | my %default_format_opts = ( | ||
724 | IgnoreFont => 1, | ||||
725 | IgnoreFill => 1, | ||||
726 | IgnoreBorder => 1, | ||||
727 | IgnoreAlignment => 1, | ||||
728 | IgnoreNumberFormat => 1, | ||||
729 | IgnoreProtection => 1, | ||||
730 | FontNo => 0, | ||||
731 | FmtIdx => 0, | ||||
732 | Lock => 1, | ||||
733 | Hidden => 0, | ||||
734 | AlignH => 0, | ||||
735 | Wrap => 0, | ||||
736 | AlignV => 2, | ||||
737 | Rotate => 0, | ||||
738 | Indent => 0, | ||||
739 | Shrink => 0, | ||||
740 | BdrStyle => [0, 0, 0, 0], | ||||
741 | BdrColor => [undef, undef, undef, undef], | ||||
742 | BdrDiag => [0, 0, undef], | ||||
743 | Fill => [0, undef, undef], | ||||
744 | ); | ||||
745 | |||||
746 | 1 | 300ns | if (!$styles) { | ||
747 | # XXX i guess? | ||||
748 | my $font = Spreadsheet::ParseExcel::Font->new( | ||||
749 | Height => 12, | ||||
750 | Color => '#000000', | ||||
751 | Name => '', | ||||
752 | ); | ||||
753 | my $format = Spreadsheet::ParseExcel::Format->new(%default_format_opts, Font => $font,); | ||||
754 | |||||
755 | return { | ||||
756 | FormatStr => \%default_format_str, | ||||
757 | Font => [$font], | ||||
758 | Format => [$format], | ||||
759 | }; | ||||
760 | } | ||||
761 | |||||
762 | 1 | 3µs | my %halign = ( | ||
763 | center => 2, | ||||
764 | centerContinuous => 6, | ||||
765 | distributed => 7, | ||||
766 | fill => 4, | ||||
767 | general => 0, | ||||
768 | justify => 5, | ||||
769 | left => 1, | ||||
770 | right => 3, | ||||
771 | ); | ||||
772 | |||||
773 | 1 | 2µs | my %valign = ( | ||
774 | bottom => 2, | ||||
775 | center => 1, | ||||
776 | distributed => 4, | ||||
777 | justify => 3, | ||||
778 | top => 0, | ||||
779 | ); | ||||
780 | |||||
781 | 1 | 4µs | my %border = ( | ||
782 | dashDot => 9, | ||||
783 | dashDotDot => 11, | ||||
784 | dashed => 3, | ||||
785 | dotted => 4, | ||||
786 | double => 6, | ||||
787 | hair => 7, | ||||
788 | medium => 2, | ||||
789 | mediumDashDot => 10, | ||||
790 | mediumDashDotDot => 12, | ||||
791 | mediumDashed => 8, | ||||
792 | none => 0, | ||||
793 | slantDashDot => 13, | ||||
794 | thick => 5, | ||||
795 | thin => 1, | ||||
796 | ); | ||||
797 | |||||
798 | 1 | 7µs | my %fill = ( | ||
799 | darkDown => 7, | ||||
800 | darkGray => 3, | ||||
801 | darkGrid => 9, | ||||
802 | darkHorizontal => 5, | ||||
803 | darkTrellis => 10, | ||||
804 | darkUp => 8, | ||||
805 | darkVertical => 6, | ||||
806 | gray0625 => 18, | ||||
807 | gray125 => 17, | ||||
808 | lightDown => 13, | ||||
809 | lightGray => 4, | ||||
810 | lightGrid => 15, | ||||
811 | lightHorizontal => 11, | ||||
812 | lightTrellis => 16, | ||||
813 | lightUp => 14, | ||||
814 | lightVertical => 12, | ||||
815 | mediumGray => 2, | ||||
816 | none => 0, | ||||
817 | solid => 1, | ||||
818 | ); | ||||
819 | |||||
820 | my @fills = map { | ||||
821 | 11 | 9µs | 11 | 4.73ms | my $pattern_type = $_->att('patternType'); # spent 4.72ms making 1 call to XML::Twig::get_xpath
# spent 7µs making 10 calls to XML::Twig::Elt::att, avg 720ns/call |
822 | 10 | 24µs | 40 | 344µs | [($pattern_type ? $fill{$pattern_type} : 0), $self->_color($workbook->{Color}, $_->first_child('s:fgColor'), 1), $self->_color($workbook->{Color}, $_->first_child('s:bgColor'), 1),] # spent 234µs making 20 calls to XML::Twig::Elt::first_child, avg 12µs/call
# spent 110µs making 20 calls to Spreadsheet::ParseXLSX::_color, avg 6µs/call |
823 | } $styles->find_nodes('//s:fills/s:fill/s:patternFill'); | ||||
824 | |||||
825 | my @borders = map { | ||||
826 | 2 | 2µs | 1 | 454µs | my $border = $_; # spent 454µs making 1 call to XML::Twig::get_xpath |
827 | 3 | 4µs | 4 | 2µs | my ($ddiag, $udiag) = map { $self->_xml_boolean($border->att($_)) } qw(diagonalDown diagonalUp); # spent 1µs making 2 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 650ns/call
# spent 1µs making 2 calls to XML::Twig::Elt::att, avg 600ns/call |
828 | my %borderstyles = map { | ||||
829 | 6 | 8µs | 5 | 444µs | my $e = $border->first_child("s:$_"); # spent 444µs making 5 calls to XML::Twig::Elt::first_child, avg 89µs/call |
830 | 5 | 6µs | 5 | 3µs | $_ => ($e ? $e->att('style') || 'none' : 'none') # spent 3µs making 5 calls to XML::Twig::Elt::att, avg 600ns/call |
831 | } qw(left right top bottom diagonal); | ||||
832 | my %bordercolors = map { | ||||
833 | 6 | 7µs | 5 | 24µs | my $e = $border->first_child("s:$_"); # spent 24µs making 5 calls to XML::Twig::Elt::first_child, avg 5µs/call |
834 | 5 | 3µs | 5 | 95µs | $_ => ($e ? $e->first_child('s:color') : undef) # spent 95µs making 5 calls to XML::Twig::Elt::first_child, avg 19µs/call |
835 | } qw(left right top bottom diagonal); | ||||
836 | # XXX specs say "begin" and "end" rather than "left" and "right", | ||||
837 | # but... that's not what seems to be in the file itself (sigh) | ||||
838 | { | ||||
839 | colors => [map { $self->_color($workbook->{Color}, $bordercolors{$_}) } qw(left right top bottom)], | ||||
840 | styles => [map { $border{$borderstyles{$_}} } qw(left right top bottom)], | ||||
841 | diagonal => [( | ||||
842 | $ddiag && $udiag ? 3 | ||||
843 | : $ddiag && !$udiag ? 2 | ||||
844 | : !$ddiag && $udiag ? 1 | ||||
845 | : 0 | ||||
846 | ), | ||||
847 | $border{$borderstyles{diagonal}}, | ||||
848 | 1 | 10µs | 5 | 4µs | $self->_color($workbook->{Color}, $bordercolors{diagonal}), # spent 4µs making 5 calls to Spreadsheet::ParseXLSX::_color, avg 740ns/call |
849 | ], | ||||
850 | } | ||||
851 | } $styles->find_nodes('//s:borders/s:border'); | ||||
852 | |||||
853 | 1 | 15µs | 1 | 535µs | my %format_str = (%default_format_str, (map { $_->att('numFmtId') => $_->att('formatCode') } $styles->find_nodes('//s:numFmts/s:numFmt')),); # spent 535µs making 1 call to XML::Twig::get_xpath |
854 | |||||
855 | my @font = map { | ||||
856 | 4 | 5µs | 4 | 652µs | my $vert = $_->first_child('s:vertAlign'); # spent 515µs making 1 call to XML::Twig::get_xpath
# spent 138µs making 3 calls to XML::Twig::Elt::first_child, avg 46µs/call |
857 | 3 | 2µs | 3 | 128µs | my $under = $_->first_child('s:u'); # spent 128µs making 3 calls to XML::Twig::Elt::first_child, avg 43µs/call |
858 | 3 | 2µs | 3 | 95µs | my $heightelem = $_->first_child('s:sz'); # spent 95µs making 3 calls to XML::Twig::Elt::first_child, avg 32µs/call |
859 | # XXX i guess 12 is okay? | ||||
860 | 3 | 4µs | 3 | 2µs | my $height = 0 + ($heightelem ? $heightelem->att('val') : 12); # spent 2µs making 3 calls to XML::Twig::Elt::att, avg 767ns/call |
861 | 3 | 2µs | 3 | 95µs | my $nameelem = $_->first_child('s:name'); # spent 95µs making 3 calls to XML::Twig::Elt::first_child, avg 32µs/call |
862 | 3 | 2µs | 3 | 2µs | my $name = $nameelem ? $nameelem->att('val') : ''; # spent 2µs making 3 calls to XML::Twig::Elt::att, avg 533ns/call |
863 | Spreadsheet::ParseExcel::Font->new( | ||||
864 | Height => $height, | ||||
865 | # Attr => $iAttr, | ||||
866 | # XXX not sure if there's a better way to keep the indexing stuff | ||||
867 | # intact rather than just going straight to #xxxxxx | ||||
868 | # XXX also not sure what it means for the color tag to be missing, | ||||
869 | # just assuming black for now | ||||
870 | Color => ( | ||||
871 | $_->first_child('s:color') | ||||
872 | 3 | 20µs | 24 | 447µs | ? $self->_color($workbook->{Color}, $_->first_child('s:color')) # spent 404µs making 18 calls to XML::Twig::Elt::first_child, avg 22µs/call
# spent 33µs making 3 calls to Spreadsheet::ParseXLSX::_color, avg 11µs/call
# spent 10µs making 3 calls to Spreadsheet::ParseExcel::Font::new, avg 3µs/call |
873 | : '#000000' | ||||
874 | ), | ||||
875 | Super => ( | ||||
876 | $vert | ||||
877 | ? ( | ||||
878 | $vert->att('val') eq 'superscript' ? 1 | ||||
879 | : $vert->att('val') eq 'subscript' ? 2 | ||||
880 | : 0 | ||||
881 | ) | ||||
882 | : 0 | ||||
883 | ), | ||||
884 | # XXX not sure what the single accounting and double accounting | ||||
885 | # underline styles map to in xlsx. also need to map the new | ||||
886 | # underline styles | ||||
887 | UnderlineStyle => ( | ||||
888 | $under | ||||
889 | # XXX sometimes style xml files can contain just <u/> with no | ||||
890 | # val attribute. i think this means single underline, but not | ||||
891 | # sure | ||||
892 | ? ( | ||||
893 | !$under->att('val') ? 1 | ||||
894 | : $under->att('val') eq 'single' ? 1 | ||||
895 | : $under->att('val') eq 'double' ? 2 | ||||
896 | : 0 | ||||
897 | ) | ||||
898 | : 0 | ||||
899 | ), | ||||
900 | Name => $name, | ||||
901 | |||||
902 | Bold => $_->has_child('s:b') ? 1 : 0, | ||||
903 | Italic => $_->has_child('s:i') ? 1 : 0, | ||||
904 | Underline => $_->has_child('s:u') ? 1 : 0, | ||||
905 | Strikeout => $_->has_child('s:strike') ? 1 : 0, | ||||
906 | ) | ||||
907 | } $styles->find_nodes('//s:fonts/s:font'); | ||||
908 | |||||
909 | my @format = map { | ||||
910 | 16 | 12µs | 1 | 1.01ms | my $xml_fmt = $_; # spent 1.01ms making 1 call to XML::Twig::get_xpath |
911 | 15 | 7µs | 15 | 128µs | my $alignment = $xml_fmt->first_child('s:alignment'); # spent 128µs making 15 calls to XML::Twig::Elt::first_child, avg 9µs/call |
912 | 15 | 5µs | 15 | 130µs | my $protection = $xml_fmt->first_child('s:protection'); # spent 130µs making 15 calls to XML::Twig::Elt::first_child, avg 9µs/call |
913 | 15 | 95µs | 180 | 71µs | my %ignore = map { ("Ignore$_" => !$self->_xml_boolean($xml_fmt->att("apply$_"))) } qw(Font Fill Border Alignment NumberFormat Protection); # spent 36µs making 90 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 399ns/call
# spent 35µs making 90 calls to XML::Twig::Elt::att, avg 389ns/call |
914 | 15 | 40µs | my %opts = (%default_format_opts, %ignore,); | ||
915 | |||||
916 | 15 | 8µs | 15 | 5µs | $opts{FmtIdx} = 0 + ($xml_fmt->att('numFmtId') || 0); # spent 5µs making 15 calls to XML::Twig::Elt::att, avg 353ns/call |
917 | 15 | 8µs | 15 | 5µs | $opts{FontNo} = 0 + ($xml_fmt->att('fontId') || 0); # spent 5µs making 15 calls to XML::Twig::Elt::att, avg 347ns/call |
918 | 15 | 4µs | $opts{Font} = $font[$opts{FontNo}]; | ||
919 | 15 | 7µs | 15 | 4µs | $opts{Fill} = $fills[$xml_fmt->att('fillId') || 0]; # spent 4µs making 15 calls to XML::Twig::Elt::att, avg 293ns/call |
920 | 15 | 7µs | 15 | 5µs | $opts{BdrStyle} = $borders[$xml_fmt->att('borderId') || 0]{styles}; # spent 5µs making 15 calls to XML::Twig::Elt::att, avg 320ns/call |
921 | 15 | 6µs | 15 | 4µs | $opts{BdrColor} = $borders[$xml_fmt->att('borderId') || 0]{colors}; # spent 4µs making 15 calls to XML::Twig::Elt::att, avg 260ns/call |
922 | 15 | 6µs | 15 | 5µs | $opts{BdrDiag} = $borders[$xml_fmt->att('borderId') || 0]{diagonal}; # spent 5µs making 15 calls to XML::Twig::Elt::att, avg 313ns/call |
923 | |||||
924 | 15 | 2µs | if ($alignment) { | ||
925 | 5 | 2µs | 5 | 1µs | $opts{AlignH} = $halign{$alignment->att('horizontal') || 'general'}; # spent 1µs making 5 calls to XML::Twig::Elt::att, avg 240ns/call |
926 | 5 | 4µs | 10 | 3µs | $opts{Wrap} = $self->_xml_boolean($alignment->att('wrapText')); # spent 2µs making 5 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 360ns/call
# spent 1µs making 5 calls to XML::Twig::Elt::att, avg 260ns/call |
927 | 5 | 3µs | 5 | 2µs | $opts{AlignV} = $valign{$alignment->att('vertical') || 'bottom'}; # spent 2µs making 5 calls to XML::Twig::Elt::att, avg 320ns/call |
928 | 5 | 2µs | 5 | 1µs | $opts{Rotate} = $alignment->att('textRotation'); # spent 1µs making 5 calls to XML::Twig::Elt::att, avg 260ns/call |
929 | 5 | 2µs | 5 | 1µs | $opts{Indent} = $alignment->att('indent'); # spent 1µs making 5 calls to XML::Twig::Elt::att, avg 240ns/call |
930 | 5 | 4µs | 10 | 3µs | $opts{Shrink} = $self->_xml_boolean($alignment->att('shrinkToFit')); # spent 2µs making 5 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 320ns/call
# spent 1µs making 5 calls to XML::Twig::Elt::att, avg 240ns/call |
931 | # JustLast => $iJustL, | ||||
932 | } | ||||
933 | |||||
934 | 15 | 1µs | if ($protection) { | ||
935 | $opts{Lock} = | ||||
936 | defined $protection->att('locked') | ||||
937 | ? $self->_xml_boolean($protection->att('locked')) | ||||
938 | : 1; | ||||
939 | $opts{Hidden} = $self->_xml_boolean($protection->att('hidden')); | ||||
940 | } | ||||
941 | |||||
942 | # Style => $iStyle, | ||||
943 | # Key123 => $i123, | ||||
944 | # Merge => $iMerge, | ||||
945 | # ReadDir => $iReadDir, | ||||
946 | 15 | 36µs | 15 | 44µs | Spreadsheet::ParseExcel::Format->new(%opts) # spent 44µs making 15 calls to Spreadsheet::ParseExcel::Format::new, avg 3µs/call |
947 | } $styles->find_nodes('//s:cellXfs/s:xf'); | ||||
948 | |||||
949 | return { | ||||
950 | 1 | 11µs | FormatStr => \%format_str, | ||
951 | Font => \@font, | ||||
952 | Format => \@format, | ||||
953 | }; | ||||
954 | } | ||||
955 | |||||
956 | # spent 72.2ms (90µs+72.1) within Spreadsheet::ParseXLSX::_extract_files which was called:
# once (90µs+72.1ms) by Spreadsheet::ParseXLSX::_parse_workbook at line 134 | ||||
957 | 1 | 100ns | my $self = shift; | ||
958 | 1 | 200ns | my ($zip) = @_; | ||
959 | |||||
960 | 1 | 200ns | my $type_base = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships'; | ||
961 | |||||
962 | 1 | 1µs | 2 | 4.01ms | my $rels = $self->_parse_xml($zip, $self->_rels_for(''),); # spent 4.00ms making 1 call to Spreadsheet::ParseXLSX::_parse_xml
# spent 4µs making 1 call to Spreadsheet::ParseXLSX::_rels_for |
963 | 1 | 2µs | 1 | 1.72ms | my $node = ($rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/officeDocument"]>))[0]; # spent 1.72ms making 1 call to XML::Twig::get_xpath |
964 | 1 | 100ns | die "invalid workbook" unless $node; | ||
965 | |||||
966 | 1 | 1µs | 1 | 2µs | my $wb_name = $node->att('Target'); # spent 2µs making 1 call to XML::Twig::Elt::att |
967 | 1 | 2µs | 1 | 1µs | $wb_name =~ s{^/}{}; # spent 1µs making 1 call to CORE::subst |
968 | 1 | 1µs | 1 | 2.27ms | my $wb_xml = $self->_parse_xml($zip, $wb_name); # spent 2.27ms making 1 call to Spreadsheet::ParseXLSX::_parse_xml |
969 | |||||
970 | 1 | 2µs | 1 | 5µs | my $path_base = $self->_base_path_for($wb_name); # spent 5µs making 1 call to Spreadsheet::ParseXLSX::_base_path_for |
971 | 1 | 1µs | 2 | 1.08ms | my $wb_rels = $self->_parse_xml($zip, $self->_rels_for($wb_name),); # spent 1.08ms making 1 call to Spreadsheet::ParseXLSX::_parse_xml
# spent 4µs making 1 call to Spreadsheet::ParseXLSX::_rels_for |
972 | |||||
973 | # spent 16µs (13+3) within Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979] which was called 5 times, avg 3µs/call:
# once (4µs+700ns) by Spreadsheet::ParseXLSX::_extract_files at line 1004
# once (3µs+1µs) by Spreadsheet::ParseXLSX::_extract_files at line 981
# once (2µs+400ns) by Spreadsheet::ParseXLSX::_extract_files at line 985
# once (2µs+500ns) by Spreadsheet::ParseXLSX::_extract_files at line 1010
# once (2µs+500ns) by Spreadsheet::ParseXLSX::_extract_files at line 983 | ||||
974 | 5 | 1µs | my ($p) = @_; | ||
975 | |||||
976 | 5 | 19µs | 5 | 3µs | return $p =~ s{^/}{} # spent 3µs making 5 calls to CORE::subst, avg 680ns/call |
977 | ? $p | ||||
978 | : $path_base . $p; | ||||
979 | 1 | 1µs | }; | ||
980 | |||||
981 | 2 | 5µs | 4 | 600µs | my ($strings_xml) = map { $self->_zip_file_member($zip, $get_path->($_->att('Target'))) } $wb_rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/sharedStrings"]>); # spent 362µs making 1 call to Spreadsheet::ParseXLSX::_zip_file_member
# spent 232µs making 1 call to XML::Twig::get_xpath
# spent 4µs making 1 call to Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979]
# spent 1µs making 1 call to XML::Twig::Elt::att |
982 | |||||
983 | 2 | 5µs | 4 | 16.6ms | my ($styles_xml) = map { $self->_parse_xml($zip, $get_path->($_->att('Target'))) } $wb_rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/styles"]>); # spent 16.4ms making 1 call to Spreadsheet::ParseXLSX::_parse_xml
# spent 218µs making 1 call to XML::Twig::get_xpath
# spent 3µs making 1 call to Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979]
# spent 800ns making 1 call to XML::Twig::Elt::att |
984 | |||||
985 | 2 | 7µs | 5 | 17.3ms | my %worksheet_xml = map { ($_->att('Id') => $self->_zip_file_member($zip, $get_path->($_->att('Target')))) } $wb_rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/worksheet"]>); # spent 17.0ms making 1 call to Spreadsheet::ParseXLSX::_zip_file_member
# spent 260µs making 1 call to XML::Twig::get_xpath
# spent 3µs making 1 call to Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979]
# spent 2µs making 2 calls to XML::Twig::Elt::att, avg 750ns/call |
986 | |||||
987 | # If we have hyperlinks in cells we need the rels file to get the link details | ||||
988 | 1 | 300ns | my $worksheet_rels_xml; | ||
989 | |||||
990 | # Get each worksheet object | ||||
991 | 1 | 3µs | 1 | 51µs | foreach my $worksheet ($wb_rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/worksheet"]>)) { # spent 51µs making 1 call to XML::Twig::get_xpath |
992 | # Split the worksheet xml path so we can | ||||
993 | 1 | 5µs | 1 | 1µs | my @sheetname_parts = split('/', $worksheet->att('Target')); # spent 1µs making 1 call to XML::Twig::Elt::att |
994 | |||||
995 | # Insert _rels before the sheetname, and amend the filename to have .rels on the end | ||||
996 | 1 | 500ns | my $sheetname = pop(@sheetname_parts); | ||
997 | 1 | 900ns | push(@sheetname_parts, '_rels'); | ||
998 | 1 | 600ns | push(@sheetname_parts, $sheetname . '.rels'); | ||
999 | |||||
1000 | # Recreate the file path | ||||
1001 | 1 | 1µs | my $rels_name = join('/', @sheetname_parts); | ||
1002 | |||||
1003 | # Check if we have a rels file | ||||
1004 | 1 | 4µs | 2 | 29µs | if (my $relfile = $zip->memberNamed($get_path->($rels_name))) { # spent 25µs making 1 call to Archive::Zip::Archive::memberNamed
# spent 4µs making 1 call to Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979] |
1005 | # Add the XML to our hash for access later on | ||||
1006 | $worksheet_rels_xml->{$worksheet->att('Id')} = $relfile->contents; | ||||
1007 | } | ||||
1008 | } | ||||
1009 | |||||
1010 | 2 | 7µs | 5 | 28.3ms | my %themes_xml = map { $_->att('Id') => $self->_parse_xml($zip, $get_path->($_->att('Target'))) } $wb_rels->find_nodes(qq<//packagerels:Relationship[\@Type="$type_base/theme"]>); # spent 28.0ms making 1 call to Spreadsheet::ParseXLSX::_parse_xml
# spent 253µs making 1 call to XML::Twig::get_xpath
# spent 3µs making 1 call to Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:979]
# spent 2µs making 2 calls to XML::Twig::Elt::att, avg 750ns/call |
1011 | |||||
1012 | return { | ||||
1013 | 1 | 11µs | 2 | 169µs | workbook => $wb_xml, # spent 169µs making 2 calls to XML::Twig::DESTROY, avg 85µs/call |
1014 | sheets => \%worksheet_xml, | ||||
1015 | themes => \%themes_xml, | ||||
1016 | ( | ||||
1017 | $styles_xml ? (styles => $styles_xml) | ||||
1018 | : () | ||||
1019 | ), | ||||
1020 | ( | ||||
1021 | $strings_xml ? (strings => $strings_xml) | ||||
1022 | : () | ||||
1023 | ), | ||||
1024 | (($worksheet_rels_xml && keys(%$worksheet_rels_xml)) ? (sheets_rels => $worksheet_rels_xml) : ()), | ||||
1025 | }; | ||||
1026 | } | ||||
1027 | |||||
1028 | # spent 51.8ms (26µs+51.7) within Spreadsheet::ParseXLSX::_parse_xml which was called 5 times, avg 10.4ms/call:
# once (6µs+28.0ms) by Spreadsheet::ParseXLSX::_extract_files at line 1010
# once (5µs+16.4ms) by Spreadsheet::ParseXLSX::_extract_files at line 983
# once (6µs+4.00ms) by Spreadsheet::ParseXLSX::_extract_files at line 962
# once (5µs+2.27ms) by Spreadsheet::ParseXLSX::_extract_files at line 968
# once (4µs+1.08ms) by Spreadsheet::ParseXLSX::_extract_files at line 971 | ||||
1029 | 5 | 500ns | my $self = shift; | ||
1030 | 5 | 2µs | my ($zip, $subfile, $map_xmlns) = @_; | ||
1031 | |||||
1032 | 5 | 4µs | 5 | 3.15ms | my $xml = $self->_new_twig; # spent 3.15ms making 5 calls to Spreadsheet::ParseXLSX::_new_twig, avg 630µs/call |
1033 | 5 | 9µs | 10 | 48.6ms | $xml->parse($self->_zip_file_member($zip, $subfile)); # spent 46.4ms making 5 calls to XML::Twig::parse, avg 9.28ms/call
# spent 2.20ms making 5 calls to Spreadsheet::ParseXLSX::_zip_file_member, avg 440µs/call |
1034 | |||||
1035 | 5 | 10µs | return $xml; | ||
1036 | } | ||||
1037 | |||||
1038 | # spent 19.6ms (53µs+19.5) within Spreadsheet::ParseXLSX::_zip_file_member which was called 7 times, avg 2.80ms/call:
# 5 times (38µs+2.16ms) by Spreadsheet::ParseXLSX::_parse_xml at line 1033, avg 440µs/call
# once (9µs+17.0ms) by Spreadsheet::ParseXLSX::_extract_files at line 985
# once (6µs+356µs) by Spreadsheet::ParseXLSX::_extract_files at line 981 | ||||
1039 | 7 | 1µs | my $self = shift; | ||
1040 | 7 | 2µs | my ($zip, $name) = @_; | ||
1041 | |||||
1042 | 7 | 118µs | 21 | 393µs | my @members = $zip->membersMatching(qr/^$name$/i); # spent 303µs making 7 calls to Archive::Zip::Archive::membersMatching, avg 43µs/call
# spent 82µs making 7 calls to CORE::regcomp, avg 12µs/call
# spent 8µs making 7 calls to CORE::qr, avg 1µs/call |
1043 | 7 | 1µs | die "no subfile named $name" unless @members; | ||
1044 | |||||
1045 | 7 | 22µs | 7 | 19.1ms | return scalar $members[0]->contents; # spent 19.1ms making 7 calls to Archive::Zip::Member::contents, avg 2.73ms/call |
1046 | } | ||||
1047 | |||||
1048 | sub _rels_for { | ||||
1049 | 2 | 200ns | my $self = shift; | ||
1050 | 2 | 300ns | my ($file) = @_; | ||
1051 | |||||
1052 | 2 | 2µs | my @path = split '/', $file; | ||
1053 | 2 | 400ns | my $name = pop @path; | ||
1054 | 2 | 400ns | $name = '' unless defined $name; | ||
1055 | 2 | 400ns | push @path, '_rels'; | ||
1056 | 2 | 1µs | push @path, "$name.rels"; | ||
1057 | |||||
1058 | 2 | 4µs | return join '/', @path; | ||
1059 | } | ||||
1060 | |||||
1061 | # spent 5µs within Spreadsheet::ParseXLSX::_base_path_for which was called:
# once (5µs+0s) by Spreadsheet::ParseXLSX::_extract_files at line 970 | ||||
1062 | 1 | 100ns | my $self = shift; | ||
1063 | 1 | 400ns | my ($file) = @_; | ||
1064 | |||||
1065 | 1 | 2µs | my @path = split '/', $file; | ||
1066 | 1 | 300ns | pop @path; | ||
1067 | |||||
1068 | 1 | 2µs | return join('/', @path) . '/'; | ||
1069 | } | ||||
1070 | |||||
1071 | # spent 24µs (10+14) within Spreadsheet::ParseXLSX::_dimensions which was called:
# once (10µs+14µs) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:246] at line 238 | ||||
1072 | 1 | 300ns | my $self = shift; | ||
1073 | 1 | 400ns | my ($dim) = @_; | ||
1074 | |||||
1075 | 1 | 3µs | my ($topleft, $bottomright) = split ':', $dim; | ||
1076 | 1 | 300ns | $bottomright = $topleft unless defined $bottomright; | ||
1077 | |||||
1078 | 1 | 2µs | 1 | 10µs | my ($rmin, $cmin) = $self->_cell_to_row_col($topleft); # spent 10µs making 1 call to Spreadsheet::ParseXLSX::_cell_to_row_col |
1079 | 1 | 900ns | 1 | 3µs | my ($rmax, $cmax) = $self->_cell_to_row_col($bottomright); # spent 3µs making 1 call to Spreadsheet::ParseXLSX::_cell_to_row_col |
1080 | |||||
1081 | 1 | 2µs | return ($rmin, $cmin, $rmax, $cmax); | ||
1082 | } | ||||
1083 | |||||
1084 | sub _is_merged { | ||||
1085 | my ($self, $sheet, $row, $col) = @_; | ||||
1086 | |||||
1087 | return unless $sheet->{MergedArea}; | ||||
1088 | |||||
1089 | foreach my $area (@{$sheet->{MergedArea}}) { | ||||
1090 | my ($topRow, $leftCol, $bottomRow, $rightCol) = @$area; | ||||
1091 | |||||
1092 | return 1 | ||||
1093 | if $topRow <= $row | ||||
1094 | && $leftCol <= $col | ||||
1095 | && $row <= $bottomRow | ||||
1096 | && $col <= $rightCol; | ||||
1097 | } | ||||
1098 | |||||
1099 | return 0; | ||||
1100 | } | ||||
1101 | |||||
1102 | # spent 1.04s (861ms+176ms) within Spreadsheet::ParseXLSX::_cell_to_row_col which was called 239270 times, avg 4µs/call:
# 202907 times (746ms+157ms) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:443] at line 361, avg 4µs/call
# 18180 times (68.0ms+12.2ms) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:302] at line 295, avg 4µs/call
# 18180 times (46.6ms+6.55ms) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:302] at line 296, avg 3µs/call
# once (12µs+2µs) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:338] at line 330
# once (8µs+2µs) by Spreadsheet::ParseXLSX::_dimensions at line 1078
# once (3µs+600ns) by Spreadsheet::ParseXLSX::_dimensions at line 1079 | ||||
1103 | 239270 | 26.6ms | my $self = shift; | ||
1104 | 239270 | 40.6ms | my ($cell) = @_; | ||
1105 | |||||
1106 | 239270 | 579ms | 239270 | 176ms | my ($col, $row) = $cell =~ /([A-Z]+)([0-9]+)/; # spent 176ms making 239270 calls to CORE::match, avg 734ns/call |
1107 | |||||
1108 | 239270 | 24.0ms | my $ncol = 0; | ||
1109 | 239270 | 104ms | for my $char (split //, $col) { | ||
1110 | 239270 | 26.3ms | $ncol *= 26; | ||
1111 | 239270 | 77.4ms | $ncol += ord($char) - ord('A') + 1; | ||
1112 | } | ||||
1113 | 239270 | 22.0ms | $ncol = $ncol - 1; | ||
1114 | |||||
1115 | 239270 | 44.2ms | my $nrow = $row - 1; | ||
1116 | |||||
1117 | 239270 | 587ms | return ($nrow, $ncol); | ||
1118 | } | ||||
1119 | |||||
1120 | # spent 13.4ms within Spreadsheet::ParseXLSX::_xml_boolean which was called 15730 times, avg 851ns/call:
# 15608 times (13.3ms+0s) by Spreadsheet::ParseXLSX::__ANON__[lib/Spreadsheet/ParseXLSX.pm:443] at line 354, avg 855ns/call
# 90 times (36µs+0s) by Spreadsheet::ParseXLSX::_parse_styles at line 913, avg 399ns/call
# 19 times (9µs+0s) by Spreadsheet::ParseXLSX::_color at line 1131, avg 468ns/call
# 5 times (2µs+0s) by Spreadsheet::ParseXLSX::_parse_styles at line 926, avg 360ns/call
# 5 times (2µs+0s) by Spreadsheet::ParseXLSX::_parse_styles at line 930, avg 320ns/call
# 2 times (1µs+0s) by Spreadsheet::ParseXLSX::_parse_styles at line 827, avg 650ns/call
# once (2µs+0s) by Spreadsheet::ParseXLSX::_parse_workbook at line 148 | ||||
1121 | 15730 | 2.05ms | my $self = shift; | ||
1122 | 15730 | 2.83ms | my ($bool) = @_; | ||
1123 | 15730 | 68.9ms | return defined($bool) && ($bool eq 'true' || $bool eq '1'); | ||
1124 | } | ||||
1125 | |||||
1126 | # spent 147µs (114+33) within Spreadsheet::ParseXLSX::_color which was called 28 times, avg 5µs/call:
# 20 times (85µs+26µs) by Spreadsheet::ParseXLSX::_parse_styles at line 822, avg 6µs/call
# 5 times (4µs+0s) by Spreadsheet::ParseXLSX::_parse_styles at line 848, avg 740ns/call
# 3 times (25µs+8µs) by Spreadsheet::ParseXLSX::_parse_styles at line 872, avg 11µs/call | ||||
1127 | 28 | 3µs | my $self = shift; | ||
1128 | 28 | 5µs | my ($colors, $color_node, $fill) = @_; | ||
1129 | |||||
1130 | 28 | 2µs | my $color; | ||
1131 | 28 | 16µs | 38 | 14µs | if ($color_node && !$self->_xml_boolean($color_node->att('auto'))) { # spent 9µs making 19 calls to Spreadsheet::ParseXLSX::_xml_boolean, avg 468ns/call
# spent 6µs making 19 calls to XML::Twig::Elt::att, avg 289ns/call |
1132 | 19 | 31µs | 49 | 15µs | if (defined $color_node->att('indexed')) { # spent 15µs making 49 calls to XML::Twig::Elt::att, avg 314ns/call |
1133 | # see https://rt.cpan.org/Public/Bug/Display.html?id=93065 | ||||
1134 | if ($fill && $color_node->att('indexed') == 64) { | ||||
1135 | return '#FFFFFF'; | ||||
1136 | } else { | ||||
1137 | $color = '#' . Spreadsheet::ParseExcel->ColorIdxToRGB($color_node->att('indexed')); | ||||
1138 | } | ||||
1139 | } elsif (defined $color_node->att('rgb')) { | ||||
1140 | $color = '#' . substr($color_node->att('rgb'), 2, 6); | ||||
1141 | } elsif (defined $color_node->att('theme')) { | ||||
1142 | 2 | 2µs | 2 | 900ns | my $theme = $colors->[$color_node->att('theme')]; # spent 900ns making 2 calls to XML::Twig::Elt::att, avg 450ns/call |
1143 | 2 | 1µs | if (defined $theme) { | ||
1144 | $color = "#$theme"; | ||||
1145 | } else { | ||||
1146 | return; | ||||
1147 | } | ||||
1148 | } | ||||
1149 | |||||
1150 | 11 | 5µs | 11 | 3µs | $color = $self->_apply_tint($color, $color_node->att('tint')) # spent 3µs making 11 calls to XML::Twig::Elt::att, avg 245ns/call |
1151 | if $color_node->att('tint'); | ||||
1152 | } | ||||
1153 | |||||
1154 | 20 | 16µs | return $color; | ||
1155 | } | ||||
1156 | |||||
1157 | sub _apply_tint { | ||||
1158 | my $self = shift; | ||||
1159 | my ($color, $tint) = @_; | ||||
1160 | |||||
1161 | my ($r, $g, $b) = map { oct("0x$_") } $color =~ /#(..)(..)(..)/; | ||||
1162 | my ($h, $l, $s) = rgb2hls($r, $g, $b); | ||||
1163 | |||||
1164 | if ($tint < 0) { | ||||
1165 | $l = $l * (1.0 + $tint); | ||||
1166 | } else { | ||||
1167 | $l = $l * (1.0 - $tint) + (1.0 - 1.0 * (1.0 - $tint)); | ||||
1168 | } | ||||
1169 | |||||
1170 | return scalar hls2rgb($h, $l, $s); | ||||
1171 | } | ||||
1172 | |||||
1173 | # spent 12.0ms (31µs+12.0) within Spreadsheet::ParseXLSX::_new_twig which was called 7 times, avg 1.71ms/call:
# 5 times (22µs+3.13ms) by Spreadsheet::ParseXLSX::_parse_xml at line 1032, avg 630µs/call
# once (4µs+4.89ms) by Spreadsheet::ParseXLSX::_parse_sheet at line 445
# once (4µs+3.95ms) by Spreadsheet::ParseXLSX::_parse_shared_strings at line 657 | ||||
1174 | 7 | 800ns | my $self = shift; | ||
1175 | 7 | 2µs | my %opts = @_; | ||
1176 | |||||
1177 | 7 | 29µs | 7 | 12.0ms | return XML::Twig->new( # spent 12.0ms making 7 calls to XML::Twig::new, avg 1.71ms/call |
1178 | map_xmlns => { | ||||
1179 | 'http://schemas.openxmlformats.org/spreadsheetml/2006/main' => 's', | ||||
1180 | 'http://schemas.openxmlformats.org/package/2006/relationships' => 'packagerels', | ||||
1181 | 'http://schemas.openxmlformats.org/officeDocument/2006/relationships' => 'rels', | ||||
1182 | 'http://schemas.openxmlformats.org/drawingml/2006/main' => 'drawmain', | ||||
1183 | }, | ||||
1184 | no_xxe => 1, | ||||
1185 | keep_original_prefix => 1, | ||||
1186 | %opts, | ||||
1187 | ); | ||||
1188 | } | ||||
1189 | |||||
1190 | =head1 INCOMPATIBILITIES | ||||
1191 | |||||
1192 | This module returns data using classes from L<Spreadsheet::ParseExcel>, so for | ||||
1193 | the most part, it should just be a drop-in replacement. That said, there are a | ||||
1194 | couple areas where the data returned is intentionally different: | ||||
1195 | |||||
1196 | =over 4 | ||||
1197 | |||||
1198 | =item Colors | ||||
1199 | |||||
1200 | In Spreadsheet::ParseExcel, colors are represented by integers which index into | ||||
1201 | the color table, and you have to use | ||||
1202 | C<< Spreadsheet::ParseExcel->ColorIdxToRGB >> in order to get the actual value | ||||
1203 | out. In Spreadsheet::ParseXLSX, while the color table still exists, cells are | ||||
1204 | also allowed to specify their color directly rather than going through the | ||||
1205 | color table. In order to avoid confusion, I normalize all color values in | ||||
1206 | Spreadsheet::ParseXLSX to their string RGB format (C<"#0088ff">). This affects | ||||
1207 | the C<Fill>, C<BdrColor>, and C<BdrDiag> properties of formats, and the | ||||
1208 | C<Color> property of fonts. Note that the default color is represented by | ||||
1209 | C<undef> (the same thing that C<ColorIdxToRGB> would return). | ||||
1210 | |||||
1211 | =item Formulas | ||||
1212 | |||||
1213 | Spreadsheet::ParseExcel doesn't support formulas. Spreadsheet::ParseXLSX | ||||
1214 | provides basic formula support by returning the text of the formula as part of | ||||
1215 | the cell data. You can access it via C<< $cell->{Formula} >>. Note that the | ||||
1216 | restriction still holds that formula cell values aren't available unless they | ||||
1217 | were explicitly provided when the spreadsheet was written. | ||||
1218 | |||||
1219 | =back | ||||
1220 | |||||
1221 | =head1 BUGS | ||||
1222 | |||||
1223 | =over 4 | ||||
1224 | |||||
1225 | =item Large spreadsheets may cause segfaults on perl 5.14 and earlier | ||||
1226 | |||||
1227 | This module internally uses XML::Twig, which makes it potentially subject to | ||||
1228 | L<Bug #71636 for XML-Twig: Segfault with medium-sized document|https://rt.cpan.org/Public/Bug/Display.html?id=71636> | ||||
1229 | on perl versions 5.14 and below (the underlying bug with perl weak references | ||||
1230 | was fixed in perl 5.15.5). The larger and more complex the spreadsheet, the | ||||
1231 | more likely to be affected, but the actual size at which it segfaults is | ||||
1232 | platform dependent. On a 64-bit perl with 7.6gb memory, it was seen on | ||||
1233 | spreadsheets about 300mb and above. You can work around this adding | ||||
1234 | C<XML::Twig::_set_weakrefs(0)> to your code before parsing the spreadsheet, | ||||
1235 | although this may have other consequences such as memory leaks. | ||||
1236 | |||||
1237 | =item Worksheets without the C<dimension> tag are not supported | ||||
1238 | |||||
1239 | =item Intra-cell formatting is discarded | ||||
1240 | |||||
1241 | =item Shared formulas are not supported | ||||
1242 | |||||
1243 | Shared formula support will require an actual formula parser and quite a bit of | ||||
1244 | custom logic, since the only thing stored in the document is the formula for | ||||
1245 | the base cell - updating the cell references in the formulas in the rest of the | ||||
1246 | cells is handled by the application. Values for these cells are still handled | ||||
1247 | properly. | ||||
1248 | |||||
1249 | =back | ||||
1250 | |||||
1251 | In addition, there are still a few areas which are not yet implemented (the | ||||
1252 | XLSX spec is quite large). If you run into any of those, bug reports are quite | ||||
1253 | welcome. | ||||
1254 | |||||
1255 | Please report any bugs to GitHub Issues at | ||||
1256 | L<https://github.com/MichaelDaum/spreadsheet-parsexlsx/issues>. | ||||
1257 | |||||
1258 | =head1 SEE ALSO | ||||
1259 | |||||
1260 | L<Spreadsheet::ParseExcel>: The equivalent, for XLS files. | ||||
1261 | |||||
1262 | L<Spreadsheet::XLSX>: An older, less robust and featureful implementation. | ||||
1263 | |||||
1264 | =head1 SUPPORT | ||||
1265 | |||||
1266 | You can find this documentation for this module with the perldoc command. | ||||
1267 | |||||
1268 | perldoc Spreadsheet::ParseXLSX | ||||
1269 | |||||
1270 | You can also look for information at: | ||||
1271 | |||||
1272 | =over 4 | ||||
1273 | |||||
1274 | =item * MetaCPAN | ||||
1275 | |||||
1276 | L<https://metacpan.org/release/Spreadsheet-ParseXLSX> | ||||
1277 | |||||
1278 | =item * RT: CPAN's request tracker | ||||
1279 | |||||
1280 | L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=Spreadsheet-ParseXLSX> | ||||
1281 | |||||
1282 | =item * Github | ||||
1283 | |||||
1284 | L<https://github.com/MichaelDaum/spreadsheet-parsexlsx> | ||||
1285 | |||||
1286 | =item * CPAN Ratings | ||||
1287 | |||||
1288 | L<http://cpanratings.perl.org/d/Spreadsheet-ParseXLSX> | ||||
1289 | |||||
1290 | =back | ||||
1291 | |||||
1292 | =head1 SPONSORS | ||||
1293 | |||||
1294 | Parts of this code were paid for by | ||||
1295 | |||||
1296 | =over 4 | ||||
1297 | |||||
1298 | =item Socialflow L<http://socialflow.com> | ||||
1299 | |||||
1300 | =back | ||||
1301 | |||||
1302 | =cut | ||||
1303 | |||||
1304 | 1 | 2µs | 1; |