Code Monkey home page Code Monkey logo

xlsx.jl's Introduction

xlsx.jl's People

Contributors

amgad-naiem avatar ararslan avatar beastyblacksmith avatar best4innio avatar bjarthur avatar chris-b1 avatar divbyzerofordummies avatar fchorney avatar felipenoris avatar felixrehren avatar goggle avatar guilhermebodin avatar gustafsson avatar hhaensel avatar imresamu avatar jaakkor2 avatar junpei-n avatar kano31 avatar kdheepak avatar mo-gul avatar mortenpi avatar nathanrboyer avatar pascalr0410 avatar pkienscherf avatar quinnj avatar rben01 avatar tecosaur avatar yonghee-kim avatar zundertj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xlsx.jl's Issues

Cannot open file created by openpyxl (relative file paths)

I was trying to open an Excel file generated in Python by the openpyxl package (the most commonly recommended Excel package for Python) but XLSX fails with

julia> XLSX.readxlsx("test.xlsx")
ERROR: AssertionError: Couldn't find xl//xl/worksheets/sheet1.xml in test.xlsx.
Stacktrace:
 [1] open_internal_file_stream(::XLSX.XLSXFile, ::String) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\stream.jl:48
 [2] read_worksheet_dimension(::XLSX.XLSXFile, ::String, ::String) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\worksheet.jl:15
 [3] XLSX.Worksheet(::XLSX.XLSXFile, ::EzXML.Node) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\worksheet.jl:7
 [4] parse_workbook!(::XLSX.XLSXFile) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\read.jl:317
 [5] open_or_read_xlsx(::String, ::Bool, ::Bool, ::Bool) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\read.jl:191
 [6] readxlsx(::String) at C:\Julia\Packages\packages\XLSX\8Z2SV\src\read.jl:34
 [7] top-level scope at none:0

After opening the file in Excel (which works fine) and saving it again, the problem goes away. Do you know if this is a problem with openpyxl not formatting files correctly or a problem with XLSX? I've attached the test file which was generated in Python with

from openpyxl import Workbook
wb = Workbook()
ws = wb.create_sheet(title="Test1")
ws.append(["One", "Two"])
ws.append([1, 2])
ws.append([3, 4])
wb.save("test.xlsx")

Thanks for all your work on this package!
test.xlsx

Migrate to EzXML.jl

I suspect that this line of code from read function is killing performance:

xf.data[f.name] = LightXML.parse_string(readstring(f))

The problem is that LightXML.jl doesn´t allow for reading from IO streams, AFAIK.
Changing to EzXML.jl will also allow for implementing #3.

encoding problem - exporting values with `&` char

Hi!
First, thank you for this package!

I have found a strange encoding problem: - with exporting values with & character - can you replicate ?

minimal code:

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.4 (2018-07-09 19:09 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-pc-linux-gnu

julia> import DataFrames, XLSX

julia> Pkg.status("XLSX")
 - XLSX                          0.2.2

julia> df1 = DataFrames.DataFrame(COL1=[10,20,30], COL2=["Fist%1row", "Sec&2row", "Third#3row"])
3×2 DataFrames.DataFrame
│ Row │ COL1 │ COL2       │
├─────┼──────┼────────────┤
│ 1   │ 10   │ Fist%1row  │
│ 2   │ 20   │ Sec&2row   │
│ 3   │ 30   │ Third#3row │

julia> XLSX.writetable("df1.xlsx", DataFrames.columns(df1), DataFrames.names(df1))

Result - with LibreOffice Calc

"Sec&2row" value is missing:

screenshot from 2018-07-18 18-39-51

Expecting:

│ COL1 │ COL2       │
┼──────┼────────────┤
│ 10   │ Fist%1row  │
│ 20   │ Sec&2row   │  <---------
│ 30   │ Third#3row │  <---------

Julia 0.7 test XLSX error

Will XLSX work on Julia 0.7 ?

test XLSX
......
┌ Warning: `method_exists(f, t)` is deprecated, use `hasmethod(f, t)` instead.
│   caller = top-level scope at none:0
└ @ Core none:0
WARNING: importing deprecated binding Base.Dates into XLSX.
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near C:\Users\PC\.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near C:\Users\PC\.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near C:\Users\PC\.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: importing deprecated binding Base.Nullable into XLSX.
ERROR: LoadError: LoadError: TypeError: in Type{...} expression, expected UnionAll, got typeof(Base.Nullable)
Stacktrace:
 [1] top-level scope at none:0
 [2] include at .\boot.jl:317 [inlined]
 [3] include_relative(::Module, ::String) at .\loading.jl:1038
 [4] include at .\sysimg.jl:29 [inlined]
 [5] include(::String) at C:\Users\PC\.julia\packages\XLSX\ETIGH\src\XLSX.jl:3
 [6] top-level scope at none:0
 [7] include at .\boot.jl:317 [inlined]
 [8] include_relative(::Module, ::String) at .\loading.jl:1038
 [9] include(::Module, ::String) at .\sysimg.jl:29
 [10] top-level scope at none:2
 [11] eval at .\boot.jl:319 [inlined]
 [12] eval(::Expr) at .\client.jl:399
 [13] top-level scope at .\none:3
in expression starting at C:\Users\PC\.julia\packages\XLSX\ETIGH\src\structs.jl:153
in expression starting at C:\Users\PC\.julia\packages\XLSX\ETIGH\src\XLSX.jl:14
ERROR: LoadError: Failed to precompile XLSX [fdbf4ff8-1666-58a4-91e7-1b58723a45e0] to C:\Users\PC\.julia\compil
d\v0.7\XLSX\gPxqz.ji.
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] macro expansion at .\logging.jl:313 [inlined]
 [3] compilecache(::Base.PkgId, ::String) at .\loading.jl:1185
 [4] _require(::Base.PkgId) at .\logging.jl:311
 [5] require(::Base.PkgId) at .\loading.jl:852
 [6] macro expansion at .\logging.jl:311 [inlined]
 [7] require(::Module, ::Symbol) at .\loading.jl:834
 [8] include at .\boot.jl:317 [inlined]
 [9] include_relative(::Module, ::String) at .\loading.jl:1038
 [10] include(::Module, ::String) at .\sysimg.jl:29
 [11] include(::String) at .\client.jl:398
 [12] top-level scope at none:0
in expression starting at C:\Users\PC\.julia\packages\XLSX\ETIGH\test\runtests.jl:2
ERROR: Package XLSX errored during testing

Paul

gettable 'stop in empty row' option

Current behavior is to stop reading rows if gettable finds an empty row.
There should be an option to keep reading rows until there´s no more data in the spreadsheet.

Excel file with charts can't be opened

When using readxlsx I get ERROR: Couldn't parse worksheet somefile.xlsx. A copy of the file with the tabs containing charts removed is read successfully.

error on openxlsx with PrinterSetting

I was trying to add new sheet to my working file, and encountered error

https://drive.google.com/file/d/1j3Ajooj6j8LmrEd7wrc8ylDK7Y5YldZy/view?usp=sharing

p = joinpath(@__DIR__, "temp.xlsx")
XLSX.openxlsx(p, mode="rw") do xf
    sheet = xf[1]
    @show xf
end

ERROR: MethodError: no method matching read(::ZipFile.ReadableFile, ::Type{UInt8}, ::UInt32)

it happens on here while trying to read xl/printerSettings/printerSettings1.bin
https://github.com/felipenoris/XLSX.jl/blob/master/src/read.jl#L181

ZipFile.read(f, UInt8, f.uncompressedsize) ->ZipFile.read(f) seems to work. But I don't know it's right thing to do

Warn for supported datatypes for setdata!

ERROR: LoadError: MethodError: no method matching setdata!(::XLSX.Worksheet, ::XLSX.CellRef, ::Symbol)

Closest candidates are:
setdata!(::XLSX.Worksheet, ::XLSX.CellRef, ::DateTime) at /Users/hannaekfalth/.julia/v0.6/XLSX/src/write.jl:248
setdata!(::XLSX.Worksheet, ::XLSX.CellRef, ::Base.Dates.Time) at /Users/hannaekfalth/.julia/v0.6/XLSX/src/write.jl:235
setdata!(::XLSX.Worksheet, ::XLSX.CellRef, ::Date) at /Users/hannaekfalth/.julia/v0.6/XLSX/src/write.jl:222

Erroneous parsing of defined names

I have a workbook that has traveled through many hands, so I don't know the origins of these "hidden" defined names, but they're breaking XLSX.jl. I replaced the error() at the end of parse_workbook! with println("Could not parse named range '$(ref_or_range_str)' in $defined_name_node."), yielding this:

Could not parse named range '2' in <definedName name="lssolver_est" localSheetId="1" hidden="1">2</definedName>.
Could not parse named range '1000' in <definedName name="lssolver_itr" localSheetId="1" hidden="1">1000</definedName>.
Could not parse named range '0' in <definedName name="lssolver_neg" localSheetId="1" hidden="1">0</definedName>.
Could not parse named range '0.000001' in <definedName name="lssolver_piv" localSheetId="1" hidden="1">0.000001</definedName>.

...

I don't know the meaning of "hidden" defined names, but the parser seems to be interpreting the value incorrectly as the name of the range.

How to insert vector verticaly to existing file ?

Hi,
XLSX is very usuefull, thanks!
Below, my new vector rand(5) is starting from cel B2, but in sheet is horizontaly. How to insert vector but verticaly ? In cels B1:B10 ?

julia> XLSX.openxlsx("my_new_file.xlsx", mode="rw") do xf
sheet = xf[1]
sheet["B2"] = rand(5)
end
5-element Array{Float64,1}:
0.19797082062820803
0.48317579235510166
0.09707096411823213
0.26370710993348934
0.8109768288800014

Thanks
Paul

Read from IO

Use case: Excel file inside a Zip or Gzip file.

Detect valid XLSX file before opening

I tried to open http://www.foodstandards.gov.au/science/monitoringnutrients/ausnut/Documents/8b.%20AUSNUT%202011-13%20AHS%20Food%20Nutrient%20Database.xls

MWE

using XLSX
nutrients_data_filename = download("http://www.foodstandards.gov.au/science/monitoringnutrients/ausnut/Documents/8b.%20AUSNUT%202011-13%20AHS%20Food%20Nutrient%20Database.xls")
xf = XLSX.openxlsx(nutrients_data_filename)

Error:

failed to find end of centeral directory record

Stacktrace:
 [1] _find_enddiroffset(::IOStream) at /home/wheel/oxinabox/.julia/v0.6/ZipFile/src/ZipFile.jl:292
 [2] ZipFile.Reader(::IOStream, ::Bool) at /home/wheel/oxinabox/.julia/v0.6/ZipFile/src/ZipFile.jl:111
 [3] XLSX.XLSXFile(::String, ::Bool, ::Bool) at /home/wheel/oxinabox/.julia/v0.6/XLSX/src/structs.jl:219
 [4] open_or_read_xlsx(::String, ::Bool, ::Bool, ::Bool) at /home/wheel/oxinabox/.julia/v0.6/XLSX/src/read.jl:54
 [5] #openxlsx#9 at /home/wheel/oxinabox/.julia/v0.6/XLSX/src/read.jl:46 [inlined]
 [6] openxlsx(::String) at /home/wheel/oxinabox/.julia/v0.6/XLSX/src/read.jl:4

This is with v0.2.0 on julia 0.6.2

XML streaming to handle large Excel files

One drawback of the current implementation for this package is that the whole XML file from a Excel package are parsed in the moment the file is read at XML.read("filename.xlsx") worksheet is parsed when accessing any information of the worksheet.

One idea is to stream XML for worksheet XML files.

I found that EzXML.jl has support for streaming XML files.

Can't open an XLSX file in Julia 1.0 with Linux Mint 19

I installed the package in Windows 10 for Julia 1.0 and was able to read xlsx files.

I took the very same steps for installation on my Linux Mint 19 partition (I'm running dual boot); I saw no errors in installing, building, and using the package, but whenever I attempted to open even the smallest file, I got this message:

ERROR: error compiling #7: error compiling unsafe_read: error compiling read: error compiling fillbuf: could not load library "libz"
libz.so: cannot open shared object file: No such file or directory

Stacktrace:
[1] macro expansion at /home/user/.julia/packages/EzXML/DUxj7/src/error.jl:50 [inlined]
[2] readxml(::ZipFile.ReadableFile) at /home/user/.julia/packages/EzXML/DUxj7/src/document.jl:153
[3] internal_xml_file_read(::XLSX.XLSXFile, ::String) at /home/user/.julia/packages/XLSX/8Z2SV/src/read.jl:419
[4] open_or_read_xlsx(::String, ::Bool, ::Bool, ::Bool) at /home/user/.julia/packages/XLSX/8Z2SV/src/read.jl:179
[5] readxlsx(::String) at /home/user/julia/packages/XLSX/8Z2SV/src/read.jl:34
[6] top-level scope at none:0

Get Table

When using sheet[:], dimension(sheet) is used to identify the range of cells to query.
Sometimes the result is not what´s expected, because the sheet´s dimension may contain empty cells. This is the case when the Excel file was generated from a template.

There should be a gettable method to return data based on columns selected by the user, or based on a cell anchor + inspection of the table area.

The result could be an array of columns and array of headers, just like readdlm function, to help integration with DataFrames or other table-like packages.

This should be based on a new row iterator, so that DataStreams integration could use this low-level iterator to stream data.

Accept matrix as input for writetable!

Hello,

currently, there is some functionality for writing an AbstractVector (see source). By default, a vector is written into a single row.

How about extending the interface to enable the following ? I could make a pull request, but would such an extension be desirable?

  1. enable writing vectors into columns (same interface as writing into rows)
  2. enable writing 2-D arrays, i.e. AbstractMatrix directly to the sheet (similar interface)

This could lead to some design changes:

  • The functions available for data::AbstractVector where the row is an Integer and the argument for the columns is a UnitRange or a Colon could be duplicated with equivalent functions where instead the column is an Integer and the argument for the rows can be a UnitRange or a Colon
  • The functions available for data::AbstractVector could be duplicated, but allowing data::AbstractMatrix and corresponding possibilities for the row/column arguments
  • Base.setindex!(sheet::Worksheet, data::AbstractVector, ref_str::AbstractString) currently converts ref_str to a CellRef. Instead, one could try to convert it to either a CellRef or to a CellRange and then proceed accordingly

Decouple `TableRow` from worksheet cells

Currently TableRow has a reference to the whole XLSX package.
The common use case is to read just the cell values from a table with Base.getindex.
If TableRow stores only the cell values, it can be used for streaming data in parallel, without having to serialize the whole XLSX package between processes.

There could also be a eachtablerowcells to recover the getcell feature.

XLSX.jl locks file on Windows / Julia 0.7

While attempting to make this package run on Julia 0.7-beta2, I ran into this (I cannot reproduce this on Julia 0.6.2!)

I'm on a Windows 10 machine, and this error keeps popping up while running the tests on the very last line where rm("output_tables.xlsx") is called:

ERROR: LoadError: unlink: resource busy or locked (EBUSY)
Stacktrace:
 [1] uv_error at .\libuv.jl:80 [inlined]
 [2] unlink(::String) at .\file.jl:741
 [3] #rm#9(::Bool, ::Bool, ::Function, ::String) at .\file.jl:253
 [4] rm(::String) at .\file.jl:245
 [5] top-level scope at none:0
 [6] include at .\boot.jl:317 [inlined]
 [7] include_relative(::Module, ::String) at .\loading.jl:1034
 [8] include(::Module, ::String) at .\sysimg.jl:29
 [9] include(::String) at .\client.jl:393
 [10] top-level scope at none:0
in expression starting at C:\Users\Gebruiker\.julia\dev\XLSX\test\runtests.jl:1035
ERROR: Package XLSX errored during testing

Apparantly, this is a common problem on Windows with respect to file IO, see for instance https://discourse.julialang.org/t/windows-specific-uv-error/3127/8.

I suspect that it might (partially) be a base Julia bug, but for some reason it only fails at the last rm("output_tables.xlsx"), and not on previous calls. Do you have any idea what is so special about this last time, are there operations in the final set of tests that make it lock the file?

Write API Improvements

  • Select style template cell: setdata!(ws, ref, value; template_cell::CellRef=ref)
  • Implement XLSX.addsheet!(wb::Workbook)
  • XLSX.writetable option to export multiple tables in one Excel file. One Worksheet per table.
  • Fix calcChain.xml issue when editing Excel files with formulas.
  • Check if sst relationship exists before saving sst xml.
  • Review README.md for new write/edit methods.

sst optimization

The shared string table (sst) is queried each time a string cell is read. The sst can be cached for performance optimization.

AssertionError: isempty(XML_GLOBAL_ERROR_STACK)

Hi,

When I try to readxlsx blank .xlsx file I am getting this error:

   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1 (2018-09-29)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

ERROR: LoadError: AssertionError: isempty(XML_GLOBAL_ERROR_STACK)
Stacktrace:
 [1] macro expansion at C:\Users\pn\.julia\packages\EzXML\DUxj7\src\error.jl:49 [inlined]
 [2] readxml(::ZipFile.ReadableFile) at C:\Users\pn\.julia\packages\EzXML\DUxj7\src\document.jl:153
 [3] internal_xml_file_read(::XLSX.XLSXFile, ::String) at C:\Users\pn\.julia\packages\XLSX\4MZLm\src
\read.jl:419
 [4] open_or_read_xlsx(::String, ::Bool, ::Bool, ::Bool) at C:\Users\pn\.julia\packages\XLSX\4MZLm\s
rc\read.jl:179
 [5] readxlsx(::String) at C:\Users\pn\.julia\packages\XLSX\4MZLm\src\read.jl:34
 [6] top-level scope at none:0
 [7] include_string(::Module, ::String, ::String) at .\loading.jl:1005
 [8] (::getfield(Atom, Symbol("##129#135")){String,String,Module})() at C:\Users\pn\.julia\packages\
Atom\7rQ1O\src\eval.jl:125
 [9] withpath(::getfield(Atom, Symbol("##129#135")){String,String,Module}, ::String) at C:\Users\pn\
.julia\packages\CodeTools\hB4Hy\src\utils.jl:30
 [10] withpath at C:\Users\pn\.julia\packages\Atom\7rQ1O\src\eval.jl:46 [inlined]
 [11] #128 at C:\Users\pn\.julia\packages\Atom\7rQ1O\src\eval.jl:122 [inlined]
 [12] with_logstate(::getfield(Atom, Symbol("##128#134")){String,String,Module}, ::Base.CoreLogging.
LogState) at .\logging.jl:397
 [13] with_logger at .\logging.jl:493 [inlined]
 [14] #127 at C:\Users\pn\.julia\packages\Atom\7rQ1O\src\eval.jl:121 [inlined]
 [15] hideprompt(::getfield(Atom, Symbol("##127#133")){String,String,Module}) at C:\Users\pn\.julia\
packages\Atom\7rQ1O\src\repl.jl:85
 [16] macro expansion at C:\Users\pn\.julia\packages\Atom\7rQ1O\src\eval.jl:120 [inlined]
 [17] (::getfield(Atom, Symbol("##126#132")){Dict{String,Any}})() at .\task.jl:85

Thanks,
Piotr

Secret message error challenge

@assert sst.formatted_strings[i+1] == str_formatted "\"Congratulations. You've just discovered the secret message. Please send your answer to Old Pink, care of the Funny Farm, Chalfont…\"\nPlease, file an issue at https://github.com/felipenoris/XLSX.jl..."

Ref.

better API docs

Currently the API index is auto generated.
Public methods should be documented, and Private methods should not be shown in the API Index.

Writing Excel files is very slow

Opening an Excel file using XLSX.openxlsx( filename, mode = "rw" ) is unexpectedly slow even for small(ish) files. Is this expected behaviour?

The script below illustrates the point:

# This script performs and times the following actions:
# 1. generate an Excel file of 10 tabs with 100x25 of data.
# 2. open the file for reading.
# 3. open the file for read/writing.
using XLSX

# Generate XLSX file
print( "Data generation: " )
@time XLSX.openxlsx( "XLSXtest.xlsx", mode = "w" ) do xf
    for ii in 1:10
        fSheet = XLSX.addsheet!( xf )

        for jj in 1:100, kk in 1:25
            fSheet[ XLSX.CellRef( kk, jj ) ] = jj * kk
        end  # for jj in 1:25, kk in 1:100
    end  # for ii in 1:10
end  # XLSX.openxlsx( "XLSXtest.xlsx", mode = "w" ) do xf

print( "Open file in mode \"r\": " )
@time XLSX.openxlsx( "XLSXtest.xlsx", mode = "r" ) do xf
end  # XLSX.openxlsx( "XLSXtest.xlsx", mode = "r" ) do xf

print( "Open file in mode \"rw\": " )
@time XLSX.openxlsx( "XLSXtest.xlsx", mode = "rw" ) do xf
end  # XLSX.openxlsx( "XLSXtest.xlsx", mode = "rw" ) do xf

which gave as output:

Data generation:   0.193967 seconds (1.07 M allocations: 39.604 MiB, 10.18% gc time)
Open file in mode "r":   0.570344 seconds (367.24 k allocations: 14.159 MiB)
Open file in mode "rw":  15.477702 seconds (13.14 M allocations: 447.368 MiB, 0.47% gc time)

The test was performed on Julia 1.0 with XLSX 0.4.1. Similar results are obtained on Julia 0.6.3 and XLSX 0.3.0.

00:00:00 to 1899-12-30 bug with XLSX

In my data set, I have a data type Time in 24-hour notation where midnight is written as 00:00:00. Then if I read the data with XLSX.readdata(), then 00:00:00 is converted to the date 1899-12-30. This is with julia v0.7 and XLSX v0.4.2

For example, let's say that we have an excel file (test.xlsx) with only one column such that:

Time
23:58:00
23:59:00
00:00:00
00:01:00

If I extract the time from the excel files using XLSX.readdata(), I get the following :

julia> x = XLSX.readdata("test.xlsx", "Sheet1", "A2:A5") 

4×1 Array{String,2}:
 "23:58:00"  
 "23:59:00"  
 "1899-12-30"
 "00:01:00"

I know that this bug is common with spreadsheets software data manipulations. Is there a fix to that bug? (Besides writing a function to filter the date.)

Thank you in advance.

Columns argument in XLSX.readtable() not recognized

I tried to use readable to directly read into a dataframe but it seems that I cannot pass the column argument as described in the documentation. If I do

df = DataFrame(XLSX.readtable("my_data.xlsx", "raw_data", columns="C"; first_row=20, header=true)...)

I get the error

MethodError: no method matching readtable(::String, ::String; columns="C", first_row=20, header=true)
Closest candidates are:
  readtable(::AbstractString, ::Union{Int64, AbstractString}; first_row, column_labels, header, infer_eltypes, stop_in_empty_row, stop_in_row_function, enable_cache) at /Users/born/.julia/packages/XLSX/j2fqF/src/read.jl:538 got unsupported keyword argument "columns"
kwerr(::NamedTuple{(:columns, :first_row, :header),Tuple{String,Int64,Bool}}, ::Function, ::String, ::String) at error.jl:97
(::getfield(XLSX, Symbol("#kw##readtable")))(::NamedTuple{(:columns, :first_row, :header),Tuple{String,Int64,Bool}}, ::typeof(XLSX.readtable), ::String, ::String) at none:0
top-level scope at none:0
include_string(::Module, ::String, ::String) at loading.jl:1008
include_string(::Module, ::String, ::String, ::Int64) at eval.jl:30
(::getfield(Atom, Symbol("##114#119")){String,Int64,String})() at eval.jl:94
withpath(::getfield(Atom, Symbol("##114#119")){String,Int64,String}, ::String) at utils.jl:30
withpath at eval.jl:46 [inlined]
#113 at eval.jl:93 [inlined]
with_logstate(::getfield(Atom, Symbol("##113#118")){String,Int64,String}, ::Base.CoreLogging.LogState) at logging.jl:395
with_logger at logging.jl:491 [inlined]
#112 at eval.jl:92 [inlined]
hideprompt(::getfield(Atom, Symbol("##112#117")){String,Int64,String}) at repl.jl:85
macro expansion at eval.jl:91 [inlined]
(::getfield(Atom, Symbol("##111#116")))(::Dict{String,Any}) at eval.jl:86
handlemsg(::Dict{String,Any}, ::Dict{String,Any}) at comm.jl:164
(::getfield(Atom, Symbol("##19#21")){Array{Any,1}})() at task.jl:259

Looking at the source at

function readtable(filepath::AbstractString, sheet::Union{AbstractString, Int}; first_row::Int = 1, column_labels::Vector{Symbol}=Vector{Symbol}(), header::Bool=true, infer_eltypes::Bool=false, stop_in_empty_row::Bool=true, stop_in_row_function::Union{Nothing, Function}=nothing, enable_cache::Bool=false)

there also doesn't seem to exist a columns= argument. So is this just a mistake in the documentation or would it be possible to specify the columns to be extracted in readtable?

I'm using Julia 1.0.2 on MacOS. Thanks for your help.

Failed to precompile XLSX on julia 1.0.1

Hello,

I have some troubles to make XLSX package work on my computer. I am still a beginner in programming, thus it may be an obvious problem that I don't see, but when I try to use the package I have the following error message :

[ Info: Precompiling XLSX [fdbf4ff8-1666-58a4-91e7-1b58723a45e0]
ERROR: LoadError: LoadError: could not open file C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-
1.0.1.1\packages\ZipFile\p60bh\deps\deps.jl
Stacktrace:
[1] include at .\boot.jl:317 [inlined]
[2] include_relative(::Module, ::String) at .\loading.jl:1041
[3] include at .\sysimg.jl:29 [inlined]
[4] include(::String) at C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\packages\ZipFil
e\p60bh\src\Zlib.jl:26
[5] top-level scope at none:0
[6] include at .\boot.jl:317 [inlined]
[7] include_relative(::Module, ::String) at .\loading.jl:1041
[8] include at .\sysimg.jl:29 [inlined]
[9] include(::String) at C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\packages\ZipFil
e\p60bh\src\ZipFile.jl:36
[10] top-level scope at none:0
[11] include at .\boot.jl:317 [inlined]
[12] include_relative(::Module, ::String) at .\loading.jl:1041
[13] include(::Module, ::String) at .\sysimg.jl:29
[14] top-level scope at none:2
[15] eval at .\boot.jl:319 [inlined]
[16] eval(::Expr) at .\client.jl:389
[17] top-level scope at .\none:3
in expression starting at C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\packages\ZipFil
e\p60bh\src\Zlib.jl:50
in expression starting at C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\packages\ZipFil
e\p60bh\src\ZipFile.jl:43
ERROR: LoadError: Failed to precompile ZipFile [a5390f91-8eb1-5f08-bee0-b1d1ffed6cea] to C:\Users\th
iba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\compiled\v1.0\ZipFile\cOum2.ji.
Stacktrace:
[1] error(::String) at .\error.jl:33
[2] macro expansion at .\logging.jl:313 [inlined]
[3] compilecache(::Base.PkgId, ::String) at .\loading.jl:1187
[4] _require(::Base.PkgId) at .\logging.jl:311
[5] require(::Base.PkgId) at .\loading.jl:855
[6] macro expansion at .\logging.jl:311 [inlined]
[7] require(::Module, ::Symbol) at .\loading.jl:837
[8] include at .\boot.jl:317 [inlined]
[9] include_relative(::Module, ::String) at .\loading.jl:1041
[10] include(::Module, ::String) at .\sysimg.jl:29
[11] top-level scope at none:2
[12] eval at .\boot.jl:319 [inlined]
[13] eval(::Expr) at .\client.jl:389
[14] top-level scope at .\none:3
in expression starting at C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\packages\XLSX\B
v7r3\src\XLSX.jl:8

My package versions :

julia> Pkg.status()
Status C:\Users\thiba\AppData\Local\JuliaPro-1.0.1.1\pkgs-1.0.1.1\environments\JuliaPro_v1.0.1. 1\Project.toml
[c52e3926] Atom v0.7.14
[a076750e] CPLEX v0.4.3
[864edb3b] DataStructures v0.15.0
[31c24e10] Distributions v0.16.4
[c04bee98] ExcelReaders v0.10.2
[3c7084bd] GLPKMathProgInterface v0.4.4
[c91e804a] Gadfly v1.0.1
[2e9cd046] Gurobi v0.5.9
[682c06a0] JSON v0.20.0
[4076af6c] JuMP v0.18.5
[e5e0dc1b] Juno v0.5.4
[b964fa9f] LaTeXStrings v1.0.3
[fdba3010] MathProgBase v0.7.7
[6f286f6a] MultivariateStats v0.6.0
[86f7a689] NamedArrays v0.9.2
[73a701b4] NamedTuples v5.0.0
[24249f21] SymPy v0.8.3
[fdbf4ff8] XLSX v0.4.4

And I am running on windows 10.

Thanks

pretty printing for repl?

Currently, it's cumbersome to know how many sheets are in XLSXFile object or which cells have values inside XLSX.Worksheet

Here is my ideas for show function.

function Base.show(io::IO, xf::XLSX.XLSXFile) 
    wb = xf.workbook
    print(io, "XLSXFile(\"$(basename(xf.filepath))\") ",
              "containing $(XLSX.sheetcount(wb)) Worksheets\n")
    @printf(io, "%20s %-6s %-10s\n", "sheetname", "range", "size")
    println(io, "-"^(20+1+6+1+10))

    for s in wb.sheets
        dim = s.dimension
        _size = size(dim) |> x -> string(x[1], "x", x[2])
        @printf("%20s %-6s %-10s\n", s.name, dim, _size)
    end
end
# displayed as
Julia> XLSXFile("myfile.xlsx") containing 3 Worksheets
           sheetname range  size
--------------------------------------
             mysheet A1:B4  4x2
          othersheet A1:A1  1x1
               named B4:B4  1x1
function Base.show(io::IO, ws::XLSX.Worksheet) 
    range = XLSX.get_dimension(ws)
    dim = size(range)
    print(io, dim[1], "x", dim[2], " XLSX.Worksheet[\"$(ws.name)!$range\"]")
end
# displayed as
Julia> 4x2 XLSX.Worksheet["mysheet!A1:B4"]

Julia 0.7 Win7/ 64 error

D:\pawel\Julia_7.0_otwarty>C:\Users\PC\AppData\Local\Julia-0.7.0\bin\julia.exe -L start.jl

           _

_ _ ()_ | A fresh approach to technical computing
() | () () | Documentation: https://docs.julialang.org
_ _ | | __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ ` | |
| | |
| | | | (
| | | Version 0.7.0 (2018-08-08 06:46 UTC)
/ |_'|||_'_| | Official http://julialang.org/ release
|__/ | x86_64-w64-mingw32

(v0.7) pkg> up
Updating registry at C:\Users\PC\.julia\registries\General
Updating git-repo https://github.com/JuliaRegistries/General.git
Resolving package versions...
Updating C:\Users\PC\.julia\environments\v0.7\Project.toml
[no changes]
Updating C:\Users\PC\.julia\environments\v0.7\Manifest.toml
[no changes]

(v0.7) pkg> add XLSX
Resolving package versions...
Installed ZipFile ─ v0.6.0
Installed EzXML ─── v0.8.0
Installed XLSX ──── v0.2.2
Updating C:\Users\PC\.julia\environments\v0.7\Project.toml
[fdbf4ff8] + XLSX v0.2.2
Updating C:\Users\PC\.julia\environments\v0.7\Manifest.toml
[8f5d6c58] + EzXML v0.8.0
[fdbf4ff8] + XLSX v0.2.2
[a5390f91] + ZipFile v0.6.0
Building EzXML → C:\Users\PC\.julia\packages\EzXML\3U9Ow\deps\build.log

(v0.7) pkg> up
Updating registry at C:\Users\PC\.julia\registries\General
Updating git-repo https://github.com/JuliaRegistries/General.git
Resolving package versions...
Updating C:\Users\PC\.julia\environments\v0.7\Project.toml
[no changes]
Updating C:\Users\PC\.julia\environments\v0.7\Manifest.toml
[no changes]

julia> using XLSX
[ Info: Precompiling XLSX [fdbf4ff8-1666-58a4-91e7-1b58723a45e0]
┌ Warning: method_exists(f, t) is deprecated, use hasmethod(f, t) instead.
│ caller = top-level scope at none:0
└ @ Core none:0
WARNING: importing deprecated binding Base.Dates into XLSX.
WARNING: Base.Dates is deprecated, run using Dates instead
likely near C:\Users\PC.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: Base.Dates is deprecated, run using Dates instead
likely near C:\Users\PC.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: Base.Dates is deprecated, run using Dates instead
likely near C:\Users\PC.julia\packages\XLSX\ETIGH\src\structs.jl:45
WARNING: importing deprecated binding Base.Nullable into XLSX.
ERROR: LoadError: LoadError: TypeError: in Type{...} expression, expected UnionAll, got typeof(Base.Nullable)
Stacktrace:
[1] top-level scope at none:0
[2] include at .\boot.jl:317 [inlined]
[3] include_relative(::Module, ::String) at .\loading.jl:1038
[4] include at .\sysimg.jl:29 [inlined]
[5] include(::String) at C:\Users\PC.julia\packages\XLSX\ETIGH\src\XLSX.jl:3
[6] top-level scope at none:0
[7] include at .\boot.jl:317 [inlined]
[8] include_relative(::Module, ::String) at .\loading.jl:1038
[9] include(::Module, ::String) at .\sysimg.jl:29
[10] top-level scope at none:2
[11] eval at .\boot.jl:319 [inlined]
[12] eval(::Expr) at .\client.jl:399
[13] top-level scope at .\none:3
in expression starting at C:\Users\PC.julia\packages\XLSX\ETIGH\src\structs.jl:153
in expression starting at C:\Users\PC.julia\packages\XLSX\ETIGH\src\XLSX.jl:14
ERROR: Failed to precompile XLSX [fdbf4ff8-1666-58a4-91e7-1b58723a45e0] to C:\Users\PC.julia\compiled\v0.7\XLSX
\gPxqz.ji.
Stacktrace:
[1] error(::String) at .\error.jl:33
[2] macro expansion at .\logging.jl:313 [inlined]
[3] compilecache(::Base.PkgId, ::String) at .\loading.jl:1185
[4] _require(::Base.PkgId) at .\logging.jl:311
[5] require(::Base.PkgId) at .\loading.jl:852
[6] macro expansion at .\logging.jl:311 [inlined]
[7] require(::Module, ::Symbol) at .\loading.jl:834

julia>

Paul

Write to IO

writetable should be able to write to IO.

Use case: Create XLSX using HTTP and download the file not saving to disk.

Tables.jl integration

This is a super exciting package! I've wanted a pure-Julia excel reader for a while. I'd be very interested in helping put together a DataStreams integration to hook into all the other data formats out there if you're open to a PR.

Feature request: creating & applying cell styles and conditional formatting

I'm currently working on a large project in Julia, and a part of it is generating Excel reports. Ideally, these reports should mark anomalous results by different text & background colour such that it's easy to see at a glance if everything is as expected or where things went wrong. In particular, I'm looking for a way to do the following 3 things from Julia:

  1. create a cellstyle (font + text/background colour) for use in the generated Excel file;
  2. assign a cellstyle to a specific cell; and
  3. create a conditional formatting and apply it to specific cells.

Would it be possible to provide this functionality in a reasonably intuitive manner?

Add support for Julia v0.7 and v1.0

I'm on 'XLSX#master' but precomfile fails on MAC (OS: macOS (x86_64-apple-darwin14.5.0)

(v0.7) pkg> precompile
Precompiling project...
Precompiling XLSX
[ Info: Precompiling XLSX [fdbf4ff8-1666-58a4-91e7-1b58723a45e0]
┌ Warning: `method_exists(f, t)` is deprecated, use `hasmethod(f, t)` instead.
│   caller = top-level scope at none:0
└ @ Core none:0
WARNING: importing deprecated binding Base.Dates into XLSX.
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near /Users/devsisters/.julia/packages/XLSX/kKqNW/src/structs.jl:45
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near /Users/devsisters/.julia/packages/XLSX/kKqNW/src/structs.jl:45
WARNING: Base.Dates is deprecated, run `using Dates` instead
  likely near /Users/devsisters/.julia/packages/XLSX/kKqNW/src/structs.jl:45
WARNING: importing deprecated binding Base.Nullable into XLSX.
ERROR: LoadError: LoadError: TypeError: in Type{...} expression, expected UnionAll, got typeof(Base.Nullable)
Stacktrace:

Add helper methods for DataFrames

All table methods avoid using DataFrames directly. I'm planning to add some helper methods to bring some sugar to the code. But this will add a depencency on DataFrames. Would like to know if someone is against this.

Performance issue

Hi, Felipe. Thanks for the quick fix!

Next issue: when I replace ExcelReaders with XLSX, the time to setup my model increases from 0.6 sec to 5 sec. I was hoping to get rid of the Python dependencies with a pure Julia reader, but the performance is prohibitive. Any thoughts on improving performance?

Support for writing Excel files

The data structures for this package were designed to allow for writing Excel files.
With some effort it will be possible to write simple Excel files, and generate new files from templates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.