queryverse / excelreaders.jl Goto Github PK

View Code? Open in Web Editor NEW

58.0 4.0 20.0 314 KB

ExcelReaders is a package that provides functionality to read Excel files.

License: Other

Julia 100.00%

julia data excel queryverse

excelreaders.jl's People

Contributors

Stargazers

Watchers

Forkers

dhoegh nagyist pdobacz scottpjones tranquilhero benjaminbenwu tkelman hedgefair whrl jorgeiv500 gregfa hhc2tech stjordanis cmcaine pstaabp standardgalactic playfloor ayeshakiran1

excelreaders.jl's Issues

Reading a spredsheet with skipstartrows=:blanks and nrows=someInt results in error

I found an error with the readxlsheet function. It seems like doing something like readxlsheet("sheet.xlsx", 1, nrows=7) makes Julia try to add 7 + :blanks.

This is the error it throws

MethodError: no method matching +(::Int64, ::Symbol)
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at operators.jl:424
  +(::PyCall.PyObject, ::Any) at C:\Users\mijail\.julia\v0.6\PyCall\src\pyoperators.jl:11
  +(::T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}, ::T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:32
  ...

Stacktrace:
 [1] #convert_args_to_row_col#5(::Symbol, ::Int64, ::Int64, ::Int64, ::Function, ::PyCall.PyObject) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:153
 [2] (::ExcelReaders.#kw##convert_args_to_row_col)(::Array{Any,1}, ::ExcelReaders.#convert_args_to_row_col, ::PyCall.PyObject) at .\<missing>:0
 [3] #readxlsheet#4(::Array{Any,1}, ::Function, ::ExcelReaders.ExcelFile, ::String) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:96
 [4] (::ExcelReaders.#kw##readxlsheet)(::Array{Any,1}, ::ExcelReaders.#readxlsheet, ::ExcelReaders.ExcelFile, ::String) at .\<missing>:0
 [5] #readxlsheet#2(::Array{Any,1}, ::Function, ::ExcelReaders.ExcelFile, ::Int64) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:86
 [6] (::ExcelReaders.#kw##readxlsheet)(::Array{Any,1}, ::ExcelReaders.#readxlsheet, ::ExcelReaders.ExcelFile, ::Int64) at .\<missing>:0
 [7] include_string(::String, ::String) at .\loading.jl:515

It works just fine if I do readxlsheet("sheet.xlsx", 1, skipstartrows=0, nrows=7).
Same thing happens with skipstartcols and ncols.

DataFrames v0.11 roadmap

ExcelReaders.jl will drop DataFrames support, and folks will instead be encourage to use ExcelFiles.jl to read things into a DataFrame.

julia = "1.6" in toml?

Does this mean that version 0.11 will be installed by julia 1.10?

Use the conda.jl package to install xlrd

ExcelReaders.jl is holding DataFrames and DataArray update?

Julia seems to think that ExcelReaders is holding DataFrame update.
But there is no upper bound in require file.

Reading from bytes / IOBuffer

I think it would be reasonable to support this directly as an overload of readxl / readxlsheet. Below is a example/workaround using xlrd directly.

using HTTP
using ExcelReaders

r = HTTP.get("url returning.xls")
wb = ExcelReaders.xlrd.open_workbook(; file_contents=r.body)
xf = ExcelReaders.ExcelFile(wb, "")
data = ExcelReaders.readxlsheet(xf, "Sheet1");

Empty cells in xls Julia showing as NA. Julia does not recognize this sign.

Empty cells in xls Julia showing as NA. Julia does not recognize this sign.
`julia> a=readxlsheet("dane/dane.xlsx","Arkusz1")
3×3 DataArrays.DataArray{Any,2}:
0.946871 0.753045 0.789162
0.867544 NA 0.940965
0.381519 0.456175 0.0888175

julia> sum(a)
NA
julia> findin(a,NA)
ERROR: UndefVarError: NA not defined
julia> findin(a,"NA")
0-element Array{Int64,1}
julia> findin(a,'NA')
ERROR: syntax: invalid character literal
julia> typeof(a[2,2])
DataArrays.NAtype
julia> (a[2,2])
NA`

IS posible to change it fo native for Julia NaN ? Paul

xlrd not automatically installed

The package uses the Python xlrd library. If either Python or the xlrd package are not installed on your system, the package will use the conda.jl package to install all necessary dependencies automatically.

I had to set ENV["PYTHON"]="", rebuild PyCall, restart Julia and then import ExcelReaders again to get PyCall to install xlrd.

I don't know if this is intended behaviour, but it certainly adds caveats to the above statement from the Readme (i.e. "package will use the conda.jl package to install all necessary dependencies automatically if you already have a Julia conda environment installed. If you do not...")

Reading excel file: ArgumentError: ref of NULL PyObject

Same issue as #20 but I wasnt as lucky.
Using Julia Pro 0.6 on Windows 7 64 bits. Also using Anaconda 4.4 Python 2.7. I installed xlrd with conda from console because ExcelReaders.jl couldn't do it for some reason.

openxl or readxl gives

ArgumentError: ref of NULL PyObject
in readxl at ExcelReaders\src\ExcelReaders.jl:195
in openxl at ExcelReaders\src\ExcelReaders.jl:75
in getindex at PyCall\src\PyCall.jl:239

and Pkg.test("ExcelReaders") gives:

INFO: Testing ExcelReaders
ERROR: LoadError: InitError: 
Failed to import required Python module xlrd.

For automated xlrd installation, try configuring PyCall to use the Conda.jl package's Python "Miniconda" distribution within Julia. Relaunch Julia and run:
    ENV["PYTHON"]=""
    Pkg.build("PyCall")
before trying again.

The pyimport exception was: PyError (ccall(@pysym(:PyImport_ImportModule), PyPtr, (Cstring,), name)) <class 'ImportError'>
ImportError("No module named 'xlrd'",)


Stacktrace:
 [1] pyimport_conda(::String, ::String, ::String) at E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\PyCall\src\PyCall.jl:552
 [2] __init__() at E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\src\ExcelReaders.jl:16
 [3] _include_from_serialized(::String) at .\loading.jl:157
 [4] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:200
 [5] _require_search_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:236
 [6] _require(::Symbol) at .\loading.jl:434
 [7] require(::Symbol) at .\loading.jl:398
 [8] include_from_node1(::String) at .\loading.jl:569
 [9] include(::String) at .\sysimg.jl:14
 [10] process_options(::Base.JLOptions) at .\client.jl:305
 [11] _start() at .\client.jl:371
during initialization of module ExcelReaders
while loading E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\test\runtests.jl, in expression starting on line 1
============================[ ERROR: ExcelReaders ]=============================

failed process: Process(`'E:\JuliaPro-0.6.0.1\Julia-0.6.0\bin\julia.exe' -Cx86-64 '-JE:\JuliaPro-0.6.0.1\Julia-0.6.0\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=no --compilecache=yes 'E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\test\runtests.jl'`, ProcessExited(1)) [1]


================================================================================
Base.Pkg.Entry.PkgTestError("ExcelReaders had test errors")

Fail if first cell empty

When loading to a DataFrame, if the first cell of the first column is empty (remaining of the column has content), it fails with

LoadError: type Void has no field captures
 in convert_ref_to_sheet_row_col at C:\User\.julia\v0.4\ExcelReaders\src\ExcelReaders.jl:177
 in readxl at C:\User\.julia\v0.4\ExcelReaders\src\ExcelReaders.jl:246
 in include_string at loading.jl:282
 in include_string at C:\User\.julia\v0.4\CodeTools\src\eval.jl:32
 in anonymous at C:\User\.julia\v0.4\Atom\src\eval.jl:84
 in withpath at C:\User\.julia\v0.4\Requires\src\require.jl:37
 in withpath at C:\User\.julia\v0.4\Atom\src\eval.jl:53
 [inlined code] from C:\User\.julia\v0.4\Atom\src\eval.jl:83
 in anonymous at task.jl:58

Wouldn't it better to fill it with something like A (as DataFrame needs a name for the column)?

To reproduce:

using ExcelReaders
using DataFrames

f = openxl("Filename.xlsx")
df = readxl(DataFrame, f, "Sheet1")

use pyimport_conda

As of PyCall 1.5, you can now just do pyimport_conda("xlrd", "xlrd") to auto-install xlrd via Conda (if PyCall is using Conda) or print an informative error message, rather than duplicating that logic here. This will also let you remove the explicit Conda dependency.

Reading excel: ArgumentError: ref of NULL PyObject

i've tried to read xlsx file bying using readxl, by returning the NULL error. the Excel Readers tests passed though, with pycall functioning well. Any thing else i can try to fix ?

Read named references

Hi!
I was trying to read an specific range of an excel file (defined in Excel), like this:
dfData = readxl(DataFrame, file0, "Control!Investments", header=false)

But I think it can't be done right now. Is it possible?
If not, I think it is useful to incorporate this option.

Thanks!

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Add tests for readxlnames and readxlrange

Support for error cells

Support "A3:A" style references?

In Excel, this refers to all the cells in the A column from row 3 till the bottom.

It would have been convenient for my use case to have this select from row 3 till the last row with any non-missing cells.

readxlsheet reads the empty cells as #NA

Hi, I'm using Julia 0.6.4, with version 0.9.0. This used to work fine in the earlier versions but I've recently updated the package to 0.9 and now readxlsheet reads the empty cells in the file as "#NA" and julia does not register this as a missing value. typeof() returns DataValue(Union{})().

I need these to register as missing values, are there any suggestions?

Add documentation for readxlnames and readxlrange

Reading sheets without column headings

Thanks for a useful package.

I just got a large excel file where the data has been split into separate sheets so only the first sheet has column headings. Is there a way I can manually give the column headings/names when reading in a sheet? (This way I can read the first sheet and then use its column headings when reading the other sheets)

Thanks for any pointers.

Feature request: read entire sheet into dataframe

As far as I can tell, it's possible to read an entire sheet using readxlsheet or read to a dataframe using readxl(DataFrame ...). Could we also have an option that lets us read an entire sheet into a dataframe (or is this possible and I'm just missing it?)

pycall api updates

PyCall 1.90.0 is now released, which change o[:foo] and o["foo"] to o.foo and o."foo", respectively, for python objects o; see also JuliaPy/PyCall.jl#629.

The old getindex methods still work but are deprecated, so you'll want to put out a new release that uses the new methods and REQUIREs PyCall 1.90.0 to avoid having zillions of deprecation messages.

Add a readxlsheet function

Error in kwfunc

Hello! I need Help!

My code - data = readxl(DataFrame, "A.xlsx", "Sheet!A1:X1834", header=true) - results in

function readxl does not accept keyword arguments
kwfunc(::Any) at boot.jl:237
include_string(::String, ::String) at loading.jl:522
include_string(::String, ::String, ::Int64) at eval.jl:30
include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N} where N) at eval.jl:34
(::Atom.##102#107{String,Int64,String})() at eval.jl:82
withpath(::Atom.##102#107{String,Int64,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
hideprompt(::Atom.##101#106{String,Int64,String}) at repl.jl:67
macro expansion at eval.jl:80 [inlined]
(::Atom.##100#105{Dict{String,Any}})() at task.jl:80

Before, this code it resulted in PyError (IOError, 2, not such file or directory). I think maybe be one error in PyCall, butt I am not right. When i closed Atom and reopen, the error changed for function readxl... keyword arguments. I am confused!

What is this?

Thank you so much!

Best

Backport Conda support to 0.3.1

Mainly so that there is a julia 0.3 compatible version that has the Conda support.

This should happen on the release-0.3 branch.

Tests fail on PkgEval now

http://pkg.julialang.org/logs/ExcelReaders_0.3.log

Is something not right with the new Conda stuff?

problem with instal on Julia 1.2

Win10 (64), updated.

Importand :
On my machine no dir C:\cygwin\

``
julia> Pkg.add("ExcelReaders")
Resolving package versions...
ERROR: MethodError: Cannot convert an object of type Nothing to an object of type String
Closest candidates are:
convert(::Type{T<:AbstractString}, ::T<:AbstractString) where T<:AbstractString at strings/basic.jl:208
convert(::Type{T<:AbstractString}, ::AbstractString) where T<:AbstractString at strings/basic.jl:209
convert(::Type{T}, ::T) where T at essentials.jl:167
Stacktrace:
[1] setindex!(::Dict{Base.UUID,String}, ::Nothing, ::Base.UUID) at .\dict.jl:380
[2] deps_graph(::Pkg.Types.Context, ::Dict{Base.UUID,String}, ::Dict{Base.UUID,Pkg.Types.VersionSpec}, ::Dict{Base.UUID,Pkg.Types.Fixed}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:424
[3] resolve_versions!(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:316
[4] #add#100(::Bool, ::typeof(Pkg.Operations.add), ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}, ::Array{Base.UUID,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:962
[5] #add at .\none:0 [inlined]
[6] #add#25(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Pkg.API.add), ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:99
[7] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:69 [inlined]
[8] #add#24 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:67 [inlined]
[9] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:67 [inlined]
[10] #add#21 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:65 [inlined]
[11] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:65 [inlined]
[12] #add#20(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Pkg.API.add), ::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:64
[13] add(::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:64
[14] top-level scope at REPL[9]:1

julia> Pkg.build("ExcelReaders")
ERROR: The following package names could not be resolved:

ExcelReaders (not found in project or manifest)
Please specify by known name=uuid.
Stacktrace:
[1] pkgerror(::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Types.jl:112
[2] #ensure_resolved#107(::Bool, ::typeof(Pkg.Types.ensure_resolved), ::Pkg.Types.EnvCache, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Types.jl:910
[3] ensure_resolved at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Types.jl:886 [inlined]
[4] #build#73(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Pkg.API.build), ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:398
[5] build at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:382 [inlined]
[6] #build#72 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:380 [inlined]
[7] build at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:380 [inlined]
[8] #build#69 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:377 [inlined]
[9] build(::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:377
[10] top-level scope at REPL[10]:1

julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, sandybridge)

julia>
``

makes things simpler?

Hello. So far I've enjoyed the use of ExcelReaders. I think there are a few tweaks that could make it simpler and easier.

Is there a reason to have a separated readxl and readxlsheet functions?. Could the readxl funtion detect when you are not supplying a range and assume you want the whole sheet?. Maybe just with the number of arguments?.
Can the module require DataFrames and give a DataFrame as output by default?. If a DataArray is more convenient fore a certain application can this be selected with a keyword argument?.

thanks

Braulio

Is there reason to prevent constructing DataFrame with empty column name?

I found that if top row has any empty cell, I cannot use readxlsheet(DataFrame......

This code below would allow constructing DataFrame with an empty column name, But I don't know if this is an intentional restriction or not. 14ca83c

        if header
            headervec = data[1, :]
            NAcol = Bool.(isna(headervec))
            headervec[NAcol] = DataFrames.gennames(countnz(NAcol))

            colnames = convert(Array{Symbol},vec(headervec))
        else
            colnames = DataFrames.gennames(ncol)
        end

Reading excel file: ArgumentError: ref of NULL PyObject

Hi, I am new with Julia. I am trying with ExcelReaders but I get an error.
First, in order to use ExcelReaders I have to do:
ENV["PYTHON"]=""
Pkg.build("PyCall")

It seems to work, but when I try to read some excel file like data = readxl("File.xlsx", "Sheet1!A1:C4"), it gives the error ArgumentError: ref of NULL PyObject

I do not know what is it.
Thank you

xlrd 2.0.1 issues

the xlrd documentation now (as of Dec 2020) says
This library will no longer read anything other than .xls files.

I gave it a try, and ExcelReaders+xlrd 2.0.1 indeed fail on an xlsx file, so I had to revert to xlrd 1.2.0.

May I ask what the plans for ExcelReaders.jl are?

Why not use AbstractArray{Nullable{T}, N} or NullableArray{T, N}?

OK, it's one thing not to do so when reading in as DataArray.

But, the point of DataFrames is they already provide support for NA. But, it appears that when you scan though a column containing any NA even if all other values are of the same type you create a DataArray of that type for the column. So, that just breaks NA handling.

At the risk of increasing memory usage (quite real risk) why not convert the raw array you read in from the Excel sheet to DataFrames and let DataFrames figure out the types and do it's default NA handling? Maybe I don't understand so just a sort of brute force thought.

Given this limitation I'll just export the Excel data I was using (some state dept of education data) as csv and use the DataFrames package csv reader, which might handle the NAs. I'll report back if this works.

Thanks--I realize this is new work and relying on python xlrd, so no automatic handling for NAs there.

Tests crash on 0.4 nightlies for some windows build configuration

See here.

Probably needs to be fixed upstream JuliaPy/PyCall.jl#185.

Provide better error message for reading a DataFrame that has an empty cell in the header row

https://groups.google.com/forum/#!topic/julia-users/yvtOS21frnI

ArgumentError: ref of NULL PyObject

Hello! I just installed the newest version of JuliaPro (1.1.1.1) and ExcelReaders stopped working.

using ProgressMeter, BenchmarkTools, DataFrames, GLM, Combinatorics

# Set Directory

cd("...\\julia_codes")


global Country_Sample = "vsmall"

Data1 = readxlsheet("gravity_$(Country_Sample).xlsx", "Sheet1") '''

The following error appears 

```ArgumentError: ref of NULL PyObject
in top-level scope at base\none
in readxlsheet at ExcelReaders\PgE1r\src\ExcelReaders.jl:79
in #readxlsheet#3 at ExcelReaders\PgE1r\src\ExcelReaders.jl:79 
in openxl at ExcelReaders\PgE1r\src\ExcelReaders.jl:64
in getproperty at PyCall\ttONZ\src\PyCall.jl:313 
in getproperty at PyCall\ttONZ\src\PyCall.jl:303```

I have removed, added and built Conda and PyCall several times. 

Also, I added xldr through Conda.

Finally, this is the output when I run test ExcelReaders:

``` Testing ExcelReaders
    Status `C:\Users\jmcastro\AppData\Local\Temp\jl_8D0.tmp\Manifest.toml`
  [00ebfdb7] CSTParser v0.6.2
  [34da2185] Compat v2.1.0
  [8f4d0f93] Conda v1.3.0
  [864edb3b] DataStructures v0.17.0
  [e2d170a0] DataValueInterfaces v1.0.0
  [e7dc6d0d] DataValues v0.4.12
  [c04bee98] ExcelReaders v0.11.0
  [682c06a0] JSON v0.21.0
  [1914dd2f] MacroTools v0.5.1
  [bac558e1] OrderedCollections v1.1.0
  [69de0a69] Parsers v0.3.6
  [438e738f] PyCall v1.91.2
  [0796e94c] Tokenize v0.5.5
  [81def892] VersionParsing v1.1.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [b99e7846] BinaryProvider  [`@stdlib/BinaryProvider`]
  [864e158e] CredentialsHandler  [`@stdlib/CredentialsHandler`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8bb1440f] DelimitedFiles  [`@stdlib/DelimitedFiles`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [cd3eb016] HTTP  [`@stdlib/HTTP`]
  [83e8ac13] IniFile  [`@stdlib/IniFile`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [739be429] MbedTLS  [`@stdlib/MbedTLS`]
  [a63ad114] Mmap  [`@stdlib/Mmap`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [1a1011a3] SharedArrays  [`@stdlib/SharedArrays`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [9d418dce] TOML  [`@stdlib/TOML`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
Test Summary: | Pass  Total
ExcelReaders  |  245    245
   Testing ExcelReaders tests passed ```

Crash while precompiling, v0.7 mac

On my computer, ExcelReaders seems to crash on precompilation. This happened both on v0.10.2 and (as below) on master.

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0 (2018-08-08 06:46 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-apple-darwin14.5.0

(v0.7) pkg> add ExcelReaders#master
   Cloning git-repo `https://github.com/queryverse/ExcelReaders.jl.git`
  Updating git-repo `https://github.com/queryverse/ExcelReaders.jl.git`
 Resolving package versions...
  Updating `~/.julia/environments/v0.7/Project.toml`
  [c04bee98] ↑ ExcelReaders v0.10.2 ⇒ v0.10.2+ #master (https://github.com/queryverse/ExcelReaders.jl.git)
  Updating `~/.julia/environments/v0.7/Manifest.toml`
  [c04bee98] ↑ ExcelReaders v0.10.2 ⇒ v0.10.2+ #master (https://github.com/queryverse/ExcelReaders.jl.git)

julia> using ExcelReaders
[ Info: Precompiling ExcelReaders [c04bee98-12a5-510c-87df-2a230cb6e075]

signal (11): Segmentation fault: 11
in expression starting at no file:0

Issue with "readxl"

I tried to read some excel files into julia and I kept getting the below error.

The code that I ran was:

file="path to the file"
using ExcelReaders
using DataFrames

data=readxl(DataFrame,file,"Sheet1")

ERROR: Invalid Excel range specified.
Stacktrace:
[1] convert_ref_to_sheet_row_col(::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:181
[2] #readxl#7(::Bool, ::Array{Symbol,1}, ::Function, ::Type{DataFrames.DataFrame}, ::ExcelReaders.ExcelFile, ::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:263
[3] (::ExcelReaders.#kw##readxl)(::Array{Any,1}, ::ExcelReaders.#readxl, ::Type{DataFrames.DataFrame}, ::ExcelReaders.ExcelFile, ::String) at .<missing>:0
[4] readxl(::Type{DataFrames.DataFrame}, ::String, ::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:257

Additionally, I ran a couple tests.

Pkg.test("ExcelReaders")
INFO: Testing ExcelReaders
INFO: ExcelReaders tests passed

julia> using PyCall

julia> @pyimport math
julia> math.sin(math.pi/4)-sin(pi/4)
0.0

Thanks.

Mention XLSX.jl in Alternatives section in readme?

XLSX.jl is a pure-julia implementation for parsing xlsx files. In my
testing it doesn't work for some sheets that are produced by strange and
bad software, but should work for most use cases.

Provide a way to read cell formulas

There is a module in xlrd for reading formulas (https://github.com/python-excel/xlrd/blob/master/xlrd/formula.py). I am not sure how complete it is though (the two main functions both say they are under construction). Ideally, the formula would come in parsed into a tree of function calls and references.

My interest is in providing on-the-fly translation from a sheet in an Excel document to a Mimi model component, so non-technical users have a way to make changes to certain components.

Make things work with julia 0.4

PyCall is throwing an error in combination with Excelreaders

Here is the error message, it pops up when I use readxlsheet:

PyCall.PyError("ccall(@pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, arg, C_NULL)", PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x000000001eab2e60), PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x0000000023c6f2b0), PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x0000000023c5fb48))
pyerr_check at exception.jl:56 [inlined]
pyerr_check at exception.jl:61 [inlined]
macro expansion at exception.jl:81 [inlined]
#_pycall#66(::Array{Any,1}, ::Function, ::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:596
_pycall(::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:584
#pycall#70(::Array{Any,1}, ::Function, ::PyCall.PyObject, ::Type{PyCall.PyAny}, ::String, ::Vararg{String,N} where N) at PyCall.jl:618
pycall(::PyCall.PyObject, ::Type{PyCall.PyAny}, ::String, ::Vararg{String,N} where N) at PyCall.jl:618
#call#71(::Array{Any,1}, ::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:621
(::PyCall.PyObject)(::String, ::Vararg{String,N} where N) at PyCall.jl:621
openxl at ExcelReaders.jl:75 [inlined]
#readxlsheet#10(::Bool, ::Array{Symbol,1}, ::Array{Any,1}, ::Function, ::Type{DataFrames.DataFrame}, ::String, ::String) at ExcelReaders.jl:275
readxlsheet(::Type{DataFrames.DataFrame}, ::String, ::String) at ExcelReaders.jl:275
include_string(::String, ::String) at loading.jl:515
include_string(::String, ::String, ::Int64) at eval.jl:30
include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N} where N) at eval.jl:34
(::Atom.##49#52{String,Int64,String})() at eval.jl:50
withpath(::Atom.##49#52{String,Int64,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
macro expansion at eval.jl:49 [inlined]
(::Atom.##48#51{Dict{String,Any}})() at task.jl:80

EDIT: Works fine after I substituted the full path for my file instead of just the filename. Is there a reason for why that stopped working?

Add inline docs for readxl and readxlsheet functions

Reading into a DataFrame crashes for specific column types

When there is a column that is of only one data type but has missing values, reading into a DataFrame crashes.

This should be fixed when JuliaStats/DataArrays.jl#107 is fixed.

Support for date cells

Problem with missing values

Hi,

I would like to read values from A1:B2 cells using readxl with following script:

using ExcelReaders

path = Base.Filesystem.joinpath(@__DIR__, "test.xlsx")
range = "test!A1:B2"

data = ExcelReaders.readxl(path, range)
dump(data)

Currently the script works fine if the values are there and I get PyError in case values are missing. I would like to get 2x2 array regardless if the values are there or not.

I tried to use readxlsheet like this:

data = ExcelReaders.readxlsheet(path, "test"; skipstartrows=0, skipstartcols=0, nrows=2, ncols=2)
dump(data)

but I get 0x0 array in case data is missing.

Thanks,
Piotr

Reading excel file: ArgumentError: ref of NULL PyObject

I have the same problem as #20 and when I run Pkg.test("ExcelReaders") I had the following output. Any idea how I can fix this?

INFO: Testing ExcelReaders
ERROR: LoadError: InitError: Failed to import required Python module xlrd.

For automated xlrd installation, try configuring PyCall to use the Conda.jl package's Python "Miniconda" distribution within J
ulia. Relaunch Julia and run:
ENV["PYTHON"]=""
Pkg.build("PyCall")
before trying again.

The pyimport exception was: PyError (ccall(@pysym(:PyImport_ImportModule), PyPtr, (Cstring,), name)

The Python package xlrd could not be found by pyimport. Usually this means
that you did not install xlrd in the Python version being used by PyCall.

PyCall is currently configured to use the Python version at:

C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\Python\python.exe

and you should use whatever mechanism you usually use (apt-get, pip, conda,
etcetera) to install the Python package containing the xlrd module.

One alternative is to re-configure PyCall to use a different Python
version on your system: set ENV["PYTHON"] to the path/name of the python
executable you want to use, run Pkg.build("PyCall"), and re-launch Julia.

Another alternative is to configure PyCall to use a Julia-specific Python
distribution via the Conda.jl package (which installs a private Anaconda
Python distribution), which has the advantage that packages can be installed
and kept up-to-date via Julia. As explained in the PyCall documentation,
set ENV["PYTHON"]="", run Pkg.build("PyCall"), and re-launch Julia. Then,
To install the xlrd module, you can use pyimport_conda("xlrd", PKG),
where PKG is the Anaconda package the contains the module xlrd,
or alternatively you can use the Conda package directly (via
using Conda followed by Conda.add etcetera).

) <class 'ImportError'>
ImportError("No module named 'xlrd'",)

Stacktrace:
[1] pyimport_conda(::String, ::String, ::String) at C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\PyCal
l\src\PyCall.jl:609
[2] init() at C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\src\ExcelReaders.jl:16
[3] _include_from_serialized(::String) at .\loading.jl:157
[4] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:200
[5] _require_search_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:236
[6] _require(::Symbol) at .\loading.jl:441
[7] require(::Symbol) at .\loading.jl:405
[8] include_from_node1(::String) at .\loading.jl:576
[9] include(::String) at .\sysimg.jl:14
[10] process_options(::Base.JLOptions) at .\client.jl:305
[11] _start() at .\client.jl:371
during initialization of module ExcelReaders
while loading C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\test\runtests.jl, in expression
starting on line 1
===================================================[ ERROR: ExcelReaders ]====================================================

failed process: Process('C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\Julia-0.6.2\bin\julia.exe' -Cx86-64 '-JC:\Users\dz saf_000\AppData\Local\JuliaPro-0.6.2.2\Julia-0.6.2\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-co verage=none --color=yes --compilecache=yes 'C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\t est\runtests.jl', ProcessExited(1)) [1]

========================================

ExcelReaders: suggestion about adding a way to close an .xlsx file after opening it with “openxl”

Hello,

I am using Julia 0.6.4 and the ExcelReaders package to read several sheets from a very large .xlsx file.
Thus, instead of repeating foo_n = readxlsheet("data.xlsx", "sheetname_n") for n times, it seemed to me to be more convenient and faster to use:

main_file = openxl("data.xlsx")
foo_1 = readxlsheet(main_file, "sheetname_1")
foo_2 = readxlsheet(main_file, "sheetname_2")
...

Which, if I got it right, keeps the whole large file "data.xlsx" in memory, so to permit a faster access whenever e.g. a new sheet needs to be read.
However, how about adding a feature to close such a large file (once not needed anymore in the program)? It could be something like: closexl("data.xlsx").
I presume that by freeing up some memory, the latter could be used to run faster a very long script for instance.

Thank you in advance!

What to do about XLDateAmbiguous

See https://discourse.julialang.org/t/loading-data/18639/5

queryverse / excelreaders.jl Goto Github PK

excelreaders.jl's People

Contributors

Stargazers

Watchers

Forkers

excelreaders.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org