queryverse / excelreaders.jl Goto Github PK
View Code? Open in Web Editor NEWExcelReaders is a package that provides functionality to read Excel files.
License: Other
ExcelReaders is a package that provides functionality to read Excel files.
License: Other
I found an error with the readxlsheet
function. It seems like doing something like readxlsheet("sheet.xlsx", 1, nrows=7)
makes Julia try to add 7 + :blanks
.
This is the error it throws
MethodError: no method matching +(::Int64, ::Symbol)
Closest candidates are:
+(::Any, ::Any, ::Any, ::Any...) at operators.jl:424
+(::PyCall.PyObject, ::Any) at C:\Users\mijail\.julia\v0.6\PyCall\src\pyoperators.jl:11
+(::T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}, ::T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:32
...
Stacktrace:
[1] #convert_args_to_row_col#5(::Symbol, ::Int64, ::Int64, ::Int64, ::Function, ::PyCall.PyObject) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:153
[2] (::ExcelReaders.#kw##convert_args_to_row_col)(::Array{Any,1}, ::ExcelReaders.#convert_args_to_row_col, ::PyCall.PyObject) at .\<missing>:0
[3] #readxlsheet#4(::Array{Any,1}, ::Function, ::ExcelReaders.ExcelFile, ::String) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:96
[4] (::ExcelReaders.#kw##readxlsheet)(::Array{Any,1}, ::ExcelReaders.#readxlsheet, ::ExcelReaders.ExcelFile, ::String) at .\<missing>:0
[5] #readxlsheet#2(::Array{Any,1}, ::Function, ::ExcelReaders.ExcelFile, ::Int64) at C:\Users\mijail\.julia\v0.6\ExcelReaders\src\ExcelReaders.jl:86
[6] (::ExcelReaders.#kw##readxlsheet)(::Array{Any,1}, ::ExcelReaders.#readxlsheet, ::ExcelReaders.ExcelFile, ::Int64) at .\<missing>:0
[7] include_string(::String, ::String) at .\loading.jl:515
It works just fine if I do readxlsheet("sheet.xlsx", 1, skipstartrows=0, nrows=7)
.
Same thing happens with skipstartcols
and ncols
.
ExcelReaders.jl will drop DataFrames support, and folks will instead be encourage to use ExcelFiles.jl to read things into a DataFrame.
Does this mean that version 0.11 will be installed by julia 1.10?
Julia seems to think that ExcelReaders is holding DataFrame update.
But there is no upper bound in require file.
I think it would be reasonable to support this directly as an overload of readxl
/ readxlsheet
. Below is a example/workaround using xlrd directly.
using HTTP
using ExcelReaders
r = HTTP.get("url returning.xls")
wb = ExcelReaders.xlrd.open_workbook(; file_contents=r.body)
xf = ExcelReaders.ExcelFile(wb, "")
data = ExcelReaders.readxlsheet(xf, "Sheet1");
Empty cells in xls Julia showing as NA. Julia does not recognize this sign.
`julia> a=readxlsheet("dane/dane.xlsx","Arkusz1")
3×3 DataArrays.DataArray{Any,2}:
0.946871 0.753045 0.789162
0.867544 NA 0.940965
0.381519 0.456175 0.0888175
julia> sum(a)
NA
julia> findin(a,NA)
ERROR: UndefVarError: NA not defined
julia> findin(a,"NA")
0-element Array{Int64,1}
julia> findin(a,'NA')
ERROR: syntax: invalid character literal
julia> typeof(a[2,2])
DataArrays.NAtype
julia> (a[2,2])
NA`
IS posible to change it fo native for Julia NaN ? Paul
The package uses the Python xlrd library. If either Python or the xlrd package are not installed on your system, the package will use the conda.jl package to install all necessary dependencies automatically.
I had to set ENV["PYTHON"]=""
, rebuild PyCall, restart Julia and then import ExcelReaders again to get PyCall to install xlrd.
I don't know if this is intended behaviour, but it certainly adds caveats to the above statement from the Readme (i.e. "package will use the conda.jl package to install all necessary dependencies automatically if you already have a Julia conda environment installed. If you do not...")
Same issue as #20 but I wasnt as lucky.
Using Julia Pro 0.6 on Windows 7 64 bits. Also using Anaconda 4.4 Python 2.7. I installed xlrd with conda from console because ExcelReaders.jl couldn't do it for some reason.
openxl or readxl gives
ArgumentError: ref of NULL PyObject
in readxl at ExcelReaders\src\ExcelReaders.jl:195
in openxl at ExcelReaders\src\ExcelReaders.jl:75
in getindex at PyCall\src\PyCall.jl:239
and Pkg.test("ExcelReaders")
gives:
INFO: Testing ExcelReaders
ERROR: LoadError: InitError:
Failed to import required Python module xlrd.
For automated xlrd installation, try configuring PyCall to use the Conda.jl package's Python "Miniconda" distribution within Julia. Relaunch Julia and run:
ENV["PYTHON"]=""
Pkg.build("PyCall")
before trying again.
The pyimport exception was: PyError (ccall(@pysym(:PyImport_ImportModule), PyPtr, (Cstring,), name)) <class 'ImportError'>
ImportError("No module named 'xlrd'",)
Stacktrace:
[1] pyimport_conda(::String, ::String, ::String) at E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\PyCall\src\PyCall.jl:552
[2] __init__() at E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\src\ExcelReaders.jl:16
[3] _include_from_serialized(::String) at .\loading.jl:157
[4] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:200
[5] _require_search_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:236
[6] _require(::Symbol) at .\loading.jl:434
[7] require(::Symbol) at .\loading.jl:398
[8] include_from_node1(::String) at .\loading.jl:569
[9] include(::String) at .\sysimg.jl:14
[10] process_options(::Base.JLOptions) at .\client.jl:305
[11] _start() at .\client.jl:371
during initialization of module ExcelReaders
while loading E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\test\runtests.jl, in expression starting on line 1
============================[ ERROR: ExcelReaders ]=============================
failed process: Process(`'E:\JuliaPro-0.6.0.1\Julia-0.6.0\bin\julia.exe' -Cx86-64 '-JE:\JuliaPro-0.6.0.1\Julia-0.6.0\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=no --compilecache=yes 'E:\JuliaPro-0.6.0.1\pkgs-0.6.0.1\v0.6\ExcelReaders\test\runtests.jl'`, ProcessExited(1)) [1]
================================================================================
Base.Pkg.Entry.PkgTestError("ExcelReaders had test errors")
When loading to a DataFrame, if the first cell of the first column is empty (remaining of the column has content), it fails with
LoadError: type Void has no field captures
in convert_ref_to_sheet_row_col at C:\User\.julia\v0.4\ExcelReaders\src\ExcelReaders.jl:177
in readxl at C:\User\.julia\v0.4\ExcelReaders\src\ExcelReaders.jl:246
in include_string at loading.jl:282
in include_string at C:\User\.julia\v0.4\CodeTools\src\eval.jl:32
in anonymous at C:\User\.julia\v0.4\Atom\src\eval.jl:84
in withpath at C:\User\.julia\v0.4\Requires\src\require.jl:37
in withpath at C:\User\.julia\v0.4\Atom\src\eval.jl:53
[inlined code] from C:\User\.julia\v0.4\Atom\src\eval.jl:83
in anonymous at task.jl:58
Wouldn't it better to fill it with something like A
(as DataFrame needs a name for the column)?
To reproduce:
using ExcelReaders
using DataFrames
f = openxl("Filename.xlsx")
df = readxl(DataFrame, f, "Sheet1")
As of PyCall 1.5, you can now just do pyimport_conda("xlrd", "xlrd")
to auto-install xlrd via Conda (if PyCall is using Conda) or print an informative error message, rather than duplicating that logic here. This will also let you remove the explicit Conda dependency.
i've tried to read xlsx file bying using readxl, by returning the NULL error. the Excel Readers tests passed though, with pycall functioning well. Any thing else i can try to fix ?
Hi!
I was trying to read an specific range of an excel file (defined in Excel), like this:
dfData = readxl(DataFrame, file0, "Control!Investments", header=false)
But I think it can't be done right now. Is it possible?
If not, I think it is useful to incorporate this option.
Thanks!
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
In Excel, this refers to all the cells in the A column from row 3 till the bottom.
It would have been convenient for my use case to have this select from row 3 till the last row with any non-missing cells.
Hi, I'm using Julia 0.6.4, with version 0.9.0. This used to work fine in the earlier versions but I've recently updated the package to 0.9 and now readxlsheet reads the empty cells in the file as "#NA" and julia does not register this as a missing value. typeof() returns DataValue(Union{})().
I need these to register as missing values, are there any suggestions?
Thanks for a useful package.
I just got a large excel file where the data has been split into separate sheets so only the first sheet has column headings. Is there a way I can manually give the column headings/names when reading in a sheet? (This way I can read the first sheet and then use its column headings when reading the other sheets)
Thanks for any pointers.
As far as I can tell, it's possible to read an entire sheet using readxlsheet or read to a dataframe using readxl(DataFrame ...). Could we also have an option that lets us read an entire sheet into a dataframe (or is this possible and I'm just missing it?)
PyCall 1.90.0 is now released, which change o[:foo]
and o["foo"]
to o.foo
and o."foo"
, respectively, for python objects o
; see also JuliaPy/PyCall.jl#629.
The old getindex
methods still work but are deprecated, so you'll want to put out a new release that uses the new methods and REQUIREs PyCall 1.90.0 to avoid having zillions of deprecation messages.
Hello! I need Help!
My code - data = readxl(DataFrame, "A.xlsx", "Sheet!A1:X1834", header=true) - results in
function readxl does not accept keyword arguments
kwfunc(::Any) at boot.jl:237
include_string(::String, ::String) at loading.jl:522
include_string(::String, ::String, ::Int64) at eval.jl:30
include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N} where N) at eval.jl:34
(::Atom.##102#107{String,Int64,String})() at eval.jl:82
withpath(::Atom.##102#107{String,Int64,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
hideprompt(::Atom.##101#106{String,Int64,String}) at repl.jl:67
macro expansion at eval.jl:80 [inlined]
(::Atom.##100#105{Dict{String,Any}})() at task.jl:80
Before, this code it resulted in PyError (IOError, 2, not such file or directory). I think maybe be one error in PyCall, butt I am not right. When i closed Atom and reopen, the error changed for function readxl... keyword arguments. I am confused!
What is this?
Thank you so much!
Best
Mainly so that there is a julia 0.3 compatible version that has the Conda support.
This should happen on the release-0.3 branch.
http://pkg.julialang.org/logs/ExcelReaders_0.3.log
Is something not right with the new Conda stuff?
Win10 (64), updated.
Importand :
On my machine no dir C:\cygwin\
``
julia> Pkg.add("ExcelReaders")
Resolving package versions...
ERROR: MethodError: Cannot convert
an object of type Nothing to an object of type String
Closest candidates are:
convert(::Type{T<:AbstractString}, ::T<:AbstractString) where T<:AbstractString at strings/basic.jl:208
convert(::Type{T<:AbstractString}, ::AbstractString) where T<:AbstractString at strings/basic.jl:209
convert(::Type{T}, ::T) where T at essentials.jl:167
Stacktrace:
[1] setindex!(::Dict{Base.UUID,String}, ::Nothing, ::Base.UUID) at .\dict.jl:380
[2] deps_graph(::Pkg.Types.Context, ::Dict{Base.UUID,String}, ::Dict{Base.UUID,Pkg.Types.VersionSpec}, ::Dict{Base.UUID,Pkg.Types.Fixed}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:424
[3] resolve_versions!(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:316
[4] #add#100(::Bool, ::typeof(Pkg.Operations.add), ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}, ::Array{Base.UUID,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\Operations.jl:962
[5] #add at .\none:0 [inlined]
[6] #add#25(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Pkg.API.add), ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:99
[7] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:69 [inlined]
[8] #add#24 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:67 [inlined]
[9] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:67 [inlined]
[10] #add#21 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:65 [inlined]
[11] add at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:65 [inlined]
[12] #add#20(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(Pkg.API.add), ::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:64
[13] add(::String) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Pkg\src\API.jl:64
[14] top-level scope at REPL[9]:1
julia> Pkg.build("ExcelReaders")
ERROR: The following package names could not be resolved:
name=uuid
.julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, sandybridge)
julia>
``
Hello. So far I've enjoyed the use of ExcelReaders. I think there are a few tweaks that could make it simpler and easier.
Is there a reason to have a separated readxl and readxlsheet functions?. Could the readxl funtion detect when you are not supplying a range and assume you want the whole sheet?. Maybe just with the number of arguments?.
Can the module require DataFrames and give a DataFrame as output by default?. If a DataArray is more convenient fore a certain application can this be selected with a keyword argument?.
thanks
Braulio
I found that if top row has any empty cell, I cannot use readxlsheet(DataFrame......
This code below would allow constructing DataFrame with an empty column name, But I don't know if this is an intentional restriction or not. 14ca83c
if header
headervec = data[1, :]
NAcol = Bool.(isna(headervec))
headervec[NAcol] = DataFrames.gennames(countnz(NAcol))
colnames = convert(Array{Symbol},vec(headervec))
else
colnames = DataFrames.gennames(ncol)
end
Hi, I am new with Julia. I am trying with ExcelReaders but I get an error.
First, in order to use ExcelReaders I have to do:
ENV["PYTHON"]=""
Pkg.build("PyCall")
It seems to work, but when I try to read some excel file like data = readxl("File.xlsx", "Sheet1!A1:C4"), it gives the error ArgumentError: ref of NULL PyObject
I do not know what is it.
Thank you
the xlrd documentation now (as of Dec 2020) says
This library will no longer read anything other than .xls files.
I gave it a try, and ExcelReaders+xlrd 2.0.1 indeed fail on an xlsx file, so I had to revert to xlrd 1.2.0.
May I ask what the plans for ExcelReaders.jl are?
OK, it's one thing not to do so when reading in as DataArray.
But, the point of DataFrames is they already provide support for NA. But, it appears that when you scan though a column containing any NA even if all other values are of the same type you create a DataArray of that type for the column. So, that just breaks NA handling.
At the risk of increasing memory usage (quite real risk) why not convert the raw array you read in from the Excel sheet to DataFrames and let DataFrames figure out the types and do it's default NA handling? Maybe I don't understand so just a sort of brute force thought.
Given this limitation I'll just export the Excel data I was using (some state dept of education data) as csv and use the DataFrames package csv reader, which might handle the NAs. I'll report back if this works.
Thanks--I realize this is new work and relying on python xlrd, so no automatic handling for NAs there.
See here.
Probably needs to be fixed upstream JuliaPy/PyCall.jl#185.
Hello! I just installed the newest version of JuliaPro (1.1.1.1) and ExcelReaders stopped working.
using ProgressMeter, BenchmarkTools, DataFrames, GLM, Combinatorics
# Set Directory
cd("...\\julia_codes")
global Country_Sample = "vsmall"
Data1 = readxlsheet("gravity_$(Country_Sample).xlsx", "Sheet1") '''
The following error appears
```ArgumentError: ref of NULL PyObject
in top-level scope at base\none
in readxlsheet at ExcelReaders\PgE1r\src\ExcelReaders.jl:79
in #readxlsheet#3 at ExcelReaders\PgE1r\src\ExcelReaders.jl:79
in openxl at ExcelReaders\PgE1r\src\ExcelReaders.jl:64
in getproperty at PyCall\ttONZ\src\PyCall.jl:313
in getproperty at PyCall\ttONZ\src\PyCall.jl:303```
I have removed, added and built Conda and PyCall several times.
Also, I added xldr through Conda.
Finally, this is the output when I run test ExcelReaders:
``` Testing ExcelReaders
Status `C:\Users\jmcastro\AppData\Local\Temp\jl_8D0.tmp\Manifest.toml`
[00ebfdb7] CSTParser v0.6.2
[34da2185] Compat v2.1.0
[8f4d0f93] Conda v1.3.0
[864edb3b] DataStructures v0.17.0
[e2d170a0] DataValueInterfaces v1.0.0
[e7dc6d0d] DataValues v0.4.12
[c04bee98] ExcelReaders v0.11.0
[682c06a0] JSON v0.21.0
[1914dd2f] MacroTools v0.5.1
[bac558e1] OrderedCollections v1.1.0
[69de0a69] Parsers v0.3.6
[438e738f] PyCall v1.91.2
[0796e94c] Tokenize v0.5.5
[81def892] VersionParsing v1.1.3
[2a0f44e3] Base64 [`@stdlib/Base64`]
[b99e7846] BinaryProvider [`@stdlib/BinaryProvider`]
[864e158e] CredentialsHandler [`@stdlib/CredentialsHandler`]
[ade2ca70] Dates [`@stdlib/Dates`]
[8bb1440f] DelimitedFiles [`@stdlib/DelimitedFiles`]
[8ba89e20] Distributed [`@stdlib/Distributed`]
[cd3eb016] HTTP [`@stdlib/HTTP`]
[83e8ac13] IniFile [`@stdlib/IniFile`]
[b77e0a4c] InteractiveUtils [`@stdlib/InteractiveUtils`]
[76f85450] LibGit2 [`@stdlib/LibGit2`]
[8f399da3] Libdl [`@stdlib/Libdl`]
[37e2e46d] LinearAlgebra [`@stdlib/LinearAlgebra`]
[56ddb016] Logging [`@stdlib/Logging`]
[d6f4376e] Markdown [`@stdlib/Markdown`]
[739be429] MbedTLS [`@stdlib/MbedTLS`]
[a63ad114] Mmap [`@stdlib/Mmap`]
[44cfe95a] Pkg [`@stdlib/Pkg`]
[de0858da] Printf [`@stdlib/Printf`]
[3fa0cd96] REPL [`@stdlib/REPL`]
[9a3f8284] Random [`@stdlib/Random`]
[ea8e919c] SHA [`@stdlib/SHA`]
[9e88b42a] Serialization [`@stdlib/Serialization`]
[1a1011a3] SharedArrays [`@stdlib/SharedArrays`]
[6462fe0b] Sockets [`@stdlib/Sockets`]
[2f01184e] SparseArrays [`@stdlib/SparseArrays`]
[10745b16] Statistics [`@stdlib/Statistics`]
[9d418dce] TOML [`@stdlib/TOML`]
[8dfed614] Test [`@stdlib/Test`]
[cf7118a7] UUIDs [`@stdlib/UUIDs`]
[4ec0a83e] Unicode [`@stdlib/Unicode`]
Test Summary: | Pass Total
ExcelReaders | 245 245
Testing ExcelReaders tests passed ```
On my computer, ExcelReaders seems to crash on precompilation. This happened both on v0.10.2 and (as below) on master.
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.7.0 (2018-08-08 06:46 UTC)
_/ |\__'_|_|_|\__'_| | Official http://julialang.org/ release
|__/ | x86_64-apple-darwin14.5.0
(v0.7) pkg> add ExcelReaders#master
Cloning git-repo `https://github.com/queryverse/ExcelReaders.jl.git`
Updating git-repo `https://github.com/queryverse/ExcelReaders.jl.git`
Resolving package versions...
Updating `~/.julia/environments/v0.7/Project.toml`
[c04bee98] ↑ ExcelReaders v0.10.2 ⇒ v0.10.2+ #master (https://github.com/queryverse/ExcelReaders.jl.git)
Updating `~/.julia/environments/v0.7/Manifest.toml`
[c04bee98] ↑ ExcelReaders v0.10.2 ⇒ v0.10.2+ #master (https://github.com/queryverse/ExcelReaders.jl.git)
julia> using ExcelReaders
[ Info: Precompiling ExcelReaders [c04bee98-12a5-510c-87df-2a230cb6e075]
signal (11): Segmentation fault: 11
in expression starting at no file:0
I tried to read some excel files into julia and I kept getting the below error.
The code that I ran was:
file="path to the file"
using ExcelReaders
using DataFrames
data=readxl(DataFrame,file,"Sheet1")
ERROR: Invalid Excel range specified.
Stacktrace:
[1] convert_ref_to_sheet_row_col(::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:181
[2] #readxl#7(::Bool, ::Array{Symbol,1}, ::Function, ::Type{DataFrames.DataFrame}, ::ExcelReaders.ExcelFile, ::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:263
[3] (::ExcelReaders.#kw##readxl)(::Array{Any,1}, ::ExcelReaders.#readxl, ::Type{DataFrames.DataFrame}, ::ExcelReaders.ExcelFile, ::String) at .<missing>:0
[4] readxl(::Type{DataFrames.DataFrame}, ::String, ::String) at C:\JuliaPro-0.6.2.1\pkgs-0.6.2.1\v0.6\ExcelReaders\src\ExcelReaders.jl:257
Additionally, I ran a couple tests.
Pkg.test("ExcelReaders")
INFO: Testing ExcelReaders
INFO: ExcelReaders tests passed
julia> using PyCall
julia> @pyimport math
julia> math.sin(math.pi/4)-sin(pi/4)
0.0
Thanks.
XLSX.jl is a pure-julia implementation for parsing xlsx files. In my
testing it doesn't work for some sheets that are produced by strange and
bad software, but should work for most use cases.
There is a module in xlrd for reading formulas (https://github.com/python-excel/xlrd/blob/master/xlrd/formula.py). I am not sure how complete it is though (the two main functions both say they are under construction). Ideally, the formula would come in parsed into a tree of function calls and references.
My interest is in providing on-the-fly translation from a sheet in an Excel document to a Mimi model component, so non-technical users have a way to make changes to certain components.
Here is the error message, it pops up when I use readxlsheet:
PyCall.PyError("ccall(@pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, arg, C_NULL)", PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x000000001eab2e60), PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x0000000023c6f2b0), PyCall.PyObject(Ptr{PyCall.PyObject_struct} @0x0000000023c5fb48))
pyerr_check at exception.jl:56 [inlined]
pyerr_check at exception.jl:61 [inlined]
macro expansion at exception.jl:81 [inlined]
#_pycall#66(::Array{Any,1}, ::Function, ::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:596
_pycall(::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:584
#pycall#70(::Array{Any,1}, ::Function, ::PyCall.PyObject, ::Type{PyCall.PyAny}, ::String, ::Vararg{String,N} where N) at PyCall.jl:618
pycall(::PyCall.PyObject, ::Type{PyCall.PyAny}, ::String, ::Vararg{String,N} where N) at PyCall.jl:618
#call#71(::Array{Any,1}, ::PyCall.PyObject, ::String, ::Vararg{String,N} where N) at PyCall.jl:621
(::PyCall.PyObject)(::String, ::Vararg{String,N} where N) at PyCall.jl:621
openxl at ExcelReaders.jl:75 [inlined]
#readxlsheet#10(::Bool, ::Array{Symbol,1}, ::Array{Any,1}, ::Function, ::Type{DataFrames.DataFrame}, ::String, ::String) at ExcelReaders.jl:275
readxlsheet(::Type{DataFrames.DataFrame}, ::String, ::String) at ExcelReaders.jl:275
include_string(::String, ::String) at loading.jl:515
include_string(::String, ::String, ::Int64) at eval.jl:30
include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N} where N) at eval.jl:34
(::Atom.##49#52{String,Int64,String})() at eval.jl:50
withpath(::Atom.##49#52{String,Int64,String}, ::String) at utils.jl:30
withpath(::Function, ::String) at eval.jl:38
macro expansion at eval.jl:49 [inlined]
(::Atom.##48#51{Dict{String,Any}})() at task.jl:80
EDIT: Works fine after I substituted the full path for my file instead of just the filename. Is there a reason for why that stopped working?
When there is a column that is of only one data type but has missing values, reading into a DataFrame crashes.
This should be fixed when JuliaStats/DataArrays.jl#107 is fixed.
Hi,
I would like to read values from A1:B2 cells using readxl
with following script:
using ExcelReaders
path = Base.Filesystem.joinpath(@__DIR__, "test.xlsx")
range = "test!A1:B2"
data = ExcelReaders.readxl(path, range)
dump(data)
Currently the script works fine if the values are there and I get PyError
in case values are missing. I would like to get 2x2 array regardless if the values are there or not.
I tried to use readxlsheet
like this:
data = ExcelReaders.readxlsheet(path, "test"; skipstartrows=0, skipstartcols=0, nrows=2, ncols=2)
dump(data)
but I get 0x0 array in case data is missing.
Thanks,
Piotr
I have the same problem as #20 and when I run Pkg.test("ExcelReaders") I had the following output. Any idea how I can fix this?
INFO: Testing ExcelReaders
ERROR: LoadError: InitError: Failed to import required Python module xlrd.
For automated xlrd installation, try configuring PyCall to use the Conda.jl package's Python "Miniconda" distribution within J
ulia. Relaunch Julia and run:
ENV["PYTHON"]=""
Pkg.build("PyCall")
before trying again.
The pyimport exception was: PyError (ccall(@pysym(:PyImport_ImportModule), PyPtr, (Cstring,), name)
The Python package xlrd could not be found by pyimport. Usually this means
that you did not install xlrd in the Python version being used by PyCall.
PyCall is currently configured to use the Python version at:
C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\Python\python.exe
and you should use whatever mechanism you usually use (apt-get, pip, conda,
etcetera) to install the Python package containing the xlrd module.
One alternative is to re-configure PyCall to use a different Python
version on your system: set ENV["PYTHON"] to the path/name of the python
executable you want to use, run Pkg.build("PyCall"), and re-launch Julia.
Another alternative is to configure PyCall to use a Julia-specific Python
distribution via the Conda.jl package (which installs a private Anaconda
Python distribution), which has the advantage that packages can be installed
and kept up-to-date via Julia. As explained in the PyCall documentation,
set ENV["PYTHON"]="", run Pkg.build("PyCall"), and re-launch Julia. Then,
To install the xlrd module, you can use pyimport_conda("xlrd", PKG)
,
where PKG is the Anaconda package the contains the module xlrd,
or alternatively you can use the Conda package directly (via
using Conda
followed by Conda.add
etcetera).
) <class 'ImportError'>
ImportError("No module named 'xlrd'",)
Stacktrace:
[1] pyimport_conda(::String, ::String, ::String) at C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\PyCal
l\src\PyCall.jl:609
[2] init() at C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\src\ExcelReaders.jl:16
[3] _include_from_serialized(::String) at .\loading.jl:157
[4] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:200
[5] _require_search_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at .\loading.jl:236
[6] _require(::Symbol) at .\loading.jl:441
[7] require(::Symbol) at .\loading.jl:405
[8] include_from_node1(::String) at .\loading.jl:576
[9] include(::String) at .\sysimg.jl:14
[10] process_options(::Base.JLOptions) at .\client.jl:305
[11] _start() at .\client.jl:371
during initialization of module ExcelReaders
while loading C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\test\runtests.jl, in expression
starting on line 1
===================================================[ ERROR: ExcelReaders ]====================================================
failed process: Process('C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\Julia-0.6.2\bin\julia.exe' -Cx86-64 '-JC:\Users\dz saf_000\AppData\Local\JuliaPro-0.6.2.2\Julia-0.6.2\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-co verage=none --color=yes --compilecache=yes 'C:\Users\dzsaf_000\AppData\Local\JuliaPro-0.6.2.2\pkgs-0.6.2.2\v0.6\ExcelReaders\t est\runtests.jl'
, ProcessExited(1)) [1]
========================================
Hello,
I am using Julia 0.6.4 and the ExcelReaders
package to read several sheets from a very large .xlsx file.
Thus, instead of repeating foo_n = readxlsheet("data.xlsx", "sheetname_n")
for n times, it seemed to me to be more convenient and faster to use:
main_file = openxl("data.xlsx")
foo_1 = readxlsheet(main_file, "sheetname_1")
foo_2 = readxlsheet(main_file, "sheetname_2")
...
Which, if I got it right, keeps the whole large file "data.xlsx"
in memory, so to permit a faster access whenever e.g. a new sheet needs to be read.
However, how about adding a feature to close such a large file (once not needed anymore in the program)? It could be something like: closexl("data.xlsx")
.
I presume that by freeing up some memory, the latter could be used to run faster a very long script for instance.
Thank you in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.