Comments (7)
Hi, today I come back to the problem and no data racing anymore. My thought was the last time I updated CSV from v0.10.10 => v0.10.11, temporary file still exists in local machine then the bug still occurs. #1073 absolutely fixes this issue. Thanks for the hard working. I will close this issue in here.
from csv.jl.
Hi and thank you for the bug report! Would you mind testing whether this still occurs after updating CSV.jl? Version 0.10.11 (tagged yesterday) includes #1073 which intends to fix this kind of issues.
from csv.jl.
I tested, the data race frequency decreased but the problem is still there. Moreover, now sometimes this plugin causes Pluto to hang for about 5 minutes I think because data racing.
bandicam.2023-06-07.07-58-26-840.mp4
My thought: if run the code single time, I mean run and wait until the code done -> continue, no problem exist with type. But if we run it many times, like I spam in the video, data racing will happen with multiple core(in my example is 8 cores). Idk if my thought is true or not, please explain for me.
from csv.jl.
Ah that's unfortunate and unexpected. It seems I cannot reproduce the issue: I tried running a Pluto notebook with the same environment (JULIA_NUM_THREADS=8 JULIA_REVISE_WORKER_ONLY=1 ~/julia-1.9.0/bin/julia --startup-file=no -e "using Pluto; Pluto.run()"
) and I put the code of your initial message, one line per cell. Then I did like in your video, refreshing the df
definition cell repeatedly, even just leaving Shift+Enter pressed down for a while, but I never see the type of the first column changing.
I also tried the following to automate things a bit:
body = HTTP.get(filename).body
for _ in 1:10000
df2 = CSV.read(body, DataFrame, header=headers)
if eltype(df2[!,1]) != Int64
error("Encountered: $(eltype(df2[!,1]))")
end
end
but no error occurs.
Just to check if it can be something else in the configuration, can you please check the output of Base.Threads.nthreads()
in one cell of your Pluto notebook, as well as that of import Pkg; Pkg.status()
? Mine yields respectively 8
and
Status `/tmp/jl_pNSR9l/Project.toml`
[336ed68f] CSV v0.10.11
[a93c6f00] DataFrames v1.5.0
[cd3eb016] HTTP v1.9.6
[44cfe95a] Pkg v1.9.0
[10745b16] Statistics v1.9.0
from csv.jl.
Just to check if it can be something else in the configuration, can you please check the output of
Base.Threads.nthreads()
in one cell of your Pluto notebook, as well as that ofimport Pkg; Pkg.status()
? Mine yields respectively8
and
Ah that's unfortunate and unexpected. It seems I cannot reproduce the issue: I tried running a Pluto notebook with the same environment (JULIA_NUM_THREADS=8 JULIA_REVISE_WORKER_ONLY=1 ~/julia-1.9.0/bin/julia --startup-file=no -e "using Pluto; Pluto.run()") and I put the code of your initial message, one line per cell. Then I did like in your video, refreshing the df definition cell repeatedly, even just leaving Shift+Enter pressed down for a while, but I never see the type of the first column changing.
I also tried the following to automate things a bit:
I can reproduce the error with your requirement, maybe your OS is different to me. I'm using Windows 11 to test, with powershell=7.2.
Untitled.mp4
from csv.jl.
Thanks for checking: apparently you are still using CSV v0.10.10, but the bugfix I mentioned was only released starting from with CSV v0.10.11, which explains why you are still seeing this bug.
Would you mind updating the package and letting us know whether the bug still occurs afterwards? To update, run Pkg.update("CSV")
from a cell of your notebook (or simply Pkg.update()
to update all packages in your environment): you should see somewhere a line stating
[336ed68f] ↑ CSV v0.10.10 ⇒ v0.10.11
from csv.jl.
Thanks for checking: apparently you are still using CSV v0.10.10, but the bugfix I mentioned was only released starting from with CSV v0.10.11, which explains why you are still seeing this bug. Would you mind updating the package and letting us know whether the bug still occurs afterwards? To update, run
Pkg.update("CSV")
from a cell of your notebook (or simplyPkg.update()
to update all packages in your environment): you should see somewhere a line stating[336ed68f] ↑ CSV v0.10.10 ⇒ v0.10.11
I realized that I only update local env not Pluto. sorry for that. The first time I check, data racing still exist but at the second time and third time everything ok. There's something weird in here or maybe problem with multi threads. We need more people to validate this behavior. Thanks
from csv.jl.
Related Issues (20)
- Keyword `decimal` not respected for AbstractFloats in CSV.write()
- Can't transfer CSV.jl v0.10.11 from Windows to Linux HOT 2
- CSV.write somehow cannot write file with name `con.csv` in Windows?! HOT 5
- Add Zenodo badge to README HOT 6
- Segfault on Julia 1.9 on Intel Sapphire Rapids during precompilation
- `bufsize` of `write` is defined to be length of row but actually cells
- can not read the csv with large cells written by itself HOT 1
- Formatting broken on Examples page in documentation HOT 2
- CSV.jl fails to precompile on Ubuntu server, v0.10.5 and up. HOT 2
- Error on CSV.read attempt HOT 4
- `emptyvalue` keyword option
- CSV.Chunks splits file into uneven chunks
- CSV.jl errors on nightly
- Incorrect results for `argmax` with multithreaded parsing
- CSV is failing PkgEval HOT 4
- Error when combining single row with multiple row CSV file into a DataFrame with pooling on. HOT 1
- `Date` types should not be inferred from column
- CSV is broken in nightly julia
- 1.12.0-DEV.317 ERROR: LoadError: TypeError: in typeassert, expected Tuple{Vector{UInt8}, Int64, Int64, Union{Nothing, String}}, got a value of type Tuple{Memory{UInt8}, Int64, Int64, Nothing}
- Error when passing as `source` a vector with fewer unique elements than files.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csv.jl.