Comments (3)
This is a hard one. If we have a lot of tuples of different types, then we don't want to specialize, but if we have a lot of tuples of the same type, then we do. I can think of a couple possible heuristics:
- Specialize on tuples in leaf-typed slots (or for arrays with leaf eltypes) but not in other cases.
- Specialize on types only after we have saved them
n
times. Otherwise use an unspecialized code path.
Either of these should solve this particular case. But actually writing the code in a way that allows either a specialized or unspecialized code path to be executed seems kinda hard.
I'd also like to decrease the amount of codegen time for each new type, so that the coefficient is closer to that of serialize
, but it's difficult to know what needs to be optimized because we don't have a good way of profiling codegen time.
Ultimately, the problem here extends well beyond JLD. For example:
julia> vals = [Val{i}() for i = 1:5];
julia> @time valst = [(vals[i],vals[j]) for i = 1:length(vals), j = 1:length(vals)];
0.061875 seconds (13.18 k allocations: 815.614 KB)
julia> @time show(IOBuffer(), valst);
0.766540 seconds (714.30 k allocations: 36.741 MB, 1.21% gc time)
julia> vals = [Val{i}() for i = 6:35];
julia> @time valst = [(vals[i],vals[j]) for i = 1:length(vals), j = 1:length(vals)];
2.797126 seconds (411.34 k allocations: 25.280 MB, 0.16% gc time)
julia> @time show(IOBuffer(), valst);
19.339135 seconds (10.96 M allocations: 602.963 MB, 0.64% gc time)
Ideally Julia itself would be smart enough to interpret code until it's hot enough to deserve compilation (modern JS engines do this), but that's a long ways off.
from jld2.jl.
CCing @carnaval and @vtjnash, not because I'm hoping for a quick fix, but because I think you'll find this one fun.
I hadn't thought of waiting to compile until you've actually run the code via the interpreter; seems like a smarter strategy than trying to forecast two costs (compilation and interpretation) and choosing in advance.
from jld2.jl.
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.6.0 (2021-03-24)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> include("jldtest.jl")
5.887887 seconds (10.20 M allocations: 565.706 MiB, 2.23% gc time, 16.08% compilation time)
0.517091 seconds (1.11 M allocations: 58.193 MiB, 2.11% gc time, 99.70% compilation time)
110.533142 seconds (216.16 M allocations: 12.303 GiB, 3.57% gc time)
julia> include("jldtest.jl")
0.000898 seconds (1.39 k allocations: 59.016 KiB)
0.001258 seconds (5.25 k allocations: 202.031 KiB)
0.013244 seconds (26.71 k allocations: 1.225 MiB)
Ouch, this has gotten significantly worse since then
from jld2.jl.
Related Issues (20)
- Custom serialization with duplicated instances HOT 8
- circular references with Ref are not correctly restored using custom serialization HOT 1
- 0.4.26 broken - UndefVarError in `JLD2.ReadRepresentation()` HOT 9
- Custom serialization with duplicated instances with Vector HOT 7
- issue under Julia v1.8.3 HOT 6
- Error saving DataFrame in 1.8.3 HOT 3
- custom serialization through NamedTuples HOT 2
- Remove large_fractal_heap.h5 (60Mb) file from package? HOT 3
- No compression with `jldsave` HOT 2
- `JLD2.InvalidDataException("Did not find a Superblock.")` error from "straightforward" save and load HOT 1
- error reconstructing DataFrames? (with solution) HOT 2
- overwriting existing data HOT 3
- InexactError while loading JLD2 HOT 3
- Reading data file fails when original package that created the data changed HOT 5
- Unpack for JLD2File HOT 1
- Reconstruction warnings HOT 2
- save_object has no compress kwarg HOT 1
- CuArray support
- TimeDatatype not defined HOT 4
- Variables named with modifiers (like \hat) are incorrectly saved HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jld2.jl.