Code Monkey home page Code Monkey logo

checkpoints.jl's People

Contributors

aisopous avatar bencottier avatar bsnelling avatar fchorney avatar github-actions[bot] avatar mjram0s avatar mzgubic avatar nickrobinson251 avatar nicoleepp avatar oxinabox avatar rofinn avatar timmylev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

checkpoints.jl's Issues

Threadsafe checkpoints

We don't have a significant use case right now, but we may want to introduce a lock ensure multiple threads don't update checkpoint storage at the same time.

Incorrect `prefixes` when indexing checkpoints that have no tags

If I save a checkpoint called "Forecasters.predicted" in example/path without tags, the filepath is example/path/Forecasters/predicted.jlso.

This IndexEntry constructor assumes that prefixes are those paths segments which do not contain "=", i.e. are not tags. However, when there are no tags in the first place, all path segments will be included in this operation, because first_tag_ind defaults to 1:

first_tag_ind = something(findfirst(contains("="), filepath.segments), 1)
segments = filepath.segments[first_tag_ind:end-1]
prefixes = filter(!contains("="), segments)

MWE:

julia> Checkpoints.register("Forecasters", ["predicted"])

julia> Checkpoints.config("Forecasters.predicted", "example/path")

julia> checkpoint("Forecasters.predicted", [1])
13361

julia> IndexEntry("example/path/Forecasters/predicted.jlso").prefixes
("example", "path", "Forecasters")

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Introduce `force` argument to JLSOHandler

We introduced a force=false kwarg to the DictHandler constructor in #50 to ensure checkpoints aren't accidentally being overwritten. We may also want to introduce that in the JLSOHandler. This would be a breaking change as our code currently overwrites checkpoints JLSOs if a job fails.

Error indexing S3Path with JLSO file at top level

Not quite reproducible example using some files I have on S3:

julia> using Checkpoints, FilePathsBase

julia> dir = p"s3://eis-jobresultsbucket-15m8pleoysiwx/backrun/2021-10-28T22h56m45.360/";

julia> index_checkpoint_files(dir)
ERROR: ArgumentError: . cannot be parsed as AWSS3.S3Path{AWS.AWSConfig}
Stacktrace:
  [1] #parse#6
    @ ~/.julia/packages/FilePathsBase/YFK4h/src/path.jl:74 [inlined]
  [2] parse
    @ ~/.julia/packages/FilePathsBase/YFK4h/src/path.jl:73 [inlined]
  [3] relative(fp::AWSS3.S3Path{AWS.AWSConfig}, start::AWSS3.S3Path{AWS.AWSConfig})
    @ FilePathsBase ~/.julia/packages/FilePathsBase/YFK4h/src/path.jl:440
  [4] relpath
    @ ~/.julia/packages/FilePathsBase/YFK4h/src/aliases.jl:22 [inlined]
  [5] IndexEntry(filepath::AWSS3.S3Path{AWS.AWSConfig}, base_dir::AWSS3.S3Path{AWS.AWSConfig})
    @ Checkpoints ~/JuliaEnvs/Checkpoints.jl/src/indexing.jl:28
  [6] #27
    @ ~/JuliaEnvs/Checkpoints.jl/src/indexing.jl:159 [inlined]
  [7] iterate
    @ ./generator.jl:47 [inlined]
  [8] grow_to!(dest::Vector{IndexEntry}, itr::Base.Generator{Base.Iterators.Filter{ComposedFunction{Base.Fix2{typeof(==), String}, typeof(extension)}, Channel{AWSS3.S3Path{AWS.AWSConfig}}}, Checkpoints.var"#27#28"{AWSS3.S3Path{AWS.AWSConfig}}})
    @ Base ./array.jl:739
  [9] collect
    @ ./array.jl:676 [inlined]
 [10] map
    @ ./abstractarray.jl:2323 [inlined]
 [11] index_checkpoint_files(dir::AWSS3.S3Path{AWS.AWSConfig})
    @ Checkpoints ~/JuliaEnvs/Checkpoints.jl/src/indexing.jl:158
 [12] top-level scope
    @ REPL[5]:1

Problem arises from a JLSO file existing at the top level in dir, which means the dirname and the base_dir of this file path are the same here. Then FilePathsBase calls parse(S3Path, ".") under the hood. AWSS3 doesn't accept ".", it only accepts paths with the "s3://" URI.

We should be able to fix the above specific problem in this package by just not looking for prefixes/tags if the dirname and base_dir are equal.

add checkpoint_fullname(x::IndexEntry)

If we know a checkpoint's name expressed in the MODULE.SUBMODULE.NAME form
then right now you need to do
split(fullname)[1:end-1] == prefixes(x) && last(split(fullname)) == checkpoint_name(x)
to see if it matches some x from the index.

That is pretty gross.

We should add checkpoint_fullname(x::IndexEntry) = join(".", [prefixes(x); checkpoint_name(x)])
or something like that.
So that it can be checked easily.

Index checkpoint files incrementally

index_checkpoint_files walks a given path to find checkpoint files, and organises the segments of each checkpoint path into tags.

One might want to call this many times on the same top-level checkpoint directory, to analyse checkpoint data while the program is running and new checkpoints are added. For example, if a checkpoint is made at regular time intervals, with the timestamp used as a tag.

If there are a lot of checkpoint files (e.g. 100s), walking the whole path becomes a big waste. One could index a subdirectory of the top-level checkpoint directory, but then not all of the tags would be found, because tags are part of the path.

Is there a way to update the checkpoint index incrementally, based on diffs in the file tree? For example, if I want to reindex per timestep, it only searches the checkpoints for that timestep and adds them to an existing index, but still knows all of the tags.

Making checkpoint a macro

We might likely to make checkpoint into a macro

Reasons for macro:

  • the main reason: we can use this to set things up such that if a checkpoint isn’t enabled then functions that it calls to store their values are not called, which could be expensive. @checkpoint("RegressionSummary", value=expensive_summary_function(foo)). (This is what the Base Logging macros do)
  • We can get rid of the need to register them in __init__ by making it, at parse time register it. (not 100% sure if this will work, since it is mutating a global variable at parse time, I think it does. If it doens’t then shouldn’t do this)
  • We can also automatically have also record the names of all the things it is saving, and then the user can query that with a function like checkpoint_info that would print a list. As a kind of documentation.
  • We can store the filename and line number (if we a really clever we can store the exact git commit and then we will be able to generate a link to that file and line, I had a proof of concept for this ages ago) so can lookup afterwards where it is from.
  • we can do like bases logging macros and have just writing a be the same as :a=>a (though we also get this if we changes to storing data in the kwarg position #16)

On the otherhand macros are harder to reason about. so the gains might not be worth it.
I think low priority

add `checkpoint_basepath(::IndexEntry)`

Consisder a checkpoint index index constructed from the path results/backrun/2021-10-08T15:59:04.829/foo=BAR/sim_now=2019-02-14T10:15:00-05:00/strategy=1/Forecasters/predicted.jlso

checkpoint_path(ind) would return that results/backrun/2021-10-08T15:59:04.829/foo=BAR/sim_now=2019-02-14T10:15:00-05:00/strategy=1/Forecasters/predicted.jlso

I propose a new: checkpoint_basepath(index) that returns results/backrun/2021-10-08T15:59:04.829/
Giving what ever part is common for all tags

Change checkpoint to be `checkpoint(name, tags...; data...)`

As of #15 (cc @mzgubic)
we are allowed duplicate tags, and because of this with_tag is used as with_tag(:tag1=1, :tag2=2) do
Since we need to actually have a function that can take those duplicate tags they can't be in the keyword positon.

Conversely, the data keys must be unique, so having them in the varargs position is suboptimal.

Also as of #15 it will be very rare to pass tags directly to a checkpoint.
So calls will be checkpoint("FooBar"; foo=1, bar=[1,2,3]) which will create JLSO files containing both foo and bar.
In a location determined by the context tags

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.