Code Monkey home page Code Monkey logo

Comments (10)

benmccann avatar benmccann commented on May 27, 2024

+1

i would like to be able to cache quotes retrieved from yahoo (or other sources) instead of fetching with every run of the algorithm. along with hdf5 support I would also like to see load_from_yahoo take an existing DataFrame and be able to add stocks to it. also, i'm not really familiar with DataFrame yet, but I wonder if storing every stock in the same DataFrame or h5f5 file is the best way to go. if the data sets grow very large (e.g. a thousand stocks with per-second quotes) will this become difficult to deal with? does a DataFrame need to fit into memory?

from zipline.

ehebert avatar ehebert commented on May 27, 2024

@benmccann, I agree, caching the Yahoo data would be a great improvement.
(In regards to this ticket, I think hdf5 would be a good format for that cache.)

Would the existing DataFrame you would like to pass to load_from_yahoo contain OHLCV data, or other types of data?

A DataFrame does need to fit into memory.
There is a thread here, https://groups.google.com/forum/?fromgroups=&hl=en#!topic/zipline/fLojh3EfJp0, about using PyTables directly (which provides a generator from a hdf5 source).

from zipline.

benmccann avatar benmccann commented on May 27, 2024

my thought with passing a DataFrame to load_from_yahoo was that if i loaded 10 securities into a DataFrame by calling load_from_yahoo and then wanted to add another 10 to my dataset there's no real way to do that right now

from zipline.

MichaelWS avatar MichaelWS commented on May 27, 2024

something like this works with each node being a date in iso format

https://gist.github.com/MichaelWS/e5eb873e32b089a4487e

from zipline.

ehebert avatar ehebert commented on May 27, 2024

Michael:

Apologies for the delay in follow up.
The gist link appears to be dead.

  • Eddie

MichaelRB [email protected] writes:

something like this works with each node being a date in iso format

https://gist.github.com/MichaelRB/e5eb873e32b089a4487e

β€”
Reply to this email directly or view it on GitHub.*

from zipline.

MichaelWS avatar MichaelWS commented on May 27, 2024

Sorry about that. This should work

https://gist.github.com/MichaelWS/e5eb873e32b089a4487e

from zipline.

MichaelWS avatar MichaelWS commented on May 27, 2024

Here's a pull request to fix this. I had a to throw something together for a friend so I figured I would contribute this back.
#244

from zipline.

llllllllll avatar llllllllll commented on May 27, 2024

@ehebert has this been addressed?

from zipline.

ehebert avatar ehebert commented on May 27, 2024

It has not, but we could make a wrapper for BcolzDailyBarWriter which reformats an hdf5 file to bcolz. (which is a quick ctable.fromhdf5(table_path).copy(rootdir=output_path).

With incoming changes on lazy-mainline branch the backtest data (not just pipeline) will be sourced from files created by BcolzDailyBarWriter and BcolzMinuteBarWriter classes.

from zipline.

llllllllll avatar llllllllll commented on May 27, 2024

I think we have settled on bcolz as the internal format for zipline. With the new data bundle changes we also support the case of caching yahoo data instead of downloading on each run.

from zipline.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.