Code Monkey home page Code Monkey logo

btc-etf-tracker's People

Contributors

buildwithdata avatar h-blues avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

btc-etf-tracker's Issues

--date

python runner/db/inflows_btc.py -d "2024-04-12"
python runner/db/inflows_btc_bfill.py -d "2024-04-12"

not raising warning message when data are not found

week datatype

week        TEXT    NOT NULL,

change to

week        INT    NOT NULL,

this is needed otherwise sorting fails like this:

week
1
10
11
...
2
3
4

HK Visualization

Add visualization for Hong Kong ETFs

TODO

  • add functions to generate graphs to src/visualization/hk.py
  • code src/runner/visualization/hk.py

OTHERS

  • can we make some functions in src/visualization/us.py general ? for example current_holdings

Add More Browsers

Scrapers here run with Chrome only, but would be better to support at least another one to give users more choice

  • firefox
  • safari
  • microsoft edge

--ticker

add option --ticker to runner/db/update_raw.py to load that for target ETF only:

python ./runner/db/update_raw --ticker BRRR

selenium - headless

When scraping FBTC and ARKB we cannot pass this option

options.add_argument('--headless')

if we do it, then scraping fails...

if we wanna put this on a remote server and automate it, I guess we cannot do it... would be possible if we make this option always work

BRRR - DOM

looks like BRRR is faling to extract data from 17-04-2024 bcs the DOM has changed

FBTC - holdings

Scenario: data from 2024-03-28 are missing

Problem: likely the xls file structure has changed

ref_date

Scenario:

Suppose on day X there is no trading bcs TradFi is shut down (eg christmas), data are scraped and then data are extracted with ref_date = X -1

On Day X + 1 trading is back, ie on Day X + 2 data are updated and when they are scraped ref_date = X + 1

Problem:

Any consumption table will be missing a row with ref_date= X

Solution:

Add row for ref_date= X with data from X -1

Add new etf BTC

Grayscale has released their mini etf with ticker BTC, we have to add this to the repo

Already added scraper, check branch feature/btc, next is to write code to extract data and create tables

BTCW - Scraper

add scraper to scrape data from here

TODO

  • add module src/product/etp/btcw.py
  • define scraper class BTCW and implement scrape method
  • add it to scraper runner src/runner/scarping/scrape.py

NEXT STEP

  • implement extract method

ETH ETF

Let's scrape ETH ETFs

all scrapers can be found on this branch

what's left to do now is to code the extract and the update_db methods

let's start by coding the first one, when this is done pls push the code on a new branch called feature/etf_ticker so we can all test it

if testing is succesfull, we gonna merge it and then code the second method

sqlite3 - significant digits

for some reason sqlite fails to round to 2 decimal digits sometimes, for example:

    select * from inflows_btc_bxfill
    
    +------------+-------------------+
    |  ref_date  |       TOTAL       |
    +------------+-------------------+
    | 2024-01-11 | 13796.2           |
    | 2024-01-12 | 4023.2            |
    | 2024-01-15 |                   |
    | 2024-01-16 | -1947.1           |
    | 2024-01-17 | 9256.8            |
    | 2024-01-18 | -4514.5           |
    | 2024-01-19 | -545.900000000002 |
    | 2024-01-22 | -2350.3           |

but it seems this is kinda random, for example inflows_btc_bfill is created with data from inflows_btc but:

  select total from inflows_btc_bfill where ref_date = '2024-03-22'

  +---------+
  |  TOTAL  |
  +---------+
  | -310.93 |
  +---------+

  select total from inflows_btc where ref_date = '2024-03-22'

  +-------------------+
  |       TOTAL       |
  +-------------------+
  | -310.929999999999 |
  +-------------------+

but what is weirder is that rounding is done in both cases:

out["TOTAL"] = out.iloc[:, :-1].sum(axis=1).round(2)

out = (extracted.iloc[:, 1:] - extracted.iloc[:, 1:].shift(1)).round(2)

setup - db

failing to handle properly database path when specified in config file

--force

Add option to force uploading of data, this would be especially handy for no coders so they do not have to connect to db and run query to:

python runner/db/update_raw --from-date "2024-04-13" --force

implement this in all runners

Getting started videos

For developers:

  • voice over
  • zip and upload historical data
  • update links in README

If no coding skills:

  • video
  • voice over
  • update links in README

GBTC - weekly outflow

create table or graph to aggregate data on a monthly and weekly basis to show the overall trend

DEFI - Scraper

hashdex ETF

TODO

  • add module src/product/etp/defi.py
  • define scraper class DEFI and implement scrape method
  • add it to scraper runner src/runner/scarping/scrape.py

NEXT STEP

  • implement extract method

setup

when user type data more than once, like here:

Path:
    root: /Users/bwd/BTC-ETF-Tracker
    data: /Users/bwd/BTC-ETF-Tracker/data
    db:   /Users/bwd/BTC-ETF-Tracker/data_refactor/db

then DATA_PATH contains multiple paths, instead it should keep the one specified at data

chinaAMC

  • BTC
    • ticker
    • html
    • xlsx
    • 30/04
    • 01/05
    • xlsx -> Total_NAV / Price
  • ETH
    • ticker
    • html
    • xlsx
    • 30/04
    • 01/05

UTC

Scraped data are saved with timestamp in their filename, timestamp must be UTC

BTCO - holdings

Holdings are not reported here, the only way we can calculate them is:

holdings ~= AUM / btc_ref_price

Problem:

  • there is no reference price available on the website, so we gotta pick one. I'd go with the daily closing price, but open to discussion here...
  • about the price we could do with data from yahoo finance, but since coinbase is custodian here, it would be great to get btc price from coinbase

ETN Notes - London Stock Exchange

source

  • are these ETFs or derivatives ?
  • keep up with news, deadline for submission is April 15th and first trading day is May 22
  • if these are ETFs, find who the issuers are and where they pubblish data

ARKB - Scraper

Scenario: seems like the website has changed and in the landing page a banner pops out asking whether to continue to the USA or EU website

Problem:

  • scraper fails bcs this banner prevents the next banner to come out
  • not sure this happening 100% of times

Solution: catch this banner and press the link to USA website

--from date

Add option to ingest data from a reference target date

python runner/db/update_raw --from-date "yyyy-mm-dd"

this option cannot be passed if option --date is passed as well and viceversa

Request for Updating Scraped Data

Hi Team,

Thanks for your great job! I really appreciate it.

This issue is to kindly ask for the update of the scraping data, which starts from Feb 23rd and ends on May 19th.

Many thanks for your help.

ARKB Scraper

Since June when scraping the webpage, which is saved as an html file, relevant data like number of shares are missing from the saved html file.
This is probably bcs they have changed the webpage and the scraping commands are not effective anymore.

This causes the scraper to fail when the extract method is called as either empty strings or None values are found

I do not know how to tackle this bug, but we are not using any data from the webpage itself as holdings are found in the csv file.
So let's turn off this for now meanwhile we figure out how to fix this bug

HARVEST

  • BTC
    • ticker 9439
    • html
    • 30/04
    • 01/05
    • holdings -> html -> fund_size * % virtual asset / avg(market_price_borsera, market_price_amc)
  • ETH
    • ticker 9179
    • html
    • 30/04
    • 01/05

BRRR - url has changed

From late July BRRR data are available here https://valkyrie-funds.com/brrr/

Found out about this in early august, so pretty much we have been scraping an empty webpage.
Not a big deal... holdings on the 24th of July were 8877.57 whereas on 2nd of Aug 8876.9,
so we did not miss much

BORSERA

  • BTC
    • ticker BRRAP
    • html
    • xlsx
    • 30/04
    • 01/05
    • holdings -> xlsx -> market_value * net_asset_% / market_price
  • ETH
    • ticker
    • html
    • xlsx
    • 30/04
    • 01/05

IBIT - extract

method to extract data from html fails from 2024-05-22 -> DOM has changed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.