Code Monkey home page Code Monkey logo

Comments (7)

johncronan avatar johncronan commented on July 28, 2024

Thousands of time-series datasets accessed through a cart-based ordering system. The biggest category is radiometry (MFRSR is a network of solar irradiance sensors). Many of the other datasets are analyses and experimental results.

I don't see an obvious way to bulk download these. Suggest low priority.

from datasets.

donbright avatar donbright commented on July 28, 2024

The website above appears to be dead.

But it looks like this data might be available in bulk at ftp://ftp.arm.gov/

Especially pub/sites which appears to have a massive amount of data from the various ARM research sites, using abbreviations for each site.

Sizes:

0.8 T	./pub/project
2.9 T	./pub/sites
3.7 T	./pub

Looking at their website https://www.arm.gov/about/history they have about 25 years of climate data that is used to develop climate models. They claim they have a Petabyte (1024 Terabytes) so this ftp site would be a small fraction

from datasets.

gabefair avatar gabefair commented on July 28, 2024

I don't think its going to be possible to mirror this dataset without some serious coordination.

from datasets.

donbright avatar donbright commented on July 28, 2024

lol i crashed my machine trying to do too many 'lftp du -h' at the same time, ran out of memory or something. will keep trying.

from datasets.

donbright avatar donbright commented on July 28, 2024

pub/projects is .8 T

pub/sites is 2.9 T, but 2.4T of that is under 'pub/sites/aafF1/hiscale/video'

from datasets.

bkirkbri avatar bkirkbri commented on July 28, 2024

Seems like an ideal candidate to reach out to and get bulk access to priority data.

from datasets.

donbright avatar donbright commented on July 28, 2024

attempting offline copy of everything in 'sites' except 'Video'. getting good xfer rate, 1.7M/s


update

local copy of 'pub/sites' without Video subdirectory is complete.

commands used: 
$ lftp ftp.arm.gov:/pub/sites
> mirror -v -v -n -x ".*Video.*"

Total: 625 directories, 30318 files, 0 symlinks
New: 29166 files, 0 symlinks
Modified: 3 files, 0 symlinks
306781631320 bytes transferred in 465523 seconds (643.6 KiB/s)
lftp ftp.arm.gov:/pub/sites> 

( Size du -h is 326G )


OK ... in preparation for future organizing this better... the vast majority of the pub/sites data is due to three projects from their "Aerial Facility" (measuring from aircraft) https://dis.arm.gov/sites/aaf

HISCALE https://www.arm.gov/research/campaigns/aaf2016hiscale
Holistic Interactions of Shallow Clouds, Aerosols, and Land-Ecosystems (HI-SCALE)

ACMEV https://www.arm.gov/research/campaigns/aaf2014armacmev
ARM Airborne Carbon Measurements (ARM-ACME V)

ICARUS https://www.arm.gov/news/features/post/37859
Inaugural Campaigns for ARM Research using Unmanned Systems,

under ftp.arm.gov/pub/sites/aafF1 
137G	./acmev
13G	        ./icarus
177G	./hiscale

examining pub/projects.


ok looks like im out of this game for this month. used more data in a week than i usually use in half a year

delme3

from datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.