Code Monkey home page Code Monkey logo

Comments (15)

 avatar commented on July 28, 2024

The link is not working per my ISP and Is It Down Right Now. http://www.isitdownrightnow.com/nassgeodata.gmu.edu.html

from datasets.

nickrsan avatar nickrsan commented on July 28, 2024

It seems to be up now if you'd like to try again. Thank you!

from datasets.

alex-kazda avatar alex-kazda commented on July 28, 2024

I will try to get this data, but it might be a tricky dataset to get. So far, I could only find the data as TIFF images to be downloaded from a web interface, one picture for each year + some dBase database of crops.

I'm new to archiving and it will probably take me several days to get the data. I estimate the size of the data (uncompressed) to be about 6GB per year, so about 120 GB for the 20 year dataset.

Metadata (reasonably small) should be here https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/meta.php

EDITED: @mxplusb is there a mirror of this somewhere, so that I actually don't need to do the downloading the dataset?

from datasets.

gabefair avatar gabefair commented on July 28, 2024

@alex-kazda were you able to download a copy or do this dataset need help?

from datasets.

alex-kazda avatar alex-kazda commented on July 28, 2024

@gabefair I did not download it yet. I had a preliminary look at the database interface, downloading some small samples. I'm in Austria and need to go to sleep now, so if you think that this needs doing more quickly, then you can start downloading the data. I recommend clicking on the little red-white-blue outline of the US and select the regions to be downloaded on a state by state basis (that is the best partition of the data that I could find so far -- the web interface did not let me to select the whole USA and states seem like reasonable units).

from datasets.

rustyguts avatar rustyguts commented on July 28, 2024

You can download the entire data set per year. Should be about 1.5gb per year. Is that all the data?

https://nassgeodata.gmu.edu/nass_data_cache/tar/2016_cdls.tar.gz
(replace 2016 with year you want to download)

from datasets.

RoboDonut avatar RoboDonut commented on July 28, 2024

Can I mirror from a public s3 bucket?

from datasets.

nickrsan avatar nickrsan commented on July 28, 2024

Hi @RoboDonut - nice to see you here! Yes, a public S3 bucket is totally fine - whatever tech makes the most sense for you - lots of people are using S3 and posting the URLs back here. Trying to line up additional storage now too.

from datasets.

rustyguts avatar rustyguts commented on July 28, 2024

from datasets.

RoboDonut avatar RoboDonut commented on July 28, 2024

Right on @nickrsan, glad to be here and hope to contribute. I'll start with this data tonight.

from datasets.

RoboDonut avatar RoboDonut commented on July 28, 2024

Weeeee!

from os.path import join
from requests.packages.urllib3.exceptions import InsecureRequestWarning

requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

def download_file(url, dl_dir):
    local_filename =  join(dl_dir,url.split('/')[-1])
    # NOTE the stream=True parameter
    r = requests.get(url, stream=True, verify=False)
    with open(local_filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024): 
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
                #f.flush() commented by recommendation from J.F.Sebastian
    return local_filename


files= ['http://nassgeodata.gmu.edu/nass_data_cache/tar/1996_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/1997_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/1998_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/1999_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2000_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2001_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2002_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2003_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2004_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2005_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2006_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2007_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2008_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2009_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2010_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2011_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2012_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2013_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2014_cdls.tar.gz',
        'http://nassgeodata.gmu.edu/nass_data_cache/tar/2015_cdls.tar.gz']

download_directory = r"C:\NASS"
for f in files:
    print "Downloading: {0}".format(f)
    download_file(f,download_directory)```

from datasets.

RoboDonut avatar RoboDonut commented on July 28, 2024

uploading slowly to

http://s3-external-1.amazonaws.com/nass-mirror

from datasets.

alex-kazda avatar alex-kazda commented on July 28, 2024

@rustyguts thanks for the URL and @RoboDonut thanks for the backup. Since you have the Data Layer in hand, I will at least download the metadata at https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/meta.php

from datasets.

alex-kazda avatar alex-kazda commented on July 28, 2024
#!/bin/sh

wget https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/XMLs_1997-1999.zip

for i in `seq -w 0 15`; do 
   sleep 5
   wget https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/XMLs_20$i.zip
done

wget https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/2015_cultivated_layer_metadata.php
sleep 5
wget https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/crop_frequency_2015_metadata.php

from datasets.

alex-kazda avatar alex-kazda commented on July 28, 2024

The collected metadata (about 6MB) are at http://atrey.karlin.mff.cuni.cz/~alexak/dokumenty/USDA_Cropland_Data_Layer_Metadata.zip for now (this sharing method does not scale well, so please attach them to what you got).

from datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.