Code Monkey home page Code Monkey logo

move-book's Introduction

DOI

Movebank dataset summarizer

This project scrapes and summarizes data sets available on Movebank. There is a huge diversity of data sources with varying license terms, public access/not, taxa. This is an attempt to better understand how accessible relevant data are for my own project. I'm hoping to wrap this up as a more general-use template, so if that's something you would be interested in - please open an issue and we can discuss. I'd rather do that with an idea of what others are looking for. In the meantime, this can be used and adapted for your own needs.

We use a number of packages for downloading (rmoveapi, move), processing (data.table, anytime) and visualizing (ggplot2, leaflet) the data. All steps are wrapped up in a targets workflow, and package versions tracked with renv. The result is a bookdown doc with each study on its own page. There's a minimum working example of combining targets and bookdown available here: robitalec/targets-parameterized-bookdown. Thank you to the developers of all of these great packages.

Setup

  1. Register for a Movebank account
  2. Run the setup script to install packages with renv, save your credentials with keyring and download study data
  3. Edit the credentials and path sections in the targets file (_targets.R)
  4. Run targets::tar_make()

I've decided to remove the study data download step from the targets workflow because during development it was a huge step I didn't want to rerun. I figure folks will do this step once, so it's not worth hanging up the workflow. Since the paths are tracked however, targets will only rerun what is needed as data are added or removed.

Caveats/lessons learned

If you are hoping to do something similar, I have two main lessons learned:

  • Reduce the data to only what you need as quickly as possible, without going far into analysis or processing. This is directly related to...
  • The data is not homogeneous. Data types do not appear to be strictly enforced, there are many duplicated study names, and there are NAs or errors throughout. This makes it hard to apply the same functions, or combine different datasets programmatically.

So reduce down to only the data you need before doing anything too specific.

Some things to watch out for:

  • duplicated study names (see here)
  • errors in dates (eg. typos, impossible study period date/end ranges)
  • taxons provided are often a list of different taxonomies or taxonomic ranks
  • pay attention to the Movebank API doc

Citing

This project has been released under a GPL-3 license and with a Zenodo DOI:

Alec L. Robitaille. robitalec/move-book. Zenodo.

move-book's People

Contributors

robitalec avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.