Code Monkey home page Code Monkey logo

iberianbees's Introduction

DOI License: CC BY 4.0

IberianBees database v.1.0.0 🐝

This is a repository to document the distribution and diversity of bee species of the Iberian Peninsula. You can see a summary of the data here.

How to contribute:

If you have data on Iberian bee's occurrence, fill in this template and send it to [email protected]

How to use this repo

  • The IberianBees database can be found on: Data/iberian_bees.csv.gz. This is a zip file so double click on it to unzip.

  • Metadata can be consulted here.

  • Records with non-accepted names on the Iberian bee species masterlist have been excluded of the final dataset but can be found on Data/Processing_iberian_bees_raw/removed.csv.

  • Please, if you spot any issue, please let @ibartomeus know to avoid duplicating efforts by creating an issue with the corresponding unique identifier (uid) of the record that needs to be fixed.

  • If you are curious on the process keep reading.

Process:

To build this database, we follow a reproducible workflow to clean and ensemble the data.

1- Use Scripts/1_1_Fetch_data.R to update data from internet (i.e. Gbif, iNaturalist).

2- Add new datasets (i.e. csv files) locally to Data/Rawdata/csvs/.

3- Process and clean individual files and assign a unique identifier within the folder Scripts/1_2_Processing_raw_data/.

4- Run Scripts/2_Run_all-Merge_all.R. This will run all individual files in Scripts/1_2_Processing_raw_data/and bind the data. The data can be merged directly without running all files by running the second section of the code "2 Merge all files".

5- Conduct a final cleaning (things that weren't fixed on the individual files on step 3). This is done in Scripts/3_1_Final_cleaning.R and will generate the final dataset Data/iberian_bees.csv.gz.

5.1- Non accepted species are excluded and saved on Data/Processing_iberian_bees_raw/removed.csv.

5.2- The non-accepted species names (e.g., synonyms) are checked manually from Data/Processing_iberian_bees_raw/to_check.csv and added to Data/Processing_iberian_bees_raw/manual_checks.csv once they have been reviewed with taxonomic advice when necessary. After running Scripts/3_1_Final_cleaning.R the fixed species will be included on the final Iberianbees dataset.

Metadata is generated using DataSpice.

Example:

Here, we provide an example of how to select, filter and plot the distribution of the species Xylocopa violacea for the records after the year 1999.

  • First, read compressed data in gzip format:
data <- read.table("../Data/iberian_bees.csv.gz", 
header = T, quote = "\"", sep = ",",row.names=1)
  • Second, select records of X. violacea after 1999
library(dplyr) #Library to filter data
xylocopa <- data %>% filter(Accepted_name == "Xylocopa violacea" & Year > 1999)
  • Finally, load map and plot records:
library(ggplot2) #to load worldmap and plotting
#Load map
world <- map_data("world")
#Plot records and adjust map to the Iberian Peninsula
ggplot(data = xylocopa, aes(Longitude, Latitude)) +
geom_map(data = world, map = world,
aes(long, lat, map_id = region), color = "white", fill = "grey", size = 0.1) +
coord_sf(xlim = c(-9, 4), ylim = c(36, 44)) +
geom_point() 

iberianbees's People

Contributors

josebsl avatar ibartomeus avatar ethanwhite avatar weecologydeploy avatar gmyenni avatar unnibarge avatar miguelangelcollado avatar

Stargazers

aubrey avatar Patricio Bonilla avatar Luis Barqueira avatar Pedro Jordano avatar Paulo E. Cardoso avatar be avatar

Watchers

James Cloos avatar  avatar Emilie Ploquin avatar  avatar

Forkers

pedroj

iberianbees's Issues

Add an AUthor contribution table to Data

Would be nice to add a simple author contributors table with name, affiliation, etc... It can be generated from the database, or maybe simply using the Gdocs we used to compile authors for the publication.

Posible error fechas?

"Pero me he dado cuenta de una incongruencia, y es que hay registros que tienen en fecha "1970" en el año y luego en las columnas de "start.date" y "end.date" aparecen otras fechas distintas con día y mes y otro año distinto a 1970. Pasa sobre todo con ese año, no sé si con otras fechas."

Things to do and datsets to add

Datasets to add

[x] Species master list
[x] Contributors via .xls in "/rawdata" [missing: Curro, Cap de creus, ...]
[x] Thomas Wood et al. data (Check data from Ian Cross)
[o] Historical papers [MA]
[ ] E. Asensio data
[o] Museo ciencias Naturales [Piluca]
[ ] Museo bcn [Curro]
[ ] Other datasets: Felix Torres, Leopoldo Castro, Obeso[x], Aguado, Piluca, Ortiz PDFs ...

Seladonia 26_Martinez-Lopez

These two species (Seladonia gemmea, Seladonia gemmella) are out of the database but they seem to be included on the masterlist of Iberian bees.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.