Code Monkey home page Code Monkey logo

sits's Introduction

SITS - Satellite Image Time Series Analysis for Earth Observation Data Cubes

SITS icon

Status at rOpenSci Software Peer Review CRAN status R-check-dev Codecov Documentation Life cycle Software License

Overview

sits is an open source R package for satellite image time series analysis. It enables users to apply machine learning techniques for classifying image time series obtained from earth observation data cubes. The basic workflow in sits is:

  1. Select an image collection available on cloud providers AWS, Brazil Data Cube, Digital Earth Africa, Copernicus Data Space, Digital Earth Australia, Microsoft Planetary Computer, NASA Harmonized Landsat/Sentinel, and Swiss Data Cube.
  2. Build a regular data cube from analysis-ready image collections.
  3. Extract labelled time series from data cubes to be used as training samples.
  4. Perform samples quality control using self-organised maps.
  5. Train machine learning and deep learning models.
  6. Tune deep learning models for improved accuracy.
  7. Classify data cubes using machine learning and deep learning models.
  8. Run spatial-temporal segmentation methods for object-based time series classification.
  9. Post-process classified images with Bayesian smoothing to remove outliers.
  10. Estimate uncertainty values of classified images.
  11. Evaluate classification accuracy using best practices.
  12. Improve results with active learning and self-supervised learning methods.
Conceptual view of data cubes (source: authors)

Conceptual view of data cubes (source: authors)

Documentation

Detailed documentation on how to use sits is available in the e-book “Satellite Image Time Series Analysis on Earth Observation Data Cubes”.

sits on Kaggle

Those that want to evaluate the sits package before installing are invited to run the examples available on Kaggle. If you are new to Kaggle, please follow the instructions to set up your account. These examples provide a fast-track introduction to the package. We recommend running them in the following order:

  1. Introduction to SITS
  2. Working with time series in SITS
  3. Creating data cubes in SITS
  4. Improving the quality of training samples
  5. Machine learning for data cubes
  6. Classification of raster data cubes
  7. Bayesian smoothing for post-processing
  8. Uncertainty and active learning
  9. Object-based time series classification

Installation

Pre-Requisites

The sits package relies on the geospatial packages sf, stars, gdalcubes and terra, which depend on the external libraries GDAL and PROJ. Please follow the instructions for installing sits from the Setup chapter of the on-line sits book.

Obtaining sits

sits can be installed from CRAN:

install.packages("sits")

The latest supported version is available on github. It may have additional fixes from the version available from CRAN.

devtools::install_github("e-sensing/sits", dependencies = TRUE)
# load the sits library
library(sits)
#> SITS - satellite image time series analysis.
#> Loaded sits v1.5.1.
#>         See ?sits for help, citation("sits") for use in publication.
#>         Documentation avaliable in https://e-sensing.github.io/sitsbook/.

Support for GPU

Classification using torch-based deep learning models in sits uses CUDA compatible NVIDIA GPUs if available, which provides up 10-fold speed-up compared to using CPUs only. Please see the installation instructions for more information on how to install the required drivers.

Building Earth Observation Data Cubes

Image Collections Accessible by sits

Users create data cubes from analysis-ready data (ARD) image collections available in cloud services. The collections accessible in sits 1.5.1 are:

  • Brazil Data Cube - BDC: Open data collections of Sentinel-2, Landsat-8 and CBERS-4 images.
  • Copernicus Data Space Environment CDSE: Open data collections from the EU Copernicus programme.
  • Earth on AWS - AWS: Sentinel-2/2A level 2A collections.
  • Digital Earth Africa - DEAFRICA: Open data collection of Sentinel-2/2A and Landsat-8 for Africa.
  • Digital Earth Australia - DEAUSTRALIA: Open data collections for the Australian subcontinent.
  • Microsoft Planetary Computer - MPC: Open data collection of Sentinel-2/2A and Landsat-8.
  • NASA Harmonized Landsat/Sentinel Collection HLS.
  • Swiss Data Cube (SDC): Open data collection of Sentinel-2/2A and Landsat-8.
  • USGS: Landsat-4/5/7/8 collections, which are not open data.

Open data collections do not require payment of access fees. Except for those in the Brazil Data Cube, these collections are not regular. Irregular collections require further processing before they can be used for classification using machine learning models.

Building a Data Cube from an ARD Image Collection

The following code defines an irregular data cube of Sentinel-2/2A images available in the Microsoft Planetary Computer, using the open data collection "SENTINEL-2-L2A". The geographical area of the data cube is defined by the tiles "20LKP" and "20LLKP", and the temporal extent by a start and end date. Access to other cloud services works in similar ways.

s2_cube <- sits_cube(
  source = "MPC",
  collection = "SENTINEL-2-L2A",
  tiles = c("20LKP", "20LLP"),
  bands = c("B03", "B08", "B11", "SCL"),
  start_date = as.Date("2018-07-01"),
  end_date = as.Date("2019-06-30"),
  progress = FALSE
)

This cube is irregular. The timelines of tiles "20LKP" and "20LLKP" and the resolutions of the bands are different. Sentinel-2 bands "B03" and "B08" have 10-meters resolution, while band "B11" and the cloud band "SCL" have 20-meters resolution. Irregular collections need an additional processing step to be converted to regular data cubes, as described below.

Conceptual view of data cubes (source: authors)

Conceptual view of data cubes (source: authors)

After defining an irregular ARD image collection from a cloud service using sits_cube(), users should run sits_regularize() to build a regular data cube. This function uses the gdalcubes R package, described in Appel and Pebesma, 2019.

gc_cube <- sits_regularize(
  cube          = s2_cube,
  output_dir    = tempdir(),
  period        = "P15D",
  res           = 60,
  multicores    = 4
)

The above command builds a regular data cube with all bands interpolated to 60 m spatial resolution and 15-days temporal resolution. Regular data cubes are the input to the sits functions for time series retrieval, building machine learning models, and classification of raster images and time series.

The cube can be shown in a leaflet using sits_view().

# View a color composite on a leaflet
sits_view(s2_cube[1, ], green = "B08", blue = "B03", red = "B11")

Working with Time Series in sits

Accessing Time Series in Data Cubes

sits has been designed to use satellite image time series to derive machine learning models. After the data cube has been created, time series can be retrieved individually or by using CSV or SHP files, as in the following example. The example below uses a data cube in a local directory, whose images have been obtained from the "MOD13Q1-6" collection of the Brazil Data Cube.

library(sits)
# this data cube uses images from the Brazil Data Cube that have
# downloaded to a local directory
data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
# create a cube from downloaded files
raster_cube <- sits_cube(
  source = "BDC",
  collection = "MOD13Q1-6",
  data_dir = data_dir,
  delim = "_",
  parse_info = c("X1", "X2", "tile", "band", "date"),
  progress = FALSE
)
# obtain a set of samples defined by a CSV file
csv_file <- system.file("extdata/samples/samples_sinop_crop.csv",
  package = "sits"
)
# retrieve the time series associated with the samples from the data cube
points <- sits_get_data(raster_cube, samples = csv_file)
# show the time series
points[1:3, ]
#> # A tibble: 3 × 7
#>   longitude latitude start_date end_date   label    cube      time_series      
#>       <dbl>    <dbl> <date>     <date>     <chr>    <chr>     <list>           
#> 1     -55.8    -11.7 2013-09-14 2014-08-29 Cerrado  MOD13Q1-6 <tibble [12 × 2]>
#> 2     -55.8    -11.7 2013-09-14 2014-08-29 Cerrado  MOD13Q1-6 <tibble [12 × 2]>
#> 3     -55.7    -11.7 2013-09-14 2014-08-29 Soy_Corn MOD13Q1-6 <tibble [12 × 2]>

After a time series has been obtained, it is loaded in a tibble. The first six columns contain the metadata: spatial and temporal location, label assigned to the sample, and coverage from where the data has been extracted. The spatial location is given in longitude and latitude coordinates. The first sample has been labelled “Pasture”, at location (-55.65931, -11.76267), and is considered valid for the period (2013-09-14, 2014-08-29).

Time Series Classification

Training Machine Learning Models

sits provides support for the classification of both individual time series as well as data cubes. The following machine learning methods are available in sits:

  • Support vector machines (sits_svm())
  • Random forests (sits_rfor())
  • Extreme gradient boosting (sits_xgboost())
  • Multi-layer perceptrons (sits_mlp())
  • 1D convolution neural networks (sits_tempcnn())
  • Temporal self-attention encoder (sits_tae())
  • Lightweight temporal attention encoder (sits_lighttae())

The following example illustrate how to train a dataset and classify an individual time series. First we use the sits_train() function with two parameters: the training dataset (described above) and the chosen machine learning model (in this case, TempCNN). The trained model is then used to classify a time series from Mato Grosso Brazilian state, using sits_classify(). The results can be shown in text format using the function sits_show_prediction() or graphically using plot.

# training data set
data("samples_modis_ndvi")
# point to be classified
data("point_mt_6bands")
# Train a deep learning model
tempcnn_model <- sits_train(
  samples = samples_modis_ndvi,
  ml_method = sits_tempcnn()
)
# Select NDVI band of the  point to be classified
# Classify using TempCNN model
# Plot the result
point_mt_6bands |>
  sits_select(bands = "NDVI") |>
  sits_classify(tempcnn_model) |>
  plot()
#>   |                                                                              |                                                                      |   0%  |                                                                              |===================================                                   |  50%  |                                                                              |======================================================================| 100%
Classification of NDVI time series using TempCNN

Classification of NDVI time series using TempCNN

The following example shows how to classify a data cube organized as a set of raster images. The result can also be visualized interactively using sits_view().

# Create a data cube to be classified
# Cube is composed of MOD13Q1 images from the Sinop region in Mato Grosso (Brazil)
data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
sinop <- sits_cube(
  source = "BDC",
  collection = "MOD13Q1-6",
  data_dir = data_dir,
  delim = "_",
  parse_info = c("X1", "X2", "tile", "band", "date"),
  progress = FALSE
)
# Classify the raster cube, generating a probability file
# Filter the pixels in the cube to remove noise
probs_cube <- sits_classify(
  data = sinop,
  ml_model = tempcnn_model,
  output_dir = tempdir()
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
# apply a bayesian smoothing to remove outliers
bayes_cube <- sits_smooth(
  cube = probs_cube,
  output_dir = tempdir()
)
# generate a thematic map
label_cube <- sits_label_classification(
  cube = bayes_cube,
  output_dir = tempdir()
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
# plot the the labelled cube
plot(label_cube,
  title = "Land use and Land cover in Sinop, MT, Brazil in 2018"
)
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, will retire in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.
#> The sp package is now running under evolution status 2
#>      (status 2 uses the sf package in place of rgdal)
Land use and Land cover in Sinop, MT, Brazil in 2018

Land use and Land cover in Sinop, MT, Brazil in 2018

References

Citable papers for sits

If you use sits, please cite the following paper:

  • Rolf Simoes, Gilberto Camara, Gilberto Queiroz, Felipe Souza, Pedro R. Andrade, Lorena Santos, Alexandre Carvalho, and Karine Ferreira. “Satellite Image Time Series Analysis for Big Earth Observation Data”. Remote Sensing, 13: 2428, 2021. doi:10.3390/rs13132428.

Additionally, the sample quality control methods that use self-organized maps are described in the following reference:

  • Lorena Santos, Karine Ferreira, Gilberto Camara, Michelle Picoli, Rolf Simoes, “Quality control and class noise reduction of satellite image time series”. ISPRS Journal of Photogrammetry and Remote Sensing, 177:75-88, 2021. doi:10.1016/j.isprsjprs.2021.04.014.

Papers that use sits to produce LUCC maps

Papers that describe software used by the sits package

We thank the authors of these papers for making their code available to be used in connection with sits.

  • Marius Appel and Edzer Pebesma, “On-Demand Processing of Data Cubes from Satellite Image Collections with the Gdalcubes Library.” Data 4 (3): 1–16, 2020. doi:10.3390/data4030092.

  • Ron Wehrens and Johannes Kruisselbrink, “Flexible Self-Organising Maps in kohonen 3.0”. Journal of Statistical Software, 87(7), 2018. doi:10.18637/jss.v087.i07.

  • Charlotte Pelletier, Geoffrey I. Webb, and Francois Petitjean. “Temporal Convolutional Neural Network for the Classification of Satellite Image Time Series.” Remote Sensing 11 (5), 2019. doi:10.3390/rs11050523.

  • Vivien Garnot, Loic Landrieu, Sebastien Giordano, and Nesrine Chehata, “Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention”, Conference on Computer Vision and Pattern Recognition, 2020. <doi: 10.1109/CVPR42600.2020.01234>.

  • Vivien Garnot, Loic Landrieu, “Lightweight Temporal Self-Attention for Classifying Satellite Images Time Series”, 2020. <arXiv:2007.00586>.

  • Maja Schneider, Marco Körner, “[Re] Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention.” ReScience C 7 (2), 2021. doi:10.5281/zenodo.4835356.

  • Jakub Nowosad, Tomasz Stepinski, “Extended SLIC superpixels algorithm for applications to non-imagery geospatial rasters”. International Journal of Applied Earth Observation and Geoinformation, 112, 102935, 2022.

  • Martin Tennekes, “tmap: Thematic Maps in R.” Journal of Statistical Software, 84(6), 1–39, 2018.

Acknowledgements for community support

The authors are thankful for the contributions of Edzer Pebesma, Jakub Nowosad. Marius Appel, Martin Tennekes, Robert Hijmans, Ron Wehrens, and Tim Appelhans, respectively chief developers of the packages sf/stars, supercells, gdalcubes, tmap, terra, kohonen, and leafem. The sits package recognises the great work of the RStudio team, including the tidyverse. Many thanks to Daniel Falbel for his great work in the torch and luz packages. Charlotte Pelletier shared the python code that has been reused for the TempCNN machine learning model. We would like to thank Maja Schneider for sharing the python code that helped the implementation of the sits_lighttae() and sits_tae() model. We recognise the importance of the work by Chris Holmes and Mattias Mohr on the STAC specification and API.

Acknowledgements for Financial and Material Support

We acknowledge and thank the project funders that provided financial and material support:

  1. Amazon Fund, established by the Brazilian government with financial contribution from Norway, through the project contract between the Brazilian Development Bank (BNDES) and the Foundation for Science, Technology and Space Applications (FUNCATE), for the establishment of the Brazil Data Cube, process 17.2.0536.1.

  2. Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES) and from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), for providing MSc and PhD scholarships.

  3. Sao Paulo Research Foundation (FAPESP) under eScience Program grant 2014/08398-6, for for providing MSc, PhD and post-doc scholarships, equipment, and travel support.

  4. International Climate Initiative of the Germany Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety (IKI) under grant 17-III-084- Global-A-RESTORE+ (“RESTORE+: Addressing Landscape Restoration on Degraded Land in Indonesia and Brazil”).

  5. Microsoft Planetary Computer under the GEO-Microsoft Cloud Computer Grants Programme.

  6. The Open-Earth-Monitor Cyberinfratructure project, which has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101059548.

  7. FAO-EOSTAT initiative, which uses next generation Earth observation tools to produce land cover and land use statistics.

How to contribute

The sits project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

sits's People

Contributors

albhasan avatar ammaciel avatar gilbertocamara avatar gqueiroz avatar lorenalves avatar lubiavinhas avatar m3nin0 avatar mdsumner avatar oldlipe avatar pedro-andrade-inpe avatar richardscottoz avatar rolfsimoes avatar vwmaus avatar ywatacarvalho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sits's Issues

Check if .sits_convert_resolution is necessary

.sits_convert_resolution() does not work. Check if it is really necessary. The first two lines of this function are:

    res <- vector()
    names[res] <- c("xres", "yres")

They should be as follows to work:

    res <- vector(length = 2)
    names(res) <- c("xres", "yres")

This function is only called by .sits_fromSHP(), in an if that is never true in the tests:

    if (coverage$xres > 1) {
        res <- .sits_convert_resolution(coverage)
        xres <- res["xres"]
        yres <- res["yres"]
    }

Misleading argument

The function sits::sits_classify_raster has the argument file, which is actually a prefix instead of a path to a file. This may mislead users. I suggest renaming this argument as file_prefix

About sits_select and consistency with other sits functions

Some sits functions work as interfaces to dplyr functions. For example: sits_transmute calls dplyr::transmute, sits_mutate calls dplyr::mutate, sits_sample calls dplyr::sample_n and dplyr::sample_frac. In the transmute and mutate cases, they act upon $time_series, which cannot be easily handled with dplyr directly. However, sits_select calls dplyr::filter (and not dplyr::select), and in this case dplyr could be used directly:

data.tb <- sits_select(cerrado_2classes, label == "Cerrado")

is the same of

data.tb <- dplyr::filter(cerrado_2classes, label == "Cerrado")

It will for sure confuse the user that knows dplyr. The other objective of sits_select is to call sits_select_bands, which is very useful and calls dplyr::select over $time_series. My suggestions to keep consistency are (1) rename sits_select_bands to sits_select (as the other functions already work with $time_series), OR (2) remove sits_select and rename sits_mutate to sits_mutate_bands, sits_transmute to sits_transmute_bands, and so on.

Roll back GBM tests

GBM tests needed to be removed in order to allow the Continuous Integration server work properly with the updated version of R (3.5.1) and its packages. The lines removed will be shown below.

Error in sits_dendogram

I am trying to build the sits vignette. The vignette builder is failing in the call

dendro <- sits_dendrogram(cerrado_2classes)
Error in dtwclust::tsclust(values.tb, type = "hierarchical", k = NROW(data.tb), :
Cannot have more clusters than series in the dataset

Could you check what is the problem?

confusion in the names of functions

Some functions/data of sits do not follow a convention in their names. Usually, sits use only lowercase (even for acronyms - and sits is an acronym written in lowercase) with words separated by underscore:

sits_bayes_postprocess()
sits_csv_error_file()
sits_info_services()
sits_ndvi_arima_filter()
point_ndvi

However, some functions/data do not follow this convention:

sits_TWDTW_classify()
sits_data_toCSV()
.kalmanfilter()
samples_MT_ndvi
sits_shp_toCSV()

Shouldn´t they be named as below?

sits_TWDTW_classify() -> sits_twdtw_classify()
sits_data_toCSV()     -> sits_data_to_csv()
.kalmanfilter()       -> .kalman_filter()
samples_MT_ndvi       -> samples_mt_ndvi
sits_shp_toCSV()      -> sits_shp_to_csv()

Raster Projection Problem

Hi! Great package!

I am trying the command sits_TWDTW_classify to map cropland by using Sentinel-2 tile.

I found this package fail to identify the other projections, such as UTM. Because when I use the command sits_getdata to extract time series of points from tibble coverage, it always reports an error that "Error in spTransform(y, px) : error in pj_transform: latitude or longitude exceeded limits".

I was wondering how to realize the functions, and whether there are any solutions when the projection of input data is UTM.

Best regards

Jie

Output of tests

The tests of sits packages print several messages and also create some files. Remove such files along the tests and avoid showing messages to have a clean output.

improve sits_classify error when bands do not match

data(point_MT_6bands)
data(samples_MT_9classes)

samples.tb <- sits_select(samples_MT_9classes,
                          bands = c("ndvi"))
model <- sits_train(samples.tb, sits_svm())
point.tb <- sits_select(point_MT_6bands, bands = c("ndvi", "evi"))

sits_classify(point.tb, model)

shows

bands in the data do not match bands in the model

improve to

bands in the data (ndvi, evi) do not match bands in the model (ndvi)

Create a README.Rmd

The current README.md file uses R markdown, and not GitHub markdown. Create a README.Rmd in order to allow syntax highlight in the main page of sits package, in the same way of dtwSat.

sits_plot samples with different labels and bands

sits_plot currently plots all combinations of labels and bands in separate plots. Investigate if it would be interesting to plot them in a single plot, using one axis for labels and another for bands.

data <- samples_MT_9classes %>%
    sits_select_bands(evi, ndvi) %>%
    dplyr::filter(label %in% c("Pasture", "Forest"))
par(mfcol = c(2, 2)) # sits ignores this
sits_plot(data)

Build fails

A fresh download from github fails on "R CMD build sits" by throwing :

Quitting from lines 481-515 (sits.Rmd)
Error: processing vignette 'sits.Rmd' failed with diagnostics:
conditions failed for call 'tools::buildVignett .. ".", tangle = TRUE)':
* length(.) == nrow(dist1_DT)
Description: sits_classify_raster - number of classified pixels is different
from number of input pixels
Execution halted

sits_svm example

Good morning

the example code provided for sits::sits_svm runs but it doesn't work. It seems the classifier is unable to classify the given point. See the code below

data(samples_MT_ndvi)
data(point_ndvi)
# label before classification
(label_before <- point_ndvi$label)
# NoClass is not in the label set
(point_ndvi$label %in% unique(samples_MT_ndvi$label))
class.tb <- sits_classify (point_ndvi, samples_MT_ndvi, ml_method = sits_svm(kernel = "radial", cost = 10))
# label after classification. Neither the label on the point_ndvi nor the class.tb changed!
(label_after1 <- point_ndvi$label)
(label_after2 <- class.tb$label)

Coordinate EOCubes with SITS

There is a need to coordinate the EOCubes and SITS packages. SITS should rely on EOCubes to provide information about organized collections of satellite data (cubes). There should be a away for SITS to ask EOCubes to provide information about a requested coverage. This should work in a similar way as WTSS and Google Earth Engine:

  1. The user asks EOCube about what collections are available.
  2. The user defines a coverage as a partition of a collection in space and time.
  3. EOCubes supplies to SITS all information about this coverage, without the need for entering any list of files. This should be transparent to the user.

Call sits_coverage failed

Hello

I am getting an error when use the function sits_coverage. I use the command raster_cov <- sits_coverage(service = "RASTER", name = "Sinop-crop", timeline = timeline, bands = c("ndvi"), scale_factors = 1e-04, files = filename).

The error is "Error: conditions failed for call 'sits_coverage(service = "RASTER", name = "Sinop-crop", timeline = timeline, ': * !(purrr::is_null(.))
Description: Not able to obtain scale factors for raster data"

However, I had set the scale_factors in the command before.
Some idea about how to solve this?

Best,
Jie

Compilation fail

Hello @vwmaus!

I am getting an error when compile sits from scratch:

  • installing to library ‘/.../R_libs’
  • installing source package ‘sits’ ...
    g++ -I/usr/share/R/include -DNDEBUG -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RcppExports.cpp -o RcppExports.o
    ** libs
    RcppExports.cpp:4:18: fatal error: Rcpp.h: No such file or directory

Some idea about how to solve this?

Missing RASTER_minimum_value

The property RASTER_minimum_value, misses values for the bands:

  • swir1
  • swir2

This throws errors during Landsat classifications.

sits_classify error when model has more bands than data

When the model has more bands then data in sits_classify, it stops with a bad error.

data(point_MT_6bands)
data(samples_MT_9classes)

samples.tb <- sits_select(samples_MT_9classes,
                          bands = c("ndvi", "evi"))
model <- sits_train(samples.tb, sits_svm())
point.tb <- sits_select(point_MT_6bands, bands = c("ndvi"))

sits_classify(point.tb, model)

shows the following error:

Item 26 of j is 428 which is outside the column number range [1,ncol=414] 

It should be

bands in the data do not match bands in the model

in the same way when the model has less bands than data.

sits_coverage is ignoring some arguments

sits_coverage, when used with some service, is ignoring some arguments. One possible action is to compare the passed (and currently ignored) argument to the available information about the service and use if it is allowed (a subset of the available values, a value within the available ranges, etc.), or stop with an error otherwise. Current output:

coverage_wtss <- 
    sits_coverage(service = "WTSS-INPE",
                  name    = "MOD13Q1",
                  bands = c("ndvi", "evi"))

coverage_wtss$bands # [1] "mir"  "blue" "nir"  "red"  "evi"  "ndvi"

coverage_wtss <- 
    sits_coverage(service = "WTSS-INPE",
                  name    = "MOD13Q1",
                  bands = c("ndvi", "eviee"))

coverage_wtss$bands # [1] "mir"  "blue" "nir"  "red"  "evi"  "ndvi"

sits_kohonen parameters

The first two parameters of "sits_kohonen.R" are a tibble and the time series extracted from the same tibble. Do we need two parameters or could we use only one?

distance in the output of sits_classify

In the output of sits_classify, there is an attribute called distance that comes with only zeros. Is it necessary? Could it be removed?

data(samples_MT_ndvi)
svm_model <- sits_train(samples_MT_ndvi, sits_svm(kernel = "radial", cost = 10))
data(point_ndvi)
class.tb <- sits_classify(point_ndvi, svm_model)

class.tb$predicted[[1]]$distance

sits_services() is not printing products nor coverages

The documentation of sits_services() says that it "Uses the configuration file to print information about the services, products and coverages." However, its output is currently only the services:

> sits_services()
Service - WTSS-INPE
------------------
Service - SATVEG
------------------

Fix it.

Bad error message

An error message produced by sits_select_bands wrongly references the sits_select function

``
library(sits)
data(samples_MT_9classes)
bs <- c(sits_bands(samples_MT_9classes), "fake_bands")
sits_select_bands(samples_MT_9classes, bs)

Error: conditions failed for call 'sits_select_bands(samples_MT_9classes, bs)':
* all(bands %in% sits_bands(.))
Description: sits_select: some band(s) not found in input data
``

Bug in sits_coverage function

sits_coverage() function is not working properly with WTSS coverages other than "MOD13Q1". Example:

sits_services()
# shows 'MOD13Q1' and 'MOD13Q1_M' (a new coverage for MOD13Q1 quality bands)
sits_coverage("WTSS-INPE", "MOD13Q1")
# it's working!
sits_coverage("WTSS-INPE", "MOD13Q1_M")
# fail!

Some possible related issues: #109

Review "sits table"

Review the term "sits table" in the documentation, as it does not exist any more. It should be replaced by "tibble" or "data.table" accordingly.

Remove some sits dependencies

Try to remove some dependencies of sits. Some candidates are:

(1) "mclust" - clustering functions of this package take to much time to process
(2) "signal" - implements Savitksy-Golay filter that could be implemented directly in sits
(3) "tidyr" - only needed to run fastApply in the past but it is no longer used
(4) "openxlsx"- create an excel spreadsheet with the confusion matrix
(5) "pryr" - find out the free memory available.

sits_plot output of sits_classify with more than one band

When calling sits_plot with the output of sits_classify using more than one band, in only draws the first band as default. Shouldn´t it draw all them? In the example below, it shows only ndvi.

data(point_MT_6bands)
data(samples_MT_9classes)

samples.tb <- sits_select(samples_MT_9classes,
                          bands = c("ndvi", "evi"))
model <- sits_train(samples.tb, sits_svm())
point.tb <- sits_select(point_MT_6bands, bands = c("ndvi", "evi"))

result <- sits_classify(point.tb, model)

sits_plot(result)

sits_conf_matrix error

sits_conf_matrix error fails when there are less than two classes in a predicted & reference tibble

The code below throws the error:

data and reference should be factors with the same levels.

library(sits)
data(cerrado_2classes)
pred_ref.tb <-  sits_kfold_validate(cerrado_2classes, folds = 2)
conf.mx <- sits_conf_matrix(pred_ref.tb[1:10,])

R version

> R.Version()
$platform
[1] "x86_64-pc-linux-gnu"

$arch
[1] "x86_64"

$os
[1] "linux-gnu"

$system
[1] "x86_64, linux-gnu"

$status
[1] ""

$major
[1] "3"

$minor
[1] "4.4"

$year
[1] "2018"

$month
[1] "03"

$day
[1] "15"

$`svn rev`
[1] "74408"

$language
[1] "R"

$version.string
[1] "R version 3.4.4 (2018-03-15)"

$nickname
[1] "Someone to Lean On"

Overwrite results

The function sits::sits_classify_raster overwrites former results and it doesn't provide a way to control this behavior. This could lead to undesired results, specially when classifying large data sets.

The overwrite parameter is burn in the code at sits:::.sits_classify_multicores

documentation of sits_convnets

Fix the documentation of sits_convnets:

* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'sits_convnets'
  ‘filters’ ‘kernels’
Documented arguments not in \usage in documentation object 'sits_convnets':
  ‘filter’ ‘kernel’

Please do not fix this issue yet. I'll use it to test the continuous integration service.

Review sits_patterns

Review sits_patterns:

  1. Argument timeline is useless
  2. Arguments start_date and end_date are not month-day. They should be a Date, or a string that could be converted to a Date. The code below does not work:
data(cerrado_2classes)
patterns.tb <- sits_patterns(cerrado_2classes, start_date ="09-13", end_date = "08-29")
  1. If start_date is set as argument but not end_date, start_date will be overwritten, and vice-versa.
        if (purrr::is_null(start_date) || purrr::is_null(end_date)) {
            start_date <- lubridate::as_date(utils::head(sample_dates, n = 1))
            end_date   <- lubridate::as_date(utils::tail(sample_dates, n = 1))
        }

Review sits_classify_raster memory and multicores

sits_classify_raster uses 4GB of memory and the maximum number of cores minus one as default. Review these values, as the user might not have 4GB of memory available and might not want to use almost all processors. For example, sits_classify has multicores = 1 as default.

Missing configuration values

config.yaml misses some values. The property "RASTER_missing_value" misses the blue band for LANDSAT as well as for other indexes such as savi and msavi.

review sits_select_bands()

What about reviewing sits_select_bands() to work in the same way of dplyr::select()? Currently it works as follows:

sits_select_bands(samples_MT_9classes,  c("ndvi", "evi"))

The code above could be updated to:

sits_select_bands(samples_MT_9classes, ndvi, evi)

sits tibble' class

Good morning,

There are differences in the classes of different datasets in the package. Some are "sits" and some other "sits_tibble". Are there differences among them?

Kind regards,

library(sits)
data(point_MT_6bands)
data(samples_MT_ndvi)
data(point_ndvi)
data(prodes_226_064)
data(cerrado_2classes)

# class sits
class(sits::sits_tibble()) # template function!
class(point_MT_6bands)
class(samples_MT_9classes)
# class sits_tibble
class(samples_MT_ndvi)
class(cerrado_2classes)
class(point_ndvi)
class(prodes_226_064)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.