hannameyer / cast Goto Github PK

Developer Version of the R package CAST: Caret Applications for Spatio-Temporal models

Home Page: https://hannameyer.github.io/CAST/

R 100.00%

autocorrelation caret feature-selection machine-learning overfitting predictive-modeling spatial spatio-temporal variable-selection

cast's People

Contributors

Stargazers

Watchers

cast's Issues

Supplying manual variable weights to aoa doesn't work?

Thank you for the great package!

I am attempting to run aoa without specifying a model (just newdata and weights). However, I'm getting the same results whether I supply a table of weights or not. Reading through the code, it seems as though if weight cannot be extracted from a trained model early on, it gets assigned as an object of class error, over-writing any table of weights initially supplied to the function. Do I have this right?

an alternative distance metric, categorical variable handling, and optimization ideas

First of all, thank you for the excellent package and companion articles.

While looking over the aoa code it occurred to me that some of the complexity associated with handling categorical variables can be simplified by switching to a different distance metric. Gower's generalized distance metric is ideal because it can integrate mixtures of ratio, nominal, and ordinal data types. Also, the metric automatically includes scaling / centering. There are a couple of implementations:

It would appear that the knnx.dist function does all of the heavy lifting in aoa.

A quick benchmark of a couple candidate methods.

library(gower)
library(cluster)
library(FNN)
library(microbenchmark)

set.seed(10101)
n <- 1000
a <- rnorm(n = n, mean = 0, sd = 2)
x <- rnorm(n = n, mean = 0, sd = 2)
y <- rnorm(n = n, mean = 0, sd = 2)

z <- data.frame(x, y, a)

microbenchmark(
  gower = gower_dist(z[1:10, ], z),
  knn = knnx.dist(data = z, query = z[1:10, ], k = 1),
  daisy = daisy(z, metric = 'gower')
)

The interface and resulting objects aren't directly compatible, but it does seem like gower::gower_dist() is a reasonable candidate in terms of speed. The main reason to consider cluster::daisy is that it can accommodate all variable types, while gower::gower_dist() does not yet differentiate between nominal / ordinal factors.

Unit: microseconds
  expr     min       lq       mean   median      uq      max neval cld
 gower   395.7   444.70    523.737   497.35   559.0    874.3   100  a 
   knn   772.6   794.05    892.615   842.70   925.2   1382.7   100  a 
 daisy 56398.0 73496.70 100253.478 78571.80 88727.8 276262.1   100   b

Profiling data for aoa run in a single thred:

This was performed with a model based on 1,030 observations as applied to a raster stack
dimensions : 3628, 2351, 8529428, 18 (nrow, ncol, ncell, nlayers)

I'll follow-up with a small example dataset that contains nominal and ordinal variables.

Parallel implementation of `ffs()`

Hi there,

More of an enhancement suggestion but also a question. Any advice on parallel-izing ffs()? I'm using ranger to create species distribution models for many plant species and have ~70 covariates, resulting in ffs reporting over 4000 individual models being trained. I have ~20 cores at my disposal, I think I could see major speed improvements with a multicore implementation similar to aoa.

Thanks,

Rob

aoa() and calibrate_aoa() with parallelization

Hi,

I have been using the AOA() function with parallelization as follows:

´´´
cl <- makePSOCKcluster(detectCores()-2)
registerDoParallel(cl)
AOA <- aoa(df, model_sp, cl)
stopCluster(cl)
´´´

When I check the task manager processes and performance tabs, it doesn't seem like the computer is using its cores. CPU usage is quite low (below 20%), and it takes quite some time to finish.
(dataset: 70.000 lines, 17 columns, 10folds 5reps cv)
Any idea on what might be happening? :)

In addition, I tried using the calibrate_aoa() function with multiCV=T, it took days and ended up throwing an error in the end. I'll run it again when I have some time and post the message.

Threshold calculated by `trainDI()` can become larger than the maximum training distance

Some sets of training distances produce thresholds that are larger than the maximum training distance.

I expected based on Meyer & Pebesma 2021 that the threshold would lie within the range of training distances:

"The outlier-removed maximum DI of the training data is the one used as threshold for the AOA (boxplot in Figure 2b) where outliers are defined as values greater than the upper whisker (i.e. larger than the 75-percentile plus 1.5 times the IQR of the DI values of the cross-validated training data)."

Issue #46 and commit c92a3f2 changed how this threshold is calculated.

Here two examples where the updated calculation produces a threshold value that is larger than the maximum training distance:

check_threshold <- function(di) {
  cat("max(di) = ")
  cat(
    max(di), 
    fill = TRUE
  )
  
  # threshold calculated by `trainDI()` with code from `CAST` <= `v0.7.0`
  cat("threshold <= v0.7.0: ")
  cat(
    max(di[!(di > (stats::quantile(di, 0.75) + 1.5 * stats::IQR(di)))])
  )
  cat("  ")
  cat(
    grDevices::boxplot.stats(di)$stats[5], 
    fill = TRUE
  )
  
  # threshold calculated by `trainDI()` with code from `CAST` >= `v0.7.1`
  #   * issue github.com/HannaMeyer/CAST/issues/46
  #   * commit https://github.com/HannaMeyer/CAST/commit/c92a3f2923545268db86e5bc1da6d8966d797d94
  cat("threshold >= v0.7.1: ")
  cat(
    stats::quantile(di, 0.75, na.rm = TRUE) + 1.5 * stats::IQR(di, na.rm = TRUE),
    fill = TRUE
  )
}

set.seed(17)
check_threshold(rpois(100, 2))
#> max(di) = 5
#> threshold <= v0.7.0: 5  5
#> threshold >= v0.7.1: 6

set.seed(17)
check_threshold(rnorm(100))
#> max(di) = 2.442327
#> threshold <= v0.7.0: 2.442327  2.442327
#> threshold >= v0.7.1: 2.560418

^{Created on 2023-06-30 with reprex v2.0.2}

Meyer, H., & Pebesma, E. (2021). Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods in Ecology and Evolution, 12(9), 1620–1633. https://doi.org/10.1111/2041-210X.13650

Incompability with terra SpatRaster objects

Hello @HannaMeyer,

in a current use-case I came across the issue that terra's SpatRaster objects are currently not supported by CAST.
Of course users can coerce their SpatRaster objects to RasterStack before using CAST functionality. However, in a recent PR #15 I included that functionality within the aoa and calibrate_aoa functions so that SpatRasters now won't throw an error and the output is also a SpatRaster object. Its a rather "dirty" solution but I was wondering if a transition to terra is something you are interested in for the package? And if so, should Raster objects still be supported since existing code might rely on the raster package? In that regard, if a transition to terra is considered valuable for CAST I guess it would make sense to "flip" the dependencies so that the general code base works on SpatRasters and Raster objects would be coerced beforehand?
Let me know if that is something you are interested in and I could put in some work on the mentioned PR.

ffs with recipe

When I try to use a recipe with ffs it doesn't recognize the "response" from the recipe object. Is there a work around for this? Thanks.

``
rec_test <-
recipe(stream ~ ., data = testing) %>%
update_role(sb12, new_role = "performance var") %>%
step_center(all_predictors()) %>%
step_scale(all_predictors()) %>%
step_pca(contains("_30agg"), prefix = "pca_B", threshold = 0.9)

sb12 <- CreateSpacetimeFolds(train,spacevar = "sb12",k = 10, seed = 123 )
ctrltest <- trainControl(method="repeatedcv",
repeats = 5,
allowParallel = TRUE,
returnResamp = "all",
verbose = FALSE,
index = sb12$index)

set.seed(1234)
ffstest <- ffs(rec_test,
data = testing,
metric = "Kappa",
method = "rf",
trControl = ctrltest)

Error in ffs(rec_test, data = testing, metric = "Kappa", method = "rf", :
argument "response" is missing, with no default

knndm function

Hi!
Thank you for creating this package it has been a pleasure using it so far.
I am using the knndm function to see if I can better mange spatial auto correlation in my RF model, but I am having some issues at the global_validation step.
I get an error:
"Error in global_validation(model4) : Global performance could not be estimated because predictions were not saved. Train model with savePredictions='final'."

I figured this was to do with how the data is feeding into the knndm function, now as a classification model I don't have any reference. The parameters used in the 'trainControl' function should be okay (including the line savePredictions='final'). But I'm confused

Any assistance would be greatly appreciated I'm still a beginner!
Thank you
Emme

knndm_folds4 <- knndm(pts4, modeldomain=studyArea, k = 5)
knndm_folds4
plot(knndm_folds4)

ctrl4 <- trainControl(method="cv",
index=knndm_folds4$indx_train,
savePredictions='final')

model4 <- train(abpres~., data = Train4[,-c(2:3)], method="rf", trcontrol = ctrl4)
model4
lengths(model4$pred)
global_validation(model4)

global_validation(model4)
Error in global_validation(model4) :
Global performance could not be estimated because predictions were not saved.
Train model with savePredictions='final'

(Coordinate data is projected)
head(pts4)
abpres BtmSalinity_fall_max BtmStress_fall_max BtmStress_fall_min BtmTemp_fall_max BtmTemp_fall_min
1 abs 35.04016 0.084448554 0.0318813547 5.066860 4.3416319
3 pres 31.07378 0.003851054 0.0003281421 11.879930 4.8315330
5 pres 32.51310 0.005012250 0.0002675029 5.575860 0.2846721
6 abs 34.97558 0.047943920 0.0008219921 5.541814 3.9962437
7 pres 34.75494 0.058296647 0.0227264166 4.417707 2.7072399
8 pres 33.39942 0.019661102 0.0067749284 2.136045 -0.4165318
MLD_fall_max MLD_fall_min SurfaceTemp_fall_max geometry
1 195.44728 12.422209 9.134846 POINT (926113.1 1210112)
3 39.75787 8.601914 18.005045 POINT (-144911.3 -436612)
5 33.26155 4.898076 13.206921 POINT (491246.7 -249519.5)
6 29.63660 4.868274 22.026590 POINT (-128115.7 -818321.2)
7 33.98904 5.724236 11.875775 POINT (841782.8 -177986.8)
8 31.39414 7.639641 16.741175 POINT (483544 -490805.2)

{mlr3spatiotempcv} plotting functions for `CAST::CreatespacetimeFolds()`

Hi @HannaMeyer

here are some examples of how you/users could use the autoplot() generics in {mlr3spatiotempcv} to visualize partitions created via {CAST}.

For spacetime, one can choose whether to show the ommited observations or not (show_omitted = TRUE/FALSE)

All Cstf functions also have a 2D plotting generic but when the dataset is spatiotemporal, this options is limited due to overplotting in a 2D space.

Available from {mlr3spatiotempcv} >= 0.2.1.9003 (important bugfix for spacevar + timevar plotting in 0.3.0.9005).

library(mlr3)
library(mlr3spatiotempcv)

data <- cookfarm_sample

# tweak Date variable for plotting
data$Date <- rep(c(
  "2020-01-01", "2020-02-01", "2020-03-01", "2020-04-01",
  "2020-05-01"
), times = 1, each = 100)
b <- mlr3::as_data_backend(data)
b$hash <- "_mlr3_tasks_cookfarm_"
task <- TaskRegrST$new(
  id = "cookfarm", b, target = "PHIHOX",
  extra_args = list(
    coordinate_names = c("x", "y"), coords_as_features = FALSE,
    crs = 26911
  )
)

# time out --------------------------------------------------------------------
rsp <- rsmp("sptcv_cstf", folds = 5, time_var = "Date")
set.seed(42)
rsp$instantiate(task)

# without omitted, we have no values on the y-axis and the plot is not shown
autoplot(rsp, task, fold_id = 5, show_omitted = TRUE, plot3D = TRUE)


# space out -------------------------------------------------------------------

rsp <- rsmp("sptcv_cstf", folds = 5, space_var = "SOURCEID")
set.seed(42)
rsp$instantiate(task)

# without omitted, we have no values on the y-axis and the plot is not shown
autoplot(rsp, task, fold_id = 5, show_omitted = TRUE, plot3D = TRUE)

# spacetime out --------------------------------------------------------------------

rsp <- rsmp("sptcv_cstf", folds = 5, time_var = "Date", space_var = "SOURCEID")
set.seed(42)
rsp$instantiate(task)

# without omitted, we have no values on the y-axis and the plot is not shown
autoplot(rsp, task, fold_id = 5, show_omitted = TRUE, plot3D = TRUE)

^{Created on 2021-04-01 by the reprex package (v1.0.0)}

Time only

Space only

Spacetime

Tutorial 2 - scam package not called

Sorry, me again.
When running the
calibrate_aoa function I'm getting the error
Error: Package "scam" needed for this function to work. Please install it.

It solves after installing it.
Cheers

Error when running trainDI / aoa example

Hi Hanna,
when running the example code from the function reference of aoa() or trainDI() I get the following error from the trainDI()call:

> #...then calculate the DI of the trained model:
> DI = trainDI(model=model)
negative weights were set to 0
Error in get.knnx(data, query, k, algorithm) : 
 DLL requires the use of native symbols

The example uses the default arguments (method = "L2"). For method = "MD" there is no error.

I couldn't figure out a workaround so far. Only lead I found was this thread.
Thanks for your help!

Session info

R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Vaduz
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] latticeExtra_0.6-29 viridis_0.6.2       viridisLite_0.4.0   caret_6.0-90        lattice_0.21-8      ggplot2_3.4.2      
 [7] terra_1.5-21        CAST_0.8.1          dplyr_1.1.2         sf_1.0-12          

loaded via a namespace (and not attached):
 [1] gtable_0.3.0         recipes_1.0.6        vctrs_0.6.2          tools_4.3.0          generics_0.1.2      
 [6] stats4_4.3.0         parallel_4.3.0       tibble_3.2.1         proxy_0.4-26         fansi_1.0.2         
[11] pkgconfig_2.0.3      ModelMetrics_1.2.2.2 Matrix_1.5-4.1       KernSmooth_2.23-20   data.table_1.14.2   
[16] RColorBrewer_1.1-2   lifecycle_1.0.3      FNN_1.1.3            compiler_4.3.0       stringr_1.5.0       
[21] munsell_0.5.0        codetools_0.2-19     class_7.3-21         prodlim_2019.11.13   pillar_1.9.0        
[26] MASS_7.3-59          classInt_0.4-3       gower_1.0.0          iterators_1.0.14     rpart_4.1.19        
[31] foreach_1.5.2        nlme_3.1-162         parallelly_1.30.0    lava_1.6.10          tidyselect_1.2.0    
[36] digest_0.6.29        stringi_1.7.6        future_1.24.0        reshape2_1.4.4       purrr_1.0.1         
[41] listenv_0.8.0        splines_4.3.0        grid_4.3.0           colorspace_2.0-2     cli_3.6.1           
[46] magrittr_2.0.2       randomForest_4.7-1.1 survival_3.5-3       utf8_1.2.2           future.apply_1.8.1  
[51] e1071_1.7-9          withr_2.5.0          scales_1.2.1         lubridate_1.8.0      jpeg_0.1-9          
[56] globals_0.14.0       nnet_7.3-18          gridExtra_2.3        timeDate_3043.102    png_0.1-7           
[61] hardhat_1.3.0        rlang_1.1.0          Rcpp_1.0.10          glue_1.6.2           DBI_1.1.2           
[66] pROC_1.18.0          ipred_0.9-12         rstudioapi_0.13      R6_2.5.1             plyr_1.8.6          
[71] units_0.8-0

Tutorial 2 - AOA - csample function not defined

Dear @HannaMeyer and colleagues,
thanks for the nice package and clear tutorials.
Please be aware that the csample function in the tutorial 2 is not defined. I found it in the tutorial 3, but you might want to define it beforehand

Allow tibbles for x in CreateSpacetimeFolds

CreateSpacetimeFolds currently returns an error when passing a tibble for x.

TALK: What about backward, stepwise and other approaches?

I know that might me not the best place to start a talk, but this was, others might be able to take place.

As a scientist, I'm torn. An automatic selection of variables are driven by statistical and not theoretical considerations. Of course, the whole thing depends on the research question and the research approach. Fortunately, that's a new field to be explored :-)

Are you planning to integrate backward, stepwise and other approaches of selection?

no package called ‘twosamples’

Hello CAST team,
thank you for the nice package!
A dependency on "twosamples" package is not installed during initial installation:

indices_knndm <- knndm(splotdata,predictors_sp,k=3)

error: "Error in loadNamespace(x) : there is no package called ‘twosamples’ "
Sincerely,

Inf/NA for large factorial

Dear Hanna,
if n in factorial(n) is larger than 170, the results is NaN or Inf (R version 4.0.1). I wonder about this cause it did not happen before (before = 18 month ago :-). Did you include the line lately? I wonder because for the rainfall paper there have been more than 400 variables? While there are ways to compute factorial for larger n with big integer (gmp::factorialZ), this does not solve the problem because the matrix definition won't understand.

Is perf_all the matrix you mentioned which is only relevant if you want to look up the ffs/error function, one could make it optional to execute.

Cheers
Thomas

CAST/R/ffs.R

Line 142 in f7a2824

nrow=factorial(n) / (factorial(n-minVar)* factorial(minVar))+

TuneGrid causes error with ranger method in ffs

Hi Hanna,

Thank you a lot for the great package!
I noticed that when using the package with the ranger method, it is not possible to use a tuneGrid dataframe as with caret. It is just possible to set the tuneLength argument (luckily).

When using a tune grid, R throws the following error:

[1] "model using NDVI,soil_moist will be trained now..." Something is wrong; all the RMSE metric values are missing: RMSE Rsquared MAE Min. : NA Min. : NA Min. : NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA Median : NA Median : NA Median : NA Mean :NaN Mean :NaN Mean :NaN 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA Max. : NA Max. : NA Max. : NA NA's :1 NA's :1 NA's :1 Error: Stopping In addition: Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :

I tried adding the metric argument metric = c("RMSE"). But it didn't work.

Here is my code:

train_ffs_model <- function(data){
  #tuneGrid_ffs <- expand.grid(mtry = 3, splitrule = "variance", min.node.size = 5)
  predictors <- setdiff(names(data), c("x","y","region","Lstmean","geometry"))
  folds <- CAST::CreateSpacetimeFolds(data,spacevar = "region",k=7)
  model <- CAST::ffs(data[,predictors],data$Lstmean,
                     method="ranger",
                     importance = "permutation",
                     tuneLength = 1,
                     #tuneGrid = tuneGrid_ffs,
                     trControl=trainControl(method="cv",number=10,
                                            index = folds$index,indexOut = folds$indexOut))
  return(model)
  
}

library(parallel)
library(doParallel)
cl <- makePSOCKcluster(4)
registerDoParallel(cl)

ffs_models_03_11 <- purrr::map(.x = data_years_03_11, .f = train_ffs_model)

stopCluster(cl)

I ended up commenting out the tuneGrid and relying only on the tunelength. However, I would like to have more control on the hyperparameters. And since the dataframes are quite big (80k rows), ranger is much faster than RF.
or am I doing something wrong ?

Thank you!

request: calculation of variable importance of an FFS-model

When trying to calculate the variable importance of my ffsmodel using varImp(ffsmodel_LLO), it breaks with:

Error in varImp[, "%IncMSE"] : subscript out of bounds

According to https://stackoverflow.com/a/24043890, the importance hasn't been calculated in ffs(). Could you add it in future?

What if remaining variables (VR) interact?

With reference to Algorithm 1 in Meyer et al., 2018.
If I have 4 predictors, Algorithm 1 evaluates all the possible couples to find the best one.
The algorithm recursively adds the remaining variables i=3,4.
The algorithm stops if mean(error of model_i) > mean(error of model_best).
This means that if the error increases with 3 variables the algorithm stops and it doesn't check what happens whit 4.
What if there is a couple of remaining variables that interact and improve model performance? In other words,
what if with 3 variables performance decreases while with 4 variables it increases? If the algorithm stops at 3, we will never know.
Is that right or I'm missing something?

Thanks

Tutorial 2 - RMSE does not exclude na

Hi again
The line
###for the spatial CV: RMSE(AOA_spatial$AOA)==1],values)(AOA_spatial$AOA)==1])
fails in my R
same for the RMSE of the random CV
you probably want to add the argument na.rm=T there
thanks!

Incorrect hyperparameter for ranger.

Thanks for this package, it is very helpful. One point to note:

It looks like the minimum value reset in line 183 of ffs should be .mtry instead of mtry.

When running the example within the ffs function with method = ranger, it is clear that the instances with mtry > # of predictors isn't skipped in the same way as when the tuneLength argument is set. See example below.

library(CAST) 
#> Warning: package 'CAST' was built under R version 4.1.3
library(doParallel)
#> Warning: package 'doParallel' was built under R version 4.1.3
#> Loading required package: foreach
#> Warning: package 'foreach' was built under R version 4.1.3
#> Loading required package: iterators
#> Warning: package 'iterators' was built under R version 4.1.3
#> Loading required package: parallel
library(lubridate)
#> Warning: package 'lubridate' was built under R version 4.1.2
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.1.2
library(caret)
#> Warning: package 'caret' was built under R version 4.1.3
#> Loading required package: lattice
#> Warning: package 'lattice' was built under R version 4.1.2
library(tictoc)
#> Warning: package 'tictoc' was built under R version 4.1.1
 cl <- makeCluster(3)
 registerDoParallel(cl)

 #load and prepare dataset:
 dat <- get(load(system.file("extdata","Cookfarm.RData",package="CAST")))
 trainDat <- dat[dat$altitude==-0.3&year(dat$Date)==2012&week(dat$Date)%in%c(13:14),]

 #visualize dataset:
 ggplot(data = trainDat, aes(x=Date, y=VW)) + geom_line(aes(colour=SOURCEID))

 #create folds for Leave Location Out Cross Validation:
 set.seed(10)
 indices <- CreateSpacetimeFolds(trainDat,spacevar = "SOURCEID",k=3)
 ctrl <- trainControl(method="cv",index = indices$index)

 #define potential predictors:
 predictors <- c("DEM","TWI","BLD","Precip_cum","cday","MaxT_wrcc",
 "Precip_wrcc","NDRE.M","Bt","MinT_wrcc","Northing","Easting")
 
 tuneGrid<- data.frame(
   .mtry = c(2,5,7),
   .splitrule = "variance",
   .min.node.size = c(5))

 tic()
 
 #run ffs model with Leave Location out CV
 set.seed(10)
 ffsmodel <- ffs(trainDat[,predictors],trainDat$VW,method="ranger",
 tuneGrid = tuneGrid, num.trees = 100,trControl=ctrl)
#> [1] "model using DEM,TWI will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.
#> Warning in train.default(predictors[, minGrid[i, ]], response, method =
#> method, : missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 120"
#> [1] "model using DEM,BLD will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 119"
#> [1] "model using DEM,Precip_cum will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 118"
#> [1] "model using DEM,cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 117"
#> [1] "model using DEM,MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 116"
#> [1] "model using DEM,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 115"
#> [1] "model using DEM,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 114"
#> [1] "model using DEM,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 113"
#> [1] "model using DEM,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 112"
#> [1] "model using DEM,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 111"
#> [1] "model using DEM,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 110"
#> [1] "model using TWI,BLD will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 109"
#> [1] "model using TWI,Precip_cum will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 108"
#> [1] "model using TWI,cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 107"
#> [1] "model using TWI,MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 106"
#> [1] "model using TWI,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 105"
#> [1] "model using TWI,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 104"
#> [1] "model using TWI,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 103"
#> [1] "model using TWI,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 102"
#> [1] "model using TWI,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 101"
#> [1] "model using TWI,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 100"
#> [1] "model using BLD,Precip_cum will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 99"
#> [1] "model using BLD,cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 98"
#> [1] "model using BLD,MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 97"
#> [1] "model using BLD,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 96"
#> [1] "model using BLD,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 95"
#> [1] "model using BLD,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 94"
#> [1] "model using BLD,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 93"
#> [1] "model using BLD,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 92"
#> [1] "model using BLD,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 91"
#> [1] "model using Precip_cum,cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 90"
#> [1] "model using Precip_cum,MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 89"
#> [1] "model using Precip_cum,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 88"
#> [1] "model using Precip_cum,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 87"
#> [1] "model using Precip_cum,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 86"
#> [1] "model using Precip_cum,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 85"
#> [1] "model using Precip_cum,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 84"
#> [1] "model using Precip_cum,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 83"
#> [1] "model using cday,MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 82"
#> [1] "model using cday,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 81"
#> [1] "model using cday,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 80"
#> [1] "model using cday,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 79"
#> [1] "model using cday,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 78"
#> [1] "model using cday,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 77"
#> [1] "model using cday,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 76"
#> [1] "model using MaxT_wrcc,Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 75"
#> [1] "model using MaxT_wrcc,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 74"
#> [1] "model using MaxT_wrcc,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 73"
#> [1] "model using MaxT_wrcc,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 72"
#> [1] "model using MaxT_wrcc,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 71"
#> [1] "model using MaxT_wrcc,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 70"
#> [1] "model using Precip_wrcc,NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 69"
#> [1] "model using Precip_wrcc,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 68"
#> [1] "model using Precip_wrcc,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 67"
#> [1] "model using Precip_wrcc,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 66"
#> [1] "model using Precip_wrcc,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 65"
#> [1] "model using NDRE.M,Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 64"
#> [1] "model using NDRE.M,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 63"
#> [1] "model using NDRE.M,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 62"
#> [1] "model using NDRE.M,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 61"
#> [1] "model using Bt,MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 60"
#> [1] "model using Bt,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 59"
#> [1] "model using Bt,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 58"
#> [1] "model using MinT_wrcc,Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 57"
#> [1] "model using MinT_wrcc,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 56"
#> [1] "model using Northing,Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 55"
#> [1] "vars selected: DEM,BLD with RMSE NaN"
#> [1] "model using additional variable TWI will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.
#> Warning in train.default(predictors[, c(startvars, nextvars[i])], response, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 54"
#> [1] "model using additional variable Precip_cum will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 53"
#> [1] "model using additional variable cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 52"
#> [1] "model using additional variable MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 51"
#> [1] "model using additional variable Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 50"
#> [1] "model using additional variable NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 49"
#> [1] "model using additional variable Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 48"
#> [1] "model using additional variable MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 47"
#> [1] "model using additional variable Northing will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 46"
#> [1] "model using additional variable Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 45"
#> [1] "vars selected: DEM,BLD,Northing with RMSE NaN"
#> [1] "model using additional variable TWI will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 44"
#> [1] "model using additional variable Precip_cum will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 43"
#> [1] "model using additional variable cday will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 42"
#> [1] "model using additional variable MaxT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 41"
#> [1] "model using additional variable Precip_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 40"
#> [1] "model using additional variable NDRE.M will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 39"
#> [1] "model using additional variable Bt will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 38"
#> [1] "model using additional variable MinT_wrcc will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 37"
#> [1] "model using additional variable Easting will be trained now..."
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> There were missing values in resampled performance measures.

#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
#> missing values found in aggregated results
#> [1] "maximum number of models that still need to be trained: 36"
#> [1] "vars selected: DEM,BLD,Northing with RMSE NaN"
#> Note: No increase in performance found using more than 3 variables
 ffsmodel
#> Random Forest 
#> 
#> 490 samples
#>   3 predictor
#> 
#> No pre-processing
#> Resampling: Cross-Validated (10 fold) 
#> Summary of sample sizes: 332, 326, 322 
#> Resampling results across tuning parameters:
#> 
#>   mtry  RMSE        Rsquared   MAE       
#>   2     0.06282679  0.2913655  0.04683066
#>   5            NaN        NaN         NaN
#>   7            NaN        NaN         NaN
#> 
#> Tuning parameter 'splitrule' was held constant at a value of variance
#> 
#> Tuning parameter 'min.node.size' was held constant at a value of 5
#> RMSE was used to select the optimal model using the smallest value.
#> The final values used for the model were mtry = 2, splitrule = variance
#>  and min.node.size = 5.

 toc()
#> 36.04 sec elapsed

 stopCluster(cl)

^{Created on 2022-04-24 by the reprex package (v2.0.1)}

Please remove dependencies on rgdal, rgeos, and/or maptools

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

Account for tibble (non-)drop behavior in aoa

First, thanks for your work on CAST. It is a very nice package and I am looking forward to further developments.

I recently ran into an issue while trying to run the tutorial https://cran.r-project.org/web/packages/CAST/vignettes/AOA-tutorial.html with my own data. I ran the function aoa, but the AOA$AOA results were only zeros.

AOA <- aoa(newdata = newdata, model = mod1, returnTrainDI = TRUE, cl = cl)

I found the issue was that I am using a tibble when training the model as below:

mod1 <- train(x = mytbl[,predictorNames], 
               y = mytbl$response,
               method = "rf",
               importance = TRUE,
               tuneGrid = expand.grid(mtry = c(2:length(predictorNames))),
               trControl = trainControl(method = "cv", savePredictions = TRUE))

Because of that, model$trainingData is also a tibble, and on line 168, newdata[,catvar] becomes NA, because I have one categorical predictor. tibble has a different dropping behavior than data.frame when a single column is returned. Specifically, unique(train[,catvar]) return a one-column tibble instead of a vector.

CAST/R/aoa.R

Line 168 in b34bc35

newdata[!newdata[,catvar]%in%unique(train[,catvar]),catvar] <- NA

The solution for me was to use mytbl <- as.data.frame(mytbl) before training the model, but I would suggest to use this at the beginning of the aoa function call to increase robustness to handle tibbles as well:

if(is.null(train)){train <- as.data.frame(model$trainingData)}

I don't have a ready reprex but I hope my description is sufficient to understand the issue.

Classification theshold

Hello,

I've been reading through the documentation for the ffs function, and I haven't been able to figure out a way to change what the threshold is for classifying predicted values. Am I missing something, or is this a missing feature? Perhaps the issue stems more from the caret package, but it is with the ffs function that I run into issues.

Here is an example of problem I'm having:

library(caret)
library(CAST)

test_data <- structure(list(presence = c("no", "yes", "no", "no", "no", "no", 
"yes", "no", "no", "no", "no", "no", "no", "yes", "no", "no", 
"no", "no", "no", "no", "yes", "no", "no", "yes", "yes", "yes", 
"no", "no", "no", "no", "yes", "no", "no", "no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no", "yes", 
"no", "no", "no", "no", "yes", "no", "no", "yes", "no", "no", 
"yes", "no", "no", "no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no", "yes", "no", "yes", "no", "no", "no", "no", 
"no", "yes", "no", "no", "no", "no", "no", "yes", "no", "no", 
"no", "no", "no"), annual_precip = c(153L, 200L, 235L, 281L, 
296L, 200L, 130L, 127L, 294L, 169L, 221L, 242L, 105L, 173L, 420L, 
212L, 116L, 252L, 153L, 167L, 243L, 186L, 412L, 179L, 237L, 107L, 
147L, 231L, 157L, 286L, 185L, 154L, 176L, 205L, 84L, 209L, 87L, 
247L, 380L, 146L, 218L, 119L, 420L, 420L, 200L, 195L, 199L, 411L, 
419L, 188L, 127L, 156L, 108L, 195L, 183L, 397L, 152L, 122L, 148L, 
152L, 219L, 159L, 152L, 107L, 367L, 393L, 115L, 252L, 241L, 169L, 
297L, 310L, 199L, 147L, 142L, 226L, 118L, 289L, 246L, 237L, 153L, 
113L, 203L, 220L, 76L, 101L, 346L, 133L, 154L, 305L, 156L, 233L, 
442L, 130L, 125L, 127L, 117L, 199L, 211L, 109L), precip_wettest_Q = c(74L, 
92L, 118L, 125L, 130L, 85L, 69L, 61L, 142L, 84L, 104L, 104L, 
54L, 84L, 183L, 101L, 56L, 125L, 74L, 75L, 112L, 91L, 175L, 94L, 
102L, 58L, 75L, 102L, 87L, 125L, 91L, 76L, 86L, 115L, 46L, 100L, 
45L, 123L, 156L, 69L, 103L, 66L, 172L, 183L, 85L, 100L, 85L, 
169L, 177L, 92L, 70L, 75L, 53L, 95L, 88L, 167L, 74L, 66L, 71L, 
75L, 104L, 78L, 75L, 62L, 153L, 162L, 70L, 124L, 121L, 82L, 140L, 
143L, 84L, 75L, 70L, 113L, 65L, 127L, 122L, 108L, 75L, 61L, 94L, 
116L, 39L, 52L, 136L, 66L, 75L, 129L, 76L, 117L, 183L, 69L, 64L, 
63L, 65L, 92L, 94L, 55L), mean_diurnal_range = c(7L, 7L, 7L, 
9L, 5L, 7L, 7L, 6L, 7L, 7L, 7L, 6L, 6L, 7L, 6L, 8L, 6L, 8L, 7L, 
7L, 6L, 7L, 5L, 7L, 6L, 6L, 7L, 7L, 6L, 9L, 7L, 6L, 8L, 6L, 5L, 
7L, 6L, 7L, 4L, 6L, 7L, 7L, 4L, 6L, 7L, 7L, 8L, 4L, 5L, 7L, 7L, 
7L, 6L, 7L, 7L, 5L, 6L, 6L, 7L, 6L, 6L, 7L, 6L, 5L, 4L, 4L, 6L, 
7L, 8L, 7L, 7L, 7L, 8L, 7L, 7L, 6L, 7L, 8L, 8L, 7L, 7L, 6L, 7L, 
6L, 5L, 6L, 4L, 6L, 7L, 6L, 6L, 7L, 4L, 7L, 6L, 7L, 6L, 7L, 7L, 
6L), isothermality = c(14L, 13L, 14L, 17L, 11L, 16L, 13L, 14L, 
15L, 14L, 14L, 14L, 14L, 14L, 13L, 16L, 14L, 15L, 14L, 16L, 14L, 
14L, 12L, 14L, 13L, 14L, 15L, 14L, 14L, 17L, 14L, 13L, 15L, 15L, 
13L, 16L, 13L, 14L, 10L, 13L, 14L, 15L, 11L, 13L, 16L, 15L, 16L, 
10L, 11L, 14L, 14L, 14L, 14L, 14L, 14L, 11L, 13L, 14L, 14L, 13L, 
14L, 14L, 13L, 12L, 10L, 11L, 12L, 14L, 15L, 14L, 14L, 14L, 16L, 
15L, 14L, 13L, 14L, 16L, 15L, 14L, 14L, 14L, 13L, 14L, 13L, 13L, 
11L, 14L, 14L, 14L, 13L, 14L, 11L, 15L, 13L, 14L, 13L, 13L, 14L, 
14L)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))

If I create a model using all 4 predictors variables, the Kappa statistic calculated by the train function is 0.08

set.seed(2354)

model <- train(presence ~ .,
            trControl = trControlCon, 
            method = 'glm', 
            family = 'binomial', 
            metric = 'Kappa',  
            data = test_data
)

Generalized Linear Model

100 samples
4 predictor
2 classes: 'no', 'yes'

No pre-processing
Resampling: Cross-Validated (3 fold)
Summary of sample sizes: 66, 67, 67
Resampling results:

Accuracy Kappa
0.8404635 0.08159167

However, this is based on a default threshold of 0.5. According the calculations below however, the threshold that would maximize Kappa is 0.2.

dt <- model$pred[,c("rowIndex", "obs", "yes")] %>%
  arrange(rowIndex) %>%
  mutate(obs = ifelse(obs == "yes", TRUE, FALSE),
         rowIndex = as.character(rowIndex))

ths <-  optimal.thresholds(dt, opt.methods	= "MaxKappa")

Method yes
MaxKappa 0.2

If I calculate Kappa using this threshold, I estimate a much higher Kappa statistic of 0.3.

cmx_test <- cmx(dt, ths$yes[1])

Kappa(cmx_test)

Kappa Kappa.sd
0.2412141 0.1228811

This becomes an issues when I try to use the ffs function, because I am running into many instances where when I'm using Kappa as the metric for variable selection, all of the 2 and 3 variable combinations have a Kappa statistic of 0 or less, so the algorithm stops. When this happens, all of the predicted values are returned as "no", because the probability of "yes" is less than 0.5 for all of the observations. However, if the threshold had been 0.2 instead of 0.5, I suspect the Kappa value would have varied more between predictor combinations, and likely more variables would be selected.

FF <- ffs(predictors = test_data[,2:5], 
             response = test_data$presence,
             trControl = trainControl(method = 'cv', number = 3, classProbs = TRUE,  
                                      savePredictions = TRUE),
             minVar = 2,
             method = 'glm', 
             family = 'binomial',
             metric = "Kappa"
          
  )

FF$perf_all
var1 var2 var3 Kappa SE nvar
1 annual_precip precip_wettest_Q 0.00000000 0.00000000 2
2 annual_precip mean_diurnal_range 0.00000000 0.00000000 2
3 annual_precip isothermality 0.00000000 0.00000000 2
4 precip_wettest_Q mean_diurnal_range -0.01587302 0.01587302 2
5 precip_wettest_Q isothermality 0.00000000 0.00000000 2
6 mean_diurnal_range isothermality 0.00000000 0.00000000 2
7 annual_precip precip_wettest_Q mean_diurnal_range -0.01587302 0.01587302 3
8 annual_precip precip_wettest_Q isothermality 0.00000000 0.00000000 3

Do you have any suggestions as to how I could customize the threshold used to calculate the classification metrics for ffs to avoid this issue?

small glitch in man: `ffs()` instead of `train()`

There's a glitch in

CAST/R/ffs.R

Lines 102 to 103 in 7742933

    
           #' model <- ffs(trainDat[,predictors],trainDat$VW,method="rf", 
        
           #' tuneLength=1, trControl=ctrl)

which generates

CAST/man/ffs.Rd

Lines 125 to 126 in 7742933

    
           model <- ffs(trainDat[,predictors],trainDat$VW,method="rf", 
        
           tuneLength=1, trControl=ctrl)

I think you want train() instead of ffs() here.

`aoa()` appears to return incorrect thresholds (different from Meyer & Pebesma 2021)

Hi all,

Adapting some code from the MEE-AOA repo, I believe I can calculate an AOA like this:

set.seed(123)

library(CAST)
library(caret)
library(virtualspecies)

npoints <- 50
meansPCA <- c(3, -1)
sdPCA <- c(2, 2)
simulateResponse <- c("bio2","bio5","bio10", "bio13", "bio14","bio19")
studyarea <- c(-15, 65, 30, 75)
predictors_global <- raster::brick(
  system.file(
    "extdata/bioclim_global.grd", 
    package = "CAST"
  )
)

predictors <- crop(predictors_global, extent(studyarea))
mask <- predictors[[1]]
values(mask)[!is.na(values(mask))] <- 1
response_vs <- generateSpFromPCA(
  predictors[[simulateResponse]],
  means = meansPCA,
  sds = sdPCA, 
  plot = FALSE
)
response <- response_vs$suitab.raster
mask <- rasterToPolygons(mask,dissolve=TRUE)

samplepoints <- spsample(mask,npoints,"random")
trainDat <- extract(predictors,samplepoints,df=TRUE)
trainDat$response <- extract (response,samplepoints)
trainDat <- trainDat[complete.cases(trainDat),]

model <- train(trainDat[,names(predictors)],
               trainDat$response,
               method="rf",
               importance=TRUE,
               trControl = trainControl(method="none"))

AOA <- aoa(trainDat, model=model)

According to the 2021 paper, I believe the AOA threshold after this should be equal to "the 75-percentile plus 1.5 times the IQR of the DI values of the cross-validated training data". Calculating that using quantile and IQR gives us these results:

di <- attr(AOA$AOA, "TrainDI")

(threshold_quantile <- stats::quantile(di, 0.75))
#>       75% 
#> 0.3059488
(threshold_iqr <- (1.5 * stats::IQR(di)))
#> [1] 0.3392091
threshold_quantile + threshold_iqr
#>       75% 
#> 0.6451579

But the AOA threshold returned by aoa() doesn't match that calculation:

AOA$parameters$threshold
#> [1] 0.4770295

If I'm right and this is unexpected, it seems to be due to the use of boxplot.stats() here:

CAST/R/trainDI.R

Line 221 in afcba3f

thres <- grDevices::boxplot.stats(TrainDI)$stats[5]

That gives us the threshold that CAST returns:

grDevices::boxplot.stats(di)$stats[5]
#> [1] 0.4770295

But I'm not entirely sure what boxplot.stats() actually does. For instance, imagine that we cut off the last di value in our vector:

di[50]
#> [1] 0.2120274
di <- di[1:49]

Because it's a rather low number, both our 75% percentile and IQR increase:

(threshold_quantile <- stats::quantile(di, 0.75))
#>       75% 
#> 0.3101567
(threshold_iqr <- (1.5 * stats::IQR(di)))
#> [1] 0.3523555
threshold_quantile + threshold_iqr
#>       75% 
#> 0.6625121

But boxplot.stats() returns the same value as before:

grDevices::boxplot.stats(di)$stats[5]
#> [1] 0.4770295

^{Created on 2022-12-11 by the reprex package (v2.0.1)}

Apologies if I'm misunderstanding something here! The return here just didn't match my expectations.

rsample version

Would you be willing to make a version of CreateSpacetimeFolds for rsample?

model filesize because of perf_all in the ffs

The filesize (especially the RAM) get quite big with many predictors.
This is because of the creation of a large data.frame for the perf_all containing rows and columns corresponding to the number of predictors and model runs.
The line 224 gets rid of the empty lines at the bottom of the df, however there are still empty columns left after the ffs stops. E.g. with 116 predictors, 8 got selected by the ffs. Perf all still have all 119 columns for every predictor:

length(colnames(perf_all_big$perf_all))
[1] 119

To get rid of the columns you could use e.g.

bestmodel$perf_all <- bestmodel$perf_all[,colSums(is.na(bestmodel$perf_all)) != nrow(bestmodel$perf_all)]

Again the example with reduced size:

cutting <- big_perf_all[, colSums(is.na(big_perf_all)) != nrow(big_perf_all)]
> object.size(big_perf_all)
3604768 bytes (3.4 mb)
> object.size(cutting)
480784 bytes (0.4 mb)

Greetings Marvin

Predict for other class object

Is it possible to use the predict function with other class object besides terra raster?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

	#' model <- ffs(trainDat[,predictors],trainDat$VW,method="rf",
	#' tuneLength=1, trControl=ctrl)

	model <- ffs(trainDat[,predictors],trainDat$VW,method="rf",
	tuneLength=1, trControl=ctrl)