The emma from modeloriented

[test no. 2] VIM_IRMI (PipeImpute)

test R script: script
log: vim_irmi log
successful usage: 5/10 tasks

Only two errors (internal as I remember well) occurred a few times.
On tasks: 3722, 29, 14954, 48:

Error in 1L:ncol(Y) : argument ma długość 0

and

INFO [18:15:01.269] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3561: profb (Supervised Classification)' (iter 1/5)
[1] "IRMI dont work on selcted params runing on defoult"
Error in VIM::irmi(df, imp_var = F) :
factor with less than 2 levels detected! - Overtime

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:48.084] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 48: heart-c (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'amelia.prep(x = x, m = m, idvars = idvars, empri = empri, ts = ts, ':
You have a small number of observations, relative to the number, of variables in the imputation model. Consider removing some variables, or reducing the order of time polynomials to reduce the number of parameters.

error: inv_sympd(): matrix is singular or not positive definite

error: inv_sympd(): matrix is singular or not positive definite

error: inv_sympd(): matrix is singular or not positive definite

error: inv_sympd(): matrix is singular or not positive definite

The resulting variance matrix was not invertible. Please check your data for highly collinear variables.

[test no. 2] VIM_regrImp (PipeOpTaskPreproc)

test R script: script
log: VIM_regrImp log
successful usage: 4/10 tasks

INFO [22:14:35.457] Applying learner 'imput_VIM_regrImp.encodeimpact.classif.glmnet' on task 'Task 3543: irish (Supervised Classification)' (iter 4/5)
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])':
group ‘Senior_cycle_incomplete-secondary_school’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])':
group ‘Senior_cycle_incomplete-secondary_school’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])':
group ‘Senior_cycle_incomplete-secondary_school’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])':
group ‘Senior_cycle_incomplete-secondary_school’ is empty
character(0)
[1] "Error in doTryCatch(return(expr), name, parentenv, handler): \n"
Error in try({ :

[test no. 2] missMDA_MCA_PCA_FMAD (PipeOpTaskPreproc)

test R script: script
log: missMDA_MCA_PCA_FMAD log
successful usage: 2/10 tasks

Tasks in which probably left missings after imputation (the same situation as discussed here):

Task 3830: cars
Task 3847: analcatdata_draft
During imputation, no other errors were thrown, despite linked above.

[test no. 3] mice (PipeImpute)

Test version without preprocessing of datasets.
Test log: mice log

INFO [09:28:14.248] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task '3807' (iter 1/5)
Ostrzeżenie: Number of logged events: 1
INFO [09:28:15.452] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task '3807' (iter 2/5)
Ostrzeżenie: Number of logged events: 25
INFO [09:28:17.239] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task '3807' (iter 3/5)
INFO [09:28:18.542] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task '3807' (iter 4/5)
Ostrzeżenie: Number of logged events: 40
INFO [09:28:19.834] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task '3807' (iter 5/5)

of 5 iterations

Task: 3807

Learner: imput_mice.encodeimpact.classif.glmnet

Warnings: 0 in 0 iterations

Errors: 0 in 0 iterations
PROBABLY LEFT MISSINGS AFTER IMPUTATION!

[test no. 2] softImpute (PipeOpTaskPreproc)

test R script: script
log: softImpute log
successful usage: 3/10 tasks

INFO [22:12:39.235] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3830: cars (Supervised Classification)' (iter 3/5)
Error : Processed output task during prediction of imput_softImpute does not match output task during training.

[test no. 2] missForest (PipeOpTaskPreproc)

test R script: script
log: missForest log
successful usage: 6/10 tasks

INFO [15:07:05.868] Applying learner 'imput_missForest.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 2/5)
Ostrzeżenie w poleceniu 'randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, ':
The response has five or fewer unique values. Are you sure you want to do regression?
Ostrzeżenie w poleceniu 'randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, ':
The response has five or fewer unique values. Are you sure you want to do regression?
Ostrzeżenie w poleceniu 'randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, ':
The response has five or fewer unique values. Are you sure you want to do regression?
Ostrzeżenie w poleceniu 'randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, ':
The response has five or fewer unique values. Are you sure you want to do regression?
Ostrzeżenie w poleceniu 'randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry = mtry, ':
The response has five or fewer unique values. Are you sure you want to do regression?
Error in [.data.frame(final, , i) : nie wybrano kolumn

[test no. 2] VIM_HD (PipeImpute)

test R script: script
log: VIM_HD log
successful usage: 9/10 tasks

INFO [22:12:59.469] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 5/5)

of 5 iterations

Task: Task 3722: hungarian (Supervised Classification)

Learner: imput_VIM_HD.encodeimpact.classif.glmnet

Warnings: 0 in 0 iterations

Errors: 0 in 0 iterations
PROBABLY LEFT MISSINGS AFTER IMPUTATION!

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:59.713] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 3838: autos (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ':

The number of categories in one of the variables marked nominal has greater than 10 categories. Check nominal specification.

Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels

[test no. 2] missMDA_MCA_PCA_FMAD (PipeOpTaskPreproc)

test R script: script
log: missMDA_MCA_PCA_FMAD log
successful usage: 2/10 tasks

INFO [11:51:44.076] Applying learner 'imput_missMDA_MCA_PCA_FMAD.encodeimpact.classif.glmnet' on task 'Task 3561: profb (Supervised Classification)' (iter 1/5)
[1] "Fail to estimate ncp"
Error in if (any(MM[[g]] < 0)) stop(paste("The algorithm fails to converge. Choose a number of components (ncp) less or equal than ", :
brakuje wartości tam, gdzie wymagane jest TRUE/FALSE

Information

I am conducting the first tests taking single imputation pipe and building simple learning graph
on 3 datasets with missings using factor encoding and glmnet. I save logs to separate files (EMMA_package/tests/logs),
where full messages are written.
Below I will leave my comments about documentation and usage.
As passed datasets I understand the situation when the learner was trained and scored
(despite possible warnings during imputation)

General remarks

I suggest adding a link to the documentation of imputation functions from their packages to each of Pipe wrappers.
Generally, Pipes produce a lot of output, warnings etc., from their native functions. For further development might be worth
to consider hiding all of these native prints and replace it with own messages (for example: used method, successful or not, optimized or not, with verbose option to show or hide these output).
Parameters descriptions will be easier to understand if divided to sections (or anyway different distinguished) of parameters
passed directly to native imputation functions, and these which are additional and created in EMMA for pipe control (optimize, out_file etc.)
to do after testing is complete

Detailed comments

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [14:59:09.899] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 29: credit-approval (Supervised Classification)' (iter 1/5)
...
Ostrzeżenie: glm.fit: fitted probabilities numerically 0 or 1 occurred
Ostrzeżenie w poleceniu 'multinom(form, data = x_reg, summ = 2, maxit = 50, trace = FALSE, ':
group ‘3’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data = x_reg, summ = 2, maxit = 50, trace = FALSE, ':
groups ‘2’ ‘3’ are empty
[1] "IRMI dont work on selcted params runing on defoult"
Ostrzeżenie w poleceniu 'multinom(form, data = x_reg, summ = 2, maxit = 50, trace = FALSE, ':
groups ‘2’ ‘3’ are empty
Error in 1L:ncol(Y) : argument ma długość 0

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [14:59:36.674] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3561: profb (Supervised Classification)' (iter 1/5)
[1] "IRMI dont work on selcted params runing on defoult"
Error in VIM::irmi(df, imp_var = F) :
factor with less than 2 levels detected! - Overtime

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:37.314] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Amelia Error Code: 4
The data has a column that is completely missing or only has one,observation. Remove these columns: ca

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:45.406] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 3830: cars (Supervised Classification)' (iter 1/5)
Amelia Error Code: 36
The number of categories in the nominal variable 'name' is greater than one-third of the observations.

[test no. 2] missRanger (PipeOpTaskPreproc)

test R script: script
log: missRanger log
successful usage: 9/10 tasks

INFO [22:15:13.400] Applying learner 'imput_missRanger.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 3/5)
Ostrzeżenie: Dropped unused factor level(s) in dependent variable: 3.
Error in pmm(xtrain = fit$predictions, xtest = pred, ytrain = data[[v]][!v.na], :
zmienna sum(ok <- !is.na(xtrain) & !is.na(ytrain)) >= 1L nie ma wartości TRUE

[test no. 2] missMDA_MCA_PCA_FMAD (PipeOpTaskPreproc)

test R script: script
log: missMDA_MCA_PCA_FMAD log
successful usage: 2/10 tasks
Task 3543: irish

Error in impute(X, group = group, ncp = ncp, type = type, method = method, :
The algorithm fails to converge. Choose a number of components (ncp) less or equal than 0 or a number of iterations (maxiter) less or equal than 997

[test no. 2] VIM_HD (PipeOpTaskPreproc)

test R script: script
log: VIM_HD log
successful usage: 9/10 tasks

INFO [22:12:57.002] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
INFO [22:12:57.537] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 2/5)
INFO [22:12:58.449] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 3/5)
INFO [22:12:58.997] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 4/5)
INFO [22:12:59.469] Applying learner 'imput_VIM_HD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 5/5)

of 5 iterations

Task: Task 3722: hungarian (Supervised Classification)

Learner: imput_VIM_HD.encodeimpact.classif.glmnet

Warnings: 0 in 0 iterations

Errors: 0 in 0 iterations
PROBABLY LEFT MISSINGS AFTER IMPUTATION!

Comment:
- model prediction contained NA values, which usually occurs when missing values are present in a data frame (so probably were not imputed in our case)

[test no. 2] softImpute (PipeOpTaskPreproc)

test R script: script
log: softImpute log
successful usage: 3/10 tasks

INFO [22:12:48.043] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3675: pbc (Supervised Classification)' (iter 3/5)
Ostrzeżenie w poleceniu '[<-.factor(*tmp*, is.na(col_to_imp), value = "NA's")':
niepoprawny poziom czynnika, wygenerowano wartość NA
INFO [22:12:48.876] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3675: pbc (Supervised Classification)' (iter 4/5)
INFO [22:12:49.340] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3675: pbc (Supervised Classification)' (iter 5/5)

of 5 iterations

Task: Task 3675: pbc (Supervised Classification)

Learner: imput_softImpute.encodeimpact.classif.glmnet

Warnings: 0 in 0 iterations

Errors: 0 in 0 iterations
PROBABLY LEFT MISSINGS AFTER IMPUTATION!

Comment:
- model prediction contained NA values, which usually occurs when missing values are present in a data frame (so probably were not imputed in our case)

[test no. 3] VIM_IRMI (PipePreproc)

Test version without preprocessing of datasets.
Test log: IRMI

INFO [09:37:51.610] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task '3802' (iter 1/5)
[1] "IRMI dont work on selcted params runing on defoult"
Error in colnames(final) : nie znaleziono obiektu 'final'

[test no. 2] missMDA_MCA_PCA_FMAD (PipeOpTaskPreproc)

test R script: script
log: missMDA_MCA_PCA_FMAD log
successful usage: 2/10 tasks

INFO [22:32:23.834] Applying learner 'imput_missMDA_MCA_PCA_FMAD.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
[1] "Fail to estimate ncp"
Error in eigen(crossprod(X, X), symmetric = TRUE) :
wartość nieskończona lub brakuje wartości w 'x'

[test no. 2] missForest (PipeOpTaskPreproc)

test R script: script
log: missForest log
successful usage: 6/10 tasks

INFO [15:08:02.454] Applying learner 'imput_missForest.encodeimpact.classif.glmnet' on task 'Task 3561: profb (Supervised Classification)' (iter 1/5)
Error in [<-.data.frame(*tmp*, misi, res$varInd, value = structure(c(1L, :
zamiana ma 537 wierszy, dane mają 513

[test no. 2] mice (PipeOpTaskPreproc)

test R script: script
log: mice log
successful usage: 5/10 tasks

INFO [22:10:54.599] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Ostrzeżenie: Number of logged events: 51
Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, :
wartość NA/NaN/Inf w wywołaniu obcej funcji (argument 5)

[test no. 2] mice (PipeOpTaskPreproc)

test R script: script
log: mice log
successful usage: 5/10 tasks

INFO [22:11:41.559] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task 'Task 14954: cylinder-bands (Supervised Classification)' (iter 1/5)
Error in solve.default(xtx + diag(pen)) :
system jest obliczeniowo osobliwy: numer odwrotnego warunku = 1.96433e-16

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:46.758] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 14954: cylinder-bands (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ':

The number of categories in one of the variables marked nominal has greater than 10 categories. Check nominal specification.

Ostrzeżenie w poleceniu 'amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ':

The number of categories in one of the variables marked nominal has greater than 10 categories. Check nominal specification.

Amelia Error Code: 43
You have a variable in your dataset that does not vary. Please remove this variable. Variables that do not vary: cylinder_division, ink_color

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [14:59:23.079] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3830: cars (Supervised Classification)' (iter 1/5)
...
Ostrzeżenie w poleceniu 'predict.lm(object, newdata, se.fit, scale = residual.scale, type = if (type == ':
prediction from a rank-deficient fit may be misleading
Error : Processed output task during prediction of imput_VIM_IRMI does not match output task during training.

[test no. 2] VIM_IRMI (PipeImpute)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [15:05:12.329] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3838: autos (Supervised Classification)' (iter 1/5)
...
Ostrzeżenie: glm.fit: algorithm did not converge
Ostrzeżenie: glm.fit: fitted probabilities numerically 0 or 1 occurred
Error : Processed output task during prediction of imput_VIM_IRMI does not match output task during training.

[test no. 2] softImpute (PipeImpute)

test R script: script
log: softImpute log
successful usage: 3/10 tasks

INFO [22:12:39.235] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3830: cars (Supervised Classification)' (iter 3/5)
Error : Processed output task during prediction of imput_softImpute does not match output task during training.

[test no. 3] VIM_regrImp (PipeImpute)

Test version with preprocessing of datasets.
Test log: VIM_regrImp log

INFO [10:15:48.372] Applying learner 'imput_VIM_regrImp.encodeimpact.classif.glmnet' on task '3604' (iter 1/5)
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])': group ‘4’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])': group ‘4’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])': group ‘4’ is empty
Ostrzeżenie w poleceniu 'multinom(form, data[TFna, ])': group ‘4’ is empty
[1] "Error in apply(pre, 1, function(x) sample(1:length(x), 1, prob = x)): 'dim(X)' musi mieć dodatnią długość\n"
[1] "Error in apply(pre, 1, function(x) sample(1:length(x), 1, prob = x)): 'dim(X)' musi mieć dodatnią długość\n"
Error in apply(pre, 1, function(x) sample(1:length(x), 1, prob = x)) :
'dim(X)' musi mieć dodatnią długość

It is probably an internal problem already detected but with a changed error message. Because I am not sure - reporting it.

[test no. 2] softImpute (PipeImpute)

test R script: script
log: softImpute log
successful usage: 3/10 tasks

INFO [22:12:33.475] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu '[<-.factor(*tmp*, is.na(col_to_imp), value = "NA's")':
niepoprawny poziom czynnika, wygenerowano wartość NA
Ostrzeżenie w poleceniu '[<-.factor(*tmp*, is.na(col_to_imp), value = "NA's")':
niepoprawny poziom czynnika, wygenerowano wartość NA
Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, :
wartość NA/NaN/Inf w wywołaniu obcej funcji (argument 5)

Probably this error is thrown by glmnet when missing values were present in data.

Test no. 3 for both PipePreproc and PipeImpute versions on a sample of 10 tasks with missings.
This time two setups were tested: with and without dataset preprocessing step.
For readability summary of performance is provided in google sheet.
Sheet with summary: sheet
Tasks test sample: sample

[test no. 2] VIM_IRMI (PipeImpute)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [15:05:10.585] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3847: analcatdata_draft (Supervised Classification)' (iter 1/5)
No missings in x. Nothing to impute
Ostrzeżenie w poleceniu 'kNN(x, imp_var = FALSE, mixed = mixed, mixed.constant = mixed.constant)':
Nothing to impute, because no NA are present (also after using makeNA)
Error : Processed output task during prediction of imput_VIM_IRMI does not match output task during training.

[test no. 2] missMDA_MFA (PipeOpTaskPreproc)

test R script: script
log: missMDA_MFA log
successful usage: 1/10 tasks

INFO [22:26:33.665] Applying learner 'imput_missMDA_MFA.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Error in eigen(crossprod(X, X), symmetric = TRUE) :
wartość nieskończona lub brakuje wartości w 'x'

[test no. 2] softImpute (PipeOpTaskPreproc)

test R script: script
log: softImpute log
successful usage: 3/10 tasks

INFO [22:12:33.475] Applying learner 'imput_softImpute.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu '[<-.factor(*tmp*, is.na(col_to_imp), value = "NA's")':
niepoprawny poziom czynnika, wygenerowano wartość NA
Ostrzeżenie w poleceniu '[<-.factor(*tmp*, is.na(col_to_imp), value = "NA's")':
niepoprawny poziom czynnika, wygenerowano wartość NA
Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, :
wartość NA/NaN/Inf w wywołaniu obcej funcji (argument 5)

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [15:05:12.329] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3838: autos (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'predict.lm(object, newdata, se.fit, scale = 1, type = if (type == ':
prediction from a rank-deficient fit may be misleading
...
Ostrzeżenie: glm.fit: fitted probabilities numerically 0 or 1 occurred
Ostrzeżenie: glm.fit: algorithm did not converge
...
Error : Processed output task during prediction of imput_VIM_IRMI does not match output task during training.

[test no. 2] mice (PipeOpTaskPreproc)

test R script: script
log: mice log
successful usage: 5/10 tasks

INFO [22:12:03.409] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task 'Task 3847: analcatdata_draft (Supervised Classification)' (iter 2/5)
Error in edit.setup(data, setup, ...) :
mice detected constant and/or collinear variables. No predictors were left after their removal.

[test no. 2] missMDA_MFA (PipeOpTaskPreproc)

test R script: script
log: missMDA_MFA log
successful usage: 1/10 tasks

INFO [22:26:55.985] Applying learner 'imput_missMDA_MFA.encodeimpact.classif.glmnet' on task 'Task 3838: autos (Supervised Classification)' (iter 2/5)
Error in apply(tabdisj[, (vec[i] + 1):vec[i + 1]], 1, which.max) :
'dim(X)' musi mieć dodatnią długość

[test no. 2] VIM_KNN (PipeOpTaskPreproc)

test R script: script
log: VIM_KNN log
successful usage: 9/10 tasks

INFO [22:13:38.048] Applying learner 'imput_VIM_kNN.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 3/5)
[1] 5
[1] 5
Ostrzeżenie w poleceniu 'VIM::kNN(df, k = k, numFun = numFun, catFun = catFun, imp_var = F)':
All observations of feature are missing, therefore the variable will not be imputed!

Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w min; zwracanie wartości Inf
Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w max; zwracanie wartości -Inf
Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w min; zwracanie wartości Inf
Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w max; zwracanie wartości -Inf
Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w min; zwracanie wartości Inf
Ostrzeżenie w poleceniu 'FUN(newX[, i], ...)':
brak argumentów w max; zwracanie wartości -Inf
Error in indexNA2s[, variable[j]] : indeks jest poza granicami

[test no. 2] Amelia (PipeOpTaskPreproc)

test R script: script
Amelia (ver. Preproc) log: amelia log
successful usage: 3/10 tasks

INFO [14:57:49.855] Applying learner 'imput_Amelia.encodeimpact.classif.glmnet' on task 'Task 3561: profb (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ':

The number of categories in one of the variables marked nominal has greater than 10 categories. Check nominal specification.

Ostrzeżenie w poleceniu 'amcheck(x = x, m = m, idvars = numopts$idvars, priors = priors, ':

The number of categories in one of the variables marked nominal has greater than 10 categories. Check nominal specification.

Amelia Error Code: 43
You have a variable in your dataset that does not vary. Please remove this variable. Variables that do not vary: Overtime

[test no. 2] missForest (PipeOpTaskPreproc)

test R script: script
log: missForest log
successful usage: 6/10 tasks

INFO [15:07:47.108] Applying learner 'imput_missForest.encodeimpact.classif.glmnet' on task 'Task 3830: cars (Supervised Classification)' (iter 1/5)
Error in { :
task 1 failed - "Can not handle categorical predictors with more than 53 categories."

INFO [15:07:48.802] Applying learner 'imput_missForest.encodeimpact.classif.glmnet' on task 'Task 14954: cylinder-bands (Supervised Classification)' (iter 1/5)
Error in { :
task 1 failed - "Can not handle categorical predictors with more than 53 categories."

Adding columns with the information where imputed.

Adding separate PipeOpPreproces function is required to create columns with the information where imputation will happen. I will do this. This issue is only to inform you about the problem with the current solution.

[test no. 4] Summary

Test purpose: usage of auto-optimization of parameters in missForest, mice and missRanger.
Test script: script
Test logs: logs
Performance: all pipes performed successful imputation in 5/5 tasks

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [15:05:10.585] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3847: analcatdata_draft (Supervised Classification)' (iter 1/5)
No missings in x. Nothing to impute
Ostrzeżenie w poleceniu 'kNN(x, imp_var = FALSE, mixed = mixed, mixed.constant = mixed.constant)':
Nothing to impute, because no NA are present (also after using makeNA)
Error : Processed output task during prediction of imput_VIM_IRMI does not match output task during training.

[test no. 2] VIM_regrImp (PipeImpute)

test R script: script
log: VIM_regrImp log
successful usage: 4/10 tasks
The same errors as in PipePreProc version occurred issue

[test no. 2] VIM_IRMI (PipeOpTaskPreproc)

test R script: script
log: vim_irmi log
successful usage: 3/10 tasks

INFO [14:58:23.920] Applying learner 'imput_VIM_IRMI.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Ostrzeżenie w poleceniu 'multinom(form, data = x_reg, summ = 2, maxit = 50, trace = FALSE, ':
group ‘0’ is empty
[1] "IRMI dont work on selcted params runing on defoult"
Ostrzeżenie w poleceniu 'predict.lm(object, newdata, se.fit, scale = 1, type = if (type == ':
prediction from a rank-deficient fit may be misleading
Ostrzeżenie: glm.fit: algorithm did not converge
Ostrzeżenie: glm.fit: fitted probabilities numerically 0 or 1 occurred
Ostrzeżenie w poleceniu 'predict.lm(object, newdata, se.fit, scale = 1, type = if (type == ':
prediction from a rank-deficient fit may be misleading
Ostrzeżenie w poleceniu 'multinom(form, data = x_reg, summ = 2, maxit = 50, trace = FALSE, ':
group ‘0’ is empty
Error in 1L:ncol(Y) : argument ma długość 0

[test no. 2] Summary

test R script: script

PipeOpTaskPreproc version results:

Amelia (PipeOpTaskPreproc)

log: amelia log
successful usage: 3/10 tasks

VIM_IRMI (PipeOpTaskPreproc)

log: vim_irmi log
successful usage: 3/10 tasks

missForest (PipeOpTaskPreproc)

log: missForest log
successful usage: 6/10 tasks

mice (PipeOpTaskPreproc)

log: mice log
successful usage: 5/10 tasks

softImpute (PipeOpTaskPreproc)

log: softImpute log
successful usage: 7/10 tasks

VIM_HD (PipeOpTaskPreproc)

log: VIM_HD log
successful usage: 10/10 tasks

VIM_KNN (PipeOpTaskPreproc)

log: VIM_KNN log
successful usage: 9/10 tasks

VIM_regrImp (PipeOpTaskPreproc)

log: VIM_regrImp log
successful usage: 4/10 tasks

missRanger (PipeOpTaskPreproc)

log: missRanger log
successful usage: 9/10 tasks

missMDA_MFA (PipeOpTaskPreproc)

log: missMDA_MFA log
successful usage: 1/10 tasks

missMDA_MCA_PCA_FMAD (PipeOpTaskPreproc)

log: missMDA_MCA_PCA_FMAD log
successful usage: 2/10 tasks

[test no. 2] mice (PipeImpute)

test R script: script
log: mice log
successful usage: 5/10 tasks

INFO [22:10:54.599] Applying learner 'imput_mice.encodeimpact.classif.glmnet' on task 'Task 3722: hungarian (Supervised Classification)' (iter 1/5)
Ostrzeżenie: Number of logged events: 51
Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, :
wartość NA/NaN/Inf w wywołaniu obcej funcji (argument 5)

Experiment arrangements

update 23.07.2020:

database of datasets:

basic statistics of datasets with missing data
searching for patterns in observations with missing values in real incomplete data and according to these
according to founded patterns, generation missing entries in real complete data

pipeline to support imputation methods:

R package
for multiple imputation as a default we preserve the first imputed dataset, optionally we can use multiple version of imputed data
implementing out of range imputation
implementing mask with dummy encoding whether is a a missing entry

[test no. 2] missMDA_MFA (PipeOpTaskPreproc)

test R script: script
log: missMDA_MFA log
successful usage: 1/10 tasks
Tasks in which probably left missings after imputation (the same situation as discussed here):

Task 3543: irish
Task 29: credit-approval
Task 3830: cars
Task 48: heart-c
Task 3847: analcatdata_draft
During imputation, no other errors were thrown, despite linked above.

[test no. 2] missMDA_MFA (PipeOpTaskPreproc)

test R script: script
log: missMDA_MFA log
successful usage: 1/10 tasks

INFO [22:26:44.621] Applying learner 'imput_missMDA_MFA.encodeimpact.classif.glmnet' on task 'Task 14954: cylinder-bands (Supervised Classification)' (iter 1/5)
Error in if (any(MM[[g]] < 0)) stop(paste("The algorithm fails to converge. Choose a number of components (ncp) less or equal than ", :
brakuje wartości tam, gdzie wymagane jest TRUE/FALSE

modeloriented / emma Goto Github PK

emma's People

Contributors

Stargazers

Watchers

Forkers

emma's Issues

Information

General remarks

Detailed comments

Recommend Projects

Recommend Topics

Recommend Org