Comments (8)
I think the example is outdated, not sure, I have to check that. I am going to add the code below as the example in my next push.
Try this (change the working directory on line 8 and LGBM on line 32):
library(Laurae)
library(stringi)
library(Matrix)
library(sparsity)
library(data.table)
remove(list = ls()) # WARNING: CLEANS EVERYTHING IN THE ENVIRONMENT
setwd("D:/Data Science/HousePrices") # CHANGE THIS TO WHATEVER TEMPORARY DIRECTORY WHERE YOU WANT TEMPORARY FILES
DT <- data.table(Split1 = c(rep(0, 50), rep(1, 50)), Split2 = rep(c(rep(0, 25), rep(0.5, 25)), 2))
DT$Split3 <- rep(c(rep(0, 10), rep(0.25, 15)), 4)
DT$Split4 <- rep(c(rep(0, 5), rep(0.1, 5), rep(0, 5), rep(0.1, 10)), 4)
DT$Split5 <- rep(c(rep(0, 5), rep(0.05, 5), rep(0, 10), rep(0.05, 5)), 4)
label <- c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25))
label <- as.numeric((DT$Split2 == 0) & (DT$Split1 == 0) & (DT$Split3 == 0))
label <- as.numeric((DT$Split2 == 0) & (DT$Split1 == 0) & (DT$Split3 == 0) & (DT$Split4 == 0) | ((DT$Split2 == 0.5) & (DT$Split1 == 1) & (DT$Split3 == 0.25) & (DT$Split4 == 0.1) & (DT$Split5 == 0)) | ((DT$Split1 == 0) & (DT$Split2 == 0.5)))
trained <- lgbm.cv(y_train = label,
x_train = DT,
bias_train = NA,
folds = 5,
unicity = TRUE,
application = "binary",
num_iterations = 1,
early_stopping_rounds = 1,
learning_rate = 5,
num_leaves = 16,
min_data_in_leaf = 1,
min_sum_hessian_in_leaf = 1,
tree_learner = "serial",
num_threads = 1,
lgbm_path = "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe",
workingdir = file.path(getwd()),
validation = FALSE,
files_exist = FALSE,
verbose = TRUE,
is_training_metric = TRUE,
save_binary = TRUE,
metric = "binary_logloss")
str(trained)
I am getting this output:
***************
Fold no: 1 / 5
***************
Using LightGBM path: C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe
Working directory of LightGBM: D:/Data Science/HousePrices/temp
Training configuration file saved to: D:/Data Science/HousePrices/temp/lgbm_train.conf
Saving train data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_train.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 80 rows in 1 batches of 80 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Saving validation data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_val.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 20 rows in 1 batches of 20 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Starting to work on model as of Sat Dec 10 2016 10:25:44 PM
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] Loading data set from binary file
[LightGBM] [Info] Finish loading data, use 0.000138 seconds
[LightGBM] [Info] Number of postive:27, number of negative:53
[LightGBM] [Info] Number of data:80, Number of features:5
[LightGBM] [Info] Finish training initilization.
[LightGBM] [Info] Start train
[LightGBM] [Info] cannot find more split with gain = 0.000000 , current #leaves=8
[LightGBM] [Info] Iteration:1, training's log loss: 0.000045
[LightGBM] [Info] 0.000052 seconds elapsed, finished 1 iteration
[LightGBM] [Info] Finish train
Model completed, results saved in D:/Data Science/HousePrices/temp
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] 1 models has been loaded
[LightGBM] [Info] Finish predict initilization.
[LightGBM] [Info] Start prediction for data D:/Data Science/HousePrices/temp/lgbm_val.csv without label
[LightGBM] [Info] Finish predict.
Ended to work on model as of Sat Dec 10 2016 10:25:45 PM
***************
Fold no: 2 / 5
***************
Using LightGBM path: C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe
Working directory of LightGBM: D:/Data Science/HousePrices/temp
Training configuration file saved to: D:/Data Science/HousePrices/temp/lgbm_train.conf
Saving train data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_train.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 80 rows in 1 batches of 80 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Saving validation data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_val.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 20 rows in 1 batches of 20 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Starting to work on model as of Sat Dec 10 2016 10:25:45 PM
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] Loading data set from binary file
[LightGBM] [Info] Finish loading data, use 0.000140 seconds
[LightGBM] [Info] Number of postive:27, number of negative:53
[LightGBM] [Info] Number of data:80, Number of features:5
[LightGBM] [Info] Finish training initilization.
[LightGBM] [Info] Start train
[LightGBM] [Info] cannot find more split with gain = 0.000000 , current #leaves=8
[LightGBM] [Info] Iteration:1, training's log loss: 0.000045
[LightGBM] [Info] 0.000076 seconds elapsed, finished 1 iteration
[LightGBM] [Info] Finish train
Model completed, results saved in D:/Data Science/HousePrices/temp
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] 1 models has been loaded
[LightGBM] [Info] Finish predict initilization.
[LightGBM] [Info] Start prediction for data D:/Data Science/HousePrices/temp/lgbm_val.csv without label
[LightGBM] [Info] Finish predict.
Ended to work on model as of Sat Dec 10 2016 10:25:46 PM
***************
Fold no: 3 / 5
***************
Using LightGBM path: C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe
Working directory of LightGBM: D:/Data Science/HousePrices/temp
Training configuration file saved to: D:/Data Science/HousePrices/temp/lgbm_train.conf
Saving train data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_train.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 80 rows in 1 batches of 80 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Saving validation data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_val.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 20 rows in 1 batches of 20 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Starting to work on model as of Sat Dec 10 2016 10:25:47 PM
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] Loading data set from binary file
[LightGBM] [Info] Finish loading data, use 0.000151 seconds
[LightGBM] [Info] Number of postive:27, number of negative:53
[LightGBM] [Info] Number of data:80, Number of features:5
[LightGBM] [Info] Finish training initilization.
[LightGBM] [Info] Start train
[LightGBM] [Info] cannot find more split with gain = 0.000000 , current #leaves=8
[LightGBM] [Info] Iteration:1, training's log loss: 0.000045
[LightGBM] [Info] 0.000050 seconds elapsed, finished 1 iteration
[LightGBM] [Info] Finish train
Model completed, results saved in D:/Data Science/HousePrices/temp
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] 1 models has been loaded
[LightGBM] [Info] Finish predict initilization.
[LightGBM] [Info] Start prediction for data D:/Data Science/HousePrices/temp/lgbm_val.csv without label
[LightGBM] [Info] Finish predict.
Ended to work on model as of Sat Dec 10 2016 10:25:48 PM
***************
Fold no: 4 / 5
***************
Using LightGBM path: C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe
Working directory of LightGBM: D:/Data Science/HousePrices/temp
Training configuration file saved to: D:/Data Science/HousePrices/temp/lgbm_train.conf
Saving train data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_train.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 80 rows in 1 batches of 80 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Saving validation data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_val.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 20 rows in 1 batches of 20 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Starting to work on model as of Sat Dec 10 2016 10:25:48 PM
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] Loading data set from binary file
[LightGBM] [Info] Finish loading data, use 0.000135 seconds
[LightGBM] [Info] Number of postive:27, number of negative:53
[LightGBM] [Info] Number of data:80, Number of features:5
[LightGBM] [Info] Finish training initilization.
[LightGBM] [Info] Start train
[LightGBM] [Info] cannot find more split with gain = 0.000000 , current #leaves=8
[LightGBM] [Info] Iteration:1, training's log loss: 0.000045
[LightGBM] [Info] 0.000070 seconds elapsed, finished 1 iteration
[LightGBM] [Info] Finish train
Model completed, results saved in D:/Data Science/HousePrices/temp
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] 1 models has been loaded
[LightGBM] [Info] Finish predict initilization.
[LightGBM] [Info] Start prediction for data D:/Data Science/HousePrices/temp/lgbm_val.csv without label
[LightGBM] [Info] Finish predict.
Ended to work on model as of Sat Dec 10 2016 10:25:49 PM
***************
Fold no: 5 / 5
***************
Using LightGBM path: C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe
Working directory of LightGBM: D:/Data Science/HousePrices/temp
Training configuration file saved to: D:/Data Science/HousePrices/temp/lgbm_train.conf
Saving train data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_train.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 80 rows in 1 batches of 80 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Saving validation data (data.table) file to: D:/Data Science/HousePrices/temp/lgbm_val.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
maxLineLen=24 from sample. Found in 0.000s
Writing column names ... done in 0.000s
Writing 20 rows in 1 batches of 20 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=0%)
Starting to work on model as of Sat Dec 10 2016 10:25:49 PM
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] Loading data set from binary file
[LightGBM] [Info] Finish loading data, use 0.000138 seconds
[LightGBM] [Info] Number of postive:27, number of negative:53
[LightGBM] [Info] Number of data:80, Number of features:5
[LightGBM] [Info] Finish training initilization.
[LightGBM] [Info] Start train
[LightGBM] [Info] cannot find more split with gain = 0.000000 , current #leaves=8
[LightGBM] [Info] Iteration:1, training's log loss: 0.000045
[LightGBM] [Info] 0.000055 seconds elapsed, finished 1 iteration
[LightGBM] [Info] Finish train
Model completed, results saved in D:/Data Science/HousePrices/temp
[LightGBM] [Info] Loading parameters .. finished
[LightGBM] [Info] 1 models has been loaded
[LightGBM] [Info] Finish predict initilization.
[LightGBM] [Info] Start prediction for data D:/Data Science/HousePrices/temp/lgbm_val.csv without label
[LightGBM] [Info] Finish predict.
Ended to work on model as of Sat Dec 10 2016 10:25:50 PM
and
List of 3
$ Models :List of 5
..$ 1:List of 8
.. ..$ Model : chr [1:14] "max_feature_idx=-1" "sigmoid=1" "" "Tree=0" ...
.. ..$ Path : chr "D:/Data Science/HousePrices/temp"
.. ..$ Name : chr "lgbm_model.txt"
.. ..$ lgbm : chr "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe"
.. ..$ Train : chr "lgbm_train.csv"
.. ..$ Valid : chr "lgbm_val.csv"
.. ..$ Test : logi NA
.. ..$ Validation: num [1:20] 1 1 1 1 1 ...
..$ 2:List of 8
.. ..$ Model : chr [1:14] "max_feature_idx=-1" "sigmoid=1" "" "Tree=0" ...
.. ..$ Path : chr "D:/Data Science/HousePrices/temp"
.. ..$ Name : chr "lgbm_model.txt"
.. ..$ lgbm : chr "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe"
.. ..$ Train : chr "lgbm_train.csv"
.. ..$ Valid : chr "lgbm_val.csv"
.. ..$ Test : logi NA
.. ..$ Validation: num [1:20] 1 1 1 1 1 ...
..$ 3:List of 8
.. ..$ Model : chr [1:14] "max_feature_idx=-1" "sigmoid=1" "" "Tree=0" ...
.. ..$ Path : chr "D:/Data Science/HousePrices/temp"
.. ..$ Name : chr "lgbm_model.txt"
.. ..$ lgbm : chr "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe"
.. ..$ Train : chr "lgbm_train.csv"
.. ..$ Valid : chr "lgbm_val.csv"
.. ..$ Test : logi NA
.. ..$ Validation: num [1:20] 1 1 1 1 1 ...
..$ 4:List of 8
.. ..$ Model : chr [1:14] "max_feature_idx=-1" "sigmoid=1" "" "Tree=0" ...
.. ..$ Path : chr "D:/Data Science/HousePrices/temp"
.. ..$ Name : chr "lgbm_model.txt"
.. ..$ lgbm : chr "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe"
.. ..$ Train : chr "lgbm_train.csv"
.. ..$ Valid : chr "lgbm_val.csv"
.. ..$ Test : logi NA
.. ..$ Validation: num [1:20] 1 1 1 1 1 ...
..$ 5:List of 8
.. ..$ Model : chr [1:14] "max_feature_idx=-1" "sigmoid=1" "" "Tree=0" ...
.. ..$ Path : chr "D:/Data Science/HousePrices/temp"
.. ..$ Name : chr "lgbm_model.txt"
.. ..$ lgbm : chr "C:/xgboost/LightGBM/windows/x64/Release/lightgbm.exe"
.. ..$ Train : chr "lgbm_train.csv"
.. ..$ Valid : chr "lgbm_val.csv"
.. ..$ Test : logi NA
.. ..$ Validation: num [1:20] 1 1 1 1 1 ...
$ Validation:List of 2
..$ : num [1:100] 1 1 1 1 1 ...
..$ :List of 5
.. ..$ : num [1:20] 1 1 1 1 1 ...
.. ..$ : num [1:20] 1 1 1 1 1 ...
.. ..$ : num [1:20] 1 1 1 1 1 ...
.. ..$ : num [1:20] 1 1 1 1 1 ...
.. ..$ : num [1:20] 1 1 1 1 1 ...
$ Weights : num [1:5] 0.2 0.2 0.2 0.2 0.2
from laurae.
Thanks!
from laurae.
(You can close this if you want or leave it open)
from laurae.
Another (potentially silly) question: If I followed the installation guide in the readme for linux, what might my lightgbm path be?
from laurae.
I fixed the LightGBM functions' documentation in commit @4fe8e2b35acabbe8979cd3181dca8f004a03ee38.
Another (potentially silly) question: If I followed the installation guide in the readme for linux, what might my lightgbm path be?
Your LightGBM should be on the same directory as your LightGBM download.
You can find out where it has been compiled using this on your LightGBM path:
ls -d */
If you installed in a folder named "(...)/LightGBM" path, it should the lgbm_path
should be "(...)/LightGBM/lightgbm" (unless my memory is wrong - it must create the executable in the root directory of the folder - you do not need to specify the extension, the shell takes automatically care of it).
from laurae.
I didn't even have lightgbm installed! lol. So for future reference, this error means lightgbm isn't installed, or you're pointing at the wrong path:
Error in outputs[["Models"]][[i]][["Validation"]] :
subscript out of bounds
from laurae.
I also got this by omitting the path.
***************
Fold no: 1 / 5
***************
Error in outputs[["Models"]][[i]][["Validation"]] :
subscript out of bounds
I installed on OS X as shown here...
cannot install lightgbm in R with devtools on macOS
Doing the R install as shown there with...
R CMD INSTALL --build . --no-multiarch
I believe this installs to the default R package location as shown by...
> .libPaths()
[1] "/Library/Frameworks/R.framework/Versions/3.4/Resources/library"
system("ls -l /Library/Frameworks/R.framework/Versions/3.4/Resources/library/lightgbm")
total 32
-rw-rw-r-- 1 mjh admin 2027 Jun 23 17:48 DESCRIPTION
-rw-rw-r-- 1 mjh admin 2044 Jun 23 17:50 INDEX
Might it be possible to make the .libPaths()
location the default path?
I just tried...
lgbm_path = '/Library/Frameworks/R.framework/Versions/3.4/Resources/library',
and got...
***************
Fold no: 1 / 5
***************
done (actual nth=1, anyBufferGrown=no, maxBuffUsed=35%)
Saving validation data (data.table) file to: /Users/mjh/ml/kaggle/HomeCredit/code/lgbm_val_1.csv
No list columns are present. Setting sep2='' otherwise quote='auto' would quote fields containing sep2.
Column writers: 3 12 12 12 12 3 5 5 5 5 12 12 12 12 12 5 3 5 5 3 5 3 3 12 5 3 3 12 3 12 ... 5 5 5 5 3 5 5 5 5 5
maxLineLen=1559 from sample. Found in 0.016s
Writing column names ... done in 0.000s
Writing 61502 rows in 23 batches of 2690 rows (each buffer size 8MB, showProgress=1, nth=1) ... done (actual nth=1, anyBufferGrown=no, maxBuffUsed=35%)
Starting to work on model as of Tue Jun 26 2018 08:46:11
/bin/sh: /Library/Frameworks/R.framework/Versions/3.4/Resources/library: is a directory
Model completed, results saved in /Users/mjh/ml/kaggle/HomeCredit/code
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file '/Users/mjh/ml/kaggle/HomeCredit/code/lgbm_model_1.txt': No such file or directory
It successfully wrote the .conf and train_1.csv and val_1.csv files. I'm not sure waht the other errors are about where it appears to look for a /bin/sh type executable or has the connection failure with no model_1.txt.
from laurae.
The lgbm_path in mac was the location of unix executable that you build from source.
In my case I had it in my downloads folder so the lgbm_path value would be something like "/Downloads/LightGBM/lightgbm"
from laurae.
Related Issues (15)
- Images HOT 24
- question about blog HOT 2
- daForest HOT 2
- Training & testing HOT 2
- Error in read.dcf(path)
- get.max_acc fails to calculate correct accuracy on edge case
- Error when using xgb.max_f1 as evaluation metric
- Comics HOT 1
- Potentially silly question: What's the function for writing dgCMatrix to svmLight format? HOT 5
- data.table installation instructions HOT 2
- validation_data=NULL HOT 3
- Regression HOT 2
- lightgbm installation problems HOT 3
- Installation failed: Timeout was reached HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from laurae.