Code Monkey home page Code Monkey logo

gwann's Introduction

Geographically Weighted Artificial Neural Network

System Requirements

Java JDK 1.2 or higher (for JRI/REngine JDK 1.4 or higher). If it is not already installed, you can get it here.

Install

Sys.setenv("R_REMOTES_NO_ERRORS_FROM_WARNINGS" = "true")
if (!require("devtools"))
   install.packages("devtools",INSTALL_opts="--no-multiarch")
devtools::install_github("jhagenauer/gwann")

Example

options(java.parameters="-Xmx8g")

library(viridis)
library(gwann)
library(ggplot2)

data(toy4)

x<-as.matrix(toy4[,c("x1","x2")])
y<-as.numeric(toy4[,c("y")] )
dm<-as.matrix(dist(toy4[,c("lon","lat")])  )
idx_pred<-sample(nrow(x),0.3*nrow(x)) # indices of prediction samples

r<-gwann(x_train=x[-idx_pred,],y_train=y[-idx_pred],w_train=dm[-idx_pred,-idx_pred],
     x_pred=x[idx_pred,],w_pred=dm[-idx_pred,idx_pred],
     nrHidden=4,batchSize=50,lr=0.1,optimizer="adam",cv_patience=9999,
     adaptive=F,
     bwSearch="goldenSection",bwMin=min(dm)/4, bwMax=max(dm)/4,
     threads=8
)
p<-diag(r$predictions)
print(paste("RMSE: ",sqrt(mean((p-y[s_test])^2))))
print(paste("Iterations: ",r$iterations))
print(paste("Bandwidth: ",r$bandwidth))

# plot predictions
s<-cbind( Prediction=p, toy4[s_test,c("lon","lat")] )
ggplot(s,aes(lon,lat,fill=Prediction)) + geom_raster() + scale_fill_viridis() + coord_fixed()

Note

  • If you get java.lang.OutOfMemoryError: Java heap space put options(java.parameters="-Xmx8g") before loading the package and adjust it to your available memory. To take effect, you most likely have to restart R/RStudio then.
  • The learning rate (lr), the batch size (batchSize) and the number of hidden neurons (nrHidden) have a substantial effect on the performance and therefore should be chosen carefully. (The number of iterations as well as the bandwidth are also important but are by default automatically determined by GWANN using cross-validation.)
  • In particular large values of batchSize (50% of total data size) have often shown to be useful.
  • Transforming the data to make their distributions approximally normal often improves the performance of GWANN.
  • Test different optimizers, i.e. 'nesterov' and 'adam'.

References

Julian Hagenauer & Marco Helbich (2022) A geographically weighted artificial neural network, International Journal of Geographical Information Science, 36:2, 215-235, DOI: 10.1080/13658816.2021.1871618

gwann's People

Contributors

jhagenauer avatar dependabot[bot] avatar

Stargazers

 avatar niuergo avatar Eleojo Abubakar (PhD) avatar  avatar  avatar Yauheni Semianiuk avatar  avatar  avatar Antony Barja  avatar Changjiang Shi avatar 高皓光 avatar Iron Hong avatar Oliver Hennhöfer avatar Tom Keel avatar Huanfa Chen avatar  avatar  avatar Labibsm avatar  avatar  avatar  avatar DaiShaoqing avatar Ziqi Li avatar Konstantin Klemmer avatar

Watchers

James Cloos avatar  avatar Kostas Georgiou avatar Pratyush Tripathy avatar

gwann's Issues

Is it possible to export only the trained GWANN model?

Hello, jhagenauer,

I would like to know if it would be possible to export only the trained model for later prediction of values.

In my case, I have a dataset with 6253 observations and 14 independent variables. I want to test the impact on prediction by changing some independent variables. It has been computationally expensive to run the entire model to perform this test. Would it be possible to separate the code into training and prediction?

I thank you in advance.

Error in install package gwann

Hello, Mr. Julian.
I tried to install gwann with this command in Windows 10 :
Sys.setenv("R_REMOTES_NO_ERRORS_FROM_WARNINGS" = "true")
if (!require("devtools"))
install.packages("devtools")
devtools::install_github("jhagenauer/gwann",INSTALL_opts=c("--no-multiarch"))

And i got this error:
Downloading GitHub repo jhagenauer/gwann@HEAD
Error in utils::download.file(url, path, method = method, quiet = quiet, :
download from 'https://api.github.com/repos/jhagenauer/gwann/tarball/HEAD' failed
Warning message:
In devtools::install_github("jhagenauer/gwann", INSTALL_opts = c("--no-multiarch")) :
Arguments in ... must be used.
✖ Problematic argument:
• INSTALL_opts = c("--no-multiarch")
ℹ Did you misspell an argument name?

Do you have any suggestion how to solve the problem?
Thank you so much in advance. May you have a blessed day.

rJava not recognised

Hi there,

I have installed and loaded rJava, but when I run the following code, I get the error message:

SCRIPT

**library(rJava)

Sys.setenv("R_REMOTES_NO_ERRORS_FROM_WARNINGS" = "true")
if (!require("devtools"))
install.packages("devtools",INSTALL_opts="--no-multiarch")
devtools::install_github("jhagenauer/gwann")**

CONSOLE

Error: package or namespace load failed for 'rJava':
.onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: JAVA_HOME cannot be determined from the Registry
Error : package 'rJava' could not be loaded
Error: loading failed
Execution halted
*** arch - x64
ERROR: loading failed for 'i386'

  • removing 'C:/Users/patri/Documents/R/win-library/4.1/gwann'
    Warning message:

The method of organizing test data may affect the prediction results

Hello, jhagenauer,

Recently I have done some analysis using this wonderful job. A problem I came across is that, when I organize data in test set in different ways, the prediction results may change. This really confused me. It can be illustrated with the example code in README.md. Below I give three cases to describle the problem in detail.

Case One

library(viridis)
library(gwann)
library(ggplot2)

set.seed(1)
data(toy4)

x<-as.matrix(toy4[,c("x1","x2")])
y<-as.numeric(toy4[,c("y")] )
dm<-as.matrix(dist(toy4[,c("lon","lat")])  )
s_test<-sample(nrow(x),0.3*nrow(x)) # indices of test samples

x_pred1 <- x[s_test,]
w_pred1 <- dm[-s_test, s_test]
r<-gwann(x_train=x[-s_test,],y_train=y[-s_test],w_train=dm[-s_test,-s_test],
         x_pred=x_pred1,w_pred=w_pred1,
         nrHidden=30,batchSize=50,lr=0.1,
         adaptive=F,
         bwSearch="goldenSection",bwMin=min(dm)/4, bwMax=max(dm)/4,
         threads=8
)
p<-diag(r$predictions)
print(p[1:5])

This is exactly the example code in the readme file, except that I have added a seed to make the result comparable. The predition results, i.e., the varaible p in the code is of my interest.

The output is

Golden section search...
Cross-validation results for hyperparameter search (folds: 10, repeats: 1):
* Bandwidth: 2.0720938004226177
* Iterations: 1606
* RMSE: 0.3637624585267835
Building final model with bandwidth 2.0720938004226177 and 1606 iterations...
[1] 4.6746140 8.2461332 0.6771165 0.2575546 2.5992905

Case Two

library(viridis)
library(gwann)
library(ggplot2)

set.seed(1)
data(toy4)

x<-as.matrix(toy4[,c("x1","x2")])
y<-as.numeric(toy4[,c("y")] )
dm<-as.matrix(dist(toy4[,c("lon","lat")])  )
s_test<-sample(nrow(x),0.3*nrow(x)) # indices of test samples

x_pred2 <- x[s_test[1:5],]
w_pred2 <- dm[-s_test, s_test[1:5]]
r<-gwann(x_train=x[-s_test,],y_train=y[-s_test],w_train=dm[-s_test,-s_test],
         x_pred=x_pred2,w_pred=w_pred2,
         nrHidden=30,batchSize=50,lr=0.1,
         adaptive=F,
         bwSearch="goldenSection",
         bwMin=min(dm)/4, bwMax=max(dm)/4,
         threads=8
)
p<-diag(r$predictions)
print(p[1:5])
Golden section search...
Cross-validation results for hyperparameter search (folds: 10, repeats: 1):
* Bandwidth: 2.0720938004226177
* Iterations: 1606
* RMSE: 0.3637624585267835
Building final model with bandwidth 2.0720938004226177 and 1606 iterations...
[1] 4.6890841 8.4763250 0.4011214 0.4280682 2.5759752

Here I only select the first five observations of test data in Case One to form a new test set x_pred2 and corresponding weight w_pred2. From the output we can see that although the cross-validation results are the same, the prediction results are quite different from those in Case One (especially for the third and fourth observations), which is very strange.

Case Three

library(viridis)
library(gwann)
library(ggplot2)

set.seed(1)
data(toy4)

x<-as.matrix(toy4[,c("x1","x2")])
y<-as.numeric(toy4[,c("y")] )
dm<-as.matrix(dist(toy4[,c("lon","lat")])  )
s_test<-sample(nrow(x),0.3*nrow(x)) # indices of test samples

s_test3 <- s_test
s_test3[2] <- s_test3[1]
x_pred3 <- x[s_test3,]
w_pred3 <- dm[-s_test, s_test3]
r<-gwann(x_train=x[-s_test,],y_train=y[-s_test],w_train=dm[-s_test,-s_test],
         x_pred=x_pred3,w_pred=w_pred3,
         nrHidden=30,batchSize=50,lr=0.1,
         adaptive=F,
         bwSearch="goldenSection",
         bwMin=min(dm)/4, bwMax=max(dm)/4,
         threads=8
)
p<-diag(r$predictions)
print(p[1:5])
Golden section search...
Cross-validation results for hyperparameter search (folds: 10, repeats: 1):
* Bandwidth: 2.0720938004226177
* Iterations: 1606
* RMSE: 0.3637624585267835
Building final model with bandwidth 2.0720938004226177 and 1606 iterations...
[1] 4.6732400 4.6189392 0.6733432 0.2431097 2.5998484

In this case, I replace the second observation of test set in Case One with the first observation. So the first two observations are totally the same. And I expect the predicted values of these two observations to be the same (because their data and distance vector are the same). However, the ouput shows that they are different.


In summary, there are at least two strange things happened in the above three cases:

  1. Using subset of a test data will give a different prediction (Case One vs. Case Two);
  2. Two identical observations in the test set will result in different prediction results (Case Three).

Did I do something wrong? Hope for your help!

R aborted when loading gwann

Hi Julian, may I know the version of the rJave, since every time I try to import the gawnn (this line -- library(gwann)), the R sessions is aborted.

Is gwann syntax only for testing data?

Hello Hagenauer,
I would like to ask if gwann syntax is currently only used for data testing?
Then couldn't gwann display all the predictions for each location?

I thank you in advance.

Error in .jcall

Hello Mr. Julian,
I tried your program code and encountered an error in .jcall:
SCRIPT :
r<-.jcall(obj="supervised.nnet.gwann.GWANN_RInterface",method="run",returnSig = "Lsupervised/nnet/gwann/Return_R;",

      .jarray(x_train,dispatch=T),
      y_train,
      .jarray(w_train,dispatch=T),
      
      .jarray(x_pred,dispatch=T),
      y_pred,
      .jarray(w_pred,dispatch=T),
      
      norm,nrHidden,batchSize,optimizer,lr,linOut,
      kernel,bandwidth,adaptive,
      bwSearch,
      
      bwMin,bwMax,steps,iterations,patience,folds,repeats,permutations,threads)

CONSOLE :
Error in .jcall(obj = "supervised.nnet.gwann.GWANN_RInterface", method = "run", :
method run with signature ([[D[D[[D[[D[D[[DZDDLjava/lang/String;DZLjava/lang/String;DZLjava/lang/String;DDDDDDDDD)Lsupervised/nnet/gwann/Return_R; not found

Do you have any suggestions on how to solve the problem?
Thank you so much in advance. May you have a blessed day

Failed to install 'gwann'

Hi, I tried to install gwann with this command in Windows 10:
devtools::install_github("jhagenauer/gwann",INSTALL_opts=c("--no-multiarch"))
I got this error:
Error: (converted from warning) package 'rJava' was built under R version 3.6.3
Execution halted
ERROR: lazy loading failed for package 'gwann'
* removing 'C:/Users/boenz/Documents/R/win-library/3.6/gwann'
Error: Failed to install 'gwann' from GitHub:
(converted from warning) installation of package
'C:/Users/boenz/AppData/Local/Temp/Rtmp6pqJmL/filed243a522a3/gwann_0.0.1.tar.gz’ had non-zero exit status
Do yo have any suggestion how to solve the problem?

Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.