hturner / gnm Goto Github PK
View Code? Open in Web Editor NEWgnm package for generalized nonlinear models in R
License: GNU General Public License v2.0
gnm package for generalized nonlinear models in R
License: GNU General Public License v2.0
Please see the problems shown on
https://www.stats.ox.ac.uk/pub/bdr/BLAS/gnm.out
For reproduction details see
https://www.stats.ox.ac.uk/pub/bdr/BLAS/README.txt .
Hi there,
I'm wondering whether users are allowed to set upper and lower boundary for nonlinear parameters when fitting a GNM model.
Thanks,
Xiaotong
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
The function Symm()
cannot handle more than two factors. This means that the information on the help page of Symm()
as well as in the vignette, viz. where it is said that ...
take take more than two factors, is misleading.
An illustration comes from trying to implement a Complete Symmetry model which Agresti (2010: 242-246) describes. The data are Table 8.5 in Agresti (2010: 242) and can be created as follows:
df8_5 <- data.frame(
A = rep(1:3, each = 9),
B = rep(rep(1:3, each = 3), times = 3),
C = rep(1:3, times = 9),
Freq = c(6, 4, 5, 3, 13, 10, 1, 8, 14,
2, 3, 2, 1, 3, 1, 2, 1, 2,
1, 0, 2, 0, 0, 0, 1, 1, 0)
)
The Symm()
function works properly when used with two factors:
with(df8_5, Symm(A, B))
[1] 1:1 1:1 1:1 1:2 1:2 1:2 1:3 1:3 1:3 1:2 1:2 1:2 2:2 2:2 2:2 2:3 2:3 2:3 1:3 1:3
[21] 1:3 2:3 2:3 2:3 3:3 3:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
with(df8_5, Symm(A, C))
[1] 1:1 1:2 1:3 1:1 1:2 1:3 1:1 1:2 1:3 1:2 2:2 2:3 1:2 2:2 2:3 1:2 2:2 2:3 1:3 2:3
[21] 3:3 1:3 2:3 3:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
with(df8_5, Symm(B, C))
[1] 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2 1:3 1:2 2:2 2:3 1:3 2:3 3:3 1:1 1:2
[21] 1:3 1:2 2:2 2:3 1:3 2:3 3:3
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
However, when used with three factors, the Symm()
function returns a factor with the wrong levels and only containing NA
values:
with(df8_5, Symm(A, B, C))
[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[17] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Levels: 1:1 1:2 1:3 2:2 2:3 3:3
When gnm is used in a function it looks in the global environment for it rather than inside the function environment.
Hi,
I'm wondering how do you estimate the dispersion parameter when doing conditional Poisson regression? I called gnm() and eliminate the strata (i.e. group, class) and set family = quasipoisson. My guess is this function first do conditional Poisson (which is a multinomial regression) and get the estimates of covariates' effects, then estimate the dispersion parameter (assumed common, for all stratae?) using the moment estimator. Can you clarify the algorithm, or provide any references? Thank you!
usethis::use_roxygen_md()
usethis::use_pkgdown_github_pages()
usethis::use_tidy_description()
usethis::use_package_doc()
@importFrom
directives here. usethis::use_import_from()
is handy for this.usethis::use_testthat(3)
and upgrade to 3e, testthat 3e vignetteR/
files and test/
files for workflow happiness. The docs for usethis::use_r()
include a helpful script. usethis::rename_files()
may be be useful.usethis::use_code_of_conduct()
Set up or update GitHub Actions.
Updating workflows to the latest version will often fix troublesome actions:
usethis::use_github_action('check-standard')
usethis::use_github_action('test-coverage')
Created on 2023-08-26 with usethis::use_upkeep_issue()
, using usethis v2.2.0
summary() takes a while on gnm models when there are many parameters, compared to glm(), perhaps due to parameter identification checks. Can this be improved?
(c.f. email query from @nalimilan)
Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
git push
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
I believe there is a subsetting bug in confint.profile.gnm()
which is triggered when there is only 1 covariate. Consider
library(gnm)
d <- as.data.frame(cautres)
fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
confint(fit)
For me, this returns an error message which appears to be due to lines 30-31 in file confint.profile.gnm.R:
std.err <- attr(object, "summary")$coefficients[, "Std. Error"]
parm <- parm[!is.na(std.err)[parm]]
After replacing these lines with
std.err <- attr(object, "summary")$coefficients[, "Std. Error", drop=FALSE]
parm <- parm[!is.na(std.err)[parm, ]]
The error disappears.
Many thanks for developing gnm
and making it freely available! Conditional Poisson regression is very useful in epidemiology.
Similar issue to here: hturner/BradleyTerry2#7
Consider adding a reference to the paper by van der Waal et al discussing the usefulness of Diagonal Reference Models compared to other standard types of analysis in Public Health: https://link.springer.com/article/10.1007/s00038-017-1018-x
@nalimilan implemented this for gnm in general so perhaps could be incorporated in gnm (with appropriate credit of course).
Calling gnm with 'eliminate = NULL' leads to failure despite the fact that NULL is the default value of eliminate. The reason is the code snippet
if (!missing(eliminate)) {
eliminate <- modelData$(eliminate)
if (!is.factor(eliminate))
stop("'eliminate' must be a factor")
Wouldn't it be better to replace the first line with 'if (!is.null(eliminate))'?
I had posted the same question in Stackoverflow.
https://stackoverflow.com/questions/77269879/how-to-extract-the-result-from-gnm-object-by-using-broomtidy-statsconfint-wi/77270019?noredirect=1#comment136222463_77270019
jared_mamrot
https://stackoverflow.com/users/12957340/jared-mamrot and I had attempted different methods, including
for loop,lapply(), Map() and purrr::map(), and still could not extract the result from broom::tidy.
I know that broom::tidy.gnm just directly uses the profile.gnm to obtain the estimate and confidence intervals from the gnm object. jared_mamrot
suspects that the error related to how the function handles the args.
Here is the sample dataset and the code I used:
library(tidyverse)
library(gnm)
library(broom)
library(haven)
data = read_dta('londondataset2002_2006.dta')
data$ozone10 <- data$ozone/10
# GENERATE MONTH AND YEAR
data$month <- as.factor(months(data$date))
data$year <- as.factor(format(data$date, format="%Y") )
data$dow <- as.factor(weekdays(data$date))
data$stratum <- as.factor(data$year:data$month:data$dow)
data <- data[order(data$date),]
# FIT A CONDITIONAL POISSON MODEL WITH A YEAR X MONTH X DOW STRATA
modelcpr = vector(mode = 'list',length = 12)
for(i in 1:12){
modelcpr1 <- gnm(numdeaths ~ ozone10 +
lag(temperature,i), data=data, family=poisson,
eliminate=factor(stratum))
modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>%
select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}
#vs
#only for i = 1
modelcpr1 <- gnm(numdeaths ~ ozone10 +
lag(temperature,1), data=data, family=poisson,
eliminate=factor(stratum))
#broom::tidy
broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>%
select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
The dataset and part of the code are from this paper:
https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-14-122#Sec13
After running the forloop, the following error popped up:
modelcpr = vector(mode = 'list',length = 12)
for(i in 1:12){
modelcpr1 <- gnm(numdeaths ~ ozone10 +
lag(temperature,i), data=data, family=poisson,
eliminate=factor(stratum))
modelcpr[[i]] = broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>%
select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
}
Error in profile.gnm(object, which = parm, alpha = 1 - level, trace = trace) :
profiling has found a better solution, so original fit had not converged
In addition: Warning message:
In sqrt((deviance(updated) - fittedDev)/disp) : NaNs produced
where i = 1.
However, when I run the for loop one by one:
> modelcpr1 <- gnm(numdeaths ~ ozone10 +
+ lag(temperature,1), data=data, family=poisson,
+ eliminate=factor(stratum))
>
> #broom::tidy
> broom::tidy(modelcpr1,conf.int = T, exponentiate = T) %>% slice(n()) %>%
+ select(estimate, conf.low, conf.high) %>% as.matrix %>% unname
[,1] [,2] [,3]
[1,] 1.000446 0.9988817 1.002013
I can obtain the result without any error. Is there any thing that I missed?
getContrasts
returns a "qv"
object so maybe a coef
method for "qv"
objects?
Hello,
I am one of the authors of the VFP package on CRAN which imports gnm.
Occasionally, using a custom nonlin model, the gnm fails with the error:
Error: no valid set of coefficients has been found: please supply starting values
This happens even if the exact parameters are provided as starting values.
This failure does not occur with all datasets. E.g., the following code runs, if the
data frame is restricted to lines 1:10.
I suppose it has to do with gamma models with identity link not being defined if the estimator becomes <0.
Of course this can be avoided sometimes using another parametrization, however, I want to get the variance estimates and the like on the original scale.
Here is a minimal example (using a linear model to get a reference):
Mean <- c(1237.1, 5605.55, 801.45, 2713.55, 570.4, 193.1, 97.2, 11.05,
202.5, 7031.75, 2679.6, 1252.55, 735.6, 4088.05, 9818.7, 4486.45,
3104.85, 1189.3, 217.6, 603.2, 28.45)
VC <- c(33696.08, 296681.045, 24842.205, 31777.205, 1705.28, 950.48,
2693.78, 244.205, 3026.42, 17578.125, 3.92, 8281.845, 1280.18,
76479.605, 4665.78, 130101.005, 0.125, 9800, 18355.28, 1152,
1618.805)
dat <- data.frame(Mean=Mean,VC=VC)
coeffs <- c(beta1 = 4792.94726157285, beta2 = 0.00366035757993686)
fitglm <- gnm(VC ~ I(Mean^2) , family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE)
print(fitglm$coefficients)
powfun <- function(x)
{
list(
predictors=list(beta1 = 1, beta2 = 1),
variables=list(substitute(x)),
term=function(predictors,variables){
paste( predictors[1],"+",predictors[2],"*",variables[1],"^2")
}
)
}
class(powfun) <- "nonlin"
form <- VC ~ powfun(Mean)-1
fitgnm <- gnm(formula = form, family = Gamma(link = "identity"), data = data.frame(dat), start=coeffs, trace=TRUE,iterMax=100)
Kind regards
Florian
Hi,
I'm currently using gnm for constructing conditional poisson models. I had the model specified as
gnm(event_count ~ factor(exposure) + cov1 + ... + cov n, data = data, family = 'poisson', offset = logtime, eliminate = factor(analytical_unit))
I encountered the error message as Error in if (abs(dev - devold)/(0.1 + abs(dev)) < control$epsilon) { : missing value where TRUE/FALSE needed
However, when I tried the formula specified differently, using a subset of covariates, it would run. I also tried to run the same formula on a dataset with same structure, the function would run as well. Additionally, I checked the data that I feed into the function had no missing values.
I could not find documentation about this error on stackoverflow or CRAN. Do you know where the problem is? Thanks you so much!
Best,
Monica
Dref
and DrefWeights
designed for dependence on factors, can this be extended to continuous covariates?
(c.f. email query Xiaoguang Fan )
I believe this is related to issue 1. Consider
library(gnm)
f <- function(d) {
fit <- gnm(Freq ~ vote, eliminate=class, family=poisson, data=d)
confint(fit)
}
f(as.data.frame(cautres))
For me, this fails because d
could not be found - presumably because some function downstream from update.gnm()
(called by profile.gnm()
) does not search in the correct environment. I'm sorry I'm not able to locate the exact source of the error, but I don't fully understand what's going on with environments in the following evaluated calls to gnm()
, gnmTerms()
etc.
Thanks for quickly fixing the previous issue and for your ongoing efforts!
In general, better support for getting results out, e.g. with stargazer and similar packages. Not sure what would be most useful here.
Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
git push
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
Should be referred to for easier handling of association models.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.