Code Monkey home page Code Monkey logo

iterake's People

Contributors

dwitherell avatar ttrodrigz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

iterake's Issues

Need for Assistance?

Hi Tony
Sorry. I couldn't find your email address. Do you have a vintage or documentation for your package. I was looking for a package about rim weighting in R. I am happy to help with If you require any.

Quite huge difference of one item result compared to cell weighting

Hello,

I'm new to R. I also don't know much about the details of raking. I just heard about it as an easier way to weight data (compared to cell weighting).

So, I installed your package. All results are very similar to those weighted cell by cell (differences usually do not exceed +/- 3 pp.). Sadly, in one case the difference is 8 pp. Do you have any idea why? It was a simple frequency table for a multiple response set (generated in SPSS using CTABLES).

This is the code (but I think that concerning this issue it's not important much):

#loading libraries
library(foreign)
library(iterake)
library(expss)
library(haven)
library(labelled)
library(rstudioapi)

#raking universe
#the target base was 753 so I increased the N to reach it
uni = universe(data = df, category(name = "q1", 
                                   buckets = c("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p"), 
                                   targets = c(0.0752,0.0544,0.0554,0.0262,0.0669,
                                                 0.0870,0.1427,0.0244,0.0535,0.0296,
                                                 0.0593,0.1198,0.0331,0.0362,0.0920,
                                                 0.0441), sum.1 = TRUE), 
                          category(name = "q2",
                                   buckets = c("a","b", "c","d"), 
                                   targets = c(0.1760,0.2406,0.3397,0.2437), sum.1 = TRUE), N = 844)

#creation of the raked dataframe
df.wgt = iterake(universe = uni)

Below you can find results of the question I'm speaking about:

CELL WEIGHT
item1      2%        2
item2      13%      14
item3      15%      17
item4      16%      18
item5      18%      21
item6      21%      24
item7      22%      25
**item8      41%      47**
TOTAL     148%    113

R WEIGHT
item1      2%        2
item2      15%      19
item3      15%      19
item4      17%      21
item5      19%      24
item6      21%      26
item7      24%      31
**item8      33%      42**
TOTAL     146%    126

If you helped to fix this problem in any way it would be fantastic because I find your package very useful.

Greetings,
Konrad

Putting miminum weight value

Hello,

I am using the "max.wgt" argument to cap the maximum weight when using iterake.
But was wondering if there is also a way to cap the minimum weight?

many thanks
Tina

Non-convergence

I'm raking over a lot of characteristics and when I include them all it doesn't converge, but I notice that it stops at 50 iterations. Is there a way to increase the number of iterations to see if convergence happens later?

Weighted N differs from original N

I used your package for weighting survey data. In some cases, the weighted N is smaller than the original N. Here is an example:


library(dplyr)
library(iterake)

spss_data_weighted = iterake(universe = spss_data_uni, 
                                   max.iter = 1500,  
                                   threshold = 0.001, 
                                   stuck.limit = 10)

-- iterake summary -------------------------------------------------------------
 Convergence: Success
  Iterations: 386

Unweighted N: `500.00`
 Effective N: 217.73
  Weighted N: 445.76
  Efficiency: 43.6%
        Loss: 1.296

 NOTE: Threshold met, stopped at difference of 7.705e-01 between weighted sample and universe.


compare_margins(data = spss_data_weighted, weight = weight, universe = spss_data_uni)


print(compare_margins(data = spss_data_weighted, weight = weight, universe = spss_data_uni), n = 52)

# A tibble: 52 x 9
   category bucket uwgt_n  wgt_n uwgt_prop wgt_prop targ_prop   uwgt_diff  wgt_diff
   <chr>    <chr>   <int>  <dbl>     <dbl>    <dbl>     <dbl>       <dbl>     <dbl>
 1 RECAGE   1          72  54.4      0.144  0.122     0.122    0.0220     -8.77e-15
 2 RECAGE   2         143 151.       0.286  0.338     0.338   -0.0524     -7.25e-13
 3 RECAGE   3         204 160.       0.408  0.358     0.358    0.0498      7.75e-13
 4 RECAGE   4          81  80.9      0.162  0.181     0.181   -0.0194     -4.14e-14
 5 RECQ3    1         180  96.2      0.36   0.216     0.213    0.147       2.52e- 3
 6 RECQ3    2         150 110.       0.3    0.246     0.244    0.0563      2.67e- 3
 7 RECQ3    3         100 111.       0.2    0.249     0.251   -0.0506     -1.84e- 3
 8 RECQ3    4          70 129.       0.14   0.289     0.292   -0.152      -3.36e- 3
 9 Q3A      1          70  59.7      0.14   0.134     0.133    0.00728     1.31e- 3
10 Q3A      2          82  72.0      0.164  0.162     0.162    0.00229    -1.73e- 4
11 Q3A      3          35  21.3      0.07   0.0478    0.0481   0.0219     -2.11e- 4
12 Q3A      4          12  11.7      0.024  0.0262    0.0259  -0.00193     2.75e- 4
13 Q3A      5           7   3.33     0.014  0.00746   0.00763  0.00637    -1.67e- 4
14 Q3A      6          18  12.7      0.036  0.0285    0.0275   0.00854     1.09e- 3
15 Q3A      7          33  34.6      0.066  0.0777    0.0786  -0.0126     -8.80e- 4
16 Q3A      8           7   7.69     0.014  0.0173    0.0168  -0.00278     4.76e- 4
17 Q3A      9          41  40.6      0.082  0.0911    0.0923  -0.0103     -1.18e- 3
18 Q3A      10         99  99.5      0.198  0.223     0.221   -0.0232      1.90e- 3
19 Q3A      11         29  22.3      0.058  0.0500    0.0511   0.00689    -1.12e- 3
20 Q3A      12          4   5.32     0.008  0.0119    0.0122  -0.00420    -2.68e- 4
21 Q3A      13         23  19.3      0.046  0.0434    0.0435   0.00252    -1.10e- 4
22 Q3A      14         17   9.31     0.034  0.0209    0.0214   0.0126     -4.69e- 4
23 Q3A      15         13  15.6      0.026  0.0351    0.0359  -0.00985    -7.87e- 4
24 Q3A      16         10  10.7      0.02   0.0240    0.0236  -0.00365     3.25e- 4
25 Q5_1     1         243 244.       0.486  0.548     0.554   -0.0681     -6.38e- 3
26 Q5_1     2         121  90.5      0.242  0.203     0.205    0.0370     -2.01e- 3
27 Q5_1     3          59  29.4      0.118  0.0660    0.0640   0.0540      2.01e- 3
28 Q5_1     4          25  20.7      0.05   0.0465    0.0450   0.00503     1.58e- 3
29 Q5_1     5          15   7.36     0.03   0.0165    0.0160   0.0140      5.03e- 4
30 Q5_1     6          19  19.4      0.038  0.0436    0.0419  -0.00392     1.65e- 3
31 Q5_1     7          18  34.1      0.036  0.0766    0.0739  -0.0379      2.65e- 3
32 Q5_2     1         274 279.       0.548  0.626     0.592   -0.0444      3.32e- 2
33 Q5_2     2         117  66.9      0.234  0.150     0.140    0.0943      1.03e- 2
34 Q5_2     3          50  25.9      0.1    0.0582    0.0550   0.0450      3.22e- 3
35 Q5_2     4          18  22.1      0.036  0.0496    0.0450  -0.00904     4.53e- 3
36 Q5_2     5          14   6        0.028  0.0135    0.00840  0.0196      5.06e- 3
37 Q5_2     6          14  11.4      0.028  0.0255    0.0237   0.00434     1.88e- 3
38 Q5_2     7          13  34.6      0.026  0.0777    0.136   -0.110      -5.82e- 2
39 Q5_3     1         218 232.       0.436  0.521     0.530   -0.0937     -9.05e- 3
40 Q5_3     2         133 118.       0.266  0.264     0.266   -0.00000610 -2.15e- 3
41 Q5_3     3          54  20.7      0.108  0.0465    0.0457   0.0623      7.52e- 4
42 Q5_3     4          22  26.0      0.044  0.0582    0.0572  -0.0132      1.05e- 3
43 Q5_3     5          11   6.67     0.022  0.0150    0.0145   0.00752     4.74e- 4
44 Q5_3     6          32  18.5      0.064  0.0415    0.0404   0.0236      1.08e- 3
45 Q5_3     7          30  24.2      0.06   0.0543    0.0465   0.0135      7.84e- 3
46 Q5_4     1         204 136.       0.408  0.304     0.271    0.137       3.30e- 2
47 Q5_4     2         154 133.       0.308  0.298     0.265    0.0428      3.23e- 2
48 Q5_4     3          66  34.7      0.132  0.0778    0.0694   0.0626      8.44e- 3
49 Q5_4     4          35  52.6      0.07   0.118     0.105   -0.0352      1.28e- 2
50 Q5_4     5          15  12.2      0.03   0.0274    0.0244   0.00561     2.97e- 3
51 Q5_4     6          15  45        0.03   0.101     0.118   -0.0881     -1.72e- 2
52 Q5_4     7          11  33        0.022  0.0740    0.146   -0.124      -7.23e- 2

I fixed this by dividing the original N by the weighted N and multiply the weights by that factor:

X = 500/445.76

fin = spss_data_weighted %>%
  mutate(weight = weight * X) 

This works but I don't understand the reason for the difference between the weighted N and the original N of the sample.

Iterake doesn't create weights and R is not showing any error

Hi,

I am using the Iterake package and iterake function for iterative weighting in different survey results. And it works great. But now I have a problem with it and, since RStudio didn't show any error, I don't know what that could be.

I have a dataset "inp" with 22.000 observations (respondents) and 29 variables (four of them are variables that should be used in iterative weighting: gender (with values 1 and 2), age group (with values from 1 to 15, since there are 15 age groups), the third variable is the group that a respondent belongs to according to region (values from 1 to 8) and the fourth variable has values 1 and 2 (depending of the size of a households of a respondent). So I created these four columns in my sample dataset and calculated the same shares in the whole population.

The next step was to use the function universe to create uni:

uni <- universe(
data = inp,
category(name="rim1",
buckets = c("1", "2"),
targets = c(0.51, 0.49)
),
category(name="rim2",
buckets = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15"),
targets = c(0.10, 0.06, 0,06, 0.07, 0.07, 0.07, 0.07, 0.08, 0.07, 0.07, 0.07, 0.06, 0.05, 0.05, 0.05)
),
category(name="rim3",
buckets = c("1", "2", "3", "4", "5", "6", "7", "8"),
targets = c(0.26, 0.10, 0.07, 0.08, 0.26, 0.09, 0.06, 0.08)
),
category(name="rim4",
buckets = c("1", "2"),
targets = c(0.22, 0.78)
)
)

That step is working. It creates uni as a list.
The next step, that works with all the other surveys, doesn't work here, i.e. it starts working and never ends, without any errors showed:

inpw <- iterake(universe=uni, threshold = 0.0001, max.wgt=10, max.iter=50)

I tried also without all these arguments, and still the same.
Do you have any idea what could be the issue here? Any suggestion or help would be great.

Thanks for helping and thanks for such a good package, I use it every day. :)

Best regards,

Gordana

universe error

Thanks for the great package. After updating some packages / R etc.. since this spring (when iterake worked), now the `universe function fails. Any ideas? Below are examples from the package.

library(iterake)
data(dealer_data)

# build the 'universe'
dealer_uni <- universe(
  
  df = dealer_data,
  
  category(
    name = "Age",
    buckets = c("18-34", "35-54", "55+"),
    targets = c(.12, .58, .30)
  ),
  
  category(
    name = "Year",
    buckets = c(2015, 2016, 2017, 2018),
    targets = c(.22, .25, .32, .21)
  ),
  
  category(
    name = "Type",
    buckets = c("Car", "SUV", "Truck"),
    targets = c(.38, .47, .15)
  )
  
)
#> Error in rep(vec_seq_along(data), n): invalid 'times' argument

data(weight_me)
universe(
  df = weight_me,
  
  category(
    name = "costume",
    buckets = c("Bat Man", "Cactus"),
    targets = c(0.5, 0.5)),
  
  category(
    name = "seeds",
    buckets = c("Tornado", "Bird", "Earthquake"),
    targets = c(0.4, 0.3, 0.3))
)
#> Can't bind data because some arguments have the same name

Created on 2019-09-10 by the reprex package (v0.3.0)

my sessionInfo
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] reprex_0.3.0 iterake_0.0.0.9000

loaded via a namespace (and not attached):
[1] tidyselect_0.2.5 xfun_0.9 purrr_0.3.2
[4] splines_3.6.1 haven_2.1.1 lattice_0.20-38
[7] labelled_2.2.1 colorspace_1.4-1 vctrs_0.2.0.9002
[10] htmltools_0.3.6 base64enc_0.1-3 survival_2.44-1.1
[13] rlang_0.4.0 pillar_1.4.2 foreign_0.8-71
[16] glue_1.3.1 RColorBrewer_1.1-2 stringr_1.4.0
[19] munsell_0.5.0 gtable_0.3.0 htmlwidgets_1.3
[22] evaluate_0.14 latticeExtra_0.6-28 knitr_1.24
[25] forcats_0.4.0 callr_3.2.0 ps_1.3.0
[28] htmlTable_1.13.1 Rcpp_1.0.2 clipr_0.7.0
[31] acepack_1.4.1 backports_1.1.4 scales_1.0.0
[34] checkmate_1.9.4 Hmisc_4.2-0 fs_1.3.1
[37] gridExtra_2.3 ggplot2_3.2.1 hms_0.5.1
[40] packrat_0.5.0 digest_0.6.20 stringi_1.4.3
[43] processx_3.3.1 dplyr_0.8.3 grid_3.6.1
[46] tools_3.6.1 magrittr_1.5 lazyeval_0.2.2
[49] tibble_2.1.3 Formula_1.2-3 cluster_2.1.0
[52] whisker_0.3-2 crayon_1.3.4 tidyr_0.8.3.9000
[55] pkgconfig_2.0.2 zeallot_0.1.0 Matrix_1.2-17
[58] data.table_1.12.2 rmarkdown_1.14 assertthat_0.2.1
[61] rstudioapi_0.10 R6_2.4.0 rpart_4.1-15
[64] nnet_7.3-12 compiler_3.6.1

Installing via conda

I'm trying to install the package with conda using:

conda skeleton cran <github_url>
conda build <package-name>

When I run the first command, I get the following error:

(base) ubuntu@ip-10-0-10-231:~$ /home/ubuntu/anaconda3/bin/conda skeleton cran https://github.com/ttrodrigz/iterake.git
Adding in variants from internal_defaults
INFO:conda_build.variants:Adding in variants from internal_defaults
Parsing input package https://github.com/ttrodrigz/iterake.git:
.. name: iterake location: https://github.com/ttrodrigz/iterake new_location: /home/ubuntu/r-iterake
Making/refreshing recipe for iterake
Cloning into '/home/ubuntu/anaconda3/conda-bld/skeleton_1610567786144/work'...
done.
checkout: 'HEAD'
Your branch is up to date with 'origin/_conda_cache_origin_head'.
==> git log -n1 <==

fatal: No names found, cannot describe anything.
commit 03d54cb21f90d321c56d296212f67e07b878fb27
Author: dwitherell <[email protected]>
Date:   Thu Jun 25 14:24:09 2020 -0600

    Minor edit to address funs() deprecation

==> git describe --tags --dirty <==

commit 03d54cb21f90d321c56d296212f67e07b878fb27
Author: dwitherell <[email protected]>
Date:   Thu Jun 25 14:24:09 2020 -0600

    Minor edit to address funs() deprecation

==> git status <==

On branch _conda_cache_origin_head
Your branch is up to date with 'origin/_conda_cache_origin_head'.

nothing to commit, working tree clean


Leaving build/test directories:
  Work:
 /home/ubuntu/anaconda3/conda-bld/skeleton_1610567786144/work
  Test:
 /home/ubuntu/anaconda3/conda-bld/skeleton_1610567786144/test_tmp
Leaving build/test environments:
  Test:
source activate  /home/ubuntu/anaconda3/conda-bld/skeleton_1610567786144/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho
  Build:
source activate  /home/ubuntu/anaconda3/conda-bld/skeleton_1610567786144/_build_env


Error: no tags found

Per conda/conda#6674 (comment), it seems like the releases need to be tagged. Would this be possible?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.