Code Monkey home page Code Monkey logo

catsim's Introduction

Logo


Unit tests Test Coverage Latest Version Digital Object Identifier

catsim is a Python package for computerized adaptive testing (CAT) simulations. It provides multiple methods for:

These methods can either be used in a standalone fashion [1] to power other software or be used with catsim to simulate the application of computerized adaptive tests [2], given a sample of examinees, represented by their ability levels, and an item bank, represented by their parameters according to some logistic Item Response Theory model.

What's a CAT

Computerized adaptive tests are educational evaluations, usually taken by examinees in a computer or some other digital means, in which the examinee's ability is evaluated after the response of each item. The new ability is then used to select a new item, closer to the examinee's real ability. This method of test application has several advantages compared to the traditional paper-and-pencil method or even linear tests applied electronically, since high-ability examinees are not required to answer all the easy items in a test, answering only the items that actually give some information regarding his or hers true knowledge of the subject at matter. A similar, but inverse effect happens for those examinees of low ability level.

More information is available in the docs and over at Wikipedia.

Installation

Install it using pip install catsim.

Basic Usage

NEW: there is now a Colab Notebook teaching the basics of catsim!

  1. Have an item matrix;
  2. Have a sample of examinee proficiencies, or a number of examinees to be generated;
  3. Create an initializer, an item selector, a ability estimator and a stopping criterion;
  4. Pass them to a simulator and start the simulation.
  5. Access the simulator's properties to get specifics of the results;
  6. Plot your results.
from catsim.initialization import RandomInitializer
from catsim.selection import MaxInfoSelector
from catsim.estimation import NumericalSearchEstimator
from catsim.stopping import MaxItemStopper
from catsim.simulation import Simulator
from catsim.cat import generate_item_bank
initializer = RandomInitializer()
selector = MaxInfoSelector()
estimator = NumericalSearchEstimator()
stopper = MaxItemStopper(20)
Simulator(generate_item_bank(100), 10).simulate(initializer, selector, estimator, stopper)

Dependencies

All dependencies are listed on setup.py and should be installed automatically.

To run the tests, you'll need to install the testing requirements pip install catsim[testing].

To generate the documentation, install the necessary dependencies with pip install catsim[docs].

To ensure code is valid and formatted before submission, install the necessary development dependencies with pip install catsim[dev].

Compatibility

catsim is compatible and tested against Python 3.5, 3.6, 3.7, 3.8 and 3.9.

Important links

Citing catsim

You can cite the package using the following bibtex entry:

@article{catsim,
  author = {Meneghetti, Douglas De Rizzo and Aquino Junior, Plinio Thomaz},
  title = {Application and simulation of computerized adaptive tests through the package catsim},
  year = 2018,
  month = jul,
  archiveprefix = {arXiv},
  eprint = {1707.03012},
  eprinttype = {arxiv},
  journal = {arXiv:1707.03012 [stat]},
  primaryclass = {stat}
}

If you are looking for IRT item parameter estimation...

catsim does not implement item parameter estimation. I have had great joy outsourcing that functionality to the mirt R package along the years. However, since many users request packages with item parameter estimation capabilities in the Python ecosystem, here are a few links. While I have not used them personally, specialized packages like these are hard to come by, so I hope these are helpful.

catsim's People

Contributors

douglasrizzo avatar frankier avatar ramiroluz avatar rbu avatar woj-i avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

catsim's Issues

Issue with the estimated theta

In 1 PL model I have all the items with high theta (>2). We start the assessment with theta 0. If the first item response is wrong new estimated theta is going above 2 and every subsequent false response is also increasing the theta.
initializer = FixedPointInitializer(0)
selector = MaxInfoSelector()
estimator = NumericalSearchEstimator()

Stratification methods do not select the most informative item inside the stratum for a given examinee

When using one of the following item selectors: AStratSelector, AStratBBlockSelector, MaxInfoStratSelector, MaxInfoBBlockSelector; the method stratifies the item bank but does not select the item that is maximally informative for a given theta value inside that stratum.

This issue has already been fixed in both the master and dev branches, as well as in version 0.17.2 submitted recently to Pypi.

I'm just waiting for confirmation from the interested third-party that sent me the issue via e-mail to close it.

The snippet below was used to reproduce the issue, using an item bank of 5000 items. When the response vector is changed, the same item was selected both for a low-proficiency and high-proficiency examinee.

import numpy as np
from catsim.estimation import *
from catsim.initialization import *
from catsim.irt import *
from catsim.selection import *
from catsim.simulation import *
from catsim.stopping import *

items = np.genfromtxt("test.txt")

administered_items = [1, 2346, 17, 4444, 3490]
responses = [True] * len(administered_items)
# responses = [False] * len(administered_items)
print(f"Response vector is: {responses}")

initializer = FixedPointInitializer(0)
selector = MaxInfoStratSelector(test_size=2000)
estimator = NumericalSearchEstimator()
stopper = MinErrorStopper(.2)

new_theta = estimator.estimate(items=items,
                               response_vector=responses,
                               administered_items=administered_items,
                               est_theta=initializer.initialize())
print(f'Estimated Theta is: {new_theta}')

_, start_pointer, end_pointer = selector._get_stratum(items, 5)
organized_items = selector.postsort_items(items, est_theta=new_theta, using_simulator_props=False)
print(f"Items in current stratum are: {organized_items[start_pointer:end_pointer]}")

stratum_items = items[organized_items[start_pointer:end_pointer]]
print(f"Item information in current stratum for current theta is: {irt.inf_hpc(new_theta, stratum_items)}")

item_index = selector.select(items=items,
                             administered_items=administered_items,
                             est_theta=new_theta)

print(f"Selected item index is: {item_index}")

Real Time data

Hi,
Can the code be applied to real time data in the place of simulated ones.

No tutorial in the documentation

It is that we can understand the base methods and its uses but seems unaware about its usage.

A python notebook where you can view how one one use it on small exaple of student id and question response dataset would become greatly useful and help tracking further errors if encountered.

Arbitrary error margin in NumericalSearchEstimator

Hi Douglas,

I noticed that you extend the range for bounded estimation search methods by an arbitrary error margin.
Why is that necessary?
In some of my test cases (1PL item bank, all scores correct except 1 item) I get estimates over 4.0, which is the upper limit of my item bank. Is there a way to limit the estimates to the [-4,4] range?

margin = (upper_bound - lower_bound) / 3

Thank you!

Kind regards,
Yaiza

Serious bug in HillClimbingEstimator? Causing only first iteration to matter

Hey, I think that oftentimes the HillClimbingEstimator only does 1 round of estimation (rather than 10, iteratively more precise rounds). The issue is...

  • after round 1, round 2 checks values to the left and right of the best theta, starting from the left (value checked before best theta, that best theta has > likelihood than).
  • likelihood of the leftmost estimate is always < best theta (by definition), so it ends round 2 right away.
  • crazy things start happening, because it sets the lowerbound as the upperbound, and interval sizes can become negative, etc..

image

I need to patch within the next couple days on DataCamp/catsim, so can submit a PR to fix!

Solution

Code here:

https://github.com/douglasrizzo/catsim/blob/master/catsim/estimation.py#L138-L139

seems like the solution is that max_ll needs to be reset each round?

Initializer

When trying to run your Catsim in Colab, step #4, initializing theta for the examinee, gives me the following error code:

NameError: name 'initializer' is not defined.

I'm trying to read your code here, but I'm new to Python. Could you help?

new_theta from estimator

Hello Mr. Douglas,

I have a question about the theta estimation returned from the estimator in the example Colab notebook.

The example code goes like below:

# response data
responses = [True, True, False, False]
administered_items = [1435, 3221, 17, 881]

# create objects
initializer = FixedPointInitializer(0)
selector = MaxInfoSelector()
estimator = NumericalSearchEstimator()
stopper = MinErrorStopper(.2)

# initialize the theta
est_theta = initializer.initialize()

# new theta estimation
new_theta = estimator.estimate(items=items, administered_items=administered_items, response_vector=responses, est_theta=est_theta)

# stop?
_stop = stopper.stop(administered_items=items[administered_items], theta=est_theta)
item_index = selector.select(items=items, administered_items=administered_items, est_theta=est_theta)

# simulate a response
true_theta = 0.8
a, b, c, d = items[item_index]
prob = icc(true_theta, a, b, c, d)
correct = prob > random.uniform(0, 1)

administered_items.append(item_index)
responses.append(correct)

Then, I should go back to the new theta estimation step.

My question is whether I should replace est_theta with new_theta before re-calling the estimator method. But then, it would replace the est_theta with the result based on only a few responses, which is probably not good. But when conducting adaptive testing for a new user (say, est_theta is initialized to 0), this approach may make sense (or not).

An alternative (as in the example) is to keep using the est_theta as initialized and accumulating responses until the stop condition is met. In this case, the final estimate is the new_theta, which is not used in the process. In this case, the stop condition and item selection are based on the initial theta. Therefore, the information is based on the initial theta, too.

So, Is either approach appropriate based on the usage? Or I should always follow the way the example indicates. I'd like to know your opinion.

A-Stratification Selector

Hi,

I am trying to do some work on item exposure control and would like to use the a-stratification method as a basis for comparison. When I put in the test length argument for this selector, it seems to select 30 items and always administer them. Am I putting the incorrect value here?

Thanks!

The logic of b parameter

Hey Douglas 👋,

I have a question about the package. Shouldn't the logic of the package dictate that if the examinee answers one question correctly, the next question must have a higher "b" parameter, and vice versa? I'm wondering if it's possible that the question selection method isn't always following this logic.

I experimented with the package, and at times, I chose the correct answer, but some questions seemed to have lower "b" parameters than expected. I'd appreciate your input on this.

Thanks for your help! 🙏

Multidimensional IRT

I would like to know if there are plans to support MIRT in catsim. If not, does anyone know whether it is feasible or major changes are required?

item difficulty estimation

Hi,

I 've just started studing about IRT (using IRT by Frank Baker and SeockHo Kim), and the catsim library (thx it looks like great work)

While this might be in the documentation (havent found it yet), does catsim support estimation for the item difficult parameter theta?

My problem at hand is: I have a bunch of questions/candidates and interactions without any further knowledge (questions on same topic) and need to properly estimate the question difficulty as well as the candidates proficiency.

Any examples/documentation referrals wld be deeply appreciated. Thx again.

Type checking

Currently there are type hints in the source code, but it looks like no type checking is being performed. Possible set up steps are:

  • Ensure a type checker is installed as a dev requirement
  • Add type checking as a pre-commit hook
  • Add type checking in GitHub CI

Once the package type checks it can be marked as having types so that downstream packages can use it by adding a py.typed marker file: https://www.python.org/dev/peps/pep-0561/#packaging-type-information

Error running example

Hey,

Not sure if this project is still supported but I'm trying to run the "Autonomous Usage" code that was shown in the docs and I'm getting the below error. Can you say if something has changed or needs to be updated?

File "/Users/rram/PycharmProjects/CAT_Testing/mac_venv/lib/python3.7/site-packages/catsim/selection.py", line 83, in <listcomp>

valid_indexes_low_r = [index for index in valid_indexes if items[index, 4] < self._r_max]

IndexError: index 4 is out of bounds for axis 1 with size 4```

error importing [SyntaxError: invalid syntax]

pip installed it, when i try:

from catsim.estimation import *

i get

File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/catsim/estimation.py", line 18
    def __init__(self, precision: int = 6, dodd: bool = False, verbose: bool = False):
                                ^
SyntaxError: invalid syntax

any advice?

Unexpected behaviour from MaxInfoSelector

I am a bit confused by the current behaviour of MaxInfoSelector. I would expect it to select the item with the maximum fisher information given the current ability estimate, however it seems to instead select based upon where individual item response functions reach their maximum, ignoring the differences in scale on the y-axis between them . See e.g.:
Screenshot from 2021-05-24 22-38-31

(Figure taked from A Visual Guide to IRT https://www.metheval.uni-jena.de/irt/VisualIRT.pdf )

Imagine there is a small distance on the x-axis between the peaks of the blue and the black curve. If I have an ability estimate at the peak of the black curve I would expect the next item chosen by MaxInfoSelector to be the blue one since it still has higher information even a bit away from its peak.

Basically, the behaviour I want is equivalent to RandomesqueSelector(bin_size=1) (which is what I've now moved onto --- temporarily at least) but the behaviour I actually get for the 2PL model is the same as UrrySelector.

What is the specification for MaxInfoSelector? The citation given is Lion 1977, but in that article there are no equations. If the current behaviour is the intended behaviour, maybe the documentation could make it clear since it's a bit surprising based on the current name/doc to me.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.