Comments (4)
I've opened an issue to improve our st.slices() strategy. Once that's done - no eta, since I think it's a good intro for new contributors - the only thing it won't generate is slices which have start != end but select no elements due to step. This is usually helpful (because around half the time nothing is selected otherwise), but in your case you might want to roll your own to keep generating such cases.
I will probably continue to use my own. I think another difference between my use-case and hypothesis is that I always want to generate indices independent of shape, because ndindex operations should always work independently of array shape when possible.
Capping the product of side lengths without introducing weird biases or shrinking problems is surprisingly difficult... our usual approach is still to filter them. Note that these problems only really matter if you're creating arrays of the given shape though.
Yeah, that's exactly what I'm doing. I don't know what else you would do with a shape. Actually I don't really know what a "typical" use-case for the numpy hypothesis stuff is.
In the end filtering out a lot isn't actually a problem in itself, so if the performance is still OK you don't need to do anything. And congratulations on the bug - are you keeping a 'trophy case' list?
Right, I'm mostly concerned because of the healthcheck errors. And I the filtering means that it isn't actually running as many tests as I would hope.
As an aside, I realized that I've been misusing assume(False). I've used it on error condition cases, but I actually want those to be tested to ensure that they do give an error at the right place. I was getting a lot of healthcheck errors before in some cases because of this.
I'd suggest using separate strategies for "usually valid indices" and "always valid indices". I'm also keen to upgrade our upstream basic_indices() strategy to provide the latter - is there anything other than the "always a tuple" thing that we should change?
That and slices. I can't completely tell what you are doing with ellipses but I would generate ellipses even when they are redundant (an ellipsis at the end of the tuple or an ellipsis when there are already ndim
terms in the tuple).
No, we don't have a live chat kind of thing... they can devolve into unpaid support and none of us want to go there. There's a stackoverflow tag which is pretty helpful though, and you're welcome to tag me in an issue.
So where's the best place to ask questions about hypothesis?
from ndindex.
Hi. Thanks for taking notice of this project.
Regarding slices, I intentionally rolled my own, because for this library I care about a every possible corner case of slices to ensure correctness, whereas presumably for most use-cases you would want to avoid generating duplicate slices. For example, if a slice step=None
, it is always equivalent to step=1
. Depending on what level of abstraction you are working in, there is no need to check this. But I always want to make sure I handle every corner case here.
For shapes, I do think hypothesis can improve. See https://github.com/Quansight/ndindex/blob/0d9d131a5310915f3a51ebe95491e43afcd1dcad/ndindex/tests/helpers.py#L58-L61. I want to generate array shapes but only keep those that create arrays that are not too big (also note the NumPy bug that hypothesis found here). I think this is actually a pretty big source of filtering in my code. But I'm not sure if there's a better way to generate such shapes directly to avoid filtering.
There's also quite a bit of filtering in the tests themselves because I filter out invalid indices. Perhaps I should try to be smarter about this. I do want to generate these indices for the base tests, but generally want to avoid them for higher level tests. For example, tuple indices with more than one ellipsis should raise an error. This is tested in the constructor test_tuple_hypothesis
but uses assume(False)
in the other tests. I've been hesitant to get too clever here because I don't want to accidentally filter out too much, and because what is there has been working (although admittedly some bugs were not found until the tests were ran several times on CI). Or as a simpler example, I should probably avoid step=0
in all but the base Slice constructor test because it raises an exception there. But I haven't bothered, even for my manual exhaustive test.
Another question I had about hypothesis is if you had any kind of gitter or something like that where I could ask questions? On your website I only saw a link to an IRC channel, which didn't seem to have anyone in it when I checked.
from ndindex.
slices
I've opened an issue to improve our st.slices()
strategy. Once that's done - no eta, since I think it's a good intro for new contributors - the only thing it won't generate is slices which have start != end
but select no elements due to step
. This is usually helpful (because around half the time nothing is selected otherwise), but in your case you might want to roll your own to keep generating such cases.
shapes
Capping the product of side lengths without introducing weird biases or shrinking problems is surprisingly difficult... our usual approach is still to filter them. Note that these problems only really matter if you're creating arrays of the given shape though.
Drawing the side lengths from ints() | just(1) | sampled_from([1, 2])
or similar could help, by increasing the chance that several dimensions are of size one (or maybe two) to boost dimensionality without increasing the product (much).
In the end filtering out a lot isn't actually a problem in itself, so if the performance is still OK you don't need to do anything. And congratulations on the bug - are you keeping a 'trophy case' list?
indices
I'd suggest using separate strategies for "usually valid indices" and "always valid indices". I'm also keen to upgrade our upstream basic_indices()
strategy to provide the latter - is there anything other than the "always a tuple" thing that we should change?
admittedly some bugs were not found until the tests were ran several times on CI
For complicated new tests I often set them to run 10k or 100k examples and leave them running for a while - 100 examples just isn't much to hit edge cases in high-dimensional spaces. There are plans in the works for a proper long-running mode, and I'm hoping to announce something really cool later this year... but for now just turn up the max_examples
setting 😅
Q&A
No, we don't have a live chat kind of thing... they can devolve into unpaid support and none of us want to go there. There's a stackoverflow tag which is pretty helpful though, and you're welcome to tag me in an issue.
from ndindex.
(back from an outback road trip now)
That and slices. I can't completely tell what you are doing with ellipses but I would generate ellipses even when they are redundant (an ellipsis at the end of the tuple or an ellipsis when there are already
ndim
terms in the tuple).
We already do this upstream 😁
I've opened a new issue for the SciPy sprints to generate non-tuple basic indices though, which I think was the last thing here.
So where's the best place to ask questions about hypothesis?
Probably https://stackoverflow.com/questions/tagged/python-hypothesis for general questions; for OSS-project-specific questions you can ping me in issues too.
from ndindex.
Related Issues (20)
- Make contributors section in readme
- Old links are broken HOT 4
- Testing mode for hypothesis explain phase
- Use CSS to layout the examples in the slices docs
- Add is_view method HOT 1
- v1.6: test_integerarray_reduce_hypothesis falsifying example HOT 8
- More concise reprs for Slice
- ndindex 1.7 apparently introduced some glitches in wheels for Win HOT 4
- Build wheels in the release script
- test failure with python 3.12b4 HOT 6
- Function to list elements selected by an index
- Cleanup API docs
- Tuple.reduce with ellipses raises on the last invalid index HOT 1
- as_subchunks test failure
- Convert a (view) index into strides and offset
- Math is broken in the slices document HOT 3
- ndarray(x) == NDIndex(x) and NDIndex(x) == ndarray(x) should return a consistent result HOT 7
- Consider opening a NEP to allow overrride of ndarray.__getitem__ dispatch HOT 4
- Move repo to Quansight-Labs
- Rename ndindex.py
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ndindex.