Comments (4)
Thank you for the explanation.
from polars.
The python list get passed to the Series
constructor which defaults to strict=True
since 1.0.
You can directly pass it to the Series
constructor with more lenience:
>>> s = pl.Series([1, 2, 3, 4])
>>> s.is_in(pl.Series([2, 2.5], strict=False))
shape: (4,)
Series: '' [bool]
[
false
true
false
false
]
I think we should expose the strict
argument to the is_in
expression so that we can do s.is_in([2, 2.5], strict=False)
. Then that error message makes sense again.
from polars.
Implicit casting from integer to float64 (when the integer is inside the exact-representable range for a float64) is a standard in many systems (e.g. NumPy, base R, both do this -- Python and many programming languages implicitly cast integers to float in arithmetic operations, etc.), do you have anything written up on what motivated this decision so I can understand the reasoning?
Given how I was impacted in a very innocuous setting of passing a deserialized JSON array of numbers to is_in
, I would guess that this could become a sharp edge for many users.
from polars.
Polars is strict about data types. And dynamic user input can lead to data type bugs down the line. We want to catch those bugs at the source.
E.g. if you read in [1]
you get a Int64
column, then next time you run the query on other json data and you get [1, 2.0]
and the data is inferred to be Float64
, but Polars will fail somewhere else because you expected an integer type.
Polars will guard you from that by being strict by default. You can opt-out of it, or even better set the dtype
you want the input to be parsed as.
from polars.
Related Issues (20)
- DataFrame.__pow__ fails for (series, column) inputs HOT 1
- Merge list of dataframes with common keys HOT 4
- Request for Inequality operator to handle Null values as-well HOT 3
- Polars' rust parquet engine reads/writes files that are unreadable by duckdb/pandas/pyarrow `(use_pyarrow=True)` HOT 2
- `pl.cum_count` doesn't
- Transpose option for `DataFrame.describe()`
- Adding “Rounding half to even”
- `the name 'literal' is duplicate` when selecting a multi-element NumPy array or list
- Select expression with .when and .then statements gives incorrect results depending on preceding row. HOT 8
- Concat of columns with lists of objects raises error HOT 3
- Append scalar column to list column HOT 2
- `pl.concat` inside `.agg()` raises InvalidOperationError - output length must be equal
- Make `int_range()` and `int_ranges()` work with no inputs and default to `int_range(0, pl.len())` HOT 2
- map_elements applied to dataframe with empty column or batch with empty column returns series with length 0. HOT 5
- `scan_csv`does not support a list of datatypes in `schema_overrides`
- `read_csv` and `read_ipc` do not use native `storage_options` configuration keys
- DataFrame construction from numpy with dtype object
- Arithmetic with nested arrays gives wrong results
- Rename: Support for ignore_missing parameter
- Cloud paths with square brackets in paths are not treated as non-glob paths, even with `glob=False`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.