Comments (5)
Hey, I'd be interested in taking a stab at this issue if its available!
Sure, go ahead!
from polars.
Or is there some other input where users should define the mask?
The mask is the validity buffer of the Series. The user doesn't define it manually.
from polars.
Hey, I'd be interested in taking a stab at this issue if its available!
from polars.
Hey, I'm beginning to look in to this and just want to make sure I'm clear about what the source for the masked buffer is. Is this something that you envision to be passed as part of the to_numpy function? i.e. I'd be able to write:
x = pl.Series([1,2,-1,4]).to_numpy(mask = [0, 0, 1, 0])
Or is there some other input where users should define the mask?
from polars.
Hey, I've been very slow to get started on this but finally have some time - a quick question about the Series
type, is there a way to access the validity buffer without having to know the underlying datatype of the ChunkedArray?
Also, I wanted ask about the behavior for arrays that have a null bitmask - I assume this means that all entries are valid, and we should construct the python array as such?
from polars.
Related Issues (20)
- from_any_values_and_dtype converts AnyValue::Struct to null
- Reading from S3 compatible storage HOT 1
- pl.datetime does not respect leftmost-argument naming rule HOT 1
- loading pickled `pl.Series` of `dtype=pl.Array(pl.Enum(...), ...)` fails
- `read_csv` PanicException when `pl.Decimal` used in schema with invalid precision
- `rows_by_key` works with pl.Array
- Categorical revmaps are not merged when concatenated inside a struct
- Install slack app to allow subscription
- write_parquet encoding no longer recognized by PBI Service parquet connector after Polars 1.5.0 onwards HOT 2
- Join fails for scanned lazyframes when `streaming=True` HOT 3
- Schema assumes the column order in the data when reading a CSV HOT 3
- Row Group Based Subtotals with .group_by() HOT 2
- High memory usage when calculating variance? HOT 1
- write_csv ignores formatting when writing to io.StringIO() HOT 3
- `read_csv` raises ComputeError when filename contains "[" HOT 1
- Performance regression (particularly in q21 of the TPC-H benchmark, +60%) after specific commit HOT 1
- bug: plotting breaks when `axis` is passed to `alt.X` HOT 3
- `pl.Array` + `pl.lit` PanicException Cannot apply operation on arrays of different lengths HOT 7
- GPU accelerated Polars taking 4 times longer to SUM a column in 100m record CSV than regular CPU. Running in Jupyter Notebook HOT 6
- Filtering with pl.col is substantially (27x) slower than filtering with pl.Series HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.