Comments (4)
Yes, we will fix that. ;)
from polars.
I have the same issue.
from polars.
This is a stackoverflow. We can improve remove that. However, if you are doing such wide horizontal operations, I would consider transposing as this will never be performant (and will also not be on pandas when they transition to pyarrow).
from polars.
On my 2 computers, the code starts crashing at about ~750 columns.
I would argue that this amount of columns, while quite big, should not crash polars.
In my use case, I use polars to handle time series, and unfortunately we can have in the worst case ~1k colums for 10k-100k rows, so transposing the dataframe would not help. We would have to write custom code to split the dataframes then aggregate the results.
(In my initial code, I used all_horizontal to drop all rows containing at least 1 nan)
As a temporary workaround, sum_horizontal
does not crash, so maybe I will see if I can rely on this method instead.
from polars.
Related Issues (20)
- scan_csv should be able to read "0" and "1" into a boolean type
- serde deserialisation of AnyValue doesn't work HOT 2
- Problem with list eval on length 1 dataframes
- Add example showing how to unpivot multiple columns HOT 1
- Add pre-filtered decode to new-streaming Parquet source
- Dataframes that have both strings and categories cannot be serialized and deserialized from disk.
- LazyFrame.map_batches() ordering guarantees
- computation of list.len for null list seems incorrect HOT 1
- Schema for groupby-agg of literal raised to some power does not match `collect` result.
- Severe memory issues with `rolling` and `group_by`
- Schema inference fails when colums are produced without a name with pyodbc and sql server HOT 2
- StringCacheMismatchError when using joblib.Parallel and Categorical data HOT 1
- Expr.rank() function changed to unstable sort in polars >=1
- Significant performance difference depending on how I use the "filter" method HOT 5
- readJSON Fails on JSON with Newline Characters HOT 1
- Polars scan_parquet fails for files where DuckDb read_parquet succeeds HOT 2
- Crash when using write_parquet HOT 5
- CI issue when installing NumPy HOT 3
- In `cast()`, the argument `wrap_numerical` works differently on floats and integers HOT 1
- Support BigQuery client or URI in database I/O functions HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.