Comments (11)
Here it is, I was able to cut out a lot of the initial code :
import polars as pl
import numpy as np
df = pl.DataFrame({"index_1":np.repeat(np.arange(100), 10), "index_2":np.repeat(np.arange(100), 10)})
df = pl.concat([df[0:500], df[500:]])
df = df.filter(df["index_1"] == 0)
df = df.with_columns(index_2 = pl.Series(values=[0]*10))
df.set_sorted("index_2") #Also crashes on write_parquet and some other operations
It crashes for me (Windows 11).
---------------------------------------------------------------------------
PanicException Traceback (most recent call last)
Cell In[9], line 8
6 df = df.filter(df["index_1"] == 0)
7 df = df.with_columns(index_2 = pl.Series(values=[0]*10))
----> 8 df.set_sorted("index_2")
File C:\...\polars\dataframe\frame.py:10674, in DataFrame.set_sorted(self, column, descending, *more_columns)
10653 def set_sorted(
10654 self,
10655 column: str | Iterable[str],
10656 *more_columns: str,
10657 descending: bool = False,
10658 ) -> DataFrame:
10659 """
10660 Indicate that one or multiple columns are sorted.
10661
(...)
10669 Whether the columns are sorted in descending order.
10670 """
10671 return (
10672 self.lazy()
10673 .set_sorted(column, *more_columns, descending=descending)
> 10674 .collect(_eager=True)
10675 )
File C:\...\polars\lazyframe\frame.py:1967, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, cluster_with_columns, no_optimization, streaming, background, _eager, **_kwargs)
1964 # Only for testing purposes atm.
1965 callback = _kwargs.get("post_opt_callback")
-> 1967 return wrap_df(ldf.collect(callback))
PanicException: index out of bounds: the len is 1 but the index is 1
from polars.
Surprisingly I cannot reproduce using the given data/code, however I have the same issue.
I will try to find the time to make a minimal repro code for my case.
from polars.
The original issue no longer reproduces for me thanks to #16852
from polars.
I can confirm as well! Thanks @ritchie46! 💯
from polars.
@ritchie46 here is a repro for #16605
from polars.
Can reproduce.
If it is of use for debugging: It does not seem to happen using the Lazy API.
df = df.lazy()
df = df.filter(pl.col("val1") | pl.col("val3"))
df = df.with_columns(pl.col("val4").max().over("group1", "group2").fill_null(0).alias("val4"))
df = df.filter(pl.col("val4") > pl.col("val7").sum().over("group1", "group2"))
df.with_columns(pl.col("val4").floor()).collect()
# shape: (9, 10)
# ┌────────┬────────┬──────┬──────┬───┬───────┬──────┬──────────┬───────────┐
# │ group1 ┆ group2 ┆ val1 ┆ val2 ┆ … ┆ val5 ┆ val6 ┆ val7 ┆ val8 │
# │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
# │ i64 ┆ i64 ┆ bool ┆ f64 ┆ ┆ f64 ┆ i64 ┆ f64 ┆ f64 │
# ╞════════╪════════╪══════╪══════╪═══╪═══════╪══════╪══════════╪═══════════╡
# │ 1001 ┆ 100004 ┆ true ┆ null ┆ … ┆ 87.0 ┆ 0 ┆ 2.705119 ┆ 40.904418 │
# │ 1001 ┆ 100007 ┆ true ┆ null ┆ … ┆ 173.0 ┆ 0 ┆ 2.6165 ┆ 34.486 │
# │ 1001 ┆ 100009 ┆ true ┆ null ┆ … ┆ 211.0 ┆ 0 ┆ 4.458603 ┆ 77.95037 │
# │ 1001 ┆ 100010 ┆ true ┆ null ┆ … ┆ 178.0 ┆ 0 ┆ 2.3165 ┆ 37.77 │
# │ 1001 ┆ 100011 ┆ true ┆ null ┆ … ┆ 174.0 ┆ 0 ┆ 5.548593 ┆ 71.207139 │
# │ 1001 ┆ 100012 ┆ true ┆ null ┆ … ┆ 196.0 ┆ 0 ┆ 2.1685 ┆ 32.888 │
# │ 1001 ┆ 100015 ┆ true ┆ null ┆ … ┆ 89.0 ┆ 0 ┆ 2.400406 ┆ 39.732588 │
# │ 1003 ┆ 100008 ┆ true ┆ null ┆ … ┆ 238.0 ┆ 0 ┆ 4.913397 ┆ 93.076396 │
# │ 1003 ┆ 100013 ┆ true ┆ null ┆ … ┆ 101.5 ┆ 0 ┆ 2.254043 ┆ 45.486928 │
# └────────┴────────┴──────┴──────┴───┴───────┴──────┴──────────┴───────────┘
from polars.
I cannot reproduce this 🤔
from polars.
That one I can reproduce, thanks!
from polars.
Taking a look.
from polars.
Perhaps the original issue could be platform specific? I can reproduce it on macOS (same as @maxzw).
@Elvynzs I can reproduce your example also.
It seems it may be a little different, and have to do with your use of Series.
Changing the filter to use expressions makes the example run for me:
df.filter(pl.col("index_1") == 0)
from polars.
@Elvynzs I'm not sure your issue is equal to the one in the description, but I'll check if the fix also works for mine 😃
from polars.
Related Issues (20)
- add an `ignore_nulls` option to `json_encode` for the `pl.Struct` column
- Change all columns one time HOT 5
- map_elements replaces all array elements with nulls HOT 2
- Let arr.reshape use Expr as inputs instead of just python tuple
- Import and Export Schema Objects to JSON
- error[E0599]: no variant or associated item named `Struct` found for enum `polars_core::datatypes::DataType` in the current scope HOT 2
- `with_columns(dict)` fail silently if key exists as column HOT 5
- Polars write database - Rest API call failing with AWS Lambda trigger HOT 1
- Change `read_csv` and `read_ipc` to use object store instead of fsspec
- `.over()` fails with `.top_k_by` HOT 3
- `join_nulls` in "asof" join HOT 3
- exception on numpy slicing literal column with object column
- Scanning cloud paths with percentages '%' fails
- Make pl.Enum(...) return type rather than instance HOT 2
- Elementwise check on `join` expressions is too restrictive HOT 3
- Built-in datasets and a function to load them HOT 1
- Python test workflows may fail due to failure to download `torch` dependency HOT 1
- `scan_parquet` does not optimise `slice` or `tail` operations HOT 3
- Apply function to rows of dictionaries in `map_rows`
- De-duplicate decompression in CSV/NDJSON scans
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.