Code Monkey home page Code Monkey logo

Comments (11)

Elvynzs avatar Elvynzs commented on September 27, 2024 2

Here it is, I was able to cut out a lot of the initial code :

import polars as pl
import numpy as np

df = pl.DataFrame({"index_1":np.repeat(np.arange(100), 10), "index_2":np.repeat(np.arange(100), 10)})
df = pl.concat([df[0:500], df[500:]])
df = df.filter(df["index_1"] == 0)
df = df.with_columns(index_2 = pl.Series(values=[0]*10))
df.set_sorted("index_2") #Also crashes on write_parquet and some other operations

It crashes for me (Windows 11).

---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[9], line 8
      6 df = df.filter(df["index_1"] == 0)
      7 df = df.with_columns(index_2 = pl.Series(values=[0]*10))
----> 8 df.set_sorted("index_2")

File C:\...\polars\dataframe\frame.py:10674, in DataFrame.set_sorted(self, column, descending, *more_columns)
  10653 def set_sorted(
  10654     self,
  10655     column: str | Iterable[str],
  10656     *more_columns: str,
  10657     descending: bool = False,
  10658 ) -> DataFrame:
  10659     """
  10660     Indicate that one or multiple columns are sorted.
  10661 
   (...)
  10669         Whether the columns are sorted in descending order.
  10670     """
  10671     return (
  10672         self.lazy()
  10673         .set_sorted(column, *more_columns, descending=descending)
> 10674         .collect(_eager=True)
  10675     )

File C:\...\polars\lazyframe\frame.py:1967, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, cluster_with_columns, no_optimization, streaming, background, _eager, **_kwargs)
   1964 # Only for testing purposes atm.
   1965 callback = _kwargs.get("post_opt_callback")
-> 1967 return wrap_df(ldf.collect(callback))

PanicException: index out of bounds: the len is 1 but the index is 1

from polars.

Elvynzs avatar Elvynzs commented on September 27, 2024 1

Surprisingly I cannot reproduce using the given data/code, however I have the same issue.

I will try to find the time to make a minimal repro code for my case.

from polars.

cmdlineluser avatar cmdlineluser commented on September 27, 2024 1

The original issue no longer reproduces for me thanks to #16852

from polars.

maxzw avatar maxzw commented on September 27, 2024 1

I can confirm as well! Thanks @ritchie46! 💯

from polars.

maxzw avatar maxzw commented on September 27, 2024

@ritchie46 here is a repro for #16605

from polars.

cmdlineluser avatar cmdlineluser commented on September 27, 2024

Can reproduce.

If it is of use for debugging: It does not seem to happen using the Lazy API.

df = df.lazy()
df = df.filter(pl.col("val1") | pl.col("val3"))
df = df.with_columns(pl.col("val4").max().over("group1", "group2").fill_null(0).alias("val4"))
df = df.filter(pl.col("val4") > pl.col("val7").sum().over("group1", "group2"))
df.with_columns(pl.col("val4").floor()).collect()

# shape: (9, 10)
# ┌────────┬────────┬──────┬──────┬───┬───────┬──────┬──────────┬───────────┐
# │ group1 ┆ group2 ┆ val1 ┆ val2 ┆ … ┆ val5  ┆ val6 ┆ val7     ┆ val8      │
# │ ---    ┆ ---    ┆ ---  ┆ ---  ┆   ┆ ---   ┆ ---  ┆ ---      ┆ ---       │
# │ i64    ┆ i64    ┆ bool ┆ f64  ┆   ┆ f64   ┆ i64  ┆ f64      ┆ f64       │
# ╞════════╪════════╪══════╪══════╪═══╪═══════╪══════╪══════════╪═══════════╡
# │ 1001   ┆ 100004 ┆ true ┆ null ┆ … ┆ 87.0  ┆ 0    ┆ 2.705119 ┆ 40.904418 │
# │ 1001   ┆ 100007 ┆ true ┆ null ┆ … ┆ 173.0 ┆ 0    ┆ 2.6165   ┆ 34.486    │
# │ 1001   ┆ 100009 ┆ true ┆ null ┆ … ┆ 211.0 ┆ 0    ┆ 4.458603 ┆ 77.95037  │
# │ 1001   ┆ 100010 ┆ true ┆ null ┆ … ┆ 178.0 ┆ 0    ┆ 2.3165   ┆ 37.77     │
# │ 1001   ┆ 100011 ┆ true ┆ null ┆ … ┆ 174.0 ┆ 0    ┆ 5.548593 ┆ 71.207139 │
# │ 1001   ┆ 100012 ┆ true ┆ null ┆ … ┆ 196.0 ┆ 0    ┆ 2.1685   ┆ 32.888    │
# │ 1001   ┆ 100015 ┆ true ┆ null ┆ … ┆ 89.0  ┆ 0    ┆ 2.400406 ┆ 39.732588 │
# │ 1003   ┆ 100008 ┆ true ┆ null ┆ … ┆ 238.0 ┆ 0    ┆ 4.913397 ┆ 93.076396 │
# │ 1003   ┆ 100013 ┆ true ┆ null ┆ … ┆ 101.5 ┆ 0    ┆ 2.254043 ┆ 45.486928 │
# └────────┴────────┴──────┴──────┴───┴───────┴──────┴──────────┴───────────┘

from polars.

stinodego avatar stinodego commented on September 27, 2024

I cannot reproduce this 🤔

from polars.

stinodego avatar stinodego commented on September 27, 2024

That one I can reproduce, thanks!

from polars.

ritchie46 avatar ritchie46 commented on September 27, 2024

Taking a look.

from polars.

cmdlineluser avatar cmdlineluser commented on September 27, 2024

Perhaps the original issue could be platform specific? I can reproduce it on macOS (same as @maxzw).

@Elvynzs I can reproduce your example also.

It seems it may be a little different, and have to do with your use of Series.

Changing the filter to use expressions makes the example run for me:

df.filter(pl.col("index_1") == 0)

from polars.

maxzw avatar maxzw commented on September 27, 2024

@Elvynzs I'm not sure your issue is equal to the one in the description, but I'll check if the fix also works for mine 😃

from polars.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.