Code Monkey home page Code Monkey logo

Comments (2)

stinodego avatar stinodego commented on July 24, 2024

Converting from Arrow is not always zero copy. We have a different string representation than what most existing Arrow implementations have. So the behavior here is expected.

from polars.

useredsa avatar useredsa commented on July 24, 2024

Hi, @stinodego,

I still have the following questions:

  1. If it's like that, maybe the documentation should be explicit about that, no? I think string is a pretty common type and I think one would understand that the conversion is zero-cost.
  2. In this example it's implied that the whole dataframe is copied. Because the memory required is double the dataframe size. If it's because of what you say, shouldn't it be only the string columns?
  3. Is there anything we can do to circumvent this? Like using certain data type with pyarrow.
  4. Will something similar happen with categories? Or is converting to categories first a good alternative if the number of different values of the string columns are small.

Thanks in advance,

from polars.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.