In the overview below (t)ime, stdev and count (of pages): <div class="snippet-clip

Table.load very slow with dtype('O') about tablite HOT 5 CLOSED

root-11 commented on May 26, 2024

Table.load very slow with dtype('O')

from tablite.

Comments (5)

realratchet commented on May 26, 2024

I'm not exactly sure what would you use re-indexing for if it's not loaded. Unless you mean you already have it re-indexed based on some other criteria and just want to blindly select the values from the array.

If that's the case then no, it cannot be improved in how it is now as the pickle format doesn't allow for random access. However, the nim implementation should definitely be faster. I implemented unpickler fully in nim whereas pythons unpickler is written in native python and is not a C binding which makes it slow. I haven't benchmarked it vs python implementation but I'm pretty sure it would be faster and there's places where it could be made even faster.

from tablite.

root-11 commented on May 26, 2024

...re-indexing for if it's not loaded...

When I have an index and need to re-arrange fields in the right table during a join, the actual value that is being re-ordered doesn't matter. It only matters that the values are put in the right order.

For example a join where the right side index is [4,2,3,1] all fields on all pages would have to be re-ordered to match the order that is dictated by the index. So it doesn't matter whether it's a struct, int, ... whatever as long as the page that is output contains the bytes in the correct order.

from tablite.

realratchet commented on May 26, 2024

Then no it is not possible with pickle format as that requires random access as bytes are not aligned so you're forced to read the entire page. It requires reading the entire file so only saving grace would be speeding up reading process.

from tablite.

root-11 commented on May 26, 2024

Ok. So the advice is "keep datatypes simple if you want speed."

from tablite.

Recommend Projects

Table.load very slow with dtype('O') about tablite HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent