Code Monkey home page Code Monkey logo

Comments (11)

esc avatar esc commented on July 22, 2024

any hunches?

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

Apparently this line:

https://github.com/FrancescAlted/blosc/blob/master/blosc/blosc.c#L657

evaluates to true, and hence typesize is set to 1, then the shuffle does not trigger, and hence the data cannot be compressed.

I still need some more research on why the heck this does not work properly on Mac OSX (using Lion here) :-/

from python-blosc.

esc avatar esc commented on July 22, 2024

Potentially an issue with signed / unsigned potenially also issues with casting?

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

Its funny, I have just discovered that using gcc, the comparison works properly, but clang fails. The problem is that clang is used by default in Mac OSX. Anyway, here are the details for the faulty clang:

$ clang -v
Apple clang version 3.0 (tags/Apple/clang-211.12) (based on LLVM 3.0svn)
Target: x86_64-apple-darwin11.4.2
Thread model: posix

For the record, my current clang works properly if the condition is written as:

  if ((int)typesize > BLOSC_MAX_TYPESIZE) {

instead of the original:

  if (typesize > BLOSC_MAX_TYPESIZE) {

Maybe this is a problem with a pretty old clang. I'm going to try updating it, although that might affect more people like me. Hmm...

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

Just updated xcode (which include clang) in my box, but the issue is not resolved. Here it is the version of the new clang:

$ clang -v
Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin11.4.2
Thread model: posix

The funny thing is that, after upgrading xcode, gcc also got the same (bad) behavior. Here it is the gcc version:

$ gcc -v
Using built-in specs.
Target: i686-apple-darwin11
Configured with: /private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/src/configure --disable-checking --enable-werror --prefix=/Applications/Xcode.app/Contents/Developer/usr/llvm-gcc-4.2 --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-prefix=llvm- --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin11 --enable-llvm=/private/var/tmp/llvmgcc42/llvmgcc42-2336.11~182/dst-llvmCore/Developer/usr/local --program-prefix=i686-apple-darwin11- --host=x86_64-apple-darwin11 --target=i686-apple-darwin11 --with-gxx-include-dir=/usr/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

Anyway, using:

if ((int)typesize > BLOSC_MAX_TYPESIZE) {

seems to fix the issue in both compilers. The problem is that I'm not sure why the cast should be done (for me it is clang/gcc problem on Mac OSX). Anyway, I'm afraid that the only way to solve this is to add the cast in Blosc directly. Hmm, time for a new Blosc release...

from python-blosc.

esc avatar esc commented on July 22, 2024

Can you reproduce this with pure blosc?

from python-blosc.

esc avatar esc commented on July 22, 2024

Maybe the following line ๐Ÿ‘
https://github.com/FrancescAlted/python-blosc/blob/master/blosc/blosc_extension.c#L125

Should be using n instead of i? But that is a wild guess.

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

Your wild guess turned out to be correct. I'm very surprised that was the problem. In fact, Blosc itself does not have the problem (which is another indicator that the problem was the format). That means that we should be very cautious when passing arguments to C extensions. You have been very inspired man. Thanks!

from python-blosc.

esc avatar esc commented on July 22, 2024

I haven't yet fully grokked the whole size_t, sszie_t and Py_ssize_t business yet, it seems to be quite important. Do you have any hypothesis, what exactly went wrong? I don't see how an integer value of 4 could be misinterpreted...

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

Frankly, I don't understand either what was happening, and why this affected just the Mac platform (Linux and Win seemed to be perfectly happy without the patch above).

Anyway, I have just committed another patch (rev dc1525c) for uniformizing the treatment of size_t. Seems to pass all tests on my Mac box (I still need to test that on Win and Linux, but I don't expecte surprises there).

from python-blosc.

FrancescAlted avatar FrancescAlted commented on July 22, 2024

I think I know what was happening here. In 64-bit platforms, size_t is 64-bit large, so when asking for an 'i' conversion, only the lower 32-bit are set. That means that the higher 32-bit are not set, and that part could have some dirty values on it (i.e. they might not necessarily be zeroed). Because of this, the comparison:

if (typesize > BLOSC_MAX_TYPESIZE) 

could fail depending on the dirty values in higher 32-bit. Of course:

if ((int)typesize > BLOSC_MAX_TYPESIZE)

worked because this is enforcing to consider only the lower 32-bit in the comparison.

from python-blosc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.