Code Monkey home page Code Monkey logo

Comments (3)

nicholasturnbull avatar nicholasturnbull commented on June 12, 2024

Update: This doesn't appear to be an endianness issue but rather has something to do with read overruns in data_files.cc and pos_zone.cc as the field lengths per record are defined by successive reads as opposed to sliced segments. It's more likely to be to do with type definitions on 64-bit versus 32-bit architectures.

from viewtouch.

nicholasturnbull avatar nicholasturnbull commented on June 12, 2024

I'm currently working on this: Confirmed to be an arithmetic error in the base64 encoding scheme used for vt_data. What is happening is that on systems with a exceptionally large long int, the normal behaviour of affecting the sign bit is not occurring as the shift does not even touch it, resulting in very large integers being read place of negative ints smaller than a certain size due to the original encoding scheme being faulty to begin with. I am currently working out how to unmangle the values. Obviously since the shifting behaviour worked in both directions before, this wasn't an issue. It may be the gcc environment, or the machine architecture, I'm not sure. By printlining the load process in manager.cc we find that it's i=200 in the page records in vt_data where it starts to go wrong, and then the fields get out of sync.

from viewtouch.

nicholasturnbull avatar nicholasturnbull commented on June 12, 2024

We now have an internal experimental patch for base64 handling to cure this (temporarily), but we're going to rewrite data_file.cc completely as it needs to load the records properly rather than successively read tokens from the file, and base64 numerical values need to be parsed from vti_data rather than converted directly into integer values. Before going any further, however, if you're having this problem, there may be a quick fix:

Test that the vti_data file that you have isn't corrupt by renaming it to vti_data.txt.gz and decompressing it with gzip. If you see a checksum error, it's got mangled in the download; Gene kindly sent us a vti_data that seems to have fewer loading issues than the one available in vti_updates on viewtouch.com. Try getting a copy of it via e-mail - contact me or Gene and we can send you one.

For developers: Adding a printline to manager.cc that outputs the i loop value for the vti_data loading process with its page number is very helpful to see what's going on. The page id's should be a few hundred negative numbers in descending order to -1. If you see page IDs with massive values (e.g. 6333429556) then it's a base64 maths error which is causing negative integer values to fail to load. This will be fixed in our patched version, which we will aim to have on github at scimatix/viewtouch within the next week.

from viewtouch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.