Code Monkey home page Code Monkey logo

Comments (11)

zeux avatar zeux commented on August 17, 2024

I'd say it's more likely to be a user error. This exact problem would happen if you try to use xml_node::next_sibling on a node in a subtree that was removed from the document.

pugixml test suite uses a custom allocator that marks freed allocation with no-access attribute: https://github.com/zeux/pugixml/blob/master/tests/allocator.cpp#L59-L67. There is a possibility that there is a edge case that does not occur in the test suite and triggers a bug in deallocation code but I'd say that's unlikely.

Please let me know if you get any more information about the issue.

from pugixml.

zeux avatar zeux commented on August 17, 2024

Note: tests on OSX/Linux (including Travis-CI builds) now run with the same page-heap-like allocator (catches buffer overrun and use-after-free).

from pugixml.

homer6 avatar homer6 commented on August 17, 2024

Thanks for your quick response and sorry for my delayed response.

The project that I'm applying this to has been in production for a while and is very stable. I've only added the hacky, experimental code (below) to try to move to mmap. The code below is by no means meant to be added to our production code as is. It is strictly a temporary proof of concept.

Also, I was wondering about increasing the pugixml memory page size beyond the 16 bit limitation. We process rather large documents that, even at 32k page size, result in a large number of allocations.

static unordered_map<void*,size_t> allocations_map;
static mutex allocations_map_mutex;

#include <sys/mman.h>
#include <new>
#include <cstdlib>
#include <cstring>

void* custom_allocate( size_t size ){

    void *p;

    unique_lock<mutex> current_lock( allocations_map_mutex );

    if( size < 1024 ){
        p = malloc( size );
        cout << "Allocating malloc: " << p << " " << std::to_string(size) << endl;
        if( p == nullptr ){
            throw std::bad_alloc();
        }
        allocations_map[ p ] = size;
        return p;
    }else{
        p = mmap( 0, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 );
        cout << "Allocating mmap: " << p << " " << std::to_string(size) << endl;
        if( p != MAP_FAILED ){
            allocations_map[ p ] = size;
            return p;
        }
        throw std::bad_alloc();
    }


}

void custom_deallocate( void* p ){

    unique_lock<mutex> current_lock( allocations_map_mutex );

    if( !allocations_map.count(p) ){
        throw std::bad_alloc();
    }

    size_t size = allocations_map[p];

    cout << "Deallocating: " << p << " " << std::to_string(size) << endl;

    memset( p, 0, size );

    if( size < 1024 ){
        free( p );
        allocations_map.erase( allocations_map.find(p) );
    }else{
        int result = munmap( p, size );
        if( result == -1 ){
            throw std::bad_alloc();
        }
        allocations_map.erase( allocations_map.find(p) );
    }

    cout << "Deallocated." << endl;

}

In main:

pugi::set_memory_management_functions( custom_allocate, custom_deallocate );

Configure:

-DPUGIXML_MEMORY_PAGE_SIZE=65456

Thanks again for a great, high quality library. I've been using it for a few years now. I love the API and the documentation is excellent. Please let me know if I can contribute in any way.

Thanks

from pugixml.

zeux avatar zeux commented on August 17, 2024

FWIW I ran the test suite with your allocation functions and all tests pass (except for the tests that rely on simulating "out of memory" conditions). So it's likely that the segmentation fault you're observing is due to the application code accessing nodes after they've been removed.

It's possible to relax the restrictions on page size somewhat; I'll look into this (issue #35).

from pugixml.

homer6 avatar homer6 commented on August 17, 2024

Thanks for taking the time and trying to replicate the segfault.

When I comment out the "pugi::set_memory_management_functions()" line, the program works as expected. This is what initially lead me to believe that it might actually be a bug with pugixml.

I know this is a lot to ask, but are you willing to provide an mmap-only allocator? That way, perhaps we can narrow down what might be happening. I strongly suspect that the error is in the custom allocator that I provided above.

Alternatively, I could try a few other allocators to see what the results of their use is.

Thanks

from pugixml.

zeux avatar zeux commented on August 17, 2024

Your allocator code looks correct to me.

There are three options here:

  1. The bug is in application code. The reason it only surfaces with a mmap allocator is that malloc does not actually release the freed memory blocks to the system when you call free(), so reading/writing usually does not crash but instead corrupts the content - which is not a problem for reading or if the access is immediately after free.
    For example, here's some pugixml code that will crash under mmap-based allocator but won't crash under malloc-based allocator:
    https://gist.github.com/zeux/68b6c2dda1917f90a961
    (comment out the setup for custom allocation functions to see the difference)
  2. The bug is in pugixml allocation code. To determine that this is the case we'd need to prove that application code is correct - e.g. by replacing pugixml pooled allocation code with 1 mmap/munmap per allocation.
  3. The bug is in custom allocator. You can try the functions from the gist above; they are correct as far as I can see.

from pugixml.

homer6 avatar homer6 commented on August 17, 2024

Thank you so much. I believe you're right about the application logic. The gist that you provided was a perfect example.

Will close once I confirm. Thanks.

from pugixml.

zeux avatar zeux commented on August 17, 2024

Any updates?

from pugixml.

homer6 avatar homer6 commented on August 17, 2024

Sorry Arseny,

We just had a big release that didn't contain this fix. However, this issue
is now at the top of the list. I'll have an update this week. Thanks for
your support.

Steve

On Monday, April 13, 2015, Arseny Kapoulkine [email protected]
wrote:

Any updates?


Reply to this email directly or view it on GitHub
#33 (comment).

from pugixml.

zeux avatar zeux commented on August 17, 2024

No worries and no rush! Just want to make sure there is a resolution at some point.

from pugixml.

homer6 avatar homer6 commented on August 17, 2024

Very sorry. We didn't have the resources to pursue this further.

from pugixml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.