Code Monkey home page Code Monkey logo

Comments (5)

jbms avatar jbms commented on May 14, 2024

Thanks for sending this example --- I'll look into it and hopefully get to the bottom of it.

from tensorstore.

jbms avatar jbms commented on May 14, 2024

Testing this myself, I'm not seeing significant growth when running this repeatedly, and using pprof also doesn't show any leaked memory. (I ran pprof using a google-internal mechanism, I'm not sure how easily the open source pprof can be used with tensorstore.)

I might be missing something, though. Can you share the outputs you are seeing from this?

I do see some amount of memory retained in the free page list by the memory allocator. Additionally, because of how tensorstore uses threads, the arr[...] = data[s1] call may complete while a background thread still momentarily retains temporary buffers used for writing, such that it is possible the "end bytes" line is run before all memory is freed, but it should be freed immediately after.

Separately, maybe you are already aware of this and this is just an example for testing memory leaks, but the way you are writing the sharded precomputed format is very inefficient because the writes are not shard aligned. When a write isn't shard aligned, tensorstore has to read each existing shard that is affected, modify it, and then rewrite it. (Modifying in place is not possible.) Therefore, there is a large write amplification factor here.

from tensorstore.

d-v-b avatar d-v-b commented on May 14, 2024

This function logs memory use in bytes before and after the slab-wise saving of the array:

def test_mem(nruns = 1):
    result = []
    for n in range(nruns):
        start_bytes = process.memory_info().rss
        for sl in tqdm(slices):
            arr[ts.d['channel'][0]][sl] = data[sl]
        stop_bytes = process.memory_info().rss
        result.append((start_bytes, stop_bytes))
    return result

Running this with nruns=10 returned the following output (rows are runs, columns are pre- and post-saving memory use in bytes):

[(17265467392, 21378891776),
 (21378891776, 22748270592),
 (22748270592, 22967046144),
 (22967046144, 23119863808),
 (23119863808, 23220432896),
 (23220436992, 23305125888),
 (23305129984, 23373012992),
 (23373012992, 23467606016),
 (23467606016, 24198459392),
 (24198459392, 24362373120)]

It looks like memory use is climbing each time the full array is saved; I'm running this again with 3x longer number of runs to see if I hit a plateau somewhere.

And yes, I know this is a pathological way to write to the sharded format. I haven't gotten around to writing the logic for shard-aligned writes. (This would be a great feature for tensorstore :) )

from tensorstore.

2bndy5 avatar 2bndy5 commented on May 14, 2024

Building from source displays a warning saying

path\to\tmp_dir\execroot\com_google_tensorstore\external\com_google_riegeli\riegeli\base\base.h(169) : warning C4722: 'riegeli::internal::UnreachableStream::~UnreachableStream': destructor never returns, potential memory leak

Not sure if that is related or a separate known issue.

from tensorstore.

jbms avatar jbms commented on May 14, 2024

That is a spurious warning --- that type is only used in the false branch of an assertion when debugging is disabled:

https://github.com/google/riegeli/blob/5b42e9a8fc1a79d73836a471969b0d3c4d67d47c/riegeli/base/base.h#L258

from tensorstore.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.