possible memory leak about tensorstore HOT 5 OPEN

google commented on May 14, 2024 1

possible memory leak

from tensorstore.

Comments (5)

jbms commented on May 14, 2024

Thanks for sending this example --- I'll look into it and hopefully get to the bottom of it.

from tensorstore.

jbms commented on May 14, 2024

Testing this myself, I'm not seeing significant growth when running this repeatedly, and using pprof also doesn't show any leaked memory. (I ran pprof using a google-internal mechanism, I'm not sure how easily the open source pprof can be used with tensorstore.)

I might be missing something, though. Can you share the outputs you are seeing from this?

I do see some amount of memory retained in the free page list by the memory allocator. Additionally, because of how tensorstore uses threads, the arr[...] = data[s1] call may complete while a background thread still momentarily retains temporary buffers used for writing, such that it is possible the "end bytes" line is run before all memory is freed, but it should be freed immediately after.

Separately, maybe you are already aware of this and this is just an example for testing memory leaks, but the way you are writing the sharded precomputed format is very inefficient because the writes are not shard aligned. When a write isn't shard aligned, tensorstore has to read each existing shard that is affected, modify it, and then rewrite it. (Modifying in place is not possible.) Therefore, there is a large write amplification factor here.

from tensorstore.

d-v-b commented on May 14, 2024

This function logs memory use in bytes before and after the slab-wise saving of the array:

def test_mem(nruns = 1):
    result = []
    for n in range(nruns):
        start_bytes = process.memory_info().rss
        for sl in tqdm(slices):
            arr[ts.d['channel'][0]][sl] = data[sl]
        stop_bytes = process.memory_info().rss
        result.append((start_bytes, stop_bytes))
    return result

Running this with nruns=10 returned the following output (rows are runs, columns are pre- and post-saving memory use in bytes):

[(17265467392, 21378891776),
 (21378891776, 22748270592),
 (22748270592, 22967046144),
 (22967046144, 23119863808),
 (23119863808, 23220432896),
 (23220436992, 23305125888),
 (23305129984, 23373012992),
 (23373012992, 23467606016),
 (23467606016, 24198459392),
 (24198459392, 24362373120)]

It looks like memory use is climbing each time the full array is saved; I'm running this again with 3x longer number of runs to see if I hit a plateau somewhere.

And yes, I know this is a pathological way to write to the sharded format. I haven't gotten around to writing the logic for shard-aligned writes. (This would be a great feature for tensorstore :) )

from tensorstore.

2bndy5 commented on May 14, 2024

Building from source displays a warning saying

path\to\tmp_dir\execroot\com_google_tensorstore\external\com_google_riegeli\riegeli\base\base.h(169) : warning C4722: 'riegeli::internal::UnreachableStream::~UnreachableStream': destructor never returns, potential memory leak

Not sure if that is related or a separate known issue.

from tensorstore.

jbms commented on May 14, 2024

That is a spurious warning --- that type is only used in the false branch of an assertion when debugging is disabled:

https://github.com/google/riegeli/blob/5b42e9a8fc1a79d73836a471969b0d3c4d67d47c/riegeli/base/base.h#L258

from tensorstore.

possible memory leak about tensorstore HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent