acts-project / vecmem Goto Github PK
View Code? Open in Web Editor NEWVectorised data model base and helper classes.
Home Page: https://acts-project.github.io/vecmem/
License: Mozilla Public License 2.0
Vectorised data model base and helper classes.
Home Page: https://acts-project.github.io/vecmem/
License: Mozilla Public License 2.0
The latest versions of the Intel compiler started producing the following types of warnings:
[ 92%] Building SYCL object tests/sycl/CMakeFiles/vecmem_test_sycl.dir/test_sycl_containers.sycl.o
In file included from /data/ssd-1tb/projects/vecmem/vecmem/tests/sycl/test_sycl_containers.sycl:15:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/const_device_vector.hpp:10:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/device_vector.hpp:12:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/memory/atomic.hpp:109:
/data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/memory/impl/atomic.ipp:70:12: error: 'atomic<unsigned int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Werror,-Wdeprecated-declarations]
return __VECMEM_SYCL_ATOMIC_CALL0(load, m_ptr);
^
/data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/memory/impl/atomic.ipp:19:19: note: expanded from macro '__VECMEM_SYCL_ATOMIC_CALL0'
cl::sycl::atomic<value_type>(cl::sycl::global_ptr<value_type>(PTR)))
^
/data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/impl/device_vector.ipp:399:22: note: in instantiation of member function 'vecmem::atomic<unsigned int>::load' requested here
return asize.load();
^
/data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/impl/device_vector.ipp:80:18: note: in instantiation of member function 'vecmem::device_vector<int>::size' requested here
assert(pos < size());
^
/data/ssd-1tb/projects/vecmem/vecmem/tests/sycl/test_sycl_containers.sycl:73:27: note: in instantiation of member function 'vecmem::device_vector<int>::at' requested here
outputvec.at(i) = inputvec.at(i) * constantarray1.at(0) +
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<unsigned int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
In file included from /data/ssd-1tb/projects/vecmem/vecmem/tests/sycl/test_sycl_containers.sycl:15:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/const_device_vector.hpp:10:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/containers/device_vector.hpp:12:
In file included from /data/ssd-1tb/projects/vecmem/vecmem/core/include/vecmem/memory/atomic.hpp:109:
So we need to start using sycl::atomic_ref
instead of sycl::atomic
.
Unfortunately at first we must wait for a resolution to intel/llvm#5647, since NVIDIA backend support is of course a must for our code...
Ignore this if you think it is the duplicate of #95.
The vector buffer with zero size doesn't work. Following is the minimal code for reproducing the problem
TEST_F(cuda_containers_test, zero_size) {
vecmem::cuda::device_memory_resource device_resource;
vecmem::data::vector_buffer<int> buff0(10, 0, device_resource);
// OK
ASSERT_EQ(buff0.capacity(), 10);
// Not OK
ASSERT_EQ(buff0.size(), 0); // Seg fault
vecmem::data::vector_buffer<int> buff1(10, 10, device_resource);
// OK
ASSERT_EQ(buff1.capacity(), 10);
// OK
ASSERT_EQ(buff1.size(), 10);
}
Segmentation fault (core dumped)
I stumbled upon this error by chance, as currently we have no using for creating non-static sized buffers starting with size different than 0, but it could come in handy in the future.
Right now the sizes of the created buffer are not initialised, only its capacities.
vecmem/core/include/vecmem/containers/impl/jagged_vector_buffer.ipp
Lines 104 to 145 in 971a7ef
Is it possible for vecmem vector to have a sort function running on device? There are some use-cases (acts seeding) for this functionality
I tried using a static_vector
in device code which unfortunately is not possible due to the call to memmove
in push_back
.
warning #20011-D: calling a host function("memmove") from a host device function ("vecmem::static_vector::insert") is not allowed
This is to bring it in sync with the jagged device vector. Having these two classes have different size types is unlikely to cause any real problems, but it is a little bit of sloppy design that I think we should fix.
CUDA version : 11.5.0
vector_buffer
doesnt work with unsigned long
with following error message:
what(): /home/beomki/projects/vecmem/vecmem/cuda/src/memory/cuda/managed_memory_resource.cpp:33 Failed to execute: cudaFree(p) (misaligned address)
int
or unsigned int
was OK
you can reproduce the error with the following codes
TEST_F(cuda_containers_test, buffer_type_test) {
// The memory resource(s).
vecmem::cuda::managed_memory_resource managed_resource;
vecmem::data::vector_buffer<int> int_vec(3, 0, managed_resource);
m_copy.setup(int_vec);
int_buffer_test(int_vec); // OK
vecmem::data::vector_buffer<unsigned int> uint_vec(3, 0, managed_resource);
m_copy.setup(uint_vec);
uint_buffer_test(uint_vec); // OK
vecmem::data::vector_buffer<unsigned long> ulong_vec(3, 0,
managed_resource);
m_copy.setup(ulong_vec);
ulong_buffer_test(ulong_vec); // doesnt work
}
void int_buffer_test(vecmem::data::vector_view<int> vec);
void uint_buffer_test(vecmem::data::vector_view<unsigned int> vec);
void ulong_buffer_test(vecmem::data::vector_view<unsigned long> vec);
__global__ void int_buffer_test_kernel(
vecmem::data::vector_view<int> vec_data) {
vecmem::device_vector<int> vec(vec_data);
vec.push_back(0);
}
void int_buffer_test(vecmem::data::vector_view<int> vec) {
int_buffer_test_kernel<<<1, 1>>>(vec);
// Check whether it succeeded to run.
VECMEM_CUDA_ERROR_CHECK(cudaGetLastError());
VECMEM_CUDA_ERROR_CHECK(cudaDeviceSynchronize());
}
__global__ void uint_buffer_test_kernel(
vecmem::data::vector_view<unsigned int> vec_data) {
vecmem::device_vector<unsigned int> vec(vec_data);
vec.push_back(0);
}
void uint_buffer_test(vecmem::data::vector_view<unsigned int> vec) {
uint_buffer_test_kernel<<<1, 1>>>(vec);
// Check whether it succeeded to run.
VECMEM_CUDA_ERROR_CHECK(cudaGetLastError());
VECMEM_CUDA_ERROR_CHECK(cudaDeviceSynchronize());
}
__global__ void ulong_buffer_test_kernel(
vecmem::data::vector_view<unsigned long> vec_data) {
vecmem::device_vector<unsigned long> vec(vec_data);
vec.push_back(0);
}
void ulong_buffer_test(vecmem::data::vector_view<unsigned long> vec) {
ulong_buffer_test_kernel<<<1, 1>>>(vec);
// Check whether it succeeded to run.
VECMEM_CUDA_ERROR_CHECK(cudaGetLastError());
VECMEM_CUDA_ERROR_CHECK(cudaDeviceSynchronize());
}
Would be good for the user to have these two constructors use the same ordering rather than opposing ones. (I'm aware for the jagged case we're not actually using the capacities as mentioned in #95 , but regardless this would be a good change)
Could also template the size_type
for the second one.
vecmem/core/include/vecmem/containers/data/jagged_vector_buffer.hpp
Lines 104 to 105 in ee21039
vecmem/core/include/vecmem/containers/data/vector_buffer.hpp
Lines 55 to 56 in ee21039
I have been trying to use binary page memrory resource and contiguous memory resource(c_mr) for traccc cuda examples. Using contiguous memory resource work effectively for File IO, but when using c_mr for cuda algorithms (seeding algorithm here) there is an invalid global read error due to an address misaligned.
I tested with 2GB of contiguous memory allocation limit as well but got the same error
side note : Got the same issue when tested with clusterization cuda algorithm as well from traccc pr #206
repo to reproduce the error - branch mem_resource_issues
The changes made:
...
vecmem::host_memory_resource host_mr;
vecmem::cuda::managed_memory_resource mng_mr;
vecmem::contiguous_memory_resource c_mr(mng_mr,pow(2,30));//<--
traccc::clusterization_algorithm ca(mng_mr);
traccc::spacepoint_formation sf(mng_mr);
traccc::seeding_algorithm sa(host_mr);
traccc::track_params_estimation tp(host_mr);
traccc::cuda::seeding_algorithm sa_cuda(c_mr); //<--
traccc::cuda::track_params_estimation tp_cuda(c_mr); //<--
...
traccc::cell_container_types::host cells_per_event =
traccc::read_cells_from_event(
event, i_cfg.cell_directory, common_opts.input_data_format,
surface_transforms, digi_cfg, c_mr);//<--
...
Error:
========= Invalid __global__ read of size 8 bytes
========= at 0xeb8 in /home/chamodya/myrepos/recent/traccc/build/_deps/vecmem-src/core/include/vecmem/containers/impl/vector_view.ipp:57:vecmem::data::vector_view<traccc::internal_spacepoint<traccc::spacepoint>>::size_ptr() const
========= by thread (253,0,0) in block (47,0,0)
========= Address 0x7fdf7ec10244 is misaligned
========= Device Frame:/home/chamodya/myrepos/recent/traccc/build/_deps/vecmem-src/core/include/vecmem/containers/impl/device_vector.ipp:20:vecmem::device_vector<traccc::internal_spacepoint<traccc::spacepoint>>::device_vector(const vecmem::data::vector_view<traccc::internal_spacepoint<traccc::spacepoint>> &) [0xe10]
========= Device Frame:/home/chamodya/myrepos/recent/traccc/build/_deps/vecmem-src/core/include/vecmem/containers/impl/jagged_device_vector.ipp:64:vecmem::jagged_device_vector<traccc::internal_spacepoint<traccc::spacepoint>>::at(unsigned long) [0xe10]
========= Device Frame:/home/chamodya/myrepos/recent/traccc/build/_deps/detray-src/core/include/detray/grids/grid2.hpp:187:detray::grid2<detray::attach_populator, detray::axis::circular, detray::axis::regular, detray::serializer2, vecmem::device_vector, vecmem::jagged_device_vector, detray::darray, detray::dtuple, traccc::internal_spacepoint<traccc::spacepoint>, (bool)0, (unsigned int)1>::bin(unsigned long, unsigned long) [0xe10]
========= Device Frame:/home/chamodya/myrepos/recent/traccc/build/_deps/detray-src/core/include/detray/grids/grid2.hpp:203:detray::grid2<detray::attach_populator, detray::axis::circular, detray::axis::regular, detray::serializer2, vecmem::device_vector, vecmem::jagged_device_vector, detray::darray, detray::dtuple, traccc::internal_spacepoint<traccc::spacepoint>, (bool)0, (unsigned int)1>::bin(unsigned long) [0xe10]
========= Device Frame:/home/chamodya/myrepos/recent/traccc/device/common/include/traccc/seeding/device/impl/populate_grid.ipp:51:traccc::device::populate_grid(unsigned long, const traccc::seedfinder_config &, const traccc::container_view<const unsigned long, const traccc::spacepoint> &, const vecmem::data::vector_view<const std::pair<unsigned long, unsigned long>> &, detray::grid2_view<detray::grid2<detray::attach_populator, detray::axis::circular, detray::axis::regular, detray::serializer2, detray::dvector, detray::djagged_vector, detray::darray, detray::dtuple, traccc::internal_spacepoint<traccc::spacepoint>, (bool)0, (unsigned int)1>>) [0xe10]
========= Device Frame:/home/chamodya/myrepos/recent/traccc/device/cuda/src/seeding/spacepoint_binning.cu:43:traccc::cuda::kernels::populate_grid(traccc::seedfinder_config, traccc::container_view<const unsigned long, const traccc::spacepoint>, vecmem::data::vector_view<const std::pair<unsigned long, unsigned long>>, detray::grid2_view<detray::grid2<detray::attach_populator, detray::axis::circular, detray::axis::regular, detray::serializer2, detray::dvector, detray::djagged_vector, detray::darray, detray::dtuple, traccc::internal_spacepoint<traccc::spacepoint>, (bool)0, (unsigned int)1>>) [0xf8]
========= Saved host backtrace up to driver entry point at kernel launch time
...
running the example :
compute-sanitizer build/bin/traccc_seq_example_cuda --detector_file=tml_detector/trackml-detector.csv --digitization_config_file=tml_detector/default-geometric-config-generic.json --cell_directory=tml_full/ttbar_mu200/ --events=2 --input-binary --run_cpu=0
Moreover, binary page memory resource did not produce the expected improvements for file IO (it actually worsened it), will create another issue after looking into it further.
Using binary page memory for file IO and algorithm IO in Traccc consumes more time than without using any downstream memory resource, this is contradicting to what is expected. The times get worse after each event. Provided below are file IO times for traccc_seq_example_cuda
computing 10 events using managed memory resource and binary page memory resource using managed memory as upstream.
Managed memory resource
$ build/bin/traccc_seq_example_cuda --detector_file=tml_detector/trackml-detector.csv --digitization_config_file=tml_detector/default-geometric-config-generic.json --cell_directory=tml_full/ttbar_mu200/ --events=10 --input-binary --run_cpu=0
Running build/bin/traccc_seq_example_cuda tml_detector/trackml-detector.csv tml_full/ttbar_mu200/ 10
0 File_IO: 0.127675
1 File_IO: 0.0567025
2 File_IO: 0.0524904
3 File_IO: 0.0517798
4 File_IO: 0.0586979
5 File_IO: 0.0564024
6 File_IO: 0.0545458
7 File_IO: 0.0526243
8 File_IO: 0.0541083
9 File_IO: 0.0492084
Binary page memory resource
$ build/bin/traccc_seq_example_cuda --detector_file=tml_detector/trackml-detector.csv --digitization_config_file=tml_detector/default-geometric-config-generic.json --cell_directory=tml_full/ttbar_mu200/ --events=10 --input-binary --run_cpu=0
Running build/bin/traccc_seq_example_cuda tml_detector/trackml-detector.csv tml_full/ttbar_mu200/ 10
0 File_IO: 0.849157
1 File_IO: 1.45001
2 File_IO: 1.58589
3 File_IO: 1.69902
4 File_IO: 1.74604
5 File_IO: 1.86683
6 File_IO: 1.9433
7 File_IO: 2.03939
8 File_IO: 2.17419
9 File_IO: 2.15853
In addition, when using binary page memory resource with host memory resource as upstream for traccc_seq_example (cpu) algorithm IO takes a really long time. Below are 2 events
host memory resource
$ build/bin/traccc_seq_example --detector_file=tml_detector/trackml-detector.csv --digitization_config_file=tml_detector/default-geometric-config-generic.json --cell_directory=tml_full/ttbar_mu200/ --events=2 --input-binary
Running build/bin/traccc_seq_example tml_detector/trackml-detector.csv tml_full/ttbar_mu200/ 2
0 File_IO: 0.0103446
0 Clusterization: 0.0153028
0 Spacepoint: 0.00112334
0 Seeding: 0.174703
0 Track params est : 0.0041265
1 File_IO: 0.00498015
1 Clusterization: 0.0167117
1 Spacepoint: 0.0012795
1 Seeding: 0.230505
1 Track params est : 0.0055673
binary page memory resource
$ build/bin/traccc_seq_example --detector_file=tml_detector/trackml-detector.csv --digitization_config_file=tml_detector/default-geometric-config-generic.json --cell_directory=tml_full/ttbar_mu200/ --events=2 --input-binary
Running build/bin/traccc_seq_example tml_detector/trackml-detector.csv tml_full/ttbar_mu200/ 2
0 File_IO: 0.795708
0 Clusterization: 251.087
0 Spacepoint: 10.6834
0 Seeding: 7.35312
0 Track params est : 0.0117272
1 File_IO: 1.34328
1 Clusterization: 466.343
1 Spacepoint: 10.4089
1 Seeding: 8.56579
1 Track params est : 0.0199618
Following #181, this is a long-term goal we should complete; we have a lot of memory resources which are not current able to support alignment.
jagged_vector_buffer
to jagged_vector
copy makes an error when the size of jagged_vector
is larger than one and the size of element is 512 bytes or its integer multiplication:
unknown file: Failure
C++ exception with description "/home/beomki/projects/vecmem/vecmem/cuda/src/utils/cuda/copy.cpp:52 Failed to execute: cudaMemcpy(to_ptr, from_ptr, size, copy_type_translator[cptype]) (invalid argument)" thrown in the test body.
Following code is the minimal reproduction of the error.
template <std::size_t ROWS, std::size_t COLS>
using matrix = vecmem::static_array<vecmem::static_array<double, ROWS>, COLS>;
/// 512 bytes object (double, 8x8 matrix) makes an error with buffer2 copy!
/// Any other dimension seems OK unless it is the multiple integer of 512 bytes
using value_type = matrix<8,8>;
/// Test buffers with "large" elements (for which alignment becomes important)
TEST_F(cuda_containers_test, very_large_buffer) {
// The memory resource(s).
vecmem::cuda::managed_memory_resource managed_resource;
// Test a (2D) jagged vector with (2D) matrix
vecmem::data::jagged_vector_buffer<value_type> buffer1({200},
managed_resource);
m_copy.setup(buffer1);
vecmem::jagged_vector<value_type> buffer1_host(&managed_resource);
/// This is OK because the size of jagged vector is 1
m_copy(buffer1, buffer1_host);
// Test a (2D) jagged vector with (2D) matrix
vecmem::data::jagged_vector_buffer<value_type> buffer2({1, 1},
managed_resource);
m_copy.setup(buffer2);
vecmem::jagged_vector<value_type> buffer2_host(&managed_resource);
/// This is NOT OK because the size of jagged vector is larger than 1
m_copy(buffer2, buffer2_host);
}
It's quite bad (or lucky) that I was playing with 8x8 matrix in detray side :/ ..
The Intel compiler changed some stuff. Making VecMem fail to compile against their latest tag.
[ 38%] Building SYCL object sycl/CMakeFiles/vecmem_sycl.dir/src/utils/sycl/device_selector.sycl.o
In file included from /mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.sycl:9:
/mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.hpp:18:42: warning: 'device_selector' is deprecated: Use Callable instead to select device. [-Wdeprecated-declarations]
class device_selector : public cl::sycl::device_selector {
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/device_selector.hpp:34:21: note: 'device_selector' has been explicitly marked deprecated here
class __SYCL_EXPORT __SYCL2020_DEPRECATED(
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/detail/defines_elementary.hpp:52:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/detail/defines_elementary.hpp:43:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
In file included from /mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.sycl:9:
/mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.hpp:29:15: warning: 'default_selector' is deprecated: Use the callable sycl::default_selector_v instead. [-Wdeprecated-declarations]
cl::sycl::default_selector m_defaultSelector;
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/device_selector.hpp:50:21: note: 'default_selector' has been explicitly marked deprecated here
class __SYCL_EXPORT __SYCL2020_DEPRECATED(
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/detail/defines_elementary.hpp:52:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/detail/defines_elementary.hpp:43:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
/mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.sycl:31:18: error: no type named 'device' in namespace 'sycl::info'; did you mean 'CUdevice'?
using info = cl::sycl::info::device;
^~~~~~~~~~~~~~~~~~~~~~
CUdevice
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/detail/backend_traits_cuda.hpp:25:13: note: 'CUdevice' declared here
typedef int CUdevice;
^
/mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.sycl:33:26: error: use of undeclared identifier 'info'; did you mean '::sycl::info'?
(device.get_info<info::name>() == m_deviceName)) {
^~~~
::sycl::info
/software/intel/clang-2022-09/x86_64-centos9-gcc11-opt/bin/../include/sycl/info/info_desc.hpp:24:11: note: '::sycl::info' declared here
namespace info {
^
/mnt/hdd1/krasznaa/projects/vecmem/vecmem/sycl/src/utils/sycl/device_selector.sycl:33:32: error: no member named 'name' in namespace 'sycl::info'
(device.get_info<info::name>() == m_deviceName)) {
~~~~~~^
2 warnings and 3 errors generated.
make[2]: *** [sycl/CMakeFiles/vecmem_sycl.dir/build.make:105: sycl/CMakeFiles/vecmem_sycl.dir/src/utils/sycl/device_selector.sycl.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1129: sycl/CMakeFiles/vecmem_sycl.dir/all] Error 2
make: *** [Makefile:166: all] Error 2
I don't think there is anything magic to be done, but the code will have to be taught how to choose between the right incantations depending on the version of the compiler being used.
Can we have vecmem::get
function for vecmem::static_array
, which is like std::get
defined here?
detray is using std::get
for array in several places.
While integrating vecmem functionality into detray, I've found that vecmem::static_array doesn't support the const expression - is it intended or just missing it?
The standard library dictates that std::vector<bool>
(and, by extension, vecmem::vector
) is a specialization without a guaranteed .data()
member function on which vecmem relies. This means booleans cannot be stored in vecmem vectors.
As I mentioned in #94, this has been bothering me for a while... With a constructor like
/// Resizable data constructor
vector_buffer(size_type capacity, size_type size,
memory_resource& resource);
, one should assume that the created vector buffer would have a starting size of size
. But in the current implementation it doesn't. It always starts from 0. ๐ฆ
When you set up a resizable (jagged) vector currently, you do it like:
vecmem::data::vector_buffer<int> output_buffer(input.size(), 0,
device_resource);
m_copy.setup(output_buffer);
Where m_copy
is one of the "platform specific" copy objects. The problem here is that the buffer constructor currently has no way of storing the starting size that it received, in the memory that it would allocate with the memory resource that it receives. In case the memory resource allocates non-host-accessible memory, one needs to write to that memory using the appropriate "copy class". But by the time that we use the vecmem::copy::setup(...)
function, that original size variable is lost. (The buffer does not store it anywhere.)
There are a couple of ways out of it, but they either require some bigger re-organisation in the code, or a setup that I wouldn't really like. I think the following are all our options here:
vecmem::copy::setup(...)
when initialising the (device) memory. Technically this could work (I think...), but I just really don't like this design. (But maybe you guys could convince me otherwise...)vecmem::copy
already knows about the buffer types...)vecmem::copy
for constructing resizable (jagged) vector buffers. This is very similar to option number 2, and would introduce a design that we don't use anywhere else in the project so far.So... Any objections to having to create resizable buffers with code like:
vecmem::cuda::device_memory_resource resource;
vecmem::cuda::copy copy;
vecmem::data::vector_buffer<int> buffer1(1000, 100, resource, copy);
vecmem::data::jagged_vector_buffer<int> buffer2({1000, 1000, 1000}, {10, 100, 200}, resource, copy);
?
@konradkusiak97, I'm interested in your opinion as well. ๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.