Code Monkey home page Code Monkey logo

async-rdma's Introduction

async-rdma

A framework for writing RDMA applications with high-level abstraction and asynchronous APIs.

Join the chat at https://gitter.im/datenlord/async-rdma Crates.io Docs GPL licensed Build Status

It provides a few major components:

  • Tools for establishing connections with rdma endpoints such as RdmaBuilder.

  • High-level APIs for data transmission between endpoints including read, write, send, receive.

  • High-level APIs for rdma memory region management including alloc_local_mr, request_remote_mr, send_mr, receive_local_mr, receive_remote_mr.

  • A framework including agent and event_listener working behind APIs for memory region management and executing rdma requests such as post_send and poll.

The ChangeLog file contains a brief summary of changes for each release.

Environment Setup

This section is for RDMA novices who want to try this library.

You can skip if your Machines have been configured with RDMA.

Next we will configure the RDMA environment in an Ubuntu20.04 VM. If you are using another operating system distribution, please search and replace the relevant commands.

1. Check whether the current kernel supports RXE

Run the following command and if the CONFIG_RDMA_RXE = y or m, the current operating system supports RXE. If not you need to search how to recompile the kernel to support RDMA.

cat /boot/config-$(uname -r) | grep RXE

2. Install Dependencies

sudo apt install -y libibverbs1 ibverbs-utils librdmacm1 libibumad3 ibverbs-providers rdma-core libibverbs-dev iproute2 perftest build-essential net-tools git librdmacm-dev rdmacm-utils cmake libprotobuf-dev protobuf-compiler clang curl

3. Configure RDMA netdev

(1) Load kernel driver

modprobe rdma_rxe

(2) User mode RDMA netdev configuration.

sudo rdma link add rxe_0 type rxe netdev ens33

rxe_0 is the RDMA device name, and you can name it whatever you want. ens33 is the name of the network device. The name of the network device may be different in each VM, and we can see it by running command "ifconfig".

(3) Check the RDMA device state

Run the following command and check if the state is ACTIVE.

rdma link

(4) Test it

Ib_send_bw is a program used to test the bandwidth of RDMA SEND operations.

Run the following command in a terminal.

ib_send_bw -d rxe_0

And run the following command in another terminal.

ib_send_bw -d rxe_0 localhost

4. Install Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source $HOME/.cargo/env

5. Try an example

git clone https://github.com/datenlord/async-rdma.git
cd async-rdma
cargo run --example rpc

if run rpc example failed, you can try run it with sudo permission.

cargo build --example rpc
sudo ./target/debug/examples/rpc

Example

A simple example: client request a remote memory region and put data into this remote memory region by rdma write. And finally client send_mr to make server aware of this memory region. Server receive_local_mr, and then get data from this mr.

use async_rdma::{LocalMrReadAccess, LocalMrWriteAccess, RdmaBuilder};
use portpicker::pick_unused_port;
use std::{
    alloc::Layout,
    io::{self, Write},
    net::{Ipv4Addr, SocketAddrV4},
    time::Duration,
};

async fn client(addr: SocketAddrV4) -> io::Result<()> {
    let layout = Layout::new::<[u8; 8]>();
    let rdma = RdmaBuilder::default().connect(addr).await?;
    // alloc 8 bytes remote memory
    let mut rmr = rdma.request_remote_mr(layout).await?;
    // alloc 8 bytes local memory
    let mut lmr = rdma.alloc_local_mr(layout)?;
    // write data into lmr
    let _num = lmr.as_mut_slice().write(&[1_u8; 8])?;
    // write the second half of the data in lmr to the rmr
    rdma.write(&lmr.get(4..8).unwrap(), &mut rmr.get_mut(4..8).unwrap())
        .await?;
    // send rmr's meta data to the remote end
    rdma.send_remote_mr(rmr).await?;
    Ok(())
}

#[tokio::main]
async fn server(addr: SocketAddrV4) -> io::Result<()> {
    let rdma = RdmaBuilder::default().listen(addr).await?;
    // receive mr's meta data from client
    let lmr = rdma.receive_local_mr().await?;
    let data = *lmr.as_slice();
    println!("Data written by the client using RDMA WRITE: {:?}", data);
    assert_eq!(data, [[0_u8; 4], [1_u8; 4]].concat());
    Ok(())
}

#[tokio::main]
async fn main() {
    let addr = SocketAddrV4::new(Ipv4Addr::new(127, 0, 0, 1), pick_unused_port().unwrap());
    std::thread::spawn(move || server(addr));
    tokio::time::sleep(Duration::new(1, 0)).await;
    client(addr)
        .await
        .map_err(|err| println!("{}", err))
        .unwrap();
}

Getting Help

First, see if the answer to your question can be found in the found API doc or Design doc. If the answer is not here, please open an issue and describe your problem in detail.

Related Projects

  • rdma-sys: Rust bindings for RDMA fundamental libraries: libibverbs-dev and librdmacm-dev.

async-rdma's People

Contributors

gipsyh avatar gitter-badger avatar gtwhy avatar joshtriplett avatar mond77 avatar my-vegetable-has-exploded avatar nugine avatar pwang7 avatar rogercloud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

async-rdma's Issues

Unnecessary overflow check

left.overflow_shl(32) never panics because 0 <= time < 1e6. It can be written as left.wrapping_shl(32).

async-rdma/src/id.rs

Lines 5 to 20 in 2a55f32

/// Creat a random u64 id.
///
/// Both `WorkRequetId` and `AgentRequestId` depend on this, so make this fn independent.
/// To avoid id duplication, this fn concatenates `SystemTime` and random number into a U64.
/// The syscall may have some overhead, which can be improved later by balancing the pros and cons.
pub(crate) fn random_u64() -> u64 {
let start = SystemTime::now();
// No time can be earlier than Unix Epoch
#[allow(clippy::unwrap_used)]
let since_the_epoch = start.duration_since(UNIX_EPOCH).unwrap();
let time = since_the_epoch.subsec_micros();
let rand = rand::thread_rng().gen::<u32>();
let left: u64 = time.into();
let right: u64 = rand.into();
left.overflow_shl(32) | right
}

self.addr().overflow_add(i.start) and i.end.overflow_sub(i.start) never panic because the bounds have been checked.

(Should "wrong range of lmr" be an IO error?)

/// Get a local mr slice
#[inline]
pub fn get(&self, i: Range<usize>) -> io::Result<LocalMrSlice> {
// SAFETY: `self` is checked to be valid and in bounds above.
if i.start >= i.end || i.end > self.length() {
Err(io::Error::new(io::ErrorKind::Other, "wrong range of lmr"))
} else {
Ok(LocalMrSlice::new(
self,
self.addr().overflow_add(i.start),
i.len(),
))
}
}
/// Get a mutable local mr slice
#[inline]
pub fn get_mut(&mut self, i: Range<usize>) -> io::Result<LocalMrSliceMut> {
// SAFETY: `self` is checked to be valid and in bounds above.
if i.start >= i.end || i.end > self.length() {
Err(io::Error::new(io::ErrorKind::Other, "wrong range of lmr"))
} else {
Ok(LocalMrSliceMut::new(
self,
self.addr().overflow_add(i.start),
i.len(),
))
}
}

/// Get a remote mr slice
#[inline]
pub fn get(&self, i: Range<usize>) -> io::Result<RemoteMrSlice> {
// SAFETY: `self` is checked to be valid and in bounds above.
if i.start >= i.end || i.end > self.length() {
Err(io::Error::new(io::ErrorKind::Other, "wrong range of rmr"))
} else {
let slice_token = MrToken {
addr: self.addr().overflow_add(i.start),
len: i.end.overflow_sub(i.start),
rkey: self.rkey(),
};
Ok(RemoteMrSlice::new_from_token(self, slice_token))
}
}
/// Get a mutable remote mr slice
#[inline]
pub fn get_mut(&mut self, i: Range<usize>) -> io::Result<RemoteMrSliceMut> {
// SAFETY: `self` is checked to be valid and in bounds above.
if i.start >= i.end || i.end > self.length() {
Err(io::Error::new(io::ErrorKind::Other, "wrong range of rmr"))
} else {
let slice_token = MrToken {
addr: self.addr().overflow_add(i.start),
len: i.end.overflow_sub(i.start),
rkey: self.rkey(),
};
Ok(RemoteMrSliceMut::new_from_token(self, slice_token))
}
}

start.overflow_add(self.max_sr_data_len) can be writtern as start.saturating_add(self.max_sr_data_len)

end.overflow_sub(start) never panics because start < end.

async-rdma/src/agent.rs

Lines 175 to 178 in 2441414

let end = (start.overflow_add(self.max_sr_data_len)).min(lm_len);
let kind = RequestKind::SendData(SendDataRequest {
len: end.overflow_sub(start),
});

[BUG] mr_allocator related bug when dealing two big lmr with one rdma object

The following code should reproduce the memory problem.

The case is that in server there are two lmr to read data.

The first read is successful. When the first lmr is dropped, jemalloc dalloc is triggered (if the lmr is fairly large), and the EXTENT_TOKEN_MAP will remove the raw_mr item.

However, when creating the second lmr, the lookup_raw_mr function in mr_allocator.rs will get error. There are actually three error situaitions in this function and I have seen all of them (still wondering why...)

Some thing about my system setting:

  • I tried using sudo to run this, still failed
  • I have set my user ulimit to unlimited
  • I use softiwarp on ubuntu 20.04, but I think the bug is only related to mr_allocator
use async_rdma::{LocalMrWriteAccess, RdmaBuilder};
use portpicker::pick_unused_port;
use std::{
    alloc::Layout,
    io::{self, Write},
    net::{Ipv4Addr, SocketAddrV4},
    time::Duration,
};

const SIZE: usize = 44444444;

async fn client(addr: SocketAddrV4) -> io::Result<()> {
    let rdma = RdmaBuilder::default().connect(addr).await?;
    let data = vec![0u8; SIZE];

    // first send
    let layout = Layout::from_size_align(SIZE, 1).unwrap();
    let mut lmr = rdma.alloc_local_mr(layout)?;
    lmr.as_mut_slice().write(&data)?;
    rdma.send_local_mr(lmr).await?;

    // second send
    let layout = Layout::from_size_align(SIZE, 1).unwrap();
    let mut lmr = rdma.alloc_local_mr(layout)?;
    lmr.as_mut_slice().write(&data)?;
    rdma.send_local_mr(lmr).await?;

    // wait for server to read, otherwise this client will early exit
    tokio::time::sleep(Duration::from_secs(5)).await;

    Ok(())
}

#[tokio::main]
async fn server(addr: SocketAddrV4) -> io::Result<()> {
    let rdma = RdmaBuilder::default().listen(addr).await?;

    {
        let layout = Layout::from_size_align(SIZE, 1).unwrap();
        println!("layout: {:?}", layout);
        let mut lmr = rdma.alloc_local_mr(layout)?;
        println!("lmr: {:?}", lmr);
        let rmr = rdma.receive_remote_mr().await?;
        rdma.read(&mut lmr, &rmr).await?;
        println!("rdma read\n-------------");
    }

    // lmr will drop here

    {
        let layout = Layout::from_size_align(SIZE, 1).unwrap();
        println!("layout: {:?}", layout);
        // the memory bug occurs here
        let mut lmr = rdma.alloc_local_mr(layout)?;
        println!("lmr: {:?}", lmr);
        let rmr = rdma.receive_remote_mr().await?;
        rdma.read(&mut lmr, &rmr).await?;
        println!("rdma read\n-------------");
    }

    Ok(())
}

#[tokio::main]
async fn main() {
    let addr = SocketAddrV4::new(Ipv4Addr::new(127, 0, 0, 1), pick_unused_port().unwrap());
    std::thread::spawn(move || server(addr));
    tokio::time::sleep(Duration::from_secs(1)).await;
    client(addr)
        .await
        .map_err(|err| println!("{}", err))
        .unwrap();
}

bump rust toolchain

For #111 , the better solution is to bump rust toolchain to a higher version.
And it needs some change to satisfy clippy's checks.

insufficient contiguous memory was available to service the allocation request

Hi there,

I'm trying to allocate an obscene amount of memory (2 Gbyte) but my program crashes with

insufficient contiguous memory was available to service the allocation request

this is the relevant code snippet:

            let layout = Layout::from_size_align(request.message_size as usize, 1).unwrap();
            let mut lmr = match rdma.alloc_local_mr(layout){
                Ok(lmr) => lmr,
                Err(e) => {
                    panic!("alloc_local_mr error: {}", e);
                }
            };

request.message_size is set to 2 Gbyte.
My machine has plenty of RAM (256 GB) and I freshly rebooted hoping to have more contiguous memory available. I also enabled CMA (Contiguous Memory Allocator) and reserved 50 Gbyte but that didn't help.
I also tried to concurrently allocate multiple 1 Gbyte memory areas, leading to the same effect.
I don't believe that my machine has <2 Gbyte contiguous memory right after a reboot.
Any ideas?

Thanks

Support working with existing apps

The current implementation defines its own protocol to establish connections and send/receive messages. This brings a limitation that both client and server must use this library to communicate. We can not use this library to build a client of an existing rdma server. So I have 2 suggestions that might help:

  1. Support setting up connections through RDMA CM.
  2. Add lower-level APIs to send/receive raw messages.

Crate examples won't compile.

It seems like something happened to one of the deps, possibly recently?

cargo build --example rpc
Updating crates.io index
error: failed to select a version for the requirement simd-abstraction = "^0.5.0"
candidate versions found which didn't match: 0.7.1
location searched: crates.io index
required by package hex-simd v0.5.0
... which satisfies dependency hex-simd = "^0.5.0" of package async-rdma v0.5.0 (/home/mike/dev/rust/async-rdma)
perhaps a crate was updated and forgotten to be re-vendored?

It looks like all prior versions to simd-abstraction 0.7.1 have been yanked from crates.io...?

contribution

How can I contribute to the project? Is there any discord or slack channel to discuss?

Failed to run example on ubuntu server 20.04

I tried to run the example as show in repo index page like:

cargo run --example rpc
cargo run --example rpc
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
     Running `target/debug/examples/rpc`
rpc server started
2023-08-21T15:50:53.641480Z ERROR async_rdma::error_utilities: OS error Os { code: 22, kind: InvalidInput, message: "Invalid argument" }
2023-08-21T15:50:53.641514Z ERROR async_rdma::error_utilities: OS error Os { code: 22, kind: InvalidInput, message: "Invalid argument" }
Invalid argument (os error 22)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ()', examples/rpc.rs:147:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', examples/rpc.rs:32:71

And I also run this example in another ubuntu 22.04 machine, and it succeed!

support append for memory region

async-rdma memory region need to use with lock_utilities::MappedRwLockReadGuard.

lmr.as_mut_slice().write(&[1_u8; 8])

I want to append(this can be done by wrapping with std::io::cursor) some bytes to mr to avoid useless copy.

But we can't build a cursor with (MappedRwLockReadGuard), like

    let cursor = std::io::Cursor::new(lmr.as_mut_slice());
	cursor.write(&[1_u8; 8])?;

so, i think we can consider providing a MappedRwLockWriteGuard<Cursor<&mut [u8]>>.

If needed, i can open a pr to add it.

How to forget memory with the jemalloc strategy.

Hello, I was using PyO3 to bind this rust code to python and is now struggling with the memory ownership problem (more specifically, double free).

When I try to pass an lmr to python, I do std::mem::forget(lmr) to make rust forget about the memory, as the memory is already somehow transfered to python by some unsafe pointer operations.

However, according to my reasoning, I think the jemalloc strategy does not give up this memory even lmr.drop() is not called. Is this expected? Is there way to forget this memory in jemalloc strategy?

P.S. raw strategy works fine and no double free occurs.

Potential unsoundness

There is no guarantee that the safety requirements are met here.

https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html#safety

/// Get the memory region as slice
#[inline]
fn as_slice(&self) -> &[u8] {
unsafe { slice::from_raw_parts(self.as_ptr(), self.length()) }
}

/// Get the memory region as mut slice
#[inline]
fn as_mut_slice(&mut self) -> &mut [u8] {
unsafe { slice::from_raw_parts_mut(self.as_mut_ptr(), self.length()) }
}

Cancellation safety

Disclaimer: I know nothing about the details of RDMA before.

I have some questions about the example at first glance.

/// Server can't aware of rdma `read` or `write`, so we need sync with client
/// before client `read` and after `write`.
async fn sync_with_client(rdma: &Rdma) {
let mut lmr_sync = rdma
.alloc_local_mr(Layout::new::<Request>())
.map_err(|err| println!("{}", &err))
.unwrap();
//write data to lmr
unsafe { *(lmr_sync.as_mut_ptr() as *mut Request) = Request::Sync };
rdma.send(&lmr_sync)
.await
.map_err(|err| println!("{}", &err))
.unwrap();
}

  1. When will the local memory region be deallocated?
  2. Does the unsafe statement have any safety requirements?
  3. What if the future returned by sync_with_client is dropped (cancelled) before the request completes?

Related article: https://zhuanlan.zhihu.com/p/346219893

run example server panic

I use the command-server example. It looks None for the value `imm" and can't unwrap.

/data/async-rdma$ RUST_BACKTRACE=1 cargo run --example server localhost 9527
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/examples/server localhost 9527`
server start
accepted
[1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1, 1], None
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', examples/server.rs:32:27
stack backtrace:
   0: rust_begin_unwind
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
   2: core::panicking::panic
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:114:5
   3: core::option::Option<T>::unwrap
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/option.rs:823:21
   4: server::receive_data_with_imm_from_client::{{closure}}
             at ./examples/server.rs:32:23
   5: server::main::{{closure}}
             at ./examples/server.rs:103:45
   6: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/park.rs:283:63
   7: tokio::runtime::coop::with_budget
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/coop.rs:107:5
   8: tokio::runtime::coop::budget
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/coop.rs:73:5
   9: tokio::runtime::park::CachedParkThread::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/park.rs:283:31
  10: tokio::runtime::context::BlockingRegionGuard::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/context.rs:315:13
  11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/scheduler/multi_thread/mod.rs:66:9
  12: tokio::runtime::runtime::Runtime::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/runtime.rs:304:45
  13: server::main


/data/async-rdma$ RUST_BACKTRACE=1 cargo run --example client localhost 9527
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/examples/client localhost 9527`
client start
connected
[0, 0, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 2, 2, 2, 2]
^[[A^[[A^[[A^[[Athread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: TimedOut, error: "Timeout for waiting for a response." }', examples/client.rs:134:37
stack backtrace:
   0: rust_begin_unwind
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/panicking.rs:64:14
   2: core::result::unwrap_failed
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/result.rs:1790:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/result.rs:1112:23
   4: client::main::{{closure}}
             at ./examples/client.rs:134:5
   5: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/park.rs:283:63
   6: tokio::runtime::coop::with_budget
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/coop.rs:107:5
   7: tokio::runtime::coop::budget
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/coop.rs:73:5
   8: tokio::runtime::park::CachedParkThread::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/park.rs:283:31
   9: tokio::runtime::context::BlockingRegionGuard::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/context.rs:315:13
  10: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/scheduler/multi_thread/mod.rs:66:9
  11: tokio::runtime::runtime::Runtime::block_on
             at /home/zhaoxu/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/tokio-1.28.1/src/runtime/runtime.rs:304:45
  12: client::main
             at ./examples/client.rs:141:5
  13: core::ops::function::FnOnce::call_once
             at /rustc/2c8cc343237b8f7d5a3c3703e3a87f2eb2c54a74/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Connection timeout between servers

I'm new to RDMA in general so I'm just trying to explore how this works. I tried the cargo run --example rpc on my localhost and it runs fine. However when I try to run this between 2 servers I get a timeout error once the client connects. I'm using the cargo run --example client and cargo run --example server. Any troubleshooting tips?

Optimize the poll workflow of CQ

Unnecessary allocation: the only one entry can be writted into a stack slot (MaybeUninit<WorkCompletion>) instead of a Vec.

/// Poll one work completion from CQ
pub(crate) fn poll_single(&self) -> io::Result<WorkCompletion> {
let polled = self.poll(1)?;
polled
.into_iter()
.next()
.ok_or_else(|| io::Error::new(io::ErrorKind::WouldBlock, ""))
}

Yes, poll only used by single_poll for now, so the allocation of vector is unnecessary.
single_poll is an interim scheme, we will optimize the poll workflow latter and track it in a new issue.

Originally posted by @GTwhy in #49 (comment)

Currently, event_listener only poll one CQE at a time, and the poll workflow can be optimized to poll more at a time.

Check more hardware limitations

    Shall we also check other hardware limitations?

Originally posted by @rogercloud in #89 (comment)

We may get errors returned by ibv APIs with ambiguous error messages because of hardware limitations.

We can check more hardware limitations, capture more errors and add some debug information.

[Feature] How to register Memory Region on specific memory address?

The API to create MR, i.e. alloc_local_mr(layout) function, only has the option to allocate new memory in an ad-hoc way (use malloc inside the alloc_local_mr function).

Is it possible that I register an existing memory block as MR. As I want to transfer a large block of memory and don't want it to be copied which may cause performance issue. Many thanks in advance!

Encounter a doubt when using MR in async-rdma

In studying this library, the following result were encountered. And I was confused about that.
request structure:
Screenshot 2023-03-01 at 22 48 29
client side:
Screenshot 2023-03-01 at 22 48 37
server side:
Screenshot 2023-03-01 at 22 48 58
main:
Screenshot 2023-03-01 at 22 48 20
result:

rpc server started
header size: 28
size: 80, 
Msg { header: RequestHeader { id: 0, type: 0, flags: 0, total_length: 0, file_path_length: 0, meta_data_length: 0, data_length: 0 }, meta_data: [], data: [] }
server side size: 80, 
Msg { header: RequestHeader { id: 0, type: 0, flags: 0, total_length: 6, file_path_length: 0, meta_data_length: 3, data_length: 3 }, meta_data: [1, 2, 3, 4, 5], data: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] }
response: Msg { header: ResponseHeader { id: 0, status: 0, flags: 0, total_length: 0, meta_data_length: 0, data_length: 0 }, meta_data: [1, 2, 3, 4, 5], data: [4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4] }

RDMA Soundness Scope

It is impossible to prevent an incorrect remote process from triggering UB in the local process.
Like mmap and /proc/self/mem, such a situation is out of the control of Rust language.

There are two solutions:

  • Document the behavior and remove it from soundness concerns.
    Like rust-lang/rust#97837
  • Put an unsafe function on the way from network connections to active RDMA connections.
    The function means "trust the remote process" while it is impossible to check whether the remote process is correct actually.

Timeout from single side is still unsound because UB may happen when system time goes back.

Related:

Large array transfer problem

Env

Two Ubuntu20.04 LTS clients for Vmware

Description

The two clients can perform RDMA large array operations separately by binding 127.0.0.1, However, when one client performs RDMA operations to the other on a large array, no packets are sent, I used Wireshark but did not catch packets, Operating with a small array can catch packets.

By the way, for arrays that exceed the size of the ulimit-L parameter, I do not need to unlock the limited memory size permissions for RDMA on a single client. When two clients perform RDMA operations, the large array needs to unlock the limit of Ulimit-L in order to connect

the two clients rust code is as following:

// server
use async_rdma::{LocalMrReadAccess, LocalMrWriteAccess, Rdma, RdmaListener};
use portpicker::pick_unused_port;
use std::{
    alloc::Layout,
    io,
    net::{Ipv4Addr, SocketAddrV4},
    time::Duration,
};

const LEN:usize = 30 * 1024 * 1024;

async fn server(addr: SocketAddrV4) -> io::Result<()> {
    let rdma_listener = RdmaListener::bind(addr).await?;
    let rdma = rdma_listener.accept(1, 1, 512).await?;
    // receive the immediate writen by client
    let imm_v = rdma.receive_write_imm().await?;
    //  print the immdedaite value
    print!("immediate value: {}", imm_v);
    // receive the metadata of the mr sent by client
    let lmr = rdma.receive_local_mr().await?;
    // print the content of lmr, which was `write` by client
    unsafe { println!("{:?}", *(lmr.as_ptr() as *const [i32;LEN])) };
    // wait for the agent thread to send all reponses to the remote.
    tokio::time::sleep(Duration::from_secs(1)).await;
    Ok(())
}
#[tokio::main]
async fn main() {
    let addr = SocketAddrV4::new(Ipv4Addr::new(192, 168, 59, 131), 8081);
    server(addr).await.unwrap();
    tokio::time::sleep(Duration::from_secs(3)).await;
}


// client
use async_rdma::{LocalMrReadAccess, LocalMrWriteAccess, Rdma, RdmaListener};
use portpicker::pick_unused_port;
use std::{
    alloc::Layout,
    io,
    net::{Ipv4Addr, SocketAddrV4},
    time::Duration,
};


const LEN:usize = 30 * 1024 * 1024;

async fn client(addr: SocketAddrV4) -> io::Result<()> {
    let rdma = Rdma::connect(addr, 1, 1, 512).await?;
    let mut lmr = rdma.alloc_local_mr(Layout::new::<[i32;LEN]>())?;
    let mut rmr = rdma.request_remote_mr(Layout::new::<[i32;LEN]>()).await?;
    // load data into lmr
    unsafe { *(lmr.as_mut_ptr() as *mut [i32;LEN]) = [0;LEN] };
    // write the content of local mr into remote mr  with immediate value
    rdma.write_with_imm(&lmr, &mut rmr, 1).await?;
    // then send rmr's metadata to server to make server aware of it
    rdma.send_remote_mr(rmr).await?;
    Ok(())
}
#[tokio::main]
async fn main() {
    let addr = SocketAddrV4::new(Ipv4Addr::new(192, 168, 59, 131), 8081);
    client(addr)
        .await
        .map_err(|err| println!("{}", err))
        .unwrap();
    tokio::time::sleep(Duration::from_secs(3)).await;
}

JEMALLOC_RETAIN defined cause memory allocate failed

When I tried to run client and server of example, I got a message as follow:
thread 'tokio-runtime-worker' panicked at 'internal error: entered unreachable code: dalloc failed, check if JEMALLOC_RETAIN defined. If so, then please undefined it', src/mr_allocator.rs:485:13

I found that JEMALLOC_RETAIN is defined in C file inside tikv-jemalloc-sys, but there is no feature associated with this, so how could I un define it?

Claiming the name on crates.io?

There have been some "bad actors" who falsely claim names of successful projects on crates.io. It seems like datenlord hasn't really published any work on crates.io yet. This leaves the room for the "bad actors". Would you consider publish this crate or even just claim the name first?

confusion about support for Infiniband and mlx5?

(First I have to declare that I am really a novice to RDMA ...)

This library works pretty fine when I am playing RDMA based on SoftiWARP. However now when I am trying to apply all my code and tool to a platform equipped with the MLX5 card, I failed to connect my rdma server and client (by specifying ip:port) anymore.

Sorry that this question might be stupid but I am not quite clear about the relationship and difference between these concepts: RoCE, RXE, iWARP, MLX, Infiniband(IB) ...

Does this repo only support RoCE but not Infiniband(IB)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.