Code Monkey home page Code Monkey logo

const-concat's Introduction

Const string concatenation

Rust has some great little magic built-in macros that you can use. A particularly-helpful one for building up paths and other text at compile-time is concat!. This takes two strings and returns the concatenation of them:

const HELLO_WORLD: &str = concat!("Hello", ", ", "world!");

assert_eq!(HELLO_WORLD, "Hello, world!");

This is nice, but it falls apart pretty quickly. You can use concat! on the strings returned from magic macros like env! and include_str! but you can't use it on constants:

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");

This produces the error:

error: expected a literal
 --> src/main.rs:3:35
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                   ^^^^^^^^

error: expected a literal
 --> src/main.rs:3:51
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                                   ^^^^^

Well with const_concat! you can! It works just like the concat! macro:

#[macro_use]
extern crate const_concat;

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = const_concat!(GREETING, ", ", PLACE, "!");

assert_eq!(HELLO_WORLD, "Hello, world!");

All this, and it's implemented entirely without hooking into the compiler. So how does it work? Through dark, evil magicks. Firstly, why can't this just work the same as runtime string concatenation? Well, runtime string concatenation allocates a new String, but allocation isn't possible at compile-time - we have to do everything on the stack. Also, we can't do iteration at compile-time so there's no way to copy the characters from the source strings to the destination string. Let's look at the implementation. The "workhorse" of this macro is the concat function:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &First>(a), *transmute::<_, &Second>(b));

    transmute(arr)
}

So what we do is convert both the (arbitrarily-sized) input arrays to pointers to constant-size arrays (well, actually to pointer-to-First and pointer-to-Second, but the intent is that First and Second are fixed-size arrays). Then, we dereference them. This is wildly unsafe - there's nothing saying that a.len() is the same as the length of the First type parameter. We put them next to one another in a #[repr(C)] tuple struct - this essentially concatenates them together in memory. Finally, we transmute it to the Out type parameter. If First is [u8; N0] and Second is [u8; N1] then Out should be [u8; N0 + N1]. Why not just use a trait with associated constants? Well, here's an example of what that would look like:

trait ConcatHack {
    const A_LEN: usize;
    const B_LEN: usize;
}

pub const unsafe fn concat<C>(
    a: &[u8],
    b: &[u8],
) -> [u8; C::A_LEN + C::B_LEN]
where
    C: ConcatHack,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &[u8; C::A_LEN]>(a), *transmute::<_, &[u8; C::B_LEN]>(b));

    transmute(arr)
}

This doesn't work though, because type parameters are not respected when calculating fixed-size array lengths. So instead we use individual type parameters for each constant-size array.

Wait, though, if you look at the documentation for std::mem::tranmute at the time of writing it's not a const fn. What's going on here then? Well, I wrote my own transmute:

#[allow(unions_with_drop_fields)]
pub const unsafe fn transmute<From, To>(from: From) -> To {
    union Transmute<From, To> {
        from: From,
        to: To,
    }

    Transmute { from }.to
}

This is allowed in a const fn where std::mem::transmute is not. Finally, let's look at the macro itself:

#[macro_export]
macro_rules! const_concat {
    ($a:expr, $b:expr) => {{
        let bytes: &'static [u8] = unsafe {
            &$crate::concat::<
                [u8; $a.len()],
                [u8; $b.len()],
                [u8; $a.len() + $b.len()],
            >($a.as_bytes(), $b.as_bytes())
        };

        unsafe { $crate::transmute::<_, &'static str>(bytes) }
    }};
    ($a:expr, $($rest:expr),*) => {{
        const TAIL: &str = const_concat!($($rest),*);
        const_concat!($a, TAIL)
    }};
}

So first we create a &'static [u8] and then we transmute it to &'static str. This works for now because &[u8] and &str have the same layout, but it's not guaranteed to work forever. The cast to &'static [u8] works even though the right-hand side of that assignment is local to this scope because of something called "rvalue static promotion".

The eagle-eyed among you may have also noticed that &[u8; N] and &[u8] have different sizes, since the latter is a fat pointer. Well my constant transmute doesn't check size (union fields can have different sizes) and for now the layout of both of these types puts the pointer first. There's no way to fix that on the current version of the compiler, since &slice[..] isn't implemented for constant expressions.

This currently doesn't work in trait associated constants. I do have a way to support trait associated constants but again, you can't access type parameters in array lengths so that unfortunately doesn't work. Finally, it requires quite a few nightly features:

#![feature(const_fn, const_str_as_bytes, const_str_len, const_let, untagged_unions)]

UPDATE

I fixed the issue where the transmute relies on the pointer in &[u8] being first by instead transmuting a pointer to the first element of the array. The code now looks like so:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &First>(&a[0]), *transmute::<_, &Second>(&b[0]));

    transmute(arr)
}

const-concat's People

Contributors

eira-fransham avatar pthariensflame avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.