Code Monkey home page Code Monkey logo

const-concat's Introduction

Const string concatenation

Rust has some great little magic built-in macros that you can use. A particularly-helpful one for building up paths and other text at compile-time is concat!. This takes two strings and returns the concatenation of them:

const HELLO_WORLD: &str = concat!("Hello", ", ", "world!");

assert_eq!(HELLO_WORLD, "Hello, world!");

This is nice, but it falls apart pretty quickly. You can use concat! on the strings returned from magic macros like env! and include_str! but you can't use it on constants:

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");

This produces the error:

error: expected a literal
 --> src/main.rs:3:35
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                   ^^^^^^^^

error: expected a literal
 --> src/main.rs:3:51
  |
3 | const HELLO_WORLD: &str = concat!(GREETING, ", ", PLACE, "!");
  |                                                   ^^^^^

Well with const_concat! you can! It works just like the concat! macro:

#[macro_use]
extern crate const_concat;

const GREETING: &str = "Hello";
const PLACE: &str = "world";
const HELLO_WORLD: &str = const_concat!(GREETING, ", ", PLACE, "!");

assert_eq!(HELLO_WORLD, "Hello, world!");

All this, and it's implemented entirely without hooking into the compiler. So how does it work? Through dark, evil magicks. Firstly, why can't this just work the same as runtime string concatenation? Well, runtime string concatenation allocates a new String, but allocation isn't possible at compile-time - we have to do everything on the stack. Also, we can't do iteration at compile-time so there's no way to copy the characters from the source strings to the destination string. Let's look at the implementation. The "workhorse" of this macro is the concat function:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &First>(a), *transmute::<_, &Second>(b));

    transmute(arr)
}

So what we do is convert both the (arbitrarily-sized) input arrays to pointers to constant-size arrays (well, actually to pointer-to-First and pointer-to-Second, but the intent is that First and Second are fixed-size arrays). Then, we dereference them. This is wildly unsafe - there's nothing saying that a.len() is the same as the length of the First type parameter. We put them next to one another in a #[repr(C)] tuple struct - this essentially concatenates them together in memory. Finally, we transmute it to the Out type parameter. If First is [u8; N0] and Second is [u8; N1] then Out should be [u8; N0 + N1]. Why not just use a trait with associated constants? Well, here's an example of what that would look like:

trait ConcatHack {
    const A_LEN: usize;
    const B_LEN: usize;
}

pub const unsafe fn concat<C>(
    a: &[u8],
    b: &[u8],
) -> [u8; C::A_LEN + C::B_LEN]
where
    C: ConcatHack,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> =
        Both(*transmute::<_, &[u8; C::A_LEN]>(a), *transmute::<_, &[u8; C::B_LEN]>(b));

    transmute(arr)
}

This doesn't work though, because type parameters are not respected when calculating fixed-size array lengths. So instead we use individual type parameters for each constant-size array.

Wait, though, if you look at the documentation for std::mem::transmute at the time of writing it's not a const fn. What's going on here then? Well, I wrote my own transmute:

#[allow(unions_with_drop_fields)]
pub const unsafe fn transmute<From, To>(from: From) -> To {
    union Transmute<From, To> {
        from: From,
        to: To,
    }

    Transmute { from }.to
}

This is allowed in a const fn where std::mem::transmute is not. Finally, let's look at the macro itself:

#[macro_export]
macro_rules! const_concat {
    ($a:expr, $b:expr) => {{
        let bytes: &'static [u8] = unsafe {
            &$crate::concat::<
                [u8; $a.len()],
                [u8; $b.len()],
                [u8; $a.len() + $b.len()],
            >($a.as_bytes(), $b.as_bytes())
        };

        unsafe { $crate::transmute::<_, &'static str>(bytes) }
    }};
    ($a:expr, $($rest:expr),*) => {{
        const TAIL: &str = const_concat!($($rest),*);
        const_concat!($a, TAIL)
    }};
}

So first we create a &'static [u8] and then we transmute it to &'static str. This works for now because &[u8] and &str have the same layout, but it's not guaranteed to work forever. The cast to &'static [u8] works even though the right-hand side of that assignment is local to this scope because of something called "rvalue static promotion".

The eagle-eyed among you may have also noticed that &[u8; N] and &[u8] have different sizes, since the latter is a fat pointer. Well my constant transmute doesn't check size (union fields can have different sizes) and for now the layout of both of these types puts the pointer first. There's no way to fix that on the current version of the compiler, since &slice[..] isn't implemented for constant expressions.

This currently doesn't work in trait associated constants. I do have a way to support trait associated constants but again, you can't access type parameters in array lengths so that unfortunately doesn't work. Finally, it requires quite a few nightly features:

#![feature(const_fn, const_str_as_bytes, const_str_len, const_let, untagged_unions)]

UPDATE

I fixed the issue where the transmute relies on the pointer in &[u8] being first by instead transmuting a pointer to the first element of the array. The code now looks like so:

pub const unsafe fn concat<First, Second, Out>(a: &[u8], b: &[u8]) -> Out
where
    First: Copy,
    Second: Copy,
    Out: Copy,
{
    #[repr(C)]
    #[derive(Copy, Clone)]
    struct Both<A, B>(A, B);

    let arr: Both<First, Second> = Both(
        *transmute::<_, *const First>(a.as_ptr()),
        *transmute::<_, *const Second>(b.as_ptr()),
    );

    transmute(arr)
}

const-concat's People

Contributors

cad97 avatar eira-fransham avatar hsjoihs avatar pthariensflame avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

const-concat's Issues

Does not compile: unions in const fn are unstable

Looks like we need to enable a new unstable feature const_fn_union. Tested with rustc 1.30.0-nightly (33b923fd4 2018-08-18).

error[E0658]: unions in const fn are unstable (see issue #51909)
  --> src/lib.rs:10:5
   |
10 |     Transmute { from }.to
   |     ^^^^^^^^^^^^^^^^^^^^^
   |
   = help: add #![feature(const_fn_union)] to the crate attributes to enable

@Vurich

Stable impl

#![no_std]

use core::mem::ManuallyDrop;

const unsafe fn transmute_prefix<From, To>(from: From) -> To {
    union Transmute<From, To> {
        from: ManuallyDrop<From>,
        to: ManuallyDrop<To>,
    }

    ManuallyDrop::into_inner(
        Transmute {
            from: ManuallyDrop::new(from),
        }
        .to,
    )
}

/// # Safety
///
/// `Len1 + Len2 >= Len3`
#[doc(hidden)]
#[allow(non_upper_case_globals)]
pub const unsafe fn concat<const Len1: usize, const Len2: usize, const Len3: usize>(
    arr1: [u8; Len1],
    arr2: [u8; Len2],
) -> [u8; Len3] {
    #[repr(C)]
    struct Concat<A, B>(A, B);
    transmute_prefix(Concat(arr1, arr2))
}

#[macro_export]
macro_rules! const_concat {
    () => ("");
    ($a:expr) => ($a);

    ($a:expr, $b:expr $(,)?) => {{
        const A: &str = $a;
        const B: &str = $b;
        const BYTES: [u8; { A.len() + B.len() }] = unsafe {
            $crate::concat::<
                { A.len() },
                { B.len() },
                { A.len() + B.len() }
            >(
                *A.as_ptr().cast(),
                *B.as_ptr().cast(),
            )
        };
        unsafe { ::core::str::from_utf8_unchecked(&BYTES) }
    }};
    
    ($a:expr, $b:expr, $($rest:expr),+ $(,)?) => {{
        const TAIL: &str = $crate::const_concat!($b, $($rest),+);
        $crate::const_concat!($a, TAIL)
    }}
}

#[test]
fn tests() {
    const SALUTATION: &str = "Hello";
    const TARGET: &str = "world";
    const GREETING: &str = const_concat!(SALUTATION, ", ", TARGET, "!");
    const GREETING_TRAILING_COMMA: &str = const_concat!(SALUTATION, ", ", TARGET, "!",);

    assert_eq!(GREETING, "Hello, world!");
    assert_eq!(GREETING_TRAILING_COMMA, "Hello, world!");
}

Consider also const_format::concatcp!.

(For anyone following after me)

Support constants that are numbers

I would like numbers to work in const_concat.

const S: &str = "abc";
const N: u32 = 123;
const C: &str = const_concat!(S, ", ", N); // "abc, 123"

Proof of concept:

#![feature(const_fn, const_fn_union, existential_type, untagged_unions)]

macro_rules! const_cond {
    (default => $e:expr,) => {
        $e
    };
    ($cond:expr => $e:expr, $($rest:tt)*) => {
        const_if($cond, $e, const_cond!($($rest)*))
    };
}

macro_rules! const_to_string {
    ($e:expr) => {{
        const fn const_define() -> This {
            $e
        }

        existential type This: ConstToString;
        const VALUE: This = const_define();
        const FAKE: This = unsafe { transmute::<usize, This>(1) };
        const TYPE: ConstType = <This as ConstToString>::TYPE;

        const IS_STR: bool = const_type_eq(TYPE, ConstType::StaticStr);
        const SEL_STR: This = const_if(IS_STR, VALUE, FAKE);
        const AS_STR: &str = unsafe { transmute::<This, &str>(SEL_STR) };

        const IS_U32: bool = const_type_eq(TYPE, ConstType::U32);
        const SEL_U32: This = const_if(IS_U32, VALUE, FAKE);
        const AS_U32: u32 = unsafe { transmute::<This, u32>(SEL_U32) };

        const DIGITS: [u8; 10] = [
            b'0' + (AS_U32 / 1_000_000_000 % 10) as u8,
            b'0' + (AS_U32 / 100_000_000 % 10) as u8,
            b'0' + (AS_U32 / 10_000_000 % 10) as u8,
            b'0' + (AS_U32 / 1_000_000 % 10) as u8,
            b'0' + (AS_U32 / 100_000 % 10) as u8,
            b'0' + (AS_U32 / 10_000 % 10) as u8,
            b'0' + (AS_U32 / 1_000 % 10) as u8,
            b'0' + (AS_U32 / 100 % 10) as u8,
            b'0' + (AS_U32 / 10 % 10) as u8,
            b'0' + (AS_U32 % 10) as u8,
        ];

        const NUM_STR: &str = unsafe {
            transmute::<&'static [u8], &'static str>(const_cond! {
                AS_U32 >= 1_000_000_000 => &DIGITS,
                AS_U32 >= 100_000_000 => &[DIGITS[1], DIGITS[2], DIGITS[3], DIGITS[4], DIGITS[5], DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 10_000_000 => &[DIGITS[2], DIGITS[3], DIGITS[4], DIGITS[5], DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 1_000_000 => &[DIGITS[3], DIGITS[4], DIGITS[5], DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 100_000 => &[DIGITS[4], DIGITS[5], DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 10_000 => &[DIGITS[5], DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 1_000 => &[DIGITS[6], DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 100 => &[DIGITS[7], DIGITS[8], DIGITS[9]],
                AS_U32 >= 10 => &[DIGITS[8], DIGITS[9]],
                default => &[DIGITS[9]],
            })
        };

        const STRING: &str = const_cond! {
            IS_STR => AS_STR,
            IS_U32 => NUM_STR,
            default => "ERROR",
        };

        STRING
    }};
}

enum ConstType {
    StaticStr,
    U32,
}

const fn const_type_eq(a: ConstType, b: ConstType) -> bool {
    a as u8 == b as u8
}

unsafe trait ConstToString: Copy {
    const TYPE: ConstType;
}

unsafe impl ConstToString for &'static str {
    const TYPE: ConstType = ConstType::StaticStr;
}

unsafe impl ConstToString for u32 {
    const TYPE: ConstType = ConstType::U32;
}

const fn const_if<T: Copy>(condition: bool, then: T, otherwise: T) -> T {
    [otherwise, then][condition as usize]
}

const unsafe fn transmute<From, To>(from: From) -> To {
    #[repr(C)]
    struct Pad<From> {
        from: From,
        zeros: [u8; 16],
    }

    #[allow(unions_with_drop_fields)]
    union Transmute<From, To> {
        from: Pad<From>,
        to: To,
    }

    Transmute {
        from: Pad {
            from,
            zeros: [0; 16],
        },
    }.to
}

fn main() {
    const N: u32 = 12345;
    const N_TO_STRING: &str = const_to_string!(N);

    const S: &str = "&str";
    const S_TO_STRING: &str = const_to_string!(S);

    println!("{:?}", N_TO_STRING); // "12345"
    println!("{:?}", S_TO_STRING); // "&str"
}

@Vurich

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.