burntsushi / byteorder Goto Github PK

View Code? Open in Web Editor NEW

937.0 937.0 143.0 287 KB

Rust library for reading/writing numbers in big-endian and little-endian.

License: The Unlicense

Rust 100.00%

byteorder's People

Contributors

Stargazers

Watchers

Forkers

tanadeau codyps fenhl andydude romanb aatch stebalien rkjnsn andersk apoelstra tomaka brandonson geofft tobba cesarb nwin tempbottle liboctavo markuskobler cwbriones lukesteensen lambda emberian dtulig tkornack klodio fitzgen oli-obk brson utkarshkukreti sdroege enet4 jedisct1 newpavlov mgeisler bendmorris nikvolf kixunil ennirosario ctz relationalai-oss gankra cuviper galoisinc wmiller848 ignatenkobrain hellow554 lexxvir adamniederer sunblaze-optio losfair fbernier waywardmonkeys tspiteri gurry joshlf plasmapower bejito tonyvelichko huonw akash-fortanix snf dhardy pducks32 jpopesculian mesalock-linux guillaumegomez gky360 jturner314 oconnor663 rafalh jonhoo isgasho eclipseo rust-embedded-community eduadiez tballance kacejot aschampion yungbumjung forkkit loveyfore i9cn my3157 luizirber frago9876543210 pigeonhands srikwit cisen jonil fmiguelgarcia nickray zyansheep hixio-mh workingjubilee bbqsrc atnightly doumanash age-rs toerktumlare

byteorder's Issues

mismatched types

I updated to recent nightly and I cannot build byteorder anymore because of mismatched types.

b.rs:278:52: 278:57 error: mismatched types:
 expected `*mut u8`,
    found `*const u8`
(values differ in mutability) [E0308]
/home/jp/.cargo/registry/src/github.com-1ecc6299db9ec823/byteorder-0.3.3/src/lib.rs:278                 copy_nonoverlapping($dst.as_mut_ptr(), bytes, $size);

Looks like it's just a one-liner

consider returning number of bytes written in WriteBytesExt methods

While this sounds redundant because the write methods are designed for fixed-width types it would allow the caller to more easily compute the amount of data written. As an example use case, consider sending a small packet of data over a UDP socket. Something like:

let mut buf = [0u8; 16];
let mut n = 0;
{
let mut packet = &mut buf[..];
n += try!(packet.write_u16::(1u16));
n += try!(packet.write_u16::(2u16));
n += try!(packet.write_u16::(3u16));
}
let sock = try!(std::net::UdpSocket::bind("0.0.0.0:0"));
try!(sock.send_to(&buf[..n], "127.0.0.1:1234"));

It gets harder to determine how many bytes may have been written for increasingly complicated packets (perhaps with conditional/variable length fields). Or if new fields are added in the future there is less risk of the sent buffer size not getting updated to match. Writing into a vector would also provide the necessary book keeping but requires an allocation from the heap. This would also bring the API closer in line with the std::io::Write's write signature.

Is there much of a performance hit in returning a hard-coded usize value instead of the empty tuple?

Incorrect use of unreachable!()

extern crate byteorder;

fn main() {
    byteorder::LittleEndian::default();
}

thread 'main' panicked at 'internal error: entered unreachable code'

Unreachable doesn't mean "please don't call this," it means "you can't get here." I think it is okay for this method to panic but it should be something other than unreachable. A handwritten message explaining why this method exists would be nice.

cc @fitzgen

Read Write for core

@Tobba and I have worked on https://github.com/QuiltOS/core_io, a copy of Read and Write but with an associated error type to make it just need core. Perhaps it would be nice to (optionally) extend these traits for no_std users?

approaching 1.0

byteorder is very heavily used, but its API has mostly remained the same since it was first released (which was inspired by both Go's encoding/binary package and the pre-existing methods in Rust's old standard library that fulfilled a similar role). There was however significant discussion on its API in #27, but I feel that no consensus has been reached and I don't think there's an obviously better API given Rust in its current form. Therefore, I'd like to propose that we cut a 1.0 release in the next few weeks.

I think the only outstanding issue that we should try to resolve before 1.0 is #52.

cc @nwin @TyOverby @sfackler @lambda

Add more examples

Several types do not directly contain examples, nor do many methods. It's not clear how much example coverage we need, the policy in std though is for everything to have an example.

Library evaluation tracking issue

Add a supertrait to hide trait details #69
ByteOrder::default should panic! not unreachable! #68
Put panic and error docs in "Panics" and "Errors" sections #72
Make sure there are enough examples #75
Add CI badges to Cargo.toml #74
Add "categories" to Cargo.toml #73
Add #[doc(html_root)] #77

This is the tracking issue for the evaluation performed by the libs team last week.

use of undeclared trait name `Writer`/'Reader'

My Cargo.toml file looks like this:

[package]

name = "lambda-db"
version = "0.0.1"
authors = ["asdasd < [email protected]>"]

[dependencies]
bincode = "*"

I'm using the lastest rustc compile:
rustc 1.0.0-nightly (b0aad7dd4 2015-03-22) (built 2015-03-23)

However I am getting build errors as it's missing undeclared trait names - Reader and Writer.

error: use of undeclared trait name `Writer`
/home/pez/.cargo/registry/src/github.com-1ecc6299db9ec823/byteorder-0.2.14/src/old.rs:225 impl<W: Writer> WriterBytesExt for W {}

compilation error: parameter `Self` is never used

I'm very new to rust otherwise I would fix this issue myself. Is it possible that I am using the wrong version of rust ?

$ rustc -V
rustc 1.0.0-beta (built 2015-04-04)
$ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
 Downloading quickcheck v0.2.14
   Compiling byteorder v0.3.8 (file:///Users/zimbatm/code/github.com/BurntSushi/byteorder)
src/lib.rs:87:1: 201:2 error: parameter `Self` is never used
src/lib.rs:87 pub trait ByteOrder {
src/lib.rs:88     /// Reads an unsigned 16 bit integer from `buf`.
src/lib.rs:89     ///
src/lib.rs:90     /// Task failure occurs when `buf.len() < 2`.
src/lib.rs:91     fn read_u16(buf: &[u8]) -> u16;
src/lib.rs:92
              ...
src/lib.rs:201:2: 201:2 help: consider removing `Self` or using a marker such as `core::marker::PhantomFn`
error: aborting due to previous error
Could not compile `byteorder`.

To learn more, run the command again with --verbose.

Add a version using new `io` module

As I understand, the purpose of this crate is to prepare for a world where we no longer have the endian-writing/reading functions on Reader and Writer. As someone that that uses those functions a lot, I would like to prepare my crates (namely bincode) for the new io crate. I'd be willing to help do the port if you are interested.

Consider mechanisms to convert &[u32] to &[u8]

It makes me a little sad to see unsafe being used to convert a &[u32] into a &[u8] in octavo:

    fn crypt(&mut self, input: &[u8], output: &mut [u8]) {
        assert_eq!(input.len(), output.len());

        if self.index == STATE_BYTES { self.update() }

        let buffer = unsafe {
            slice::from_raw_parts(self.buffer.as_ptr() as *const u8, STATE_BYTES)
        };

        for i in self.index..input.len() {
            output[i] = input[i] ^ buffer[i];
        }

        self.index = input.len();
    }

We really ought to have a place to centralize this functionality so that it's well tested and safe across our ecosystem. Would it make sense to have this functionality be in byteorder?

It'd also be interesting to also support the inverse operation, where a &[u8] is converted into a (&[u8], &[u32], &[u8]), where the first and last slice are there to read a byte-at-a-time until the the slice is aligned. This style operation could be useful to safely massage a slice into something that can use simd (or at least simd-ish operations over a usize value).

cc @huonw, @bluss, @hauleth

Writing to uninitialized buffer

write_* methods of ByteOrder trait accept a buffer and don't guarantee that they wouldn't read from it. This has a drawback that strictly speaking, the provided buffer shouldn't be uninitialized.

I suggest to provide some way of guaranteeing that the buffer won't be read from, so it's fine to pass uninitialized buffer.

Add wrapper for `ReadBytesExt`/`WriteBytesExt` which stores the endianness

Using ReadBytesExt is very noisy due to having to specify the endianness in every call:

let foo = reader.read_u16::<BigEndian>()?;
let bar = reader.read_u16::<BigEndian>()?;
let baz = reader.read_u32::<BigEndian>()?;

It would be nice if there was a reader adapter which stored the endianness to DRY this up:

pub struct ReadBytes<R, E> {
    inner: R,
    endianness: PhantomData<E>
}

impl<R, E> ReadBytes where R: Read, E: ByteOrder {
    // Alternately place this as an adapter method on `ReadBytesExt`
    pub fn new(inner: R) -> Self {...}

    // Duplicate methods of `ReadBytesExt` without the endianness parameter
}

Of course, all of this applies to WriteBytesExt as well.

`Endianess` enum for runtime cases

From a discussion on #rust-beginners there seems to be a need for ByteOrder implementation that dispatches to LE/BE based on runtime information.

Essentially something like this:

enum Endianess { Little, Big }
impl ByteOrder for Endianess {
    // boilerplate methods with `match` that dispatch to LE or BE
}

let endianess = get_endianess_at_runtime();
endianess.read_i32(&some_bytes);

The byteorder docs don't seem to say that the crate is focused solely on static/type-level checking, so I'm guessing this would be in scope for the library.

Of course this isn't strictly necessary, as you can probably just write reading/writing code generically and simply move the LE/BE decision to a higher level, but it may simplify some use cases regardless.

README should not advise a major version wildcard pattern

It currently suggests the following:

[dependencies]
byteorder = "*"

In semantic versioning the major version number indicates breaking changes, so this pattern could cause unexpected breaks in downstream libraries when 1.0 comes around. I would suggest the following:

[dependencies]
byteorder = "0.*"

Which still uses a wildcard so it's easy to keep up-to-date.

Why does not `ByteOrder` not have a `read_u8` or `read_i8` function?

It seems like they are really missing!

Add slice methods for WriteBytesExt

The 1.1.0 release introduced slice methods for the ReadBytesExt trait (read_*_into). It would be useful to have the corresponding write methods for the WriteBytesExt trait.

i24 (traid) support

Please add support for the 24 byte int (little, big, signed, unsigned)

In most cases, this is how i24s are implemented:

Writing

Start with an int, lets say 0 (0x00, 0x00, 0x00, 0x00)
Take the last byte off of the int to make it 0x00, 0x00, 0x00

Reading

Get 0x00, 0x00, 0x00 and just pad it with another null byte then read it as an int (0x00, 0x00, 0x00, 0x00)

Working with arrays

Current implementation uses &[u8] for reading and &mut [u8] for writing, which panics if lengths are invalid. I suggest to implement alternative functions like read_arr_u16(but: &[u8; 2]) (name subject to bike shedding), that would take reference to array.

This would come handy in combination with constant generics, where code could guarantee absence of panics (due to programmer error) and elision of (unnecessary) checks.

Once constant generics are in Rust, the old (&[u8]) versions could be implemented using new ones. I guess this could be done with macros.

Would it make sense to add copy_from_slice() like functions?

I use byteorder in my rawloader crate to read binary files with specific orderings. For example here's the code to read 16bit unpacked little-endian:

https://github.com/pedrocr/rawloader/blob/8f1a100e8f8d611efabd3c14db8e5050e9f0ebbe/src/decoders/basics.rs#L376-L384

On little-endian machines this could just be a memcpy. Currently rustc isn't able to optimize loops into memcpy but maybe it would make sense to have an explicit API for this inside byteorder, the same way copy_from_slice already allows that in standard types. That way I could call a LittleEndian::read_u16s(from, to, n) function in byteorder and if the architecture matches it does a memcpy, if not does a loop or even uses a BSWAP instruction. Would that make sense as part of the scope for this crate?

Extending integer types with a `.bytes()` method

I'm frequently in need of accessing individual Bytes of an integer (be it when writing emulators or implementing network protocols), and I thought this functionality would be a good addition to the byteorder crate (since it is inherently dependent on the byte order). One way it could work is by defining extension traits for all integer types that look like this:

trait U32Ext {
    fn bytes<B: ByteOrder>(self) -> [u8; 4];
}

It can be used like this:

let value = 123456i32;
let msb = value.bytes::<BigEndian>()[0];
let lsb = value.bytes::<BigEndian>()[3];

// These are equivalent:
let msb = value.bytes::<LittleEndian>()[3];
let lsb = value.bytes::<LittleEndian>()[0];

Of course, the methods must be marked #[inline] for reasonable performance (I've confirmed that these expressions are optimized properly, so only the computation for the accessed byte remains).

Thoughts?

Consider adding runtime-defined endianness

Sometimes it is impossible to statically determine required endianness in advance. For example, TIFF image format defines endianness in the first byte of an input file, so it may be either big or little but which exactly is unknown statically. It would be nice if I could use byteorder for this task too.

usize support

Support usize because it's not possible to write a usize without something like this:

let length = "Hello World!".len();
let mut wtr = vec![];
wtr.write_uint::<BigEndian>(length, mem::size_of::<usize>()).unwrap();

Doc is not up to date

When I follow the documentation link on crates.io or on the README, the documentation is not in sync with the repo/crates.io.

For example, I was looking at the Result returned by ReadBytesExt::read_u32(), which is different in 0.5.0.

Lack of read__uint_n, read__int_n

If there is no problem with this, then i can steal implementation from old_io and fill a pull request.

u128 i128 support

u128 i128 support?

Upstream tracking issue: rust-lang/rust#35118

Release 1.0.1 with i128 feature

It would be nice if we could change our git dependencies to version dependencies on crates.io

Use io::Result instead of customized Result

I think the APIs should return io::Result instead of byteorder::Result. It would be much easier to migrate those libs who depend on old_io to new io.

I suggest the UnexpectedEOF error could be put into the io::Error with io::ErrorKind::Other kind.

I could submit a PR.

is read_f32/read_f64 unsafe?

As of this moment, the read_f32 and read_f64 methods will bitcast any sequence of 4/8 bytes to a f32/f64 and will never fail, even if the resulting float is a "signaling NaN." In particular, materializing signaling NaNs appear to be undefined behavior, although the topic is pretty cloudy. See rust-lang/rust#39271 for more details.

I think what this means is that these functions need to be modified to return a Result and probably a custom error type as well, although it would be nice to just use the error type that we end up with from rust-lang/rust#39271.

This is a pretty major breaking change, so it will require a 2.0 release.

There is some confusion (at least in my head) around whether signaling NaN's are actually unsafe or not.

Add iterators

for word in file.iter_u16<byteorder::LittleEndian>() {
    ...
}

Documentation is outdated

The documentation at http://burntsushi.net/rustdoc/byteorder seems to be outdated. The biggest issue that really confused me was that the Error type was removed (http://burntsushi.net/rustdoc/byteorder/enum.Error.html).

Add "NativeEndian" byte order

Please consider adding NativeEndian byte order which corresponds to the byte order used on the host system.

While NativeEndian can be dangerous for cross-platform compatibility, sometimes it can be necessary. For example, some (poorly defined) file formats may use platform endianness for multi-byte numbers.

ByteOrder supertraits are a compatibility hazard

pub trait ByteOrder
    : Clone + Copy + Debug + Default + Eq + Hash + Ord + PartialEq + PartialOrd {

I understand why these exist (#60) but I am worried about the situation if Rust were to add another trait with a builtin derive not on this list. Our options would be to add it to this list which would require a 2.0 release of byteorder, or not add it to this list which means the new derive won't work nicely with structs parameterized over byteorder.

I think we should take steps to prevent extending this list being a breaking change. In serde_json we use a trick like this in a similar situation:

mod private {
    pub trait Sealed {}
    impl Sealed for super::LittleEndian {}
    impl Sealed for super::BigEndian {}
}

/// This trait is sealed and cannot be implemented for types outside of byteorder.
pub trait ByteOrder
    : Clone + Copy + Debug + ... + private::Sealed {

Now we are free to add supertraits and even trait methods (u128 i128 #65) without a major release.

The selling point of byteorder is that you can read and write little- / big-endian numbers, not that you can define your own wild and crazy byte orders, so I think it is reasonable to limit implementations to within the byteorder crate.

cc @fitzgen

Build fails on rust beta

Cannot build or run cargo test on 0.3.8:

$ cargo test
   Compiling byteorder v0.3.8 (file:///Users/xavierlange/code/rust/byteorder)
   Compiling libc v0.1.6
src/lib.rs:87:1: 201:2 error: parameter `Self` is never used
src/lib.rs:87 pub trait ByteOrder {
src/lib.rs:88     /// Reads an unsigned 16 bit integer from `buf`.
src/lib.rs:89     ///
src/lib.rs:90     /// Task failure occurs when `buf.len() < 2`.
src/lib.rs:91     fn read_u16(buf: &[u8]) -> u16;
src/lib.rs:92 
              ...
src/lib.rs:201:2: 201:2 help: consider removing `Self` or using a marker such as `core::marker::PhantomFn`
error: aborting due to previous error
Build failed, waiting for other jobs to finish...
Could not compile `byteorder`.

Running rustc from mac homebrew:

$ rustc --version
rustc 1.0.0-beta (built 2015-04-03)
$ brew info rust
rust: stable 1.0.0-beta (bottled), HEAD
http://www.rust-lang.org/
/usr/local/Cellar/rust/1.0.0-beta (5619 files, 274M) *
  Poured from bottle
From: https://github.com/Homebrew/homebrew/blob/master/Library/Formula/rust.rb

Consider providing a byte order enum that allows picking the byte order at runtime

Currently, it looks like the byte order must be selected in the code. It would be nice to have:

enum Endian {
    Little,
    Big,
}

Or something like that.

Improve WriteBytesExt API

At the moment the byteorder crate uses a enum to generalize over the endianness and the method name to select the type. Both is not very ergonomic to use. I propose the following interface:

trait WriteBytesExt<T> {
    fn write_le(&mut self, n: T) -> io::Result<()>;
    fn write_be(&mut self, n: T) -> io::Result<()>;
}

impl<W> WriteBytesExt<u8> for W where W: Write {
    fn write_le(&mut self, n: u8) -> io::Result<()> {
        ....
    }
    ....
}

First of all it gets rid of the enum. Since the enum is purely a compile time parameter it cannot be used for dynamic dispatch. This is as good or bad as having it directly the method name. Thus I do not see the point of having it. Secondly it gets rid of the redundant type name in the signature.

This shortens the method call significantly

w.write_u16::<LittleEndian>(42u16)

becomes

w.write_le(42u16)

My two points are:

The type in the method signature carries redundant information.
The enum type parameter does not provide any benefit for the user.
Enums are most useful for runtime polymorphism. Thus, as the enum variants are no types these *BytesExt traits cannot be use to write generic code that abstracts over endianness. Again no benefit for the user.

Release a new version to crates.io

Can you release a new version with the change from #9?

buffer len issue

Most docs tells function will panic if buffer len < value:

    /// Panics when `buf.len() < 8`.
    #[inline]
    fn read_i64(buf: &[u8]) -> i64 {

but on macro to read and write bytes it is comparing with <=

macro_rules! read_num_bytes {
    ($ty:ty, $size:expr, $src:expr, $which:ident) => ({
        assert!($size == ::core::mem::size_of::<$ty>());
        assert!($size <= $src.len());

should it be assert!($size < $src.len()) ? or the docs should be <= ?

(edit.: never mind it is comparing value with length on macro

Support reading odd number of bytes

Some protocols specify 3 byte numbers (eg u24 in rfc5246), and it would be really handy if byteorder supported a read_u24 function out of the box.

Put panic and error docs in "Panics" and "Errors" sections

Write now information about panic and error conditions is written inline with the main text. Per conventions they should be in sections titled "Panics" and "Errors".

Add impls for all of the builtin traits to LittleEndian and BigEndian and make ByteOrder also require those impls

Right now, making something compile-time generic over endianness is a huge pain:

extern crate byteorder;
use byteorder::{ByteOrder, LittleEndian};

use std::marker::PhantomData;

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
struct EndianBuf<'a, Endian>(&'a [u8], PhantomData<Endian>) where Endian: ByteOrder;

impl<'a, Endian> EndianBuf<'a, Endian> where Endian: ByteOrder {
    fn new(buf: &'a [u8]) -> EndianBuf<'a, Endian> {
        EndianBuf(buf, PhantomData)
    }
}

fn main() {
    let buf = [1,2,3,4,5,6];

    let a = EndianBuf::<LittleEndian>::new(&buf);
    let b = EndianBuf::<LittleEndian>::new(&buf);

    // Compiler error regarding this line!
    assert_eq!(a, b);
}

If you do rustc -Z unstable-options --pretty expanded endian_buf.rs, you will see this:

#[automatically_derived]
#[allow(unused_qualifications)]
impl <'a, Endian: ::std::clone::Clone> ::std::clone::Clone for EndianBuf<'a, Endian>
    where Endian: ByteOrder 
{
    // ...snip...
}

The derived impls require that the type traits also implement the trait.

If ByteOrder implied Eq, Clone, Copy, etc... the problems would go away. Arguably, this is a bug in the #[derive(Foo)] expansion code for not understanding PhantomData, but adding this would work around that bug.

Since one can't even instantiate BigEndian or LittleEndian, all the impls could be unreachable!().

Would you be willing to accept a patch that does this?

Add CI badges to toml file

error: enum `Error` is private

I can't write my converter from byteorder::new::Error, i.e.:

impl FromError<byteorder::new::Error> for MyError { .. }

Therefore converting errors becomes a pain, and I must map_err every time I'm reading primitives.

Question: Reading in an arbitrarily number of bytes

Thanks for the great work in creating this library! it has been working great for me. I do have one question though. How does one read arbitrarily amount of bytes? E.g. not a fixed size. For example I want to read 6 bytes and put it into an [u8;6] array.

Thanks,

Superhac

Is there any chance you'd be willing to offer byteorder under something more carefully designed like the Creative Commons CC0 public domain dedication?

(CC0 is also what the FSF recommends if you want to release your code into the public domain.)