Working with mostly-UTF-8 byte strings, the two biggest things my current project is m

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

A from_str(_radix) analogue about bstr HOT 9 OPEN

burntsushi commented on May 31, 2024

A from_str(_radix) analogue

from bstr.

Comments (9)

BurntSushi commented on May 31, 2024 3

Yeah, I do kind of think this sort of thing is in scope. I've certainly wanted functions like this in the past as well.

Adding these sorts of routines for integers seems pretty reasonable, but this does open the door for wanting analogous parsing routines for floating point. That is definitely not a road I want to go down. It looks like the lexical project is already meeting this need anyway: https://docs.rs/lexical-core/0.7.4/lexical_core/

So I'd probably want to draw a line in the sand here: integer parsing APIs are okay, but not floats.

from bstr.

BurntSushi commented on May 31, 2024

This feels like a good "first issue" to me. In particular, routines could be added that are implemented by first converting the input to a &str and then reusing std's integer parsing. This isn't optimal, but we can always optimize it later.

For anyone who wants to work on this, it would probably be good to start by posting an API proposal in this issue. The API is probably the hardest part of this issue to be honest. For example, do we want a new sub-module with a bunch of concrete functions for each integer? Do we want to define a trait that we implement for all of the integer APIs? I kind of lean towards the latter.

from bstr.

alexjago commented on May 31, 2024

API Proposal: `impl FromBStrRadix for {integer}`

Rationale

Currently, we call e.g. i32::from_str_radix(src, radix).

By analogy, it seems correct to call i32::from_bstr_radix(src, radix).

Defining a new trait with that function and then implementing it for the integers makes that work (at the small cost of importing the trait).

It seems reasonable to name the trait after the function.

Implementation

I'm not quite ready to make a PR but I do have a fork up:
alexjago@3487786

What's in it: a new file src/from_bstr_radix.rs with the trait, impl and tests; plus two lines of changes to src/lib.rs to use the module and export the trait.

There are a couple of slight differences to the stdlib function.

Obviously, it takes src: &BStr rather than src: &str. Question: should this be an AsRef<[u8]> instead?
radix is the output type rather than always u32 (easily changeable if desired).
The error type is slightly different. The stdlib returns Err(ParseIntError {kind: IntErrorKind}) but since kind is private and there's no constructor, I just return Err(IntErrorKind::{variant}) directly. IMHO this is actually more useful.

edit to add: well, IntErrorKind would be more useful if it didn't require 1.55. Without it, might as well just wrap the std integer parsing.

from bstr.

BurntSushi commented on May 31, 2024

@alexjago Thanks! Nice work. Some responses:

It should definitely take AsRef<[u8]>, not &BStr. &[u8] is what all of the APIs in bstr are defined on.
I'm not sure why radix would be tied to the output type? u32 feels right to me here.
The error type should definitely not be just a bare IntErrorKind. We'll want to define our own ParseIntError. We can expose the kind through a method, just like std does.

The "bstr" in the FromBStrRadix name bugs me a bit, particularly since we're not taking a BStr but a &[u8]. Probably a better name is FromBytesRadix::from_bytes_radix. I don't love it, but I like it better than FromBStrRadix.

Feel free to submit a PR and we can work through the actual code itself.

from bstr.

BurntSushi commented on May 31, 2024

Requiring Rust 1.55 is fine. I'm going to bump the MSRV of bstr up higher. See the 1.0 issue.

from bstr.

alexjago commented on May 31, 2024

There's another API question which occurred to me around panicking and allocating.

The stdlib function (and therefore this one, for now) has an assert on the radix value... which includes the erroneous value in the error message.

That's an allocation, right? I don't think any other part of the functionality really needs to allocate.

So it's probably fixable with some clever #[cfg] work (non-alloc could just... not have a message).

But it also might be an option to redesign the API so that out-of-range radices are a recoverable error, presumably with an additional variant of IntErrorKind. Obviously if we do that then we can't just re-export the stdlib enum.

from bstr.

BurntSushi commented on May 31, 2024

You're saying an assert is an alloc? No, it isn't.

Let's stick with how std does things I think wherever we can. The radix is almost always going to be a static input determined by the programmer, so it seems sensible to panic for invalid radix values.

from bstr.

alexjago commented on May 31, 2024

Good to know that asserts don't alloc. (I got this misunderstanding because the format! macro returns String, but I suppose the underlying machinery just writes to some buffer...)

And yes, I agree that sticking to how std does things makes life more straightforward for everyone.

from bstr.

BurntSushi commented on May 31, 2024

Well, the assert might alloc when the assert fails. Sure. But that's okay.

from bstr.

A from_str(_radix) analogue about bstr HOT 9 OPEN

Comments (9)

API Proposal: `impl FromBStrRadix for {integer}`

Rationale

Implementation

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (9)

API Proposal: impl FromBStrRadix for {integer}

Rationale

Implementation

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

API Proposal: `impl FromBStrRadix for {integer}`