Code Monkey home page Code Monkey logo

Comments (11)

camshaft avatar camshaft commented on June 14, 2024

Direct mode simply maps the byte patterns produced by the rng into the type and rejects the value if that mapping doesn't succeed. This is useful for drivers that record coverage metrics and how they relate to the input. Forced mode instead tries its best to get the generator to succeed.

But as you've seen, DirectRng is almost never what you want, especially when the type is really picky about what is valid. That being said, I think the biggest benefit is being able to sample, let's say 10000 calls, and produce the probability that a valid value is generated. This would give you a "pickiness score" for your type that may help you change it to be less picky.

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

Direct mode simply maps the byte patterns produced by the rng into the type and rejects the value if that mapping doesn't succeed. This is useful for drivers that record coverage metrics and how they relate to the input. Forced mode instead tries its best to get the generator to succeed.

Thank you for your reply. But I'd like more clarification on this. IMO, RNG should produce bytes by the demand of the generator instead of generating a fixed amount of bytes ahead of the generation like what the fuzzer does. Unless we adopt something like generate-and-test, I think generation with RNG should almost certainly succeed.

from bolero.

camshaft avatar camshaft commented on June 14, 2024

RNG should produce bytes by the demand of the generator instead of generating a fixed amount of bytes ahead of the generation like what the fuzzer does.

This is exactly what happens in both modes. Neither mode limits the length of the input. They just specify what to do when the bytes don't validate the constraints for the type.

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

This is exactly what happens in both modes. Neither mode limits the length of the input. They just specify what to do when the bytes don't validate the constraints for the type.

For the particular example in this issue, the constraints are the same, right? Put it another way, could you provide an example where the bytes don't validate the constraints for DirectRng and ForcedRng leads to different behavior?

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

Another example is the string generation example in #157. I don't see the generation results should differ given that both modes produce unlimited amount of random bytes.

from bolero.

camshaft avatar camshaft commented on June 14, 2024

For both of these examples it doesn't matter how long the bytes input is. Consider the first one: you've said "generate a usize that is always 5". The DirectRng will generate a usize and only return Some(usize) if the value it picked was 5 (very unlikely). The ForcedRng will ensure that the usize that it generated is coerced and always produces Some(5).

For the String example, it's the same thing. The String generator says "generate N random chars". In Direct mode, the char generator says "generate a random u32 and return Some(char) if the u32 value is a valid codepoint". If any of those fail, the whole String generator will fail. In the case of Forced mode, it'll make sure that u32 value is always mapped somehow to a valid char and will always return Some(char).

from bolero.

camshaft avatar camshaft commented on June 14, 2024

Direct mode generator for char - should only succeed if the generated u32 directly maps to a valid char:

if fill.mode() == DriverMode::Direct {
let value = u32::sample(fill, Bound::Unbounded, Bound::Unbounded)?;
return char::from_u32(value);
}

Forced mode generator for char - should always succeed:

const START: u32 = 0xD800;
const LEN: u32 = 0xE000 - START;
fn map_to_u32(c: &char) -> u32 {
match *c as u32 {
c if c >= START => c - LEN,
c => c,
}
}
let lower = BoundExt::map(min, map_to_u32);
let upper = match max {
Bound::Excluded(v) => Bound::Excluded(map_to_u32(v)),
Bound::Included(v) => Bound::Included(map_to_u32(v)),
Bound::Unbounded => Bound::Included(map_to_u32(&char::MAX)),
};
let mut value = u32::sample(fill, BoundExt::as_ref(&lower), BoundExt::as_ref(&upper))?;
if value >= START {
value += LEN;
}
char::from_u32(value)

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

For the String example, it's the same thing. The String generator says "generate N random chars". In Direct mode, the char generator says "generate a random u32 and return Some(char) if the u32 value is a valid codepoint".

I'm sorry I'm still confused. IIUC, DirectRng does generate-and-test. For instance, gen_usize(Bound::Included(&5), Bound::Excluded(&5)) means generating a usize and test if it satisfies the constraint (==5). Following the same reasoning, gen_char(Bounded::Included(&'a'), Bounded::Included(&'z')) should generate a char and test if it's a lower-case letter. If we don't provide any bounds, it should produce a valid char, just like gen_u8(Bound::Unbounded, Bound::Unbounded) should produce a u8?

from bolero.

camshaft avatar camshaft commented on June 14, 2024

I think I see where you're getting hung up. The problem we run into is there's a difference between u8 and char. For u8 any 8 bit pattern is a valid u8. This means it can easily map random input into a valid u8. OTOH, char isn't valid for any bit pattern so it can't just map arbitrary bytes to one. So by nature of using a char you've placed constraints on the underlying bytes that you don't have with u8. Does that make sense?

You can see this same thing with Kani. The primitive integers just construct arbitrary bit patterns and they're done: https://github.com/model-checking/kani/blob/06c0fbe60d793f8b47fadc5fc5a83932e96f2389/library/kani/src/arbitrary.rs#L29-L67

In contrast, the char implementation needs to place constraints on the bit pattern to ensure it's valid: https://github.com/model-checking/kani/blob/06c0fbe60d793f8b47fadc5fc5a83932e96f2389/library/kani/src/arbitrary.rs#L85-L95

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

Thank you for your patience. I think I got the idea. That being said, I think users familiar with property based testing but not with fuzzing could be easily confused. Maybe it's a good idea to add more description to the Rust doc about them.

from bolero.

shaobo-he-aws avatar shaobo-he-aws commented on June 14, 2024

Closed this issue as my questions have been answered.

from bolero.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.