Code Monkey home page Code Monkey logo

weightedrand's Introduction

weightedrand โš–๏ธ

PkgGoDev CodeFactor Build Status codecov

Fast weighted random selection for Go.

Randomly selects an element from some kind of list, where the chances of each element to be selected are not equal, but rather defined by relative "weights" (or probabilities). This is called weighted random selection.

Usage

import (
    /* ...snip... */
    "github.com/mroth/weightedrand/v2"
)

func main() {
    chooser, _ := weightedrand.NewChooser(
        weightedrand.NewChoice('๐Ÿ’', 0),
        weightedrand.NewChoice('๐Ÿ‹', 1),
        weightedrand.NewChoice('๐ŸŠ', 1),
        weightedrand.NewChoice('๐Ÿ‰', 3),
        weightedrand.NewChoice('๐Ÿฅ‘', 5),
    )
    // The following will print ๐Ÿ‹ and ๐ŸŠ with 0.1 probability, ๐Ÿ‰ with 0.3
    // probability, and ๐Ÿฅ‘ with 0.5 probability. ๐Ÿ’ will never be printed. (Note
    // the weights don't have to add up to 10, that was just done here to make
    // the example easier to read.)
    result := chooser.Pick()
    fmt.Println(result)
}

Performance

The existing Go library that has a comparable implementation of this is github.com/jmcvetta/randutil, which optimizes for the single operation case. In contrast, this library creates a presorted cache optimized for binary search, allowing repeated selections from the same set to be significantly faster, especially for large data sets.

Comparison of this library versus randutil.ChooseWeighted on my workstation. For repeated samplings from large collections, weightedrand will be much quicker:

Num choices randutil weightedrand weightedrand -cpu=8*
10 201 ns/op 38 ns/op 2.9 ns/op
100 267 ns/op 51 ns/op 4.1 ns/op
1,000 1012 ns/op 67 ns/op 5.4 ns/op
10,000 8683 ns/op 83 ns/op 6.9 ns/op
100,000 123500 ns/op 105 ns/op 12.0 ns/op
1,000,000 2399614 ns/op 218 ns/op 17.2 ns/op
10,000,000 26804440 ns/op 432 ns/op 35.1 ns/op

*: Since v0.3.0 weightedrand can efficiently utilize a single Chooser across multiple CPU cores in parallel, making it even faster in overall throughput. See PR#2 for details. Informal benchmarks conducted on an Intel Xeon W-2140B CPU (8 core @ 3.2GHz, hyperthreading enabled).

Don't be mislead by these numbers into thinking weightedrand is always the right choice! If you are only picking from the same distribution once, randutil will be faster. weightedrand optimizes for repeated calls at the expense of some initialization time and memory storage.

Requirements

weightedrand >= v2 requires go1.18 or greater. For support on earlier versions of go, use weightedrand v1.

Credits

To better understand the algorithm used in this library (as well as the one used in randutil) check out this great blog post: Weighted random generation in Python.

weightedrand's People

Contributors

dependabot[bot] avatar mroth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

weightedrand's Issues

Feature Request: Choice Removal

Hi,
First of all, thanks for your great library.
Do you accept feature request? It would be great for Chooser to have a method or an option to remove already picked items. Would you consider this to be on your roadmap?
Thank you.

RFC: int vs uint for Choice.Weight

Currently, Choice is defined such that:

type Choice struct {
	Item   interface{}
	Weight uint
}

My original intention here was from he "make invalid states unrepresentable" perspective -- and negative weights could cause errors, however #4 will introduce safety checks at creation time in NewChooser, making this less necessary.

However, it is potentially more idiomatic to utilize an int here, since it tends to be the numeric value type most commonly used by the Go community. The definition would then change slightly such that any Choice with Weight <= 0 would represent something that would never be selected. This could avoid potential typecasting (an ergonomics issue, not relevant to performance) when creating new Choices.

There are primary downside I can think of:

  • This would be another breaking API change, but since we will be pushing those anyhow in the next semver release due to #4, now is the right time to consider it.
  • The running total algorithm used in a Chooser relies on all weights being >= 0, while zero values are fine, negative values will cause state corruption during initialization. This could be easily handled by having NewChooser simply skip any items with weight <= 0 and not insert them in the internal data table at all. This is still fine from a behavior perspective, since they can never be selected by a Pick, however it would mean that we could likely not have options in the future to enumerate all items in a Chooser, or specify that the enumeration would only reflect selectable items. This does not seem to be a likely needed use case anyhow, however.

I'm currently mildly in favor of making this change now from an ergonomics perspective to hone in on the best API before stabilizing this library, but would love feedback from users. Otherwise it's probably not worth the disruption unless it's really better for users.

request to publish a new version

Hello.

I'm using this library and found there was a bug patch recently. (#6)

Can you please release new version when you have time?

thanks in advance

[HELP] How to programatically add a varying amount of choices

Hey, was I wondering if I could get some help on how to utilize this package. Feel free to close this issue if it is out of place or obvious.

I need to add a varying amount of choices with different weights based on the context of the usage. So I obviously created a for loop, but I can't for the life of me use the NewChoice() command correctly. The chooser does not accept arrays and I am not sure on how I can add choices one by one to the same instance. I tried to create an array of weightedrand.Choice structs but the NewChooser() does not want them.

Have I totally misunderstood or is this not possible? If it is possible, could I get an example?

Thank you so much for reading my issue.

Nit: undeprecate PickSource

I do want to set my seed, unrelated to lock contention. I want consistent reproduction of my random results with the same seed. Seeding global rand is deprecated (for good reason), so PickSource is the only option using this library.

Keeping the note in the method doc makes sense, but marking the entire method deprecated leaves no supported route for this use case in the library.

Use of int for internal counter

Wouldn't it make more sense to use a uint64 for the total and max fields inside the chooser?

As it stands you allow the user to use a uint64 for the weight but the max weight total cannot exceed a maxInt. And that maxInt is different for users on 32 and 64 bit systems creating uncertainty in how the software will run on users computers.

I've had a bug report of my software being unable to run on 32 bit systems because I need the total weight to exceed a maxInt32 (and ideally maxInt64 for that matter).

It seems like if you're going to allow uint64s as weights then internally the counter should be able to handle that maximum value.

Feature: Add and remove choice on in the runtime

Hey,

I Just wanted to know if you can add a feature to add, remove choice in the run time without creating new choise everytime.

something like

	Choice := wr.NewChoice("hahaha", 50)
	chooser.Add(Choice)

update docs regarding PickSource for go1.21

As per golang/go@0b9974d, global rand should no longer suffer from lock contention issues, removing the primary use case (imo) for this function. Once this commit is live (in go1.21 I believe) PickSource will still have value for using alternate randomness sources so should probably not be deprecated(?), but should be likely be unnecessary for high-throughput environments utilizing multiple goroutines.

At that point, the documentation should be adjusted to no longer recommend it for that particular use case.

(Notes on deprecation: https://github.com/golang/go/wiki/Deprecated)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.