Code Monkey home page Code Monkey logo

Comments (21)

c-cube avatar c-cube commented on July 23, 2024 1

set_stat does not make sense, but set_stats can be done.

from qcheck.

c-cube avatar c-cube commented on July 23, 2024 1

Ok, no problem.

from qcheck.

c-cube avatar c-cube commented on July 23, 2024 1
stats mod4:
  num: 100, avg: 1.52, stddev: 11.44, median 1, min 0, max 3
                0: #################################################                25
                1: ###################################################              26
                2: #########################################                        21
                3: #######################################################          28
stats num:
  num: 100, avg: 57.68, stddev: 353.53, median 55, min 0, max 120
              0-5: ######################                                            4
             6-11: ######################                                            4
            12-17: #######################################################          10
            18-23: ######################                                            4
            24-29: ######################################                            7
            30-35: ###########################                                       5
            36-41: ###########################                                       5
            42-47: ################                                                  3
            48-53: #################################                                 6
            54-59: ######################################                            7
            60-65: #################################                                 6
            66-71: ######################                                            4
            72-77: ################                                                  3
            78-83: ###########                                                       2
            84-89: #####                                                             1
            90-95: ######################################                            7
           96-101: ######################################                            7
          102-107: ###########################                                       5
          108-113: #################################                                 6
          114-119: ################                                                  3
          120-125: #####                                                             1

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

This is interesting, and definitely useful. Probably an argument ?stats : (string * ('a -> int)) list added to arbitrary, with the name of the statistics and the counting function would be best. Each such statistics would be collected and displayed individually (based on its name).

If I implement this, would you be willing to test it?

A few comments:

  • computing basic statistics (average, median, etc.) should be simple enough to be integrated directly without any dep
  • maybe the stats could be grouped by buckets, for the histogram
  • a horizontal histogram, as found somewhere in the stdlib (I think) is easy to generate. See below.

Histogram printing

The snippet below displays the bucket statistics of a hashtable (using containers and sequence):

#use "topfind";;
#require "sequence";;
#require "containers";;

let tbl = CCHashtbl.of_seq Sequence.(1 -- 10000 |> zip_i |> zip);;
let hist = Hashtbl.((stats tbl).bucket_histogram);;

let max = Sequence.of_array hist |> Sequence.max |> CCOpt.get_or ~default:0

let () =
  Array.iteri
    (fun i n ->
       let m = n * 20 / max in
       Printf.printf "%-5d: %-22s %10d\n" i (CCString.repeat "*" m) n)
    hist
;;
$ ocaml truc.ml
0    : ****************             2449
1    : ********************         2936
2    : ***********                  1751
3    : *****                         736
4    : *                             255
5    :                                56
6    :                                 9

from qcheck.

jmid avatar jmid commented on July 23, 2024

Yes, I would be happy to test it.
Also: the horizontal histogram looks great considering the limited amount of code it takes to generate it!
👍

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

Here we go, with an output as follows. Can you take a look?

Collect results for test with_stats:

stats mod4:
         0: ###########                        16
         1: #########################          36
         2: ################                   24
stats num:
       0-6: #########################           9
       1-7: ######################              8
       2-8: #############                       5
       3-9: ################                    6
      4-10: #############                       5
      5-11: ################                    6
      6-12: ################                    6
      7-13: ################                    6
      8-14: #############                       5
      9-15: ################                    6
     10-16: ###########                         4
     11-17: #############                       5
     12-18: ################                    6
     13-19: ################                    6
     14-20: ###########                         4
     15-21: #############                       5
     16-22: ################                    6
     17-23: ################                    6
     18-24: ###################                 7
     19-25: ################                    6

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

Wait, there are a few bugs to fix.

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

Now looks like this:

stats mod4:
  num: 100, avg: 1.56
         0: #####################              24
         1: #####################              24
         2: #####################              24
         3: #########################          28
stats num:
  num: 100, avg: 62.00
       0-5: ########                            3
      6-11: #############                       5
     12-17: ########                            3
     18-23: ###########                         4
     24-29: #############                       5
     30-35: #############                       5
     36-41: ######################              8
     42-47: ###################                 7
     48-53: ##                                  1
     54-59: ######################              8
     60-65: #############                       5
     66-71: ########                            3
     72-77: ######################              8
     78-83: ################                    6
     84-89: ################                    6
     90-95: ##                                  1
    96-101: ########                            3
   102-107: #############                       5
   108-113: ###########                         4
   114-119: #########################           9
   120-125: ##                                  1
================================================================================

from qcheck.

jmid avatar jmid commented on July 23, 2024

Ok. Looking at this now. I would prefer a consistent naming:

  • ~print and set_print
  • ~shrink and set_shrink
  • ~collect and set_collect
    These are all very consistent. In this light I expect ~stats and set_stats but was surprised to find that the latter is called add_stat: add instead of set and a missing s.

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

It's because in this case I allow for several statistics to be collected. I agree that it's not perfect though :/

from qcheck.

jmid avatar jmid commented on July 23, 2024

Ah, I see now it is singular. A set_stats binding would be welcome though for consistency.
Perhaps two bindings: set_stat and set_stats?

from qcheck.

jmid avatar jmid commented on July 23, 2024

Ah, it has the semantics of cons'ing a stat-entry. Sorry, in that case I agree. set_stats would still make sense though (with the semantics of overwriting with the given list).

from qcheck.

jmid avatar jmid commented on July 23, 2024

I just tried:

# let t = Test.make (add_stat ("len",List.length) (list int)) (fun _ -> true);;
val t : QCheck.Test.t = QCheck.Test.Test <abstr>
# QCheck_runner.run_tests ~verbose:true [t];;
random seed: 245450799
generated; error;  fail; pass / total -     time -- test name
[✓] ( 100)    0 ;    0 ;  100 /  100 --     0.0s -- anon_test_1

+++ Collect ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Collect results for test anon_test_1:

stats len:
  num: 100, avg: 372.07
     0-393: #######################################################          83
   394-787: ####                                                              7
  788-1181: ###                                                               5
  1182-1575:                                                                   0
  1576-1969:                                                                   1
  1970-2363:                                                                   0
  2364-2757:                                                                   0
  2758-3151:                                                                   0
  3152-3545:                                                                   0
  3546-3939:                                                                   0
  3940-4333:                                                                   0
  4334-4727:                                                                   0
  4728-5121:                                                                   1
  5122-5515:                                                                   0
  5516-5909:                                                                   0
  5910-6303: #                                                                 2
  6304-6697:                                                                   0
  6698-7091:                                                                   0
  7092-7485:                                                                   0
  7486-7879:                                                                   1
  7880-8273:                                                                   0
================================================================================
success (ran 1 tests)
- : int = 0

It works very nicely! A few observations:

  • I was first surprised by the many empty lines printed and only subsequently realized that there were actually scarse upper entries
  • The indentation of the 100's is off by one. I just tried with ~count:10000 and got a similar indentation mismatch at the last entry 10000-10499:
  • min, max, median, stddev, would be warmly welcome :-)

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

I'm not used to statistics on day-to-day programming; would you be interested in writing median and stddev?

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

About the indentation, it's a problem of limiting width, I believe. I'll increase the limit.

from qcheck.

jmid avatar jmid commented on July 23, 2024

I can have a look at median and stddev at some point, but I have too much on the TODO list right now. I already promised you some feedback for the new function generators and the statistics came up in preparing the release of our effect-driven program generator. Your contributions and QCheck is highly appreciated! I'm sorry my own contributions are limited to complaints and feature requests at this point :-)

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

Should be fixed in last commit.

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

Not sure about stddev though…

from qcheck.

c-cube avatar c-cube commented on July 23, 2024

@jmid can you elaborate on your program generator? Sounds interesting! :-)

from qcheck.

jmid avatar jmid commented on July 23, 2024

This is great - you are the man! For the standard deviation, I think you are just missing a division in line 1273 by num before the square root. Following the Wikipedia link in the code I found
https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation and
https://en.wikipedia.org/wiki/Bessel%27s_correction
which suggests a division by num - 1 instead (but I'm not a statistician either).

from qcheck.

jmid avatar jmid commented on July 23, 2024

Re. program generator: I'll send you an email

from qcheck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.