Comments (5)
I am observing a similar behavior. When I add a second benchmark the first one takes longer time. The two pieces of code I am benchmarking are completely unrelated. Also have tried this with many combinations and consistently seeing the same result.
The measurement should measure only the function under test and should be independent of the surrounding code if that's the case. I do not think the generated code of the two benchmarked functions per se is different in the two cases. They are completely unrelated.
Its hard to trust the benchmarking results because of this.
Here is a simplified test code:
import qualified Data.Text.ICU as TI
...
main :: IO ()
main = do
str <- readFile "data/English.txt"
let str' = take 1000000 (cycle str)
txt = T.pack str'
str' `deepseq` txt `deepseq` defaultMain
[ bgroup "text-icu" $ [bench "1" (nf (TI.normalize TI.NFD) txt)]
, bgroup "just-count" $ [bench "1" (nf (show . length) str')]
]
The first benchmark is measuring text-icu
normalize function. When run it with only the first benchmark alone it reports:
benchmarking text-icu/1
time 2.830 ms (2.777 ms .. 2.913 ms)
When I add the second one it becomes:
cueball:/vol/hosts/cueball/workspace/play/criterion$ ./Benchmark1
benchmarking text-icu/1
time 3.709 ms (3.570 ms .. 3.846 ms)
benchmarking just-count/1
time 2.677 ms (2.516 ms .. 2.872 ms)
A 30% degradation by just adding a line. The difference is even more marked in several other cases.
This problem is forcing me to run criterion with only one benchmark at a time. Also note that the benchmark result is wrong even when only one benchmark is chosen out of many via command line. Just the presence of another benchmark is enough irrespective of the runtime selection.
I am running criterion-1.1.1.0 and ghc-7.10.3.
from criterion.
It seems my problem was due to the sharing of input data across benchmarks which caused undue memory pressure for the later benchmarks. The problem got resolved by using env
. I rewrote the above code like this:
setup = fmap (take 1000000 . cycle) (readFile "data/English1.txt")
main :: IO ()
main = do
defaultMain
[ bgroup "text-icu" $
[
env (fmap T.pack setup) (\txt -> bench "1" (nf (TI.normalize TI.NFD) txt))
]
, bgroup "just-count" $
[
env setup (\str -> bench "1" (nf (show . length) str))
]
]
One possible enhancement could be to strongly recommend using env
in the documentation when using multiple benchmarks or even better if possible detect the case when env is not being used and issue a warning at runtime.
from criterion.
Yes, being cognizant of working set is hard with Haskell's lazy semantics and GHC's optimizations. I don't know what would be detected here for a warning though -- what would the check for a linter be?
The original issue is a tricky one of compilation units. You can always put benchmarks in separate modules but that's a pain. I'm not sure what would be a good solution to alleviate this pain. TH doesn't seem sufficient.
It kind of seems like you'd want something like fragnix to create minimal compliation units that isolate your benchmarks.
from criterion.
FWIW, the hspec
test suite has a tool called hspec-discover
that automates the process of discovering other modules in a directory that contain tests (modules that end with the suffix -Spec
). If isolating benchmarks into other modules is the recommended approach to solving this particular issue, we could consider implementing hspec-discover
-style functionality to automate discover of benchmarks in other modules.
from criterion.
See #166 for another example.
from criterion.
Related Issues (20)
- criterion lower bound on aeson seems to be too loose HOT 3
- Presence of double quotes in benchmark name produces broken HTML report HOT 5
- 0.1.2.0 build failure on Apple Silicon HOT 22
- Log scale discoverability HOT 1
- Criterion/Main/Options.hs:38:48: error: Module ‘Options.Applicative.Help.Pretty’ does not export ‘Ann’ | 38 | import Options.Applicative.Help.Pretty ((.$.), Ann) HOT 5
- Regressions in Config Parameter are Ignored HOT 3
- Support aeson-2 HOT 2
- Is there any way to generate a static output? HOT 3
- Add a variant of runBenchmark which also takes iteration range. HOT 2
- [FR] support benchmarking max memory usage HOT 4
- use performMajorGC HOT 10
- Fails to compile with optparse-applicative-0.17.0.0 HOT 2
- Update tutorial to suggest `v2-`style `cabal` commands
- Avoid only a subset of benchmarks being specialized by SpecConstr. HOT 5
- please document that (and how) criterion runs GC HOT 1
- first benchmark is expensive HOT 5
- please document that `nf` is (contains) an IO action
- Add or migrate to `peakAllocated` HOT 1
- Website is down HOT 7
- Examples directory link is wrong on Hackage Readme HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from criterion.