Code Monkey home page Code Monkey logo

Comments (18)

dadhi avatar dadhi commented on August 23, 2024 2

Benchmark

Here is the source.

DryIoc setup as an example, the rest of the containers do the same.

public static DryIoc.IContainer PrepareDryIoc()
{
    var container = new Container();

    container.Register<Parameter1>(Reuse.Transient);
    container.Register<Parameter2>(Reuse.Singleton);
    container.Register<ScopedBlah>(Reuse.Scoped);

    return container;
}

public static object Measure(DryIoc.IContainer container)
{
    using (var scope = container.OpenScope())
        return scope.Resolve<ScopedBlah>();
}

Register, then Open Scope and Resolve for the first time

[Benchmark(Baseline = true)]
public object BmarkDryIoc() => Measure(PrepareDryIoc());

Results:

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.345 (1803/April2018Update/Redstone4)
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
Frequency=2156252 Hz, Resolution=463.7677 ns, Timer=TSC
.NET Core SDK=2.1.500
  [Host]     : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT


           Method |       Mean |     Error |    StdDev |  Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------------- |-----------:|----------:|----------:|-------:|--------:|------------:|------------:|------------:|--------------------:|
     BmarkAutofac |  29.831 us | 0.2230 us | 0.2086 us |   7.36 |    0.06 |      5.2185 |           - |           - |            24.15 KB |
      BmarkDryIoc |   4.053 us | 0.0178 us | 0.0167 us |   1.00 |    0.00 |      1.2131 |           - |           - |              5.6 KB |
       BmarkGrace | 507.573 us | 5.6479 us | 5.2830 us | 125.24 |    1.59 |      5.8594 |      2.9297 |           - |            30.21 KB |
 BmarkLightInject | 401.432 us | 3.0346 us | 2.8386 us |  99.05 |    0.82 |      6.8359 |      3.4180 |           - |            32.31 KB |

Open Scope and Resolve for the first time

private static readonly DryIoc.IContainer _dryioc = PrepareDryIoc();

[Benchmark(Baseline = true)]
public object BmarkDryIoc() => Measure(_dryioc);

Results:

           Method |       Mean |     Error |    StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------------- |-----------:|----------:|----------:|------:|--------:|------------:|------------:|------------:|--------------------:|
     BmarkAutofac | 1,577.7 ns | 3.7877 ns | 3.3577 ns | 12.13 |    0.04 |      0.5302 |           - |           - |              2504 B |
      BmarkDryIoc |   130.0 ns | 0.3659 ns | 0.3422 ns |  1.00 |    0.00 |      0.0558 |           - |           - |               264 B |
       BmarkGrace |   152.0 ns | 0.3930 ns | 0.3676 ns |  1.17 |    0.00 |      0.0608 |           - |           - |               288 B |
 BmarkLightInject |   609.2 ns | 2.0553 ns | 1.8220 ns |  4.68 |    0.02 |      0.1488 |           - |           - |               704 B |

from dryioc.

jeremydmiller avatar jeremydmiller commented on August 23, 2024 2

Shame on me, I haven't written any docs for that. Let me write a blog post on that and all the struggles I went through w/ optimizing cold start. And then you know that Lamar uses bits and pieces of ImTools and its own copy of FastExpressionCompiler for internal types. So even if Lamar is competitive, you still get credit;-)

from dryioc.

ahydrax avatar ahydrax commented on August 23, 2024

Hi @dadhi , the thing is that Activate.CreateInstance is slower than expression-based activators. Proof: https://blogs.msdn.microsoft.com/seteplia/2017/02/01/dissecting-the-new-constraint-in-c-a-perfect-example-of-a-leaky-abstraction/

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Yep, I know :)

But the idea is different, I am talking about first-time resolution where compiling expression + calling result delegate is much slower than Activator.CreateInstance.

from dryioc.

ahydrax avatar ahydrax commented on August 23, 2024

Oh, ok, I get the idea :)

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Other things to consider:

  1. Two types of cache, for default and for keyed services
  2. Collection resolution
  3. Nested lambdas
  4. Func and Func with arguments
  5. Partly interpreted singletons

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Remaining work:

  • Interpret scoped dependency creation.
  • Make Interpreter public, to allow the client to interpret the resolved expression.
  • Switch Off UseInterpretation rule for expression generation (DryIocZero).
  • Directly call full Resolve from expression instead of Invoke.
  • Tighten Resolve loop for inlining.
  • Separate code with lambdas from the hot-path for inlining, e.g. cache Swap methods.
  • Replace Activator.CreateInstance with ctor .Invoke for singletons, or consider to reuse TryInterpret for singletons too.
  • Replace .SingleMethod calls with more faster less allocating alternative (benchmark alternatives)
  • Handle Resolve for keyed service.
  • Interpret Expression.Invoke.
  • Optimize Interpreter to avoid recursion and stack growth where possible, similar how FEC does it.
  • Benchmark with IoC Performance.
  • Benchmark with Autofac, Grace, LightInject and compare with MS.DI for reference. To include both approaches: activation and compilation based (maybe adding one with Roslyn compilation like Lamar).
  • Benchmark with scoped dependency which is not interpreted until #52.
  • Minimize memory allocations where possible

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Here is state of fast .NET DI containers in certain (not uncommon) scenario, or in other words: always treat benchmarks in context!

Comparing to the benchmark above, I have just added a scoped dependency. DryIoc at the moment does not support interpreting of scoped dependency, because it uses the nested lambda expression.

CreateContainerAndRegister_FirstTimeOpenScopeResolve:

Method Mean Error StdDev Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
BmarkAutofac 35.672 us 0.3983 us 0.3326 us 7.32 0.08 6.4697 - - 29.83 KB
BmarkDryIoc 529.302 us 2.3636 us 2.0953 us 108.60 0.69 1.9531 0.9766 - 12.61 KB
BmarkMicrosoftDependencyInjection 4.873 us 0.0267 us 0.0250 us 1.00 0.00 1.0529 - - 4.87 KB
BmarkGrace 783.044 us 3.8283 us 3.5810 us 160.68 1.18 8.7891 3.9063 - 42.44 KB
BmarkLightInject 666.277 us 6.2531 us 5.8492 us 136.72 1.38 8.7891 3.9063 - 43.12 KB

FirstTimeOpenScopeResolve:

Method Mean Error StdDev Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
BmarkAutofac 1,970.3 ns 11.519 ns 10.7747 ns 7.17 0.06 0.6676 - - 3152 B
BmarkDryIoc 207.0 ns 1.062 ns 0.9931 ns 0.75 0.01 0.0966 - - 456 B
BmarkMicrosoftDependencyInjection 274.9 ns 2.064 ns 1.9308 ns 1.00 0.00 0.0758 - - 360 B
BmarkGrace 264.8 ns 1.653 ns 1.5462 ns 0.96 0.01 0.1216 - - 576 B
BmarkLightInject 998.6 ns 4.589 ns 4.0676 ns 3.64 0.02 0.2422 - - 1144 B

PS. MS.DI performs great for what it designed 👍 (again, in this specific start-up / first resolution scenario).

from dryioc.

ahydrax avatar ahydrax commented on August 23, 2024

Hi @dadhi ,
Could you also add simple injector to comparison chart?

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

@ahydrax
Maybe later, but I would expect similar (slightly slower) results than LightInject. It boils down to the approach: the similar approaches produce similar results. It should be measured though ;)

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Considering that @jeremydmiller has just announced the Lamar 2.0 release, let's check it out too because it uses yet another approach with Roslyn based compilation. But I expect it to perform slower in the above use-case exactly because of the approach. Let see.

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

@ahydrax

Here are the results with SimpleInjector.

CreateContainerAndRegister_FirstTimeOpenScopeResolve.BmarkSimpleInjector: DefaultJob
Runtime = .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT; GC = Concurrent Workstation
Mean = 1.3705 ms, StdErr = 0.0216 ms (1.58%); N = 91, StdDev = 0.2064 ms
Min = 1.2566 ms, Q1 = 1.2671 ms, Median = 1.2764 ms, Q3 = 1.2922 ms, Max = 1.9380 ms
IQR = 0.0251 ms, LowerFence = 1.2295 ms, UpperFence = 1.3298 ms
ConfidenceInterval = [1.2969 ms; 1.4441 ms] (CI 99.9%), Margin = 0.0736 ms (5.37% of Mean)
Skewness = 1.84, Kurtosis = 4.78, MValue = 2
-------------------- Histogram --------------------
[1.241 ms ; 1.322 ms) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[1.322 ms ; 1.371 ms) |
[1.371 ms ; 1.451 ms) | @@
[1.451 ms ; 1.500 ms) |
[1.500 ms ; 1.580 ms) | @@
[1.580 ms ; 1.678 ms) | @@@@
[1.678 ms ; 1.762 ms) | @
[1.762 ms ; 1.842 ms) | @@
[1.842 ms ; 1.944 ms) | @@@@@@@@
---------------------------------------------------

// * Summary *

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17134.407 (1803/April2018Update/Redstone4)
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
Frequency=2156248 Hz, Resolution=463.7685 ns, Timer=TSC
.NET Core SDK=2.1.500
  [Host]     : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT


                            Method |         Mean |      Error |     StdDev |       Median |  Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |-------------:|-----------:|-----------:|-------------:|-------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |     5.474 us |  0.6720 us |   1.981 us |     3.949 us |   1.00 |    0.00 |      0.9155 |           - |           - |             4.27 KB |
                      BmarkAutofac |    44.365 us |  3.7849 us |  11.160 us |    47.858 us |   9.03 |    3.60 |      5.2490 |           - |           - |            24.22 KB |
                  BmarkLightInject |   633.471 us | 16.6356 us |  21.039 us |   626.910 us | 119.65 |   40.94 |      7.8125 |      3.9063 |           - |             38.4 KB |
                       BmarkDryIoc |   676.467 us | 79.1501 us | 233.376 us |   505.889 us | 139.14 |   66.56 |      1.9531 |           - |           - |            10.83 KB |
                        BmarkGrace |   800.199 us | 88.9617 us | 162.672 us |   742.909 us | 143.31 |   54.79 |      7.8125 |      3.9063 |           - |            40.25 KB |
               BmarkSimpleInjector | 1,370.494 us | 73.5985 us | 206.378 us | 1,276.409 us | 275.71 |   98.10 |     15.6250 |      7.8125 |           - |            77.75 KB |

// * Warnings *
MultimodalDistribution
  CreateContainerAndRegister_FirstTimeOpenScopeResolve.BmarkMicrosoftDependencyInjection: Default -> It seems that the distribution can have several modes (mValue = 2.9)
  CreateContainerAndRegister_FirstTimeOpenScopeResolve.BmarkAutofac: Default                      -> It seems that the distribution is bimodal (mValue = 3.72)
  CreateContainerAndRegister_FirstTimeOpenScopeResolve.BmarkDryIoc: Default                       -> It seems that the distribution is bimodal (mValue = 3.3)

FirstTimeOpenScopeResolve.BmarkSimpleInjector:

                             Method |       Mean |     Error |    StdDev |     Median | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------------------------------- |-----------:|----------:|----------:|-----------:|------:|--------:|------------:|------------:|------------:|--------------------:|
                       BmarkAutofac | 2,346.6 ns | 216.86 ns | 639.42 ns | 2,553.5 ns |  6.82 |    2.12 |      0.5455 |           - |           - |              2576 B |
                        BmarkDryIoc |   216.5 ns |  10.51 ns |  28.60 ns |   203.6 ns |  0.64 |    0.13 |      0.0830 |           - |           - |               392 B |
 BmarkMicrosoftSDependencyInjection |   349.6 ns |  25.51 ns |  70.25 ns |   325.5 ns |  1.00 |    0.00 |      0.0687 |           - |           - |               328 B |
                         BmarkGrace |   323.0 ns |  12.65 ns |  36.11 ns |   306.7 ns |  0.95 |    0.19 |      0.1149 |           - |           - |               544 B |
                   BmarkLightInject | 1,100.0 ns |  18.29 ns |  16.22 ns | 1,098.9 ns |  2.84 |    0.47 |      0.2346 |           - |           - |              1112 B |
                BmarkSimpleInjector |   582.9 ns |  46.52 ns | 134.96 ns |   546.2 ns |  1.74 |    0.52 |      0.1101 |           - |           - |               520 B |

from dryioc.

jeremydmiller avatar jeremydmiller commented on August 23, 2024

@dadhi "But I expect it to perform slower in the above use-case exactly because of the approach." A big yes and maybe no. If you're using it simply, yeah, the cold start time isn't super awesome on the first usage of Roslyn, but they (Roslyn team) have made huge strides on that one.

Lamar also has a model where it can drop the generated C# code into your code once, and just use the already compiled resolver strategies for much, much faster cold start times.

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

@jeremydmiller,

Lamar also has a model where it can drop the generated C# code into your code once, and just use the already compiled resolver strategies for much, much faster cold start times.

This is super interesting. How I can test that?

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Hey, the benchmark with first version of DryIoc with scoped dependency interpretation:

CreateContainerAndRegister_FirstTimeOpenScopeResolve:

                            Method |       Mean |      Error |      StdDev |     Median |          P95 |  Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |-----------:|-----------:|------------:|-----------:|-------------:|-------:|--------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |   4.797 us |  0.1125 us |   0.2904 us |   4.724 us |     5.754 us |   1.00 |    0.00 |      1.0529 |           - |           - |             4.87 KB |
                       BmarkDryIoc |   5.547 us |  0.0401 us |   0.0313 us |   5.535 us |     5.594 us |   1.17 |    0.01 |      1.6251 |           - |           - |             7.49 KB |
                      BmarkAutofac |  36.195 us |  0.1435 us |   0.1198 us |  36.232 us |    36.332 us |   7.64 |    0.03 |      6.4697 |           - |           - |            29.83 KB |
                        BmarkGrace | 776.478 us |  5.3626 us |   4.4780 us | 774.993 us |   783.422 us | 163.89 |    1.13 |      8.7891 |      3.9063 |           - |            42.44 KB |
                  BmarkLightInject | 799.472 us | 79.1998 us | 231.0294 us | 658.761 us | 1,299.696 us | 169.62 |   51.94 |      8.7891 |      3.9063 |           - |            43.12 KB |

FirstTimeOpenScopeResolve:

                             Method |        Mean |      Error |       StdDev |      Median | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------------------------------- |------------:|-----------:|-------------:|------------:|------:|--------:|------------:|------------:|------------:|--------------------:|
                        BmarkDryIoc |    244.9 ns |  24.425 ns |    71.634 ns |    198.8 ns |  0.81 |    0.24 |      0.0896 |           - |           - |               424 B |
                         BmarkGrace |    269.0 ns |   5.140 ns |     4.556 ns |    267.4 ns |  0.95 |    0.02 |      0.1216 |           - |           - |               576 B |
 BmarkMicrosoftSDependencyInjection |    282.9 ns |   5.552 ns |     4.636 ns |    280.9 ns |  1.00 |    0.00 |      0.0758 |           - |           - |               360 B |
                   BmarkLightInject |  1,072.9 ns | 111.797 ns |    99.105 ns |  1,046.4 ns |  3.80 |    0.38 |      0.2422 |           - |           - |              1144 B |
                       BmarkAutofac |  2,203.3 ns | 599.133 ns |   560.429 ns |  1,991.7 ns |  7.88 |    1.96 |      0.6676 |           - |           - |              3152 B |
                         BmarkLamar | 20,724.7 ns | 727.986 ns | 2,100.407 ns | 20,637.7 ns | 76.88 |    5.79 |           - |           - |           - |              1512 B |

Btw: @jeremydmiller, Here is mine heads-on benchmark with Lamar v2. Adding it here just for reference:

                            Method |           Mean |         Error |        StdDev |         Median |     Ratio |  RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---------------------------------- |---------------:|--------------:|--------------:|---------------:|----------:|---------:|------------:|------------:|------------:|--------------------:|
 BmarkMicrosoftDependencyInjection |       4.676 us |     0.0201 us |     0.0178 us |       4.679 us |      1.00 |     0.00 |      1.0529 |           - |           - |             4.87 KB |
                       BmarkDryIoc |       5.519 us |     0.1487 us |     0.1527 us |       5.480 us |      1.18 |     0.04 |      1.6251 |           - |           - |             7.49 KB |
                      BmarkAutofac |      35.191 us |     0.1632 us |     0.1447 us |      35.197 us |      7.53 |     0.04 |      6.4697 |           - |           - |            29.83 KB |
                  BmarkLightInject |     751.269 us |    52.9342 us |   152.7272 us |     666.809 us |    161.81 |    35.92 |      8.7891 |      3.9063 |           - |            43.12 KB |
                        BmarkGrace |   1,054.674 us |   117.4721 us |   346.3691 us |     826.261 us |    223.94 |    79.97 |      8.7891 |      3.9063 |           - |            42.44 KB |
                        BmarkLamar | 110,034.715 us | 9,271.5789 us | 8,672.6407 us | 106,728.911 us | 23,553.16 | 1,957.17 |   2000.0000 |   1000.0000 |           - |         10695.63 KB |

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

For completeness, here is the OpenScope-Resolve-Dispose load (your usual Unit-of-Work or Request) after the warmup (5 times the cycle):

                             Method |       Mean |      Error |     StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------------------------------- |-----------:|-----------:|-----------:|------:|--------:|------------:|------------:|------------:|--------------------:|
                        BmarkDryIoc |   199.2 ns |  1.5780 ns |  1.4761 ns |  0.74 |    0.01 |      0.0896 |           - |           - |               424 B |
                         BmarkGrace |   258.3 ns |  1.4635 ns |  1.2973 ns |  0.96 |    0.01 |      0.1216 |           - |           - |               576 B |
 BmarkMicrosoftSDependencyInjection |   269.7 ns |  0.7576 ns |  0.6326 ns |  1.00 |    0.00 |      0.0758 |           - |           - |               360 B |
                   BmarkLightInject |   976.5 ns |  3.7448 ns |  3.3197 ns |  3.62 |    0.02 |      0.2422 |           - |           - |              1144 B |
                       BmarkAutofac | 2,185.3 ns | 15.1710 ns | 13.4487 ns |  8.10 |    0.06 |      0.6676 |           - |           - |              3152 B |
                         BmarkLamar | 2,360.9 ns | 11.5168 ns | 10.2094 ns |  8.75 |    0.05 |      0.3166 |           - |           - |              1512 B |

Again, we should benchmark a more real-world object graph (maybe 10 level deep and 5 to 10 dependencies wide on each level, with a variety of lifestyles). Current setup is a 2 level root with 3 dependencies. Take it into account when looking at the benchmarks.

from dryioc.

jeremydmiller avatar jeremydmiller commented on August 23, 2024

Psst, here's a cheap way to do more realistic benchmarking: https://github.com/JasperFx/lamar/tree/master/src/Benchmarks. Use an ASP.net Core app to quickly get yourself a much better set of registrations.

from dryioc.

dadhi avatar dadhi commented on August 23, 2024

Considering complete.

from dryioc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.