Code Monkey home page Code Monkey logo

linqaf's Introduction

LinqAF

A low allocation re-implementation of LINQ-to-Objects using some dubious techniques.

Compatibility

LinqAF aims to be "type inference compatible" with LINQ-to-Objects, a very weak form of source compatible I made up. Essentially, if your LINQ code has no type names in it then LinqAF will probably just work - this is common in LINQ code due to extensive use of anonymous delegates and var.

As an illustration, the following code will works seamlessly with LINQ-to-Objects and LinqAF.

var range  = Enumerable.Range(0, 100);
var expanded = range.SelectMany(x => new [] { x, x * 2 });
var reversed = expanded.Reverse();
var asString = reversed.Select(y => y.ToString());

foreach(var str in asString)
{
  Console.WriteLine(str);
}

and could be written more concisely as

var e = 
  Enumerable.Range(0, 100)
  .SelectMany(x => new [] { x, x * 2 })
  .Reverse()
  .Select(y => y.ToString());
 
foreach(var str in e)
{
  Console.WriteLine(str);
}

Several convenience types and extension methods are provided so LinqAF's IEnumerable<T>-ish types can be used in common ways without requiring .AsEnumerable()-calls all over the place.

Specifically:

  • LinqAFString provides implementations of System.String's Concat and Join methods that take all LinqAF enumerable types, and pass throughs for all other methods
  • LinqAFTask does likewise for System.Thread.Task's WaitAll, WaitAny, WhenAll, and WhenAny methods, and also provides pass through methods
  • LinqAFNew provides methods that are equivalent to enumerable consuming constructors for the builtin HashSet<T>, LinkedList<T>, List<T>, Queue<T>, SortedSet<T>, and Stack<T> types.
  • Extension methods for the following methods on HashSet<T>, ISet<T>, and SortedSet<T> that take all LinqAF enumerables are provided:
    • ExceptWith
    • IntersectWith
    • IsProperSubset
    • IsProperSuperset
    • IsSubsetOf
    • IsSupersetOf
    • Overlaps
    • SetEquals
    • SymmetricExceptWith
    • UnionWith
  • Likewise extension methods are provided for AddRange, and InsertRange on List<T>

Dealing with compatibility exceptions

There are cases where LinqAF is not a direct replacement for LINQ-to-Objects. Speaking broadly, these are:

  1. Where the IEnumerable<T> type appears
  2. When different kinds of enumerables need to be assigned or returned

Dealing with IEnumerable

There are a few fixes for the first case: use AsEnumerable() to create an IEnumerable<T> wrapper (this allocates), change the type to var, or change the type to the appropriate LinqAF enumerable.

This LINQ-to-Objects code, for example

IEnumerable<string> e = Enumerable.Repeat("foo", 3);
IEnumerable<char> f = e.Select((string x, int ix) => x[ix]);

could be rewritten as

var e = Enumerable.Repeat("foo", 3);
var f = e.Select((x, ix) => x[ix]);

or

RepeatEnumerable<string> e = Enumerable.Repeat("foo", 3);
SelectIndexedEnumerable<string, char, RepeatEnumerable<string>, RepeatEnumerator<string>> f = e.Select((x, ix) => x[ix]);

or (if allocations are acceptable)

IEnumerable<string> e = Enumerable.Repeat("foo", 3).AsEnumerable();
IEnumerable<char> f = e.Select((string x, int ix) => x[ix]).AsEnumerable();

obviously, the first option is the preferable one.

Coalescing different kinds of enumerables

All LINQ-to-Objects operators return IEnumerable<T> or a subtype of it, which means that you can assign the result of almost all operators to the same variables, parameters, or returns (modulo the generic type parameter T). In LinqAF every operator returns a different type, which breaks code like the following:

// this works with LINQ-to-Objects, but not LinqAF
int a = ...;
var e = a > 0 ? Enumerable.Repeat(1, 1) : Enumerable.Empty<int>();

There are two ways to fix this, either introduce calls to Box() or AsEnumerable(). AsEnumerable() returns an IEnumerable (and is thus ideal for interoperating with non-LinqAF code), while Box() returns a BoxedEnumerable<T> (there is also an explicit cast from all LinqAF enumerables to BoxedEnumerable<T>, but using .Box() lets you avoid typing out type names).

So the previous code would become either

// this works with both LINQ-to-Objects and LinqAF
int a = ...;
var e = a > 0 ? Enumerable.Repeat(1, 1).AsEnumerable() : Enumerable.Empty<int>().AsEnumerable();

or

// this works with LinqAF, but not LINQ-to-Objects
int a = ...;
var e = a > 0 ? Enumerable.Repeat(1, 1).Box() : Enumerable.Empty<int>().Box();

Optimizations

LinqAF's primary purpose (besides scratching an intellectual itch) is to minimize the number of allocations LINQ-like code involves. Accordingly almost all operations are zero-allocation, with the exceptions of the ToXXX, Distinct, Except, GroupJoin, Intersect, Join, OrderBy(Descending), ThenBy(Descending), Reverse, TakeLast, SkipLast, and Union operations (and even then, certain cases will be zero allocation).

LinqAF achieves this, in part, by representing all operations as structs and exploiting C#'s extensive duck-typing around LINQ and foreach. This blatently disregards the official guidance on struct use.

Note that LinqAF does not change any of C#'s LINQ -> method rewrite rules, so allocations introduced as part of those will still occur. It is for this reason that I encourage the direct use of methods rather than LINQ query keywords (in all cases, not just with LinqAF) as that syntax makes allocations from intermediate objects and delegates explicit.

LinqAF leverages the much more extensive type information it has available at compile time (compared to LINQ-to-Objects) to optimize numerous cases. For example, Enumerable.Range(0, 100).Reverse() is optimized into a ReverseRangeEnumerable that does no allocations.

LinqAF also has a few alternative collection implementations that aim to allocate less than what LINQ-to-Objects would.

LinqAF also enables custom allocation rules (enabling collection reuse, or allocation monitoring) via registering an IAllocator implementation with LinqAF.Config.Allocator.

More details can be found in an upcoming blog series.

Using LinqAF

LinqAF is available on Nuget.

Be aware that LinqAF is full of dubious ideas: large structs, mutable structs, lots of generic constraints, lots of generic parameters, and truly gargantuan amounts of generated code.

LinqAF can be useful, but it's not a painless or trivial dependency. Be very cautious in using it.

Also the LinqAF.dll is ~26MB, so that's a thing.

Building, testing, and benchmarking LinqAF

LinqAF is composed of several projects:

  • LinqAF - the template for generating the final libraries
    • Should always compile, but the produced DLL is non-functional
    • All enumerables are in the top level
    • Config/ contains non-LINQ-to-Objects replacing types
    • Templates/ contains the templates for each set of operators, these are used to generate all the type-specific implementations of each operator
    • Impl/ contains the actual code used to implement templates, as well as the interfaces for each operator (for static type checking in builds)
    • Overrides/ contains concrete, per-enumerable methods that replace the ones generated from templates.
  • LinqAF.Generator - a command line app that generates the final code
    • Must be run in the root directory of the solution
    • Looks for the LinqAF project to use as a template
    • Updates the LinqAF.Generated project as part of output
  • LinqAF.Tests - library containing tests for LinqAF, links against the LinqAF.Generated project
    • Uses the Microsoft.VisualStudio.TestTools.UnitTesting framework
    • Individual tests can run in Visual Studio
    • Attempting to run several tests will likely result in an OutOfMemoryException in Visual Studio's test runner
    • Any test that, by itself, results in an OutOfMemoryException should be split into multiple tests
  • LinqAF.TestRunner - command line app that runs all tests in LinqAF.Tests, links against the LinqAF.Tests project
    • Exists to work around the OutOfMemoryException issues noted above
    • Also produces code coverage statistics
  • LinqAF.Generated* - final library projects, updated by LinqAF.Generator
    • Only the LinqAF.Generated project contains files, all other projects reference those files instead
    • Each generated project is for a different target platform
    • All produce a LinqAF.dll
  • LinqAF.Benchmark - a command line app that runs a series of benchmarks
    • Uses the BenchmarkDotNet libary
    • Benchmarks are defined in Benchmarks/, but must be added to the Main() as well
    • All benchmarks needs a LINQ2Objects and a LinqAF method
    • the LINQ2Objects method should have Baseline = true as part of it's [Benchmark]
    • If a directory is provided as a command line parameter, results are logged to that directory

Acknowledgements

The initial motivation for LinqAF came from playing around with the Rust Language and its Iterators.

A great deal of my understanding of LINQ-to-Objects (and 600+ test cases) came from Jon Skeet's EduLinq blog series.

Thanks to Benjamin Hodgson for Box(), as he pointed out that explicit casting couldn't handle anonymous types and would be quite painful for long type names regardless.

linqaf's People

Contributors

kevin-montrose avatar mderriey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linqaf's Issues

possible to make DLL size smaller?

Is there way to strip off some unused methods to make dll smaller like 1mb? With unity3d engine it just freezes when trying to load dll, this engine uses mono.

Select and Count

I was reading an article about .NET Core, where early versions had a bug/feature where if you had something like collection.Select(foo).Count(); it wouldn't actually perform the Select operation, since it doesn't matter, you only need the count. But, since this would prevent side effects from happening, it was a breaking change and they had to fix it.

But, this seems like an optimization that LinqAF could keep. Maybe you already do, but if not just wanted to get the idea out there.

how to generate code?

When i try to build solution it complains i missing generated files, how to generate those files? I cant find instructions in readme sorry if i dont see it :)

Collaboration or ideas sharing...

@kevin-montrose

Hey Kevin, I took a crack at providing a more optimized System.Linq here.

It was kind of in the opposite direction to you, I was not minimizing garbage, but rather changing the implementation of enumerable consumers to be push fed rather than pull, which in the implementation meant that I was creating more garbage, not less, but it is faster in pretty much all but the plain vanilla IEnumerable being consumed directly.

Anyway, understandably for such a large change I'm struggling to find traction within the Microsoft camp to accept this as a PR, so I'm thinking of spinning it off into it's own repo and nuget package (it's somewhere in the order of 75%-90% done I guess - main feature still to migrate is OrderBy, but need probably better implementations of some other things that I have done).

Wondering if you would be interested in collaboration/review/whatever.

Things needed for a PoC port of the Stack Overflow codebase to LinqAF

Just for kicks, here's basically what would be needed to make this not hellaciously awful.

  • AddRange on IList extension method
  • ILookup<T,V> defined
  • IGrouping<T,V> defined
  • IOrderedEnumerable defined
  • AsEnumerable() on the various sorted enumerables should return an IOrderedEnumerable

Obviously this code can't be released, but it'll give a baseline of before & after compile times from conversion.

Spoiler alert: I bet they're reaaaaaaal bad.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.