In this short post I would like to write about LINQ and try and uncover something that may not be so obvious or at least it’s something that we all probably forgot. And that is how exactly the LINQ operators work and what types of execution there are. Knowing this would make your code more performant and less memory consuming.

So, in short, there are 4 types of execution available in LINQ and you should be very careful when choosing one over the other.

  1. Immediate execution
    Just like the name suggests, these operators execute at the exact spot they are called.
    All operators that return a single, non-enumerable result, for example SingleOrDefault or FirstOrDefault run under this category.

    So, every time you do something like this:
    User user = dataSource.FirstOrDefault(x => x.Age > 18);
    you’re making LINQ to execute immediately, on the spot.

  2. Deferred execution
    This is where things start to look very interesting. Deferred execution means that the operation is not performed on the spot, but rather when the query is triggered, for example by using a foreach statement.

    So, let’s say you are running an e-commerce and you have a huge products catalogue that amounts to 10 million products.
    Let’s say you want to get all products that cost more than 5 dollars:
    IEnumerable<Product> products = dataSource.Where(x => x.Price > 5);

    The code above shows the magic of deferred execution in action. The code would not return the results you asked for, but rather a IEnumerable<Product>.

    If you want to get the actual results, you would have to trigger the query, for example by foreaching over it, like so:

    foreach (var product in products) { ... }

    This will trigger the actual execution of the query and this is why the Where operator is a deferred execution type of operator – it defers the execution until it’s actually needed in your code. It is as lazy as possible.

    Deferred operators themselves can be classified in two groups: streaming and non-streaming.

  3. Deferrd – Streaming
    Streaming operators can yield results as soon as they find one. They don’t have to read the whole data source before returning any results and this is key difference between streaming and non-streaming.

    IEnumerable<Product> products = dataSource.Where(x => x.Price > 5);

    The code above uses Where, which is classified as deferred and streaming operator. Another one is Take.

    IEnumerable<Product> products = dataSource.Where(x => x.Price > 5).Take(10);

    This, as it’s using streaming operators would start returning results as soon as it finds one and would stop execution after it finds 10 matches.

    Both of these will start returning results as soon as they find any matches, without having to read the whole data source first (Of course, after you trigger them, as they are just queries now, like we’ve learned above).

  4. Deferred – Non-streaming
    Non-streaming operators, unlike their streaming counterparts, need to read the whole data source before they start returning even a single result.

    For example,
    IEnumerable allProducts = dataSource.OrderBy(x => x.Price);

    As OrderBy is a deferred and non-streaming. This means that whenever you execute this query, LINQ would have to read the whole dataSoruce first and after that you’ll be able to return any results.

    Another example
    IEnumerable products = dataSource.OrderBy(x => x.Price).Take(10);

    Unlike with Where in our previous example, here, as we’re using the non-streaming operator OrderBy, it would have to read the whole data source and then take 10 results.

    So, if we have 10 million products, in the first example we would read only 10 results.
    Here, if we have the same 10 million products, we would have to read 10 million products first and then return 10 of them.

    And this is key to understand as it would lead to a lot of memory and performance issues if not known.

    If you want to test this, you could try something like this
// Reads ten elements, yields 5 results.
Enumerable.Range(1, int.MaxValue).Where(i => i % 2 == 0)
    .Take(5)
    .ToList();

// Reads 2147483647 elements, yields 5 results.
Enumerable.Range(1, int.MaxValue).Where(i => i % 2 == 0)
    .OrderByDescending(i => i)
    .Take(5)
    .ToList();

And if you want to see how streaming vs non-streaming can eat off your memory, you could try this code.

// Puts half a million elements in memory, sorts, then outputs them.
var numbers = Enumerable.Range(1, 1000000).Where(i => i % 2 == 0)
    .OrderByDescending(i => i);
foreach(var number in numbers) Console.WriteLine(number);

// Puts one element in memory at a time.
var numbers = Enumerable.Range(1, 1000000).Where(i => i % 2 == 0);
foreach(var number in numbers) Console.WriteLine(number);

After you start executing, just open Visual Studio’s diagnostics and watch.
For my case, the first one consumed 24 MB of ram, while the second one just 12MB.

If you want to learn more about which operators are which, I suggest you bookmark Microsoft’s docs page so it’s a click away for reference the next time you need LINQ.

Hope this short post would be useful to you in choosing the right operators for the job the next time you decide to use LINQ.
Happy coding!

Categorized in:

C#, linq, visual studio,

Last Update: December 21, 2020

Tagged in:

,