Back to Post list

LINQ

2019-06-28

When I first discovered LINQ, it completely changed the way I program. I'd already been coding for a very long time (mostly using LAMP - Linux, Apache, MySQL, and PHP) so I was familiar with SQL queries, but the idea of being able to query data directly in C# was entirely new to me.

After a while, it became the new way of doing things for me - so much so, that I ended up finding a plugin so that my dev team could even use LINQ in PHP.

That said, there are, as always, arguments for and against LINQ - as with any other tool, it's important to know when to use it, and when not to use it.

What is LINQ?

LINQ stands for 'Language-Integrated Query', and in short, what it allows you to do is to query data within a programming language (usually C#) in much the same way as you would query data within a database (such as SQL Server or MySQL).

Additionally, if you are using a database and a framework which supports it, such as the Entity Framework, you can query your database (or other data source) using the same approach - the framework will generate any SQL queries necessary and populate your datasets without requiring you to write any SQL code.

In C#, there are two primary syntaxes you can use to construct LINQ queries - Query Syntax and Method Syntax. Query Syntax is similar to writing a SQL query, whereas Method Syntax uses a more C#-like approach. However, both approaches ultimately work in much the same way, and to my knowledge it usually just comes down to personal preference. I for one prefer the Method Syntax approach - it looks clean to me, and to my mind, produces very readable code.

For this post, I'm going to focus on what I use in Unity, which is LINQ on data sets (usually not databases), using the Method Syntax approach.

Getting started with LINQ

Here's a simple example of some code, with, and without using LINQ to obtain the same results (there's no real reason for this code to be run in a Unity MonoBehaviour, but that's what I'm testing with for now):

using System;
using System.Collections.Generic;
using System.Linq;
using UnityEngine;

public class LINQTest : MonoBehaviour
{
    // a random-ish array of integers
    int[] testData = new int[] { 144, 252, 3, 82, 122, 991, 8882, 31231, 2299, 4, 55, 21, 32, 64 };

    void Start()
    {
        TestStandard();
        TestLINQ();
    }

    void TestStandard()
    {
        List results = new List();

        // add all values that are less than or equal to 1000
        foreach (var t in testData)
        {
            if (t <= 1000) results.Add(t);
        }

        // Sort the results in ascending order
        results.Sort((a, b) => a.CompareTo(b));

        // Build a string containing the first 5 entries, separated by ", "
        string resultString = "";
        int length = (Math.Min(5, results.Count));
        for (int x = 0; x < length; x++)
        {
            resultString += results[x];
            if (x < (length - 1))
            {
                resultString += ", ";
            }
        }

        // Display the results
        Debug.LogFormat("Results : {0}", resultString);
    }

    void TestLINQ()
    {
        var resultsLINQ = testData.Where(t => t <= 1000)    // only include values less than or equal to 1000
                                  .OrderBy(t => t)          // sort the values in ascending order
                                  .Take(5);                 // only include the first 5 values

        // Build a string containing the results using LINQ to convert the integers to strings
        Debug.LogFormat("Results : {0}", string.Join(", ", resultsLINQ.Select(t => t.ToString()).ToArray()));
    }
}

Executing both of these methods outputs the same result to the console, specifically:

Results : 3, 4, 21, 32, 55

Now, to me, the advantage here of LINQ is pretty clear - it's easier to read, and truth be told, much faster to write and understand. To make this example, I actually wrote the LINQ code first, which took about a minute, and then wrote the non-LINQ code second which took somewhere around ten minutes (admittedly, I am now much more familiar with LINQ code than not, and as such it sometimes takes me longer to think about how to do something without it than it may have in the past).

And that is just a very simple use-case - LINQ is extraordinarily powerful for writing clean, clear code quickly and easily.

Due to the way that LINQ is constructed (through the use of Extension Methods, you can use LINQ on just about any collection in C#, whether it is an array, a List, a Dictionary, or something else entirely.

LINQ Results

One minor thing to remember is that LINQ methods all (or at least, mostly) return a generic collection of type IEnumerable. This type represents a collection of data that can be enumerated through (e.g. with a foreach loop) to read the results. In some cases, however, you will need to convert this into a more appropriate type for your code - the primary reason for this, in my experience, is if you need to iterate through the results more than once, as IEnumerable collections are not guaranteed to automatically reset their enumerator to the start every time you attempt to iterate through them (although sometimes, they will) - this is because IEnumerable is actually an interface, not a class - the underlying 'collection' can be of any number of different types, but all of those types implement the methods defined by IEnumerable so they can be used in much the same way.

Fortunately, converting the collection to the type you need is very easy - LINQ provides a few conversion methods, specifically:

ToArray()
ToList()
ToDictionary()

As you might expect, each of these converts the query results into the desired type (although, in the case of ToDictionary(), you will need to provide instructions as to how the keys and values should be defined).

One other minor advantage of these conversion methods is that they can actually be called on any collection - even one not created from LINQ, so converting between collection types is very easy.

Lambda Expressions

In the example LINQ above, you may have noticed that each of the LINQ methods takes a (sometimes optional) argument which instructs LINQ on how to perform the query, e.g.

var resultsLINQ = testData.Where(t => t <= 1000)    // only include values less than or equal to 1000

This argument is called a lambda expression, and it is specifically this part:

t => t <= 1000

A lambda expression is basically a method which returns a result that LINQ can use, in this case, to compare the values and exclude any that don't match the condition.

This expression could also be written as follows:

var resultsLINQ = testData.Where(t =>
                                   {
                                       return t <= 1000;
                                   })

and will still function in exactly the same way. The Where method expects a method (or lambda expression) which returns true if the value should be included in the results, and false if not.

This method could also be defined elsewhere - it doesn't specifically have to be inline - so you can avoid lambda expressions entirely if you wish, e.g.

    void TestLINQ()
    {
        var resultsLINQ = testData.Where(LessThanOrEqualTo1000)
                                  .OrderBy(t => t)          // sort the values in ascending order
                                  .Take(5);                 // only include the first 5 values

        // Build a string containing the results using LINQ to convert the integers to strings
        Debug.LogFormat("Results : {0}", string.Join(", ", resultsLINQ.Select(t => t.ToString()).ToArray()));
    }

    private bool LessThanOrEqualTo1000(int v)
    {
        return v <= 1000;
    }

Although I usually prefer to stick to inline lambda expressions - once you become used to them, they are very easy to read, and keeping the code all in one place can help with readability.

The structure of a lambda expression is as follows:

(a) => b

Where a is the variable definition (which can consist of zero or more arguments to the method - brackets are optional if there is a single argument [as in the examples above]), and b defines a method (which can be a single line without the return keyword if it is simple enough - or if the method doesn't return anything).

One thing to remember is that while LINQ can use lambda expressions liberally, lambda expressions are not limited to LINQ either.

LINQ Performance

So, at first glance (to me, at any rate), LINQ is amazing and results in clean, compact code that is easy to read and maintain. So why shouldn't I use it everywhere?

Well, unfortunately, in Unity (which usually uses Mono to compile and execute C#), LINQ isn't as performant as it could be. This is mostly true in older versions of Unity, as more recent versions have improved this significantly, closing the gap in performance between LINQ and non-LINQ code.

However, it is still important to keep this in mind when writing Unity code - in many cases, your code will execute faster if it doesn't use LINQ. We're talking very minor differences in performance here, usually nanoseconds, and occasionally, milliseconds. A few nanoseconds in code that is executed infrequently, like code triggered by events (e.g responding to user input) will usually make no noticeable difference. On the other hand, code that is executed very frequently, such as in an Update() method, those nanoseconds start to add up and lower your applications overall framerate.

As with general performance, it's also important to keep your target platform(s) in mind. If you are specifically targeting mobile devices, it's usually best to stick to non-LINQ code for the most part - part of this is because LINQ handles differently on different devices (performance-wise). In the past, for example, it performed particularly badly on iOS devices (although this no longer seems to be the case).

Part of the reason for the performance difference is that LINQ, by its nature, is very generic. The plus side of that is that it can work on just about anything - the downside, as with most generic code, is that it cannot be effectively optimized (internally) to be ideal for every situation, whereas code written specifically for a specific scenario can be.

As with all optimization though - don't preemptively optimize. If LINQ results in better code, then use LINQ. If it later turns out that it is affecting your performance, then by all means replace it, but don't write code you'll struggle to read later on just on the off chance it might execute faster.

Final word

LINQ is a different way of working with data in programming languages than many of us are used to. It probably isn't for everyone, but my discovery of it almost a decade ago completely changed the way I work with data - I think, for the better

It took a lot of practice and experience to learn when and where to use it, and in particular to understand how it works internally (especially when working with data from databases - the Entity Framework in particular can sometimes create monster queries from LINQ statements if you aren't careful with it), but to my mind, it was very, very worth it.

Back to Post list

LINQ

2019-06-28

What is LINQ?

Getting started with LINQ

LINQ Results

Lambda Expressions

LINQ Performance

Final word

HOME

BLOG

PRODUCTS

CONTACT