TOC
LINQ:

Filtering data: the Where() method

One of the most basic (but also most powerful) operations you can perform on a set of data is to filter some of it out. We already saw a glimpse of what you can do with the Where() method in the LINQ introduction article, but in this article, we'll dig a bit deeper. We already discussed how many LINQ methods can use a Lambda Expression to performs its task and the Where() method is one of them - it will supply each item as the input and then you will supply the logic that decides whether or not the item is included (return true) or excluded (return false) from the final result. Here's a basic example:

List<int> numbers = new List<int>()
{
    1, 2, 4, 8, 16, 32
};
var smallNumbers = numbers.Where(n => n < 10);
foreach (var n in smallNumbers)
    Console.WriteLine(n);

In this example, each number is checked against our expression, which will return true if the number is smaller than 10 and false if it's 10 or higher. As a result, we get a version of the original list, where we have only included numbers below 10, which is then outputted to the console.

But the expression doesn't have to be as simple as that - we can easily add more requirements to it, just like if it was a regular if-statement:

List<int> numbers = new List<int>()
{
    1, 2, 4, 8, 16, 32
};
var smallNumbers = numbers.Where(n => n > 1 && n != 4 &&  n < 10);
foreach (var n in smallNumbers)
    Console.WriteLine(n);

We specify that the number has to be greater than 1, but not the specific number 4, and smaller than 10.

You can of course also use various methods call in your expression - as long as the final result is a boolean value, so that the Where() method knows whether you want the item in question included or not, you're good to go. Here's an example:

List<int> numbers = new List<int>()
{
    1, 2, 4, 7, 8, 16, 29, 32, 64, 128
};
List<int> excludedNumbers = new List<int>()
{
    7, 29
};
var validNumbers = numbers.Where(n => !excludedNumbers.Contains(n));
foreach (var n in validNumbers)
    Console.WriteLine(n);

In this example, we declare a second list of numbers - sort of a black-list of numbers which we don't want to be included! In the Where() method, we use the Contains() method on the black-list, to decide whether a number can be included in the final list of numbers or not.

And of course, it works for more complex objects than numbers and strings, and it's still very easy to use. Just have a look at this example, where we use objects with user information instead of numbers, and use the Where() method to get a list of users with names starting with the letter "J", at the age of 39 or less:

using System;
using System.Collections.Generic;
using System.Linq;

namespace LinqWhere2
{
    class Program
    {
        static void Main(string[] args)
        {
            List<User> listOfUsers = new List<User>()
            {
                new User() { Name = "John Doe", Age = 42 },
                new User() { Name = "Jane Doe", Age = 34 },
                new User() { Name = "Joe Doe", Age = 8 },
                new User() { Name = "Another Doe", Age = 15 },
            };

            var filteredUsers = listOfUsers.Where(user => user.Name.StartsWith("J") && user.Age < 40);
            foreach (User user in filteredUsers)
                Console.WriteLine(user.Name + ": " + user.Age);
        }


        class User
        {
            public string Name { get; set; }
            public int Age { get; set; }

        }
    }
}

And just for comparison, here's what the where operation would look like if we had used the query based syntax instead of the method based:

// Method syntax
var filteredUsers = listOfUsers.Where(user => user.Name.StartsWith("J") && user.Age < 40);

// Query syntax
var filteredUsersQ = from user in listOfUsers where user.Name.StartsWith("J") && user.Age < 40 select user;

Chaining multiple Where() methods

We discussed this briefly in the introduction to LINQ: The actual result of a LINQ expression is not realized until you actually need the data, e.g. when you loop over it, count it or iterate over it (as we do in our examples). That also means that you chain multiple Where() methods together, if you feel that's easier to read - in very complex expressions, it definitely can be! Here's a modified version of our previous example:

List<int> numbers = new List<int>()
{
    1, 2, 4, 8, 16, 32
};
var smallNumbers = numbers.Where(n => n > 1).Where(n => n != 4).Where(n => n < 10);
foreach (var n in smallNumbers)
    Console.WriteLine(n);

The result is exactly the same, and while the first version might not have been complex enough to justify the split into multiple Where() method calls, you will likely run into situations where it makes good sense to do so. I want to emphasize that this doesn't cost extra, in terms of performance, because the actual "where" operation(s) are not carried out until the part where we loop over the result - by then, the compiler and interpreter will have optimized your query to be as fast as possible, no matter how you wrote it.

Summary

With the Where() method, you can easily filter out unwanted items from your data source, to create a subset of the original data. Remember that it is indeed a new set of data you get - the original data source will be untouched unless you specifically override the original variable.

This article has been fully translated into the following languages: Is your preferred language not on the list? Click here to help us translate this article into your language!