Blog Barista: Anthony Wolf | May 20, 2020 | Development Practices | Brew time: 6 min
Thinking About Your Data Model
I often read code in forums or Stack Overflow from people who are beginners at C#, and see them using FirstOrDefault in every situation where they need a single item from an IEnumerable. If I ask them why they made this choice, the reply is typically something like “it always works” or “it gets the job done.” This is often not true. The problem with this logic is that it doesn’t consider the data model, and in many cases may be flat-out ignoring it. The .NET IEnumerable interface has several options for when you just want one specific item from the collection, and if you consider all of the options you have for getting a single item from an IEnumerable, it forces you to think about your data model. Here are the basic options:
Attempts to get exactly one element from the sequence, throws an exception if there is not exactly one element in the sequence.
Attempts to get exactly one element from the sequence, returns the default value for the item’s type if the sequence is empty, throws an exception if there is more than one element in the sequence.
Attempts to get the first element in the sequence, throws an exception if the sequence is empty.
Attempts to get the first element in the sequence, returns the default value for the item’s type if the sequence is empty.
Regardless of whether the data model is something you as a developer are responsible for designing or have inherited from a legacy system, thinking about your data model is very important and can prevent bugs from hiding in plain sight. Consider the following:
The CalculateTaxes method above accesses the HouseDetails collection property to retrieve a HouseDetails object for the house several times. As a developer coming in after the fact to maintain this code, I would be left scratching my head with several questions. Suppose the initial assumption when developing this code was that HouseDetails would always contain 1 element. If so, no variant of .First should ever be used to obtain that single instance, because .First does not appropriately convey the intentions and assumptions of the original developer. In addition, OrDefault is not an appropriate variant to use because it ignores the assumption that HouseDetails always contains 1 element by allowing for the possibility that it could return null. If the assumption holds forever, we’re fine and will never need to look at this code again. If for any reason, however, the assumption at any point in the future changes for any reason, the original intent of the code is lost and now we’re left making a judgment call about why we’re getting an exception.
The best case scenario for the assumption failing is if the collection is empty because this would cause a null reference exception and we would at least have the notion that something has gone wrong. However, what happens if additional items have been added to the collection without our knowledge? The code now potentially contains a bug that may go unnoticed for a significant period of time, causing a few inaccurate calculations at best, and massive amounts of inaccurate data to be persisted at worst. In any case, the original intent of the developer has been lost due to poor choices made when choosing which single-item search method to use. When thinking about which of the above options to use in an IEnumerable, the questions you should always be able to provide or assume an answer about your data and/or your data model as a developer are as follows:
1. Is it logical for this collection to contain more than one value?
a. If the answer to this is no, and you have control over the data model consider revising the data model to make sure this is no longer represented as a collection.
b. If the answer to this is no, and you do not control the data model, you should use .Single(). This will hopefully reveal any bugs or inconsistencies with critical design assumptions as soon as possible, rather than just sweeping them under the rug.
c. If the answer to this is yes, and you need a single item, you’ll likely need to use .First() – I’ll elaborate more on this later.
2. Is it logical for this collection to be empty?
a. If the answer to this is yes, you will need to use the OrDefault variant of whichever option you chose in step #1.
If You Use First(), You Need to Sort
If you applied step 1C in your decision-making process, now you need to think about how your collection is sorted. After all, if the order of your items is practically random, the result you get from using .First() will also be random. Most of the time, I’d consider getting a random item from a collection to be a rather useless operation. If you can guarantee that your IEnumerable is already sorted the way it needs to be, then you’re all set. Otherwise, you’ll need to use an OrderBy or OrderByDescending to make sure you’re getting the correct element. Even if it is already sorted, however, consider throwing in an OrderBy for the sake of readability; the next developer who needs to do maintenance on or related to this code will always be appreciative of the effort you put into readability.
Don’t Use Where in This Scenario
Another pattern I see in C# beginner code is using a Where before a First or a Single. First and Single can optionally take a filtering lambda expression just like Where does, so instead of:
return myCollection.Where(x => x.Name == “Charlie Brown”).SingleOrDefault();
you can instead write:
return myCollection.SingleOrDefault(x => x.Name == “Charlie Brown”);
and you’ve written less code that is more readable and does the exact same thing. That doesn’t mean you shouldn’t ever use Where, I’m only saying it isn’t your best option when searching for an individual element. However, sometimes including a where filter in addition to a single-search filter can improve readability as in the following example:
If you find yourself needing to use one of these filtering overloads, it is time to revisit the decision making process again in a slightly different context.
The Decision for Filtered Single-Search
When searching a collection, you may have a value(s) that uniquely identifies an element you’re looking for, in which case you can and should use .Single(x => … ). If zero items is a reasonable result from your search filter, make sure to use the OrDefault variant of whichever extension method you choose.
1. Is it reasonable for there to be more than one result from this search?
a. If no, use .Single(x => …)
b. If yes, sort, then use .First(x => …)
2. Is it reasonable for there to be no matches for this search?
a. If yes, use the OrDefault variant of the extension method chosen in step 1.
It is very easy to get into the bad habit of thinking “FirstOrDefault always works!”, but that habit can lead to missed bugs and missed opportunities to write better code that can expose design flaws early before they become serious problems. Now that you know which single search IEnumerable extension methods to use and when, you’re well on your way to a future of code that is more readable and stable in this scenario. Please share this with someone else who needs to read it.
Other recent posts:
Blog Barista: Bob Marquis, CPA, PMP, PgMP | May 26, 2020 | Project Management | Brew time: 5 min
What does a project sponsor or project executive want to know about their project? In simplest terms, they want to know, “How are we doing? Is the project going as planned?” These questions, of course, cover a number of dimensions such as schedule, cost, quality, scope, and ultimately actual benefits. Looking at these factors gives us…
OKEMOS, MI, April 19, 2021 — Kunz, Leigh & Associates (KL&A) congratulates Bob Marquis, one of KL&A’s most experienced and well-respected program management consultants, for obtaining the Program Management Professional (PgMP) certification from the Project Management Institute (PMI). PMI is the global leader for those who work in project, program, and…