Inappropriate use of local variables in the Enumerable.Where() extension method

Before holidays I was solving problem with filtering entities. Users were really unhappy. When they want to filter entities by date range application showed not corresponding results, so I dig into that problem and was unpleasantly surprised by the bug I’ve found. Developer wanted to save one line of code and reused one variable like this(keep your eyes on the date variable)

// incoming parameters from UI
// if user do not filter by dates or filter only one end of the collection there will be used empty string
string maybeStartDate = "2011-01-01";
string maybeEndDate = "2011-12-31";

// in real world it can be any data source derived from IEnumerable
IEnumerable<DateTime> dates = new DateTime[] {
                new DateTime(2003, 1, 1),
                new DateTime(2010, 6, 15),
                new DateTime(2011, 3, 12),
                new DateTime(2012, 11, 26),
                new DateTime(2012, 12, 24)
            };

// unassigned local variable for passing with out modifier to DateTime.TryParse() method
DateTime date;

// if string is correctly parsed we want to limit lower end of selection.
// If empty string is passed we get fakse and we skip lower end limitation
if (DateTime.TryParse(maybeStartDate, out date)) {
    dates = dates.Where(dt => dt >= date);
}

// if string is correctly parsed we want to limit greater end of selection.
// If empty string is passed we get fakse and we skip greater end limitation
if (DateTime.TryParse(maybeEndDate, out date)) {
    dates = dates.Where(dt => dt <= date);
}

return dates;

Sometimes I feel that developer’s have weak knowledge about how collections and LINQ works together, because as I asked my mates they thought the code above will produce collection of one date(12th of March 2011), but that’s not right. In this case result will return empty collection, because there’s nothing between 31th of December 2011 and 31th of December 2011.

Unexpected result? No!

We know that Where extension method returns IEnumerable<TSource>, but what we usually didn’t is that Where method returns one of the following nested class derived from IEnumerable<TSource> based on the instantiated type.

  • private class Enumerable.WhereArrayIterator<TSource> for TSource[]
  • private class Enumerable.WhereListIterator<TSource> for List<TSource>
  • private class Enumerable.WhereEnumerableIterator<TSource> for IEnumerable<TSource>

All of these private types are derived from private abstract class Enumerable.Iterator<TSource> and defined in System.Linq.Enumerable type.

So, the problem here is that after we use Where extension method we didn’t enumerate the source collection and yield elements that satisfies predicate directly. Instead of it is created one of the above types which helds source collection and predictate. When we call Where extension method on one of this types the predictates are combined together and creates new one that is hold in the Iterator class.

In our case we’ve created WhereArrayIterator class we combine predicate td => td >= date and td => td <= date.

WhereArrayIterator is now holding after combining two predictates and we can simply write it like this:

DateTime date = DateTime.Parse("2011-12-31");
return dates.Where(td => td <= date && td >= date);

But why it is like that? Just because we’ve used lambda expression to create delegate that points to an anonymous method and anonymous method points to the local variable, not the current value in it. So, the problem is that we work with one local variable that was changed before we iterate throught the collection.

Solution(s)

Easier and better way to avoid this unexpected collection results is to simply declare separate variable for each parameter we are using in lambda expressions.

// incoming parameters from UI
// if user do not filter by dates or filter only one end of the collection there will be used empty string
string maybeStartDate = "2011-01-01";
string maybeEndDate = "2011-12-31";

// in real world it can be any data source derived from IEnumerable
IEnumerable<DateTime> dates = new DateTime[] {
                new DateTime(2003, 1, 1),
                new DateTime(2010, 6, 15),
                new DateTime(2011, 3, 12),
                new DateTime(2012, 11, 26),
                new DateTime(2012, 12, 24)
            };

// unassigned local variable for passing with out modifier to DateTime.TryParse() method
DateTime startDate;

// if string is correctly parsed we want to limit lower end of selection.
// If empty string is passed we get fakse and we skip lower end limitation
if (DateTime.TryParse(maybeStartDate, out date)) {
    dates = dates.Where(dt => dt >= startDate);
}

// unassigned local variable for passing with out modifier to DateTime.TryParse() method
DateTime endDate;

// if string is correctly parsed we want to limit greater end of selection.
// If empty string is passed we get fakse and we skip greater end limitation
if (DateTime.TryParse(maybeEndDate, out date)) {
    dates = dates.Where(dt => dt <= endDate);
}

return dates;

Second solution for that is to iterate the collection to apply predicate. So, you can simply call ToArray(), ToList() (or any other) method, but I don’t recommend this approach, because overhead is not what we want to achieve. You know, more you iterate, more it cost.

// incoming parameters from UI
// if user do not filter by dates or filter only one end of the collection there will be used empty string
string maybeStartDate = "2011-01-01";
string maybeEndDate = "2011-12-31";

// in real world it can be any data source derived from IEnumerable
IEnumerable<DateTime> dates = new DateTime[] {
                new DateTime(2003, 1, 1),
                new DateTime(2010, 6, 15),
                new DateTime(2011, 3, 12),
                new DateTime(2012, 11, 26),
                new DateTime(2012, 12, 24)
            };

// unassigned local variable for passing with out modifier to DateTime.TryParse() method
DateTime date;

// if string is correctly parsed we want to limit lower end of selection.
// If empty string is passed we get fakse and we skip lower end limitation
if (DateTime.TryParse(maybeStartDate, out date)) {
    dates = dates.Where(dt => dt >= date).ToArray();
}

// if string is correctly parsed we want to limit greater end of selection.
// If empty string is passed we get fakse and we skip greater end limitation
if (DateTime.TryParse(maybeEndDate, out date)) {
 dates = dates.Where(dt => dt <= date);
}

return dates;

Conclusion

Don’t try to save one line of code! Code became less readable and may cause unexpected results as it does in this case. Keep using things in the standard manner. Do not experiment when you don’t know how things really works. Oh, maybe I forgot one important thing. All written above apply for all types inherited from IEnumerable, so IQueryable types are also involved.

Posted on 13.1.2013, in Development Criminal Investigative Service and tagged , , , , , , . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: