Utility of List.Sort() versus List.OrderBy() for a member of a custom container class

 

Questions


I’ve found myself running back through some old 3.5 framework legacy code, and found some points where there are a whole bunch of lists and dictionaries that must be updated in a synchronized fashion. I’ve determined that I can make this process infinitely easier to both utilize and understand by converging these into custom container classes of new custom classes. There are some points, however, where I came to concerns with organizing the contents of these new container classes by a specific inner property. For example, sorting by the ID number property of one class.

As the container classes are primarily based around a generic List object, my first instinct was to write the inner classes with IComparable, and write the CompareTo method that compares the properties. This way, I can just call items.Sort() when I want to invoke the sorting.

However, I’ve been thinking instead about using items = items.OrderBy(Func) instead. This way it is more flexible if I need to sort by any other property. Readability is better as well, since the property used for sorting will be listed in-line with the sort call rather than having to look up the IComparable code. The overall implementation feels cleaner as a result.

I don’t care for premature or micro optimization, but I like consistency. I find it best to stick with one kind of implementation for as many cases as it is appropriate, and use different implementations where it is necessary. Is it worth it to convert my code to use the LINQ OrderBy instead of using List.Sort? Is it a better practice to stick with the IComparable implementation for these custom containers? Are there any significant mechanical advantages offered by either path that I should be weighing the decision on? Or is their end-functionality equivalent to the point that it just becomes coder’s preference?

 

 

————————————————-

Answer

The major point here is that List<T>.Sort() does the sorting in place. If your list is exposed to external code, it will always represent the same object to this code. This is important if the list is kept in a field by code outside of the container class. If you’re sorting with OrderBy(), you’ll get a new enumeration each time, replacing the previous items. Any previously stored list will not represent the current state of your class.

Considering performance, OrderBy will have to iterate through the whole list to sort items. Then you will call ToList() to create the new list from this enumeration, iterating through the list a second time. Plus, since it’s an enumeration, List will use the doubling algorithm, increasing its size until every element can fit into it. In case of a large list, that could be quite a few allocations and memory copying. I would expect performance to be much worse than List<T>.Sort().

Edit: Small benchmark:

internal class Program {

    private static List<int> CreateList(int size) {

        // use the same seed so that every list has the same elements
        Random random = new Random(589134554);

        List<int> list = new List<int>(size);
        for (int i = 0; i < size; ++i)
            list.Add(random.Next());
        return list;
    }

    private static void Benchmark(int size, bool output = true) {
        List<int> list1 = CreateList(size);
        List<int> list2 = CreateList(size);

        Stopwatch stopwatch = Stopwatch.StartNew();
        list1.Sort();
        stopwatch.Stop();
        double elapsedSort = stopwatch.Elapsed.TotalMilliseconds;
        if (output)
            Console.WriteLine("List({0}).Sort(): {1}ms (100%)", size, elapsedSort);

        stopwatch.Restart();
        list2.OrderBy(i => i).ToList();
        stopwatch.Stop();
        double elapsedOrderBy = stopwatch.Elapsed.TotalMilliseconds;
        if (output)
            Console.WriteLine("List({0}).OrderBy(): {1}ms ({2:.00%})", size, elapsedOrderBy, elapsedOrderBy / elapsedSort);

    }

    internal static void Main() {

        // ensure linq library is loaded and initialized
        Benchmark(1000, false);

        Benchmark(10);
        Benchmark(100);
        Benchmark(1000);
        Benchmark(10000);
        Benchmark(100000);
        Benchmark(1000000);

        Console.ReadKey();
    }
}

Output (normalized to List.Sort):

List(10).Sort(): 0,0025ms (100%)
List(10).OrderBy(): 0,0157ms (628,00%)
List(100).Sort(): 0,0068ms (100%)
List(100).OrderBy(): 0,0294ms (432,35%)
List(1000).Sort(): 0,0758ms (100%)
List(1000).OrderBy(): 0,3107ms (409,89%)
List(10000).Sort(): 0,8969ms (100%)
List(10000).OrderBy(): 4,0751ms (454,35%)
List(100000).Sort(): 10,8541ms (100%)
List(100000).OrderBy(): 50,3497ms (463,88%)
List(1000000).Sort(): 124,1001ms (100%)
List(1000000).OrderBy(): 705,0707ms (568,15%)

c#,linq,list,sorting

[], [], [], []

Facebook Comments

Post a comment