Insights

Improving processing performance with Parallel.ForEach

 

Improving processing performance with Parallel.ForEach


Sep 20, 2019

By Rod McBride

The .NET Framework Task Parallel Library (TPL) can significantly increase processing performance by using all available cores on the host computer more efficiently. With the typical execution model, a task (which is a unit of work) executes sequentially on a single CPU core.  

However, for a recent long-running task, I wanted to leverage parallelization to distribute work across multiple processors, given that my box has four cores and eight logical processors (see Figure 1) to improve the overall processing time.

Improving processing performance with Parallel.ForEach

Figure 1. Logical processors

In this case, the task was record deletion, as over a million records were added to a table in Dynamics 365 online in error and needed to be removed. 

I used the standard Bulk Delete process to attempt the cleanup, but it took a while and ended up erroring out. So, I decided to use the CRM SDK and LINQPad to test a delete script (see Listing 1 below). The process involved was typical: retrieve the list of Ids to delete, then loop through the result set and delete the records. While the initial attempt worked, it was slow, with a batch of 50,000 records taking about an hour to delete.

Given that the deletion could be easily split into tasks that execute efficiently on their own, it was well suited for parallelism with the TPL. TPL provides a basic form of structured parallelism via three static methods in the Parallel Class:

  • Parallel.Invoke— Executes each of the provided actions, possibly in parallel.
  • Parallel.For— Executes a for loop in which iterations may run in parallel.
  • Parallel.ForEach— Executes a for loop in which iterations may run in parallel.

I leveraged the Parallel.ForEach method and had to make only minor changes to the original query, as shown in Figure 2 and in the code listing.

Improving processing performance with Parallel.ForEach

Figure 2: foreach query change

varentities = history.Where(e => e.tls_User.Id == userId)
               .Select(e => 
new{ e.Id })
               .ToList();

foreach(varinentities)

{

    Delete("history", e.Id);

}
Listing 1: Original query


var
entities = history.Where(e => e.tls_User.Id == userId)
               .Select(e => 
new{ e.Id })
               .ToList();

Parallel.ForEach(entities, (e) => { 

    Delete("history", e.Id);

});
Listing 2: Revised query

After switching from the standard C# foreach to the Parallel.ForEach and re-running the query, the performance gain was almost 10x faster. Instead of ~833 records per minute (~50K an hour), the revised query processed ~8K records per minute (~480K an hour).  

You can also parallelize the LINQ query by adding AsParallel() method and then parallelize the foreach by using the ForAll() method. Here’s a simple example: 

"abcdef".AsParallel().Select(c => char.ToUpper(c)).ForAll(Console.Write);

All I wanted to do was to parallelize a foreach, so I used its parallel version, Parallel.ForEach().

As you can see, for the deletion process there was a significant increase in performance using parallelism, but that may not always be the case.I It depends on several factors, such as the number of CPUs, the iterations involved, the type of parallelism (data, task or dataflow) and whether it’s an embarrassingly parallel problem.  

However, programs that are properly designed to take advantage of parallelism can execute faster than their sequential counterparts, which is often a significant improvement. 

For more information, see the following links:

Parallel Programming in the .NET Framework

See the Potential Pitfalls in Data and Task Parallelism

Down and Dirty: .NET Task Parallel Library (Multithreading in a Multicore World)

You can also keep reading more of our technology-focused articles here, or you can contact Wipfli with any questions about how to use Parallel.ForEach.

Comments

*User Name field is required.

(will not be published)

*Real Name field is required.

(will not be published)

*A valid email is required.

*Company field is required.

*Comment field is required.
Dynamics CRM Blog

Subscribe to Connect – Microsoft Dynamics 365


Submit