Home » How To Handle Large Datasets in CRM

How To Handle Large Datasets in CRM

by Ben Campbell-Bradley
14 minutes read

Recently, I was asked by a client to come up with a solution to a problem they were having. They wanted to retrieve a set of notes records from a backend system then use that data to create records for a custom notes entity in CRM. Then, they wanted to display a list of those notes inside a custom HTML web resource. This had to be done in real–time when requested by the user. I noticed a complication when the backend system didn’t have any kind of paging in place. I couldn’t request a subset of records, display them, and request the next page when the user asked for it. I had to make a request to the backend system, get all the records, and then figure out how to get those records into CRM quickly. The user experience depended on it.

The being fetched from the backend system contained a huge number of sometimes 10,000 or more.  I wrote an Azure Web App to handle the integration and used the ‘Execute Multiple’ method to speed up creating the records in CRM.  This, however, turned out to be only half of the solution.  I was still seeing significant delays of up to a minute or more.  After digging into the logs and doing some research I found that the biggest delay came from creating the note request and adding it to the request collection.  The second half of the solution would have to find some way to improve the performance of loading the request collection.

What was the answer? Multithreading! 

By using multithreading, we can leverage multiple CPUs to increase speed.  We’re basically trading off CPU load for speed.  While the CPU load will increase, the time to perform the actions we need to will decrease.   

After I was finished, the time to load and display the notes list decreased from a minute to around 10 seconds.  Not too shabby for 10,000 records.   

I also realized that this pattern had another application.  Data migration.  I had another client at the time who needed entity images migrated from their on-premise instance of CRM to the cloud.  I was able to use Scribe for most of the data, but entity images were not supported by Scribe.  Using this same pattern, I wrote a console app to migrate all of the contact images from their on-prem instance to their online instance. 

On to the code! 

I’m going to be showing an example of using it to create contacts, but this can also be applied to any type of request and entity record type. 

The way that this pattern works is it loads all the requests into the collection simultaneously rather than sequentially.  So rather than creating and adding each request one at a time for each contact, it creates multiple contact requests and adds them to the collection simultaneously.  After it’s finished creating and adding the requests, it calls the Execute Multiple method with the collection to load the contacts into CRM. 

As always when using multithreading, make sure you weigh the pros and cons.  Keep in mind, his solution is only appropriate for a large number of records.  Implementing it a for a smaller number of records will actually cause worse performance. 

To start, you’ll need to include the following libraries in addition to the normal Xrm.Sdk ones: 

<!—CODE BEGINS–>

using System.Threading.Tasks; 

using System.Collections.Concurrent; 

<!—CODE ENDS–>

Then declare the variables: 

<!—CODE BEGINS–>

List<Entity> contactsToCreate = new List<Entity>(); 

ConcurrentBag<OrganizationRequest> createRequests = new ConcurrentBag<OrganizationRequest>() 

<!—CODE ENDS–>

Something to note about the Concurrent Bag collection is that it is unordered. 

Here’s the rest of the code to load the request collection 

<!—CODE BEGINS–>

[php]

//Retrieve records from backend system or source organization in the case of migration and load into contactsToCreate list 

//This is where the magic happens! 

Parallel.ForEach(contactsToCreate, (contact) => 

{ 

Entity newContact = new Entity(“contact”); 

newContact[“name”] = contact[“name”]; 

//set other attributes 

CreateRequest create = new CreateRequest{Target = newContact}; 

createRequests.Add(create); 

}); 

If(createRequests.Count() > 0){ 

var request = new ExecuteMultipleRequest() 

{ 

Settings = new ExecuteMultipleSettings() 

{ 

//Set Execute Multiple Settings here 

}, 

Requests = new OrganizationRequestCollection() 

}; 

request.Requests.AddRange(createRequests); 

ExecuteMultipleResponse = (ExecuteMultipleResponse)CrmServiceClient.Execute(request); 

} [php]

<!—CODE ENDS–>

And that’s all there is to it!  The concurrent bag and parallel.foreach operation is really the meat of this pattern.  It allows multiple entities and requests to be instantiated and added to the request collection at the same time. 

The nicest part of this pattern is that it has multiple applications.  Use it in your Azure webjob or Function or CRM Plugin to integrate with another system.  You can use it in a console app to migrate data quickly from one Organization to another.  Any place where you need to perform any request operation, meaning it creates, updates, or deletes on a large dataset, you can use this coding pattern to improve performance. 

To learn more about our Microsoft consulting and implementation services, contact us today