I have a doubt with threading data service calls within a foreach loop. Here it goes:
Say you need to request data from a service and then process that data, for this example, let’s say data request takes 2 seconds and data processing takes another one. If you had to process a list of 10000 elements how would you do it?
The thing I tried was:
Dictionary<int,Task<DataResponse>> d_tasks = Dictionary<int,Task<DataResponse>>();
for(int i=0;i<listOfElements;i++)
{
if(i=0)
//create and run task for first element, save it for latter use in d_tasks
if(not last item)
//create and run task for next element, save it for latter use in d_tasks
if(i>0)
//clean task data from previous element (clean d_tasks)
d_tasks[listOfElements[i].Wait();
DoWorkWithData(d_tasks[listOfElements[i].Result);
}
this way I was able to reduce the time in the loop, since I was able to use the latency from the service to process data.
So here is the final question, is this ok? am I forgetting something? is there some kind of pattern for this kind of situations?
Any help is appreciated.
6
Not sure I understand your design. You seem to be Wait
ing for each and every task right after you start it. I would think you’d have two loops (one to create and start the tasks, another to go back over the results and process them). Does your single loop approach really increase parallelism?
Here’s another way to do it that takes advantage of the newish Parallel.For
. The nice thing here is we don’t need to maintain a list of tasks or results. The lock
might not be needed, depending on whether DoWorkWithData
is thread-safe. I’m guessing its purpose is to compile the 10,000 results into a common data structure so it probably needs to be considered a critical section.
var lockObject = new object();
Parallel.ForEach(listOfElements,
(element) => {
var result = DoExpensiveParallelWork(element);
lock(lockObject) { DoWorkWithData(result);}
});
2
You can use async-await
approach for requesting data from the service and parallel for processing data.
Create asynchronous method which will send request to the service and return result asynchronously
public async Task<DataResponse> LoadDataAsync(Element element)
{
// await service.GetResponceAsync....
}
var loadTasks = listOfElements.Select(element => LoadDataAsync(element)).ToList();
var responses = await Task.WhenAll(loadTasks);
// Now you can process all data in parallel way
Parallel.ForEach(response => ProcessResponse(response));
The line listOfElements.Select(element => LoadDataAsync(element)).ToList()
will send requests wihtout waiting for response, which makes response time for all requests to be close to response time of one request.
await Task.WhenAll(loadTasks);
will wait for all responses and results will be processed in parallel.
For big amount of element you can optimize this approach by processing data straight away after response arrived. But then processing data in parallel can be difficult.
2