[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597

ajeety4 · 2025-01-09T13:36:57Z

Product Description

This PR will NOT BE MERGED. It is just for for testing performance of road network disbursement in bulk using sequential approach.

Just for getting a high level review on approach and have a discussion around the approach.

Technical Summary

Ticket
Spec

The road network disbursement algorithm uses Mapbox Directions Matrix API to fetch the distance matrix. There are two main limitations for this API

Only maximum of 25 location coordinates (users + cases) allowed in a single API request
Maximum of 60 requests/limit

The current approach simply uses a sequential chunking approach for each cluster

Limit the maximum number of workers to 24 for road network disbursement. (24 because we need atleast one case to calculate distance)
Break down cases into chunks of size (25 - mobile_workers count)
Fetch distance matrix using API for each chunk
Ensure that each chunk runs for at least 1 sec (using time.sleep)

Additionally, celery chain at this point is not implemented based on the some considerations. (To be added in comment)

Feature Flag

Geospatial

ajeety4 · 2025-01-09T15:08:21Z

Hey @kaapstorm @zandre-eng ,

I wanted to have a discussion and get your thoughts on the proposed approach of using child celery tasks (chain or chord) f to ensure we don't exceed the Mapbox limit of 60 requests/minute. Note that this pertains only to Road Network Disbursement algorithm using Mapbox Directions Matrix API.
During the POC implementation, I realized that, along with the above limit, the API also has a maximum limit of 25 location coordinates (users + cases).

I had one concern around the celery approach(1st point below) and a couple of additional points to discuss (numbered for easier reference).

High number of short acting Celery tasks : As max of only 25 coordinates are allowed in a single API request, hence we would need to split cases and mobile workers into chunks of 25. Since the intention of using celery chain is to perform each API hit in sequence and ensure that it at least takes 1 sec to execute, this means we will have 1 task for 1 API hit. (This is true for chord which follows a parallel approach by making groups of 60 tasks and waiting for the remainder of a min).

For 100k cases, this amounts to at least 4000 tasks and increased to 40000 for 1 million cases. Note that the celery task itself would be a quick one each running for 1 sec. (API usually returns response within a sec)

I think it is not worth to create such high number of child celery tasks considering the additional overhead of a new task for such a short acting task and possibility of issues from celery (celery might not be 100 percent reliable at times).

What do you think of the above concern ?
The alternative is to simply not create child tasks and follow a sequential chunking as implemented currently in this PR.
Handling more than 25 workers in a cluster : This is possibly out of scope for this ticket, however is a good point to discuss now. We would also need to think about how to handle in case there are more than 25 workers in a cluster.
This may not happen if we happen to select a relatively small cluster size and mobile workers are evenly distribution with a higher ratio of cases to mobile workers. However, this is not guaranteed, considering the distribution can vary in practice, and we may choose a relatively larger cluster size for bulk disbursement.

Suggestion As we want to find distances across all workers and cases, we could update the chunking logic to ensure that it loops through cases for all mobile workers in the cluster. I imagine this would not be too difficult.

Any thoughts on the above scenario and the suggested approach ?
Local Testing Results
The cluster disbursement for my local testing is quite performant - 2.5 s for 300k locations with 30 clusters.
Additionally, the Mapbox API call returned response within 1 sec all the time (tested for 400 calls). If we decide to use the celery approach for child tasks, I predict that there would not be much difference between chain(sequential) and chord(approach) as we would be effectively making 60 API calls per minute. This is based on assumption that mapbox API is consistence in the performance as we scale. Otherwise, the parallel approach would be faster.

Another interesting thing to keep in mind -since we limited to 60 calls/min and 25 locations per call, this means for 100k locations it would take 100k/(25*60) = ~67 mins.
I think we should have probably have some maximum limit on total cases for the road disbursement algorithm.

Looking forward to hearing your thoughts/suggestions.

kaapstorm · 2025-01-10T11:05:43Z

Hi @ajeety4

High number of short acting Celery tasks

I agree with your concern here. I think my idea of using Celery chain or chord is over-complicating this, and we should rather use a simple loop.

Handling more than 25 workers in a cluster

About cluster size: The original intention is that a cluster size is 25 or less, because the cluster size is set by the matrix size.

If we want a complete matrix of more than 25, it will require a much higher number of API calls.

For example, imagine if the Matrix API had a limit of 4 locations per request, and you wanted to build a matrix of 8 locations (a, b, c, d, e, f, g and h), then you could put a, b, c, d in chunk 1; a, b, e, f in chunk 2, a, b, g, h in chunk 3, c, d, e, f in chunk 4, c, d, g, h in chunk 5, and finally e, f, g, h in chunk 6:

  a b c d   e f g h
a 1 1 1 1   2 2 3 3
b 1 1 1 1   2 2 3 3
c 1 1 1 1   4 4 5 5
d 1 1 1 1   4 4 5 5

e 2 2 4 4   6 6 6 6
f 2 2 4 4   6 6 6 6
g 3 3 5 5   6 6 6 6
h 3 3 5 5   6 6 6 6

So if we double* the number of locations in a cluster, we need 6 times the number of API calls.

Squeezing more locations into a chunk: One approach would be to detect when workers are very nearby, and instead of submitting the location of each of them, to submit their midpoint instead. The same approach could be used for cases. I think this would only be workable for a small value of "very nearby".

The cluster disbursement for my local testing is quite performant

YAY!

*double: Double an even number of locations. For a chunk size of 25 locations, we could build a matrix of up to 48 locations (24 per chunk) using 6 API calls.

ajeety4 · 2025-01-10T13:28:29Z

Thanks @kaapstorm .

High number of short acting Celery tasks

I agree with your concern here. I think my idea of using Celery chain or chord is over-complicating this, and we should rather use a simple loop.

Sounds perfect. I will use simple loop approach for the bulk testing.

About cluster size: The original intention is that a cluster size is 25 or less, because the cluster size is set by the matrix size.

Noted. This does makes sense. We might need to use a variation of standard k means to achieve a more balanced cluster size.
It would be interesting to see the performance results in the upcoming algorithm testing work. (I will link this discussion in case it is beneficial)

If we want a complete matrix of more than 25, it will require a much higher number of API calls.

Indeed. Just a small note for the example mentioned - for our API call , the x and y in the matrix would be mobile workers and cases respectively. I think the ratio of x to y would determine the extent to which the API calls count would increase. It will increase though.

ajeety4 added 7 commits January 9, 2025 14:42

base command: fetch gps workers and cases

f0e0cde

requirements: scikit and numpy

8043b2e

generate k means clusters

124a903

celery task for cluster disbursement

e4e9a8e

road network disbursement chunk support

e89f14c

ensure 1 sec to fetch distance matrix for each chunk

ed9503e

support to solve disbursement for fraction of clusters

50df325

dimagimon added the dependencies Pull requests that update a dependency file label Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597

[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597

ajeety4 commented Jan 9, 2025 •

edited

Loading

ajeety4 commented Jan 9, 2025

kaapstorm commented Jan 10, 2025

ajeety4 commented Jan 10, 2025

[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597

Are you sure you want to change the base?

[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597

Conversation

ajeety4 commented Jan 9, 2025 • edited Loading

Product Description

Technical Summary

Feature Flag

ajeety4 commented Jan 9, 2025

kaapstorm commented Jan 10, 2025

ajeety4 commented Jan 10, 2025

ajeety4 commented Jan 9, 2025 •

edited

Loading