-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Not to be merged] POC for testing performance of road network disbursement in bulk using sequential approach #35597
base: master
Are you sure you want to change the base?
Conversation
Hey @kaapstorm @zandre-eng , I wanted to have a discussion and get your thoughts on the proposed approach of using child celery tasks (chain or chord) f to ensure we don't exceed the Mapbox limit of 60 requests/minute. Note that this pertains only to Road Network Disbursement algorithm using Mapbox Directions Matrix API. I had one concern around the celery approach(1st point below) and a couple of additional points to discuss (numbered for easier reference).
Looking forward to hearing your thoughts/suggestions. |
Hi @ajeety4
I agree with your concern here. I think my idea of using Celery chain or chord is over-complicating this, and we should rather use a simple loop.
About cluster size: The original intention is that a cluster size is 25 or less, because the cluster size is set by the matrix size. If we want a complete matrix of more than 25, it will require a much higher number of API calls. For example, imagine if the Matrix API had a limit of 4 locations per request, and you wanted to build a matrix of 8 locations (a, b, c, d, e, f, g and h), then you could put a, b, c, d in chunk 1; a, b, e, f in chunk 2, a, b, g, h in chunk 3, c, d, e, f in chunk 4, c, d, g, h in chunk 5, and finally e, f, g, h in chunk 6:
So if we double* the number of locations in a cluster, we need 6 times the number of API calls. Squeezing more locations into a chunk: One approach would be to detect when workers are very nearby, and instead of submitting the location of each of them, to submit their midpoint instead. The same approach could be used for cases. I think this would only be workable for a small value of "very nearby".
YAY! *double: Double an even number of locations. For a chunk size of 25 locations, we could build a matrix of up to 48 locations (24 per chunk) using 6 API calls. |
Thanks @kaapstorm .
Sounds perfect. I will use simple loop approach for the bulk testing.
Noted. This does makes sense. We might need to use a variation of standard k means to achieve a more balanced cluster size.
Indeed. Just a small note for the example mentioned - for our API call , the x and y in the matrix would be mobile workers and cases respectively. I think the ratio of x to y would determine the extent to which the API calls count would increase. It will increase though. |
Product Description
This PR will NOT BE MERGED. It is just for for testing performance of road network disbursement in bulk using sequential approach.
Just for getting a high level review on approach and have a discussion around the approach.
Technical Summary
Ticket
Spec
The road network disbursement algorithm uses Mapbox Directions Matrix API to fetch the distance matrix. There are two main limitations for this API
The current approach simply uses a sequential chunking approach for each cluster
Additionally, celery chain at this point is not implemented based on the some considerations. (To be added in comment)
Feature Flag
Geospatial