[Enhancement]: Use indexes for Mongo collections #8
Labels
approved
The topic is approved by a developer
enhancement
An update to an existing part of the codebase
Checked Existing
What enhancement would you like to see?
When a collection has no indexes, Mongo will perform a full
COLLSCAN
on the collection, scanning all documents to perform the query. This can very easily eat all available resources when working withWhen there are many documents, and many scans of those documents, Mongo can easily max out CPU usage and remain there. When working with very large documents (such as those which contain files), this can also easily max out memory usage.
Any other details to share? (OPTIONAL)
Currently our BOSS server is doing a
COLLSCAN
over a 7gb+ collection, which has 880,975 documents, over a million times per day. This is locking up all system resourcesI propose we add the following indexes:
Task
task_id
andboss_app_id
, as that is the combination most often queried byFile
task_id
andboss_app_id
, as that is the combination most often queried byname
, as we sometimes query bytask_id
,boss_app_id
, andname
. This will use index intersectiondata_id
, as we also often query by just this fieldCECData
creator_pid
andgame_id
, as that is the combination most often queried bylatest_data_id
, as we also often query by just this fieldCECSlot
creator_pid
andgame_id
, as that is the combination most often queried byIt should be noted that indexes do not come for free, nor does index intersection. Indexes are stored on disk by Mongo and will increase our storage usage. Index intersection also has some overhead compared to regular indexed queries, but it should be better than a full
COLLSCAN
. We CAN make multiple compound indexes using the same fields, but this creates duplicate indexes on disk which again increases storage costs.The text was updated successfully, but these errors were encountered: