-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: SortedDict Range Deletion #180
Comments
Here's another hacky way of doing it. Not sure if it is better in terms of performance. But still, an API would be nice.
|
Having a kind of range_delete makes sense to me. There’s a variety of optimizations that could be done inside the data structure too. Note that today you can use irange instead of bisect to avoid using the position index. |
Here's the best I can come up with today (without internal optimizations): $ python -m IPython
Python 3.9.1 (v3.9.1:1e5d33e9b9, Dec 7 2020, 12:10:52)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import sortedcontainers
In [2]: class SortedDict(sortedcontainers.SortedDict):
...: def range_del(self, minimum=None, maximum=None, inclusive=(True, True)):
...: iterator = self.irange(minimum, maximum, inclusive)
...: keys = list(iterator)
...: for key in keys:
...: del self[key]
...:
In [3]: sd = SortedDict({'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5})
In [4]: sd
Out[4]: SortedDict({'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5})
In [5]: sd.range_del('b', 'd')
In [6]: sd
Out[6]: SortedDict({'a': 1, 'e': 5}) |
This is great in terms of simplicity. But I would still urge you to take a look at the hack implementation I wrote above. The main idea is that slice deletion for SortedList is already internally optimized to O(log n), which is used for keys, so why not just use it as it is and then take care of the dictionary deletion one by one. That way, the time complexity is O(k + log n) instead of O(k log n), which is inline with Sorted Maps in Java and C++ and other languages. |
The performance of the two will differ depending on the size of the dictionary and how much is deleted. But I generally agree that an optimized version would be a good addition to the API. The best improvement to your version would be to avoid building the positional index. If you like at the implementation of SortedList.irange, you can see how to do so. |
But how would I achieve slice deletion if I don't have the positional indices? |
Hello, Best regards |
Range key deletion feature is currently missing for
SortedDict
. If I want to delete multiple keys within an range, I have to do them one by one using a for loop which is very inefficient in terms of time complexity. See my example below.This is very inefficient both in terms of usage and in terms of time complexity. Time complexity for this one is O(k log n) where k is the range size. Other languages such as C++ and Java only needs an time complexity of O(k + log n) to accomplish the job with the API mentioned below.
In C++, you have the option of calling
erase(iterator first, iterator last)
for theirmap
data structure. Similarly, in Java, you can callsubMap(first, last).clear()
for theirTreeMap
data structure. Something similar should be available forSortedDict
in my opinion.The text was updated successfully, but these errors were encountered: