-
-
Notifications
You must be signed in to change notification settings - Fork 510
in memory cache verify password implementation #839
base: develop
Are you sure you want to change the base?
Conversation
It look that the CI errors does not come from my PR |
Couple things:
|
Hi jwag956,
I read (not deeply) around the web, that the slow computation of bcrypt is something good. Maybe it is to prevent brute force or another reason I don't know yet, but in our case, the check needs to be preformed once only (every TTL) as each next call is only an exact recomputation of what just have been computed. Let me know if I am wrong in any way.
|
I will try to do an implementation using LRU cache, I am interested in writing less code for the same thing :) |
Thanks - I did some tests as well as yes- bcrypt/verify_password is really slow. The only place we need to worry about brute force is for actual authentication i.e. when the user has supplied a plaintext password. For normal APIs where a token is passed looking at https://pythonhosted.org/itsdangerous/ I think that is pretty immune to brute-force attacks (assuming you have a good SECRET_KEY) I did observe that newer versions of itsdangerous provide for signing of the tokens - much like oauth2 - which could mean we just have to verify correct-signed token rather than token contents - which would be fast. But that is for another day. As with anything security related - always good to get a few sets of eyes on the changes. |
I just tried to do it using https://github.com/arnov/lru-ttl/blob/master/lruttl/lru.py. This library just not work (or I have a problem) The main problem to deal with in this problem is to clear cache while flask-security has no schedule system. The implementation I done takes care of cleaning values in cache each ttl duration (on a call to the cache system -> when auth stack is called). As there is no specific daemon setup / thread doing the clean, just feeding a cache , even with cache library requires some clean at one moment. There is no magic and LRUCache or cachetools will need some code to do the clean. So I think there is not too much code in my implementation in the end. |
So, what about merging ? |
I was thinking that you would use cachetools:TTLCache as described here: https://cachetools.readthedocs.io/en/latest/ |
Just to be clear - I am not trying to cause a huge amount more work, and I am not currently a maintainer/approver - just someone trying to help. I do think what you are trying to do is useful and clearly this has been an issue for many folks. |
I appreciate your help and your advices jwag956 :) |
I refactored the cache class code using cachetools. This is a cool library I didn't know. It works like charm now and has very fewer code. Thank you jwag956 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
- unit tests
- addition to configuration.rst
flask_security/cache.py
Outdated
|
||
def set_cache(self, token, user): | ||
"""When a password is checked, then result is put in cache.""" | ||
key_hash = self.get_hash(token, user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think this works - I would be more inclined to use user.id (which is guaranteed to be unique) as the key and store the token as the value - then you can just compare, and no need for md5 or anything..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for that, I'll do it.
from cachetools import TTLCache | ||
|
||
|
||
class VerifyHashCache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to be 2.7 compatible this will need to be: VerifyHashCache(object):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drawbacks of not doing so is that some class related feature are not available with a class declared this way and some of these points are described here https://stackoverflow.com/questions/4015417/python-class-inherits-object
The cache object I made does not uses any of these feature, so this is not required, but I guess for the sake of python the inheritance should be written.
On the other hand 2020 is comming very fast :)
flask_security/core.py
Outdated
if cache is None: | ||
cache = VerifyHashCache() | ||
local_cache.verify_hash_cache = cache | ||
has_cache = cache.has_verify_hash_cache(token, user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is fairly convoluted logic just to avoid replicating a line or 2 of code. And you aren't handing the 'not user' case.
has_cache = cache.has_verify_hash_cache(token, user) | |
if not user: | |
return _security.login_manager.anonymous_user() | |
if use_cache: | |
cache = getattr(local_cache, 'verify_hash_cache', None) | |
if cache is None: | |
cache = VerifyHashCache() | |
local_cache.verify_hash_cache = cache | |
if cache.has_verify_hash_cache(token, user): | |
return user | |
if verify_hashdata[1], user.password): | |
cache.set_cache(token, user) | |
return user | |
else: | |
if verify_hashdata[1], user.password): | |
return user | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implented this at first glance, but did not change it until you notice it. It is now changed as you suggested and it is indeed clearer the way you suggested. Thank you for this too !
Hi, |
@tyacbovi I am not sure tu understand your proposal. |
If that "bad" user will keep on attempting to gain access, it will waste resources and time by running verify_hash many times so if that result was cached it could be partially mitigated |
9829796
to
08a20f8
Compare
I think all is ok now. |
@tyacbovi Ok , I understand your point now, but I still think this is not usefull to do this as the primary goal of this implementation is to make faster subsequent calls to authenticated flask routes when the first one is ok and set the cache accordingly. Functionally, a user that attempts a bad authentication call is not expected and looping over a bad authentication that lead to 401 response is not important to me to be fast as the logic of authentication is to try to send other credentials when one fail if the user really wants to be authenticated. If a user loops authentication it means for me that the authentication process client side is wrong or deliberately trying many combination for a bruteforce attack or so. And in this case, I DO want the check passord checking to be slow in order to prevent the attacker to perform a lot of queries quickly. Moreover, Is there any case I miss here ? |
No, I pretty much agree with your points and completely agree that bruteforce protection is not supposed to be done in the application level, thanks for the detailed response. Can not wait for this to be merged and released :) |
Hey maintainers, the feature is ok now, but CI crashes because of some other issues. It would be cool to merge it when possible 👍 |
Need some unit tests... please! you don't need to test if cachetools works - but that in fact you properly call it to initialize and properly handle positive and negative cases... mocks probably needed! |
641c6d9
to
93fb4ed
Compare
Ok, I did unit tests few days ago, but test folder is gitignored, So I missed to add my test file to the commit. This is now done. |
Being overly pedantic here - but could you add a couple tests that verify the code in core.py::_request_loader that verifies if new config variable that appropriate calls to new caching module are made, etc? Thanks! |
I lack time these days. Next week I should be able to do this :) . |
Thanks. I have a question for you - why are you not using Flask Sessions (which are built on Werkzueg sessions) - which are secure cookies. If you do - there is no extra bcrypt step - that only happens when using token authentication. Could you explain your basic architecture a bit? |
Hey @jwag956 ! I will try a short answer as I am not sure to understand all of your question. However, I am not sure to understand why / how Werkzeug sessions would improve my commit. More precisely, I don't mind what "there is no extra bcrypt step" means in this context. The architecture I implemented was first to solve the issue of long time computation of password verification for flask routes that are protected with the flask-security token validation system. Now, you takes care of the quality of my work, this is cool as I am rarely (qualitatively) mentored, so beside trying to make my work merged, I feel some sort of community entertainement ;) I share you this point of view because the way I contribute to open source project is often like I descibed above. This is time consuming, and in case my work looks interesting the community expects very soon a sort of very (very) high quality for the code I share. This is a good thing for projects I think, but this is really hard for me and surely for many other that want to contribute and may feel not strong enought (this is a feeling I have). Well, enough philosophy for now. Please keep in mind, my initial goal was simply to fix a performance issue with the simplest code I could produce (with my current competences). Finally, this was not a quick answer. |
And for the extra tests that would cover the whole feature, I will try to find some time as soon as possible to enhence my commit. |
Thanks for your comments. I agree that with all the configuration variables and possible ways to use flask-security there isn't a lot of documentation giving 'options'. Ok - let me see if I can describe sessions (which are an integral part of Flask and the underlying Werkzeug). If you are accessing your application from a UI then the best and most secure thing is to use cookies which define a session. From what I can tell - this is the default and you should see a cookie being returned when you log in through flask-security - that cookie is called 'session' by default. Your browser of course will send that cookie on every request to your backend, and if you guard your API endpoints with either @login_required or @auth_required('token', 'session') then Flask will take the cookie and valid the session and no expensive passwd hashing will be done. Thus if you use sessions (which I do in my app) then there is no performance issue and so your commit while useful may not actually be needed by you! This is much more secure than having your UI store the authentication_token and send it on every ajax request. Now - your situation might be completely different and this might not apply - that's why I was asking the question. I am not trying to dissuade you from continuing your work on the cache - just wanting to clarify your use case. |
Ok, you're right again. The fact is that I started develop an app using jwt for auth layer so I met the issue I try to solve there.
I guess if my commit is not merged, I will use my fork for my current project for now, and think different for the next one. Please don't be upset, all what you told me already made me grown :) Edit : And oh yes, I may have forgot to (re)mention that my initial intention was to make it simpler to use flask-security with a SPA and where using jwt makes it a lot simpler and faster to implement both client and serverside security layer using flask-security (on my point of view), and now even more with my sub project https://github.com/eregnier/Flask_AuthOOB that is just born, maybe ready to die following your advices but almost ready to use anyway. The initial issue I met is that I am bored to always reimplement auth mechanism for my apps using flask. |
So to be clear - today, Flask-Security has a real shortcoming when using tokens for auth due to the performance issue. A fix there is critical since there are many real use cases for token auth. While I think there are potentially other ways to fix this - a cache seems fine and I would be happy to merge it into my fork. For that to happen I would like to see (as I stated before) a few more unit tests for the code you added in _request_loader. As for the larger conversation - that's just exchanging ideas and concepts to help build better applications. I too started an SPA using tokens - I then realized that flask by default is already sending a cookie and that I didn't need to do ANYTHING in my code (I was grabbing the token, storing it in web browser etc). While we all have different levels of risk/security - it's also useful I think to help steer people to current best practices (another doc I hope to write). |
Missing unit test are comming. I am really busy these days. |
Hey @jwag956 , |
I have cherry-picked your PR into my fork - and made a few minor cosmetic changes. All tests pass - please look at: and I will merge it into my fork. |
Have to fix py_webauthn until they fix user_handle at 1.9. djlint now finds lots more issues - fix/ignore those. One was the missing </a> tags reported. closes pallets-eco#830
As bcrypt algorithm computations are expensives and long, I implemented a simple cache system allowing to bypass token validation on each requests when the same validation is already in cache.
The result is a combination of the token and the user passwords. Documentation in the code describes more in detail the implementation.
It's usage is optionnal when the USE_VERIFY_PASSWORD_CACHE is set.
I would like some maintainer of the repository validate the code before I go further by adding options in documentation and implementing unit tests. I would be pleased to do this if this current commit is accepted.