-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove RecordBase to speedup processing #57
base: master
Are you sure you want to change the base?
Conversation
Looks decent. Thanks for submitting. Need more time to review more intimately. What sort of performance gains do you observe. |
The first commit is about 10-15%, the second commit about 5%. The second commit moved caching to the instance of processor, so, if someone uses a new processor every time and have small FITs, may have some performance degradation, maybe. Anyway, maybe the scrubbing is to complicated and may be simplified? Do not know. |
I've added |
Update: I've run benchmarks on my Win32 Intel i5 2.5GHz machine previously. I've tried it on our Linux x64 (Ubuntu Xenial) Intel Xeon 2.6GHz server, and it's 5x faster. (all running Python 3.5.2). IMHO it cannot be caused just by the faster processor, there must be some performance problem with Python on Windows or on x32 arch. Anyway, removing |
Just about the commit: FitFileDataProcessor cache methods not just method names with FitFile(...) as ffile:
pr = FitFileDataProcessor(ffile)
for m in pr.get_messages(...):
# m is processed This way, we do not need any static caching, it's up to the user. |
return method | ||
|
||
scrubbed_method_name = scrub_method_name(method_name) | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do method = getattr(self, scrubbed_method_name, None)
to use None
as the default if the attribute doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getattr
is slow for large FITs (e.g. 10 hours records). It's the purpose of this commit to avoid it and speed the processors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that. I'm suggesting using getattr(self, scrubbed_method_name, None)
instead of the try
/except
for the initial caching.
def __repr__(self): | ||
return '<FieldType: %s (%s)>' % (self.name, self.base_type) | ||
|
||
|
||
class MessageType(RecordBase): | ||
class MessageType(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing object
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, missing the object, thanks.
@pR0Ps thanks for CR. Meanwhile I've got mroe insight into the problem with the processors a bit. So I would discard the commit about the processors, make a issue for that where we can debate it more in depth. |
If your solution to the processors is the one you suggested above, I wouldn't recommend it. Making things "up to the user" is just pushing the problem along. If most users are going to want the values out of the |
Fix #55
The second commit limits the data processor
getattr
.