Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup dataframes #1360
Cleanup dataframes #1360
Changes from 29 commits
78a967a
7a83179
3bc9525
68013a8
5032e86
7736f19
8bbaa15
fc9273e
0cae116
2fb28dc
ac972fd
01471bb
8dc6c4f
144127a
a81f97c
7c0662c
92cfc92
346d23a
ee03161
ee5b34d
38abdff
9e27f16
b870be3
cc905c6
0dc4eb2
f8c1efe
ee92f1d
03aac67
ac5bdd8
58671eb
7ff2c6a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still surprises me that it is better to iterate over numpy arrays point by point and add them to them to lists to add to a new dataframe rather than just adding the numpy arrays to a new dataframe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling of the empty column is expensive because it requires careful handling of missing values. Without doing this shots column may be accidentally typecasted to float because numpy doesn't support nullable integer. This means we first need to create a 2D object-dtype ndarray and populate values, then convert it into dataframe. Since current
_lazy_add_rows
buffer assumes row-wise data list, arrays needs to be converted into this form internally.