Support to add new feature using python #69

pipdax · 2020-11-27T06:43:28Z

When I try to EDA, I usually create some new features to help me to analysis data.
For example, some columns has missing data, creating a new columns using 1 or 0 to identify that columns has missing or not.
I don't want to turn off the pandasgui, creating new column, show pandasgui again. This is complex

adamerose · 2020-11-28T04:15:07Z

So if I understand, the problem is that you don't like needing to repeatedly call show to re-open the GUI every time you make changes to the original dataframe? There was some discussion of having modications you make in iPython also apply to the DataFrame in the GUI, but I decided against that and explain why here #20 (comment)

Do you have any specific solution in mind? All I can think of is maybe add a method to add or replace dataframes in an existing GUI window, so you would do gui = show(df), then modify the df in iPython, then gui.show(df) and it would overwrite it in the GUI

JinchengWang · 2020-12-07T10:52:11Z

Not sure if this is a good idea but here is a rough sketch of a proposal:

Add an option in the GUI to add a new column by specifying

the name of the new column, and
an expression that evaluates to the values of the new column (on the original dataframe)

Basically do something like this:

def add_new_column(self, new_column_name, new_column_expression):
    self.dataframe_original[new_column_name] = eval(new_column_expression, globals(), {self.name:self.dataframe_original})
    self.apply_filters_and_sorting()
    self.update()

Though I'm not sure if this still makes sense when filters and sorts are applied. Personally, this behavior still makes sense to me, but maybe there are people who would disagree, and expect the new column to only contain values in the filtered rows?

adamerose · 2020-12-08T16:09:30Z

Here's an API that might work, it just lets you modify your DataFrame as you normally would and then replace the one in the GUI with your result. I can't think of any limitations with this and it seems easier to work with then the add_new_column proposal

gui.replace("my_df_name", my_new_df)

So an example usage would be like this

from pandasgui import show
from pandasgui.datasets import pokemon

gui = show(pokemon)
gui.replace('pokemon', pokemon[pokemon.HP > 100])

Another thing I can do is use my scope sniffing magic (the same thing that get's the dataframe variable name into the GUI as a string) to find all dataframes in your scope and replace the GUI instances with those, you just need to keep the name the same.

from pandasgui import show
from pandasgui.datasets import pokemon

gui = show(pokemon)
pokemon = pokemon[pokemon.HP > 100]
gui.update_all()  # This would find a variable in your scope named 'pokemon' that is a DataFrame, and then replace the one in the GUI with the same name

JinchengWang · 2020-12-09T03:25:36Z

I'm worried that if this is a recommended workflow, it would not be compatible with editing data in the GUI, since the changes made in the GUI is not synced with the original dataframe.

For example, if I run

gui = show(pokemon)

then change the HP of Bulbasaur to 200 in the GUI, I would expect Bulbasaur to show up after executing

pokemon = pokemon[pokemon.HP > 100]
gui.update_all()

adamerose · 2020-12-10T17:47:03Z

I'm worried that if this is a recommended workflow, it would not be compatible with editing data in the GUI, since the changes made in the GUI is not synced with the original dataframe.

For example, if I run
gui = show(pokemon)
then change the HP of Bulbasaur to 200 in the GUI, I would expect Bulbasaur to show up after executing
pokemon = pokemon[pokemon.HP > 100]
gui.update_all() 

Yeah you'll always need a method call to sync in either direction, because I don't want to automatically overwrite the original DataFrame due to reasons in the thread I linked. So have .get_dataframes() to get your GUI changes back into iPython and my proposed .replace() and .update_all() to get your iPython changes back into an existing GUI.

Your example would look like this

gui = show(pokemon)
# then change the HP of Bulbasaur to 200 in the GUI
pokemon = gui.get_dataframes()['pokemon']
pokemon = pokemon[pokemon.HP > 100]
gui.update_all()

This is the least verbose API I can think of

JinchengWang · 2020-12-11T02:11:45Z

Your example would look like this
gui = show(pokemon)
# then change the HP of Bulbasaur to 200 in the GUI
pokemon = gui.get_dataframes()['pokemon']
pokemon = pokemon[pokemon.HP > 100]
gui.update_all() 
This is the least verbose API I can think of

Just thought of another idea:

Provide an IPython magic command to wrap this together. For the example above, allow the user to instead do something like

gui = show(pokemon)
# then change the HP of Bulbasaur to 200 in the GUI
%pdgui pokemon = pokemon[pokemon.HP > 100]

This could also make the history-tracking better, since both GUI operations and magic commands can be recorded.

pipdax added the enhancement New feature or request label Nov 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support to add new feature using python #69

Support to add new feature using python #69

pipdax commented Nov 27, 2020

adamerose commented Nov 28, 2020 •

edited

Loading

JinchengWang commented Dec 7, 2020 •

edited

Loading

adamerose commented Dec 8, 2020 •

edited

Loading

JinchengWang commented Dec 9, 2020

adamerose commented Dec 10, 2020 •

edited

Loading

JinchengWang commented Dec 11, 2020

Support to add new feature using python #69

Support to add new feature using python #69

Comments

pipdax commented Nov 27, 2020

adamerose commented Nov 28, 2020 • edited Loading

JinchengWang commented Dec 7, 2020 • edited Loading

adamerose commented Dec 8, 2020 • edited Loading

JinchengWang commented Dec 9, 2020

adamerose commented Dec 10, 2020 • edited Loading

JinchengWang commented Dec 11, 2020

adamerose commented Nov 28, 2020 •

edited

Loading

JinchengWang commented Dec 7, 2020 •

edited

Loading

adamerose commented Dec 8, 2020 •

edited

Loading

adamerose commented Dec 10, 2020 •

edited

Loading