-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation on parameter passing #5
Comments
Hi @allmedia-nz thanks for the questions and apologies for the long delay in response! I really struggle with tracking github notifications through the notification noise I'm currently buried in ... I can see how the documentation is unclear ... I would definitely appreciate a PR to improve the language/explanatory power of the examples ...! What run_notebook(...) tries to do is 'imagine the notebook is a function defined with def with keyword parameters given by the calls to receive_parameter that occur in the notebook ...' notebook_a.py
In which case the call to
No -- Notebook B won't inherit or know anything about the state of any values in Notebook A aside from the values passed to B by A (with a little bit of a caveat ...)
yes -- Notebook A should pass parameters to Notebook B. There is a small bit of extra behavior which significantly complicates explaining things ... -- but its actually a very simple mechanism. NotebookScripter maintains a stack of values passed into a call to run_notebook
in notebook_a
in notebook_b
Outputs: "foo, bar" If we were to put a breakpoint at the print statement and examine the notebook scripter callstack -- it would look like this:
(The things in [] above are values of the notebook scripter parameters passed into that execution frame) To try an clarify the behavior or The values passed to calls to run_notebook() are stored in a stack and the stack of run_notebook parameters are searched in order to find the value for the parameter that should be used within a given execution scope. Another way of saying this is that the parameters passed to run_notebook are dynamically scoped (as opposed to lexical scoping used for normal method parameters in python) ... I think the big confusion that hit you is that this mechanism applies ONLY to the parameters passed to calls to run_notebook(...) -- not at all to any other values defined in the notebook scope ... I found this behavior to be quite convenient when developing some machine learning models ... If you think of all the receive_parameters() as defining HYPER_PARAMETERS, then you can implement pipelines that internally use run_notebook() to execute different algorithms while allowing to define 'experiments' in a 'flat way' -- simply by providing values for any receive_parameter call at a top-level... (aside: The fact that this behavior is hard to explain in a straightforward way is perhaps an argument against this mechanism ... I originally made it possible to decide if you want this behavior or not -- but that was even harder to explain ... for my use cases at least, the lexical scoping mechanism was sufficiently useful that I ended up deciding to just make it the only behavior ... (I could be persuaded differently ...))
The parameter values passed to run_notebook() calls need to be pickle serializable -- this is to support the out of process run_notebook execution models -- and I think its better to maintain consistency and require this also for the in-process execution ... I believe pandas dataframes are pickle serializable ... if you have some problematic sample code I could help take a look/identify if there is a bug somewhere ...?
Would happily accept improvements to the documentation ...! I haven't been doing much python programming lately but I'm not intending to leave this out to die and would be happy to see it improve! |
This is a genius concept for Juypter and I think revolutionizes the whole platform. However I confess I am finding the question of the scope of subnotebooks and parameter passing a bit unclear and I feel a tad more example detail would help.
As a matter of format if the examples were Juypter cells rather than consol interactions it would be easier to relate to.
Could you also please clarify the seeming contradictions:
"execs all the code cell's within Example.ipynb sequentially in the context of that module"
and
"Importantly - the notebook code is not imported as a python module - rather, all the code within the notebook is re-run on each call to run_notebook()"
What I find confusing is scope. If notebook A calls notebook B does B inherit all variables from A as if it was a continuation of A (just an automatically loaded set of cells with input and output suppressed) or does it require parameters like a function? You hint at this by distinguishing run_notebook_in_process() from run_notebook() but once again its not abundantly clear.
Can Notebook B be used in a loop with changing input values?
In my tests I have been trying to pass a pandas dataframe. It's been very perplexing trying to work out where to declare variables in A or B so that they get recognised, or whether a keyword argument is a variable or what?
The term "keyword arguments" in
"If desired, values can be injected into the notebook for use during notebook execution by passing keyword arguments"
is not clear. Are we talking about variable values? In which case are they limited to strings? or is there no restriction on type?
I am somewhat surprised by the relatively low level of following you have for this initiative given how useful it could be, and I suspect some of that may be because the documentation needs a bit more work.
Happy to help if that would be useful. I'm more of a writer than a programmer anyway.
kind regards
peter
The text was updated successfully, but these errors were encountered: