-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'develop' into chore/upgrade-module-programming-llm
- Loading branch information
Showing
30 changed files
with
982 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
Sphinx==6.2.1 | ||
sphinx-rtd-theme==1.2.0 | ||
sphinx-rtd-theme==2.0.0 | ||
sphinx-autobuild==2021.3.14 | ||
docutils==0.19 | ||
sphinxcontrib-bibtex==2.5.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _run_docker: | ||
|
||
From Docker | ||
=========================================== | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _run_local: | ||
|
||
From the Command Line | ||
=========================================== | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _run_playground: | ||
|
||
Run the Playground | ||
=========================================== | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _setup_install: | ||
|
||
Python and Poetry Setup | ||
=========================================== | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
.. _conduct_experiment_guide: | ||
|
||
============================= | ||
Conducting an Experiment | ||
============================= | ||
|
||
To conduct an experiment in the Athena Playground, follow these steps: | ||
|
||
1. **Define Experiment:** | ||
- Scroll to the Evaluation Mode section. | ||
- In "Define Experiments", choose execution modes, exercise types, and manage training and evaluation data. | ||
- Alternatively, import an experiment configuration using the "Import" button. | ||
- When done, press "Define Experiment". | ||
- Export the experiment configuration using the "Export" button for future reference. | ||
|
||
.. figure:: ../images/playground/evaluation_mode/define_experiment.png | ||
:width: 500px | ||
:alt: Define Experiment Interface of the Athena Playground | ||
|
||
Evaluation Mode: Define Experiment Interface of the Athena Playground | ||
|
||
2. **Configure Modules:** | ||
- Select and configure the modules you wish to include in your experiment. | ||
- Ensure each module is set up with appropriate parameters for effective comparison. | ||
- Import module configurations using the "Import" button, if needed. | ||
- Export the module configurations using the "Export" button for future reference. | ||
|
||
.. figure:: ../images/playground/evaluation_mode/configure_modules.png | ||
:width: 500px | ||
:alt: Configure Modules Interface of the Athena Playground | ||
|
||
Evaluation Mode: Configure Modules Interface of the Athena Playground | ||
|
||
3. **Conduct Experiment:** | ||
- Press "Start Experiment" to begin the experiment. | ||
- The steps performed include sending submissions, sending feedback for training submissions, generating feedback suggestions, and running automatic evaluations. | ||
- If training submissions are provided, you will need to manually continue the experiment by pressing "Continue". | ||
- If automatic evaluations is enabled, for instance LLM-as-a-judge for text exercises, you will also need to manually confirm it. | ||
- Export and import the experiment results as needed using the "Export" and "Import" buttons, respectively. | ||
|
||
.. figure:: ../images/playground/evaluation_mode/conduct_experiment_text.png | ||
:width: 500px | ||
:alt: Conduct Experiment Interface for a Text Exercise of the Athena Playground | ||
|
||
Evaluation Mode: Conduct Experiment Interface for a Text Exercise of the Athena Playground | ||
|
||
4. **Annotate Feedback Suggestions:** | ||
- Annotate the generated feedback suggestions with "Accept" or "Reject" as a tutor would. | ||
|
||
5. **Export Results:** | ||
- At the end of the experiment, or at any time during the experiment, export the results using the "Export" button. | ||
- Make sure that you also exported the experiment configuration and module configurations to have a complete record of the experiment. |
Oops, something went wrong.