diff --git a/docs/images/playground/expert_evaluation/define_experiment.png b/docs/images/playground/expert_evaluation/define_experiment.png new file mode 100644 index 000000000..743abdb60 Binary files /dev/null and b/docs/images/playground/expert_evaluation/define_experiment.png differ diff --git a/docs/images/playground/expert_evaluation/define_metrics.png b/docs/images/playground/expert_evaluation/define_metrics.png new file mode 100644 index 000000000..d603e415f Binary files /dev/null and b/docs/images/playground/expert_evaluation/define_metrics.png differ diff --git a/docs/images/playground/expert_evaluation/researcher_view_location.png b/docs/images/playground/expert_evaluation/researcher_view_location.png new file mode 100644 index 000000000..18e23c0d4 Binary files /dev/null and b/docs/images/playground/expert_evaluation/researcher_view_location.png differ diff --git a/docs/images/playground/expert_evaluation/side-by-side-tool.png b/docs/images/playground/expert_evaluation/side-by-side-tool.png new file mode 100644 index 000000000..1385dd7f6 Binary files /dev/null and b/docs/images/playground/expert_evaluation/side-by-side-tool.png differ diff --git a/docs/images/playground/expert_evaluation/view_expert_evaluation_progress.png b/docs/images/playground/expert_evaluation/view_expert_evaluation_progress.png new file mode 100644 index 000000000..247d01260 Binary files /dev/null and b/docs/images/playground/expert_evaluation/view_expert_evaluation_progress.png differ diff --git a/docs/overview/playground.rst b/docs/overview/playground.rst index bbebb6df9..671ed4ec9 100644 --- a/docs/overview/playground.rst +++ b/docs/overview/playground.rst @@ -214,4 +214,106 @@ For Programming Exercises - \ No newline at end of file + +Expert Evaluation +----------------- +**Expert Evaluation** is the process where a researcher enlists experts to assess the quality of feedback provided on student submissions. +These experts evaluate how well the feedback aligns with the content of the submissions and predefined metrics such as accuracy, tone, and adaptability. +The goal is to gather structured and reliable assessments to improve feedback quality or validate feedback generation methods. + +The playground provides two key Expert Evaluation views: + +1. Researcher View: Enables researchers to configure the evaluation process, define metrics, and generate expert links. +2. Expert View: Allows experts to review feedback and rate its quality based on the defined evaluation metrics. + +Researcher View +^^^^^^^^^^^^^^^ +Researcher View is accessible from the playground below Evaluation Mode: + +.. figure:: ../images/playground/expert_evaluation/researcher_view_location.png + :width: 500px + :alt: Location of the Researcher View + +The researcher begins creating a new Expert Evaluation by selecting a new name and uploading exercises with submissions and feedback. + +Now the expert can define his own metrics such as actionability, accuracy and add a short and a long description. +Based on these metrics, experts will compare the different feedback types. + +.. figure:: ../images/playground/expert_evaluation/define_metrics.png + :width: 500px + :alt: Defining metrics + +Afterwards, the researcher adds a link for each expert participating in the evaluation. +This link should then be shared with the corresponding expert. +After finishing the configuration, the researcher can define the experiment and start the Expert Evaluation. + +.. figure:: ../images/playground/expert_evaluation/define_experiment.png + :width: 500px + :alt: Define experiment + +.. warning:: + Once the evaluation has started, the exercises and the metrics can no longer be changed! + However, additional expert links can be created. + +Instead of uploading the exercises and defining the metrics separately, the researcher can also import an existing configuration at the top of the Researcher View. + +After the evaluation has been started and the experts have begun to evaluate, the researcher can track each expert's progress by clicking the Update Progress button. +Evaluation results can be exported at any time during the evaluation using the Download Results button. + +.. figure:: ../images/playground/expert_evaluation/view_expert_evaluation_progress.png + :width: 500px + :alt: View Expert Evaluation progress + +Expert View +^^^^^^^^^^^ +The Expert View can be accessed through generated expert links. +The Side-by-Side tool is used for evaluation. + +.. figure:: ../images/playground/expert_evaluation/side-by-side-tool.png + :width: 500px + :alt: Side-by-Side tool + +First time clicking on the link, the expert is greeted by a welcome screen, where the tutorial begins. +The following steps are shown and briefly described: + +The expert firstly reads the exercise details to get familiar with the exercise. +The details include the problem statement, grading instructions, and a sample solution. + +.. raw:: html + + + +After understanding the exercise, the expert reads through the submission and the corresponding feedback. + +.. raw:: html + + + +The expert then evaluates the feedback using a 5-point Likert scale based on the previously defined metrics. + +.. raw:: html + + + +If the meaning of a metric is unclear, a more detailed explanation can be accessed by clicking the info icon or the Metric Details button. + +.. raw:: html + + + +After evaluating all the different types of feedback, the expert can move on to the next submissions and repeat the process. +When ready to take a break, the expert clicks on the Continue Later button which saves their progress. + +.. raw:: html + +