Add expert evaluation docs

ls1intum · Jan 14, 2025 · 725a2a7 · 725a2a7
1 parent 26346a1
commit 725a2a7
Show file tree

Hide file tree

Showing 6 changed files with 103 additions and 1 deletion.
diff --git a/docs/images/playground/expert_evaluation/define_experiment.png b/docs/images/playground/expert_evaluation/define_experiment.png
diff --git a/docs/images/playground/expert_evaluation/define_metrics.png b/docs/images/playground/expert_evaluation/define_metrics.png
diff --git a/docs/images/playground/expert_evaluation/researcher_view_location.png b/docs/images/playground/expert_evaluation/researcher_view_location.png
diff --git a/docs/images/playground/expert_evaluation/side-by-side-tool.png b/docs/images/playground/expert_evaluation/side-by-side-tool.png
diff --git a/docs/images/playground/expert_evaluation/view_expert_evaluation_progress.png b/docs/images/playground/expert_evaluation/view_expert_evaluation_progress.png
diff --git a/docs/overview/playground.rst b/docs/overview/playground.rst
@@ -214,4 +214,106 @@ For Programming Exercises
     <iframe src="https://live.rbg.tum.de/w/artemisintro/40961?video_only=1&t=0" allowfullscreen="1" frameborder="0" width="600" height="350">
         Video version of Athena_ConductExperimentProgramming on TUM-Live.
     </iframe>
-
+
+Expert Evaluation
+-----------------
+**Expert Evaluation** is the process where a researcher enlists experts to assess the quality of feedback provided on student submissions.
+These experts evaluate how well the feedback aligns with the content of the submissions and predefined metrics such as accuracy, tone, and adaptability.
+The goal is to gather structured and reliable assessments to improve feedback quality or validate feedback generation methods.
+
+The playground provides two key Expert Evaluation views:
+
+1. Researcher View: Enables researchers to configure the evaluation process, define metrics, and generate expert links.
+2. Expert View: Allows experts to review feedback and rate its quality based on the defined evaluation metrics.
+
+Researcher View
+^^^^^^^^^^^^^^^
+Researcher View is accessible from the playground below Evaluation Mode:
+
+.. figure:: ../images/playground/expert_evaluation/researcher_view_location.png
+    :width: 500px
+    :alt: Location of the Researcher View
+
+The researcher begins creating a new Expert Evaluation by selecting a new name and uploading exercises with submissions and feedback.
+
+Now the expert can define his own metrics such as actionability, accuracy and add a short and a long description.
+Based on these metrics, experts will compare the different feedback types.
+
+.. figure:: ../images/playground/expert_evaluation/define_metrics.png
+    :width: 500px
+    :alt: Defining metrics
+
+Afterwards, the researcher adds a link for each expert participating in the evaluation.
+This link should then be shared with the corresponding expert.
+After finishing the configuration, the researcher can define the experiment and start the Expert Evaluation.
+
+.. figure:: ../images/playground/expert_evaluation/define_experiment.png
+    :width: 500px
+    :alt: Define experiment
+
+.. warning::
+    Once the evaluation has started, the exercises and the metrics can no longer be changed!
+    However, additional expert links can be created.
+
+Instead of uploading the exercises and defining the metrics separately, the researcher can also import an existing configuration at the top of the Researcher View.
+
+After the evaluation has been started and the experts have begun to evaluate, the researcher can track each expert's progress by clicking the Update Progress button.
+Evaluation results can be exported at any time during the evaluation using the Download Results button.
+
+.. figure:: ../images/playground/expert_evaluation/view_expert_evaluation_progress.png
+    :width: 500px
+    :alt: View Expert Evaluation progress
+
+Expert View
+^^^^^^^^^^^
+The Expert View can be accessed through generated expert links.
+The Side-by-Side tool is used for evaluation.
+
+.. figure:: ../images/playground/expert_evaluation/side-by-side-tool.png
+    :width: 500px
+    :alt: Side-by-Side tool
+
+First time clicking on the link, the expert is greeted by a welcome screen, where the tutorial begins.
+The following steps are shown and briefly described:
+
+The expert firstly reads the exercise details to get familiar with the exercise.
+The details include the problem statement, grading instructions, and a sample solution.
+
+.. raw:: html
+
+    <iframe src="../../playground/public/exercise_details.mp4" allowfullscreen="1" frameborder="0" width="600" height="350">
+        Read exercise details
+    </iframe>
+
+After understanding the exercise, the expert reads through the submission and the corresponding feedback.
+
+.. raw:: html
+
+    <iframe src="../../playground/public/read_submission.mp4" allowfullscreen="1" frameborder="0" width="600" height="350">
+        Read submission
+    </iframe>
+
+The expert then evaluates the feedback using a 5-point Likert scale based on the previously defined metrics.
+
+.. raw:: html
+
+    <iframe src="../../playground/public/evaluation_metrics.mp4" allowfullscreen="1" frameborder="0" width="600" height="350">
+        Evaluate metrics
+    </iframe>
+
+If the meaning of a metric is unclear, a more detailed explanation can be accessed by clicking the info icon or the Metric Details button.
+
+.. raw:: html
+
+    <iframe src="../../playground/public/metrics_explanation.mp4" allowfullscreen="1" frameborder="0" width="600" height="350">
+        Read metrics explanation
+    </iframe>
+
+After evaluating all the different types of feedback, the expert can move on to the next submissions and repeat the process.
+When ready to take a break, the expert clicks on the Continue Later button which saves their progress.
+
+.. raw:: html
+
+    <iframe src="../../playground/public/continue_later.mp4" allowfullscreen="1" frameborder="0" width="600" height="350">
+        Continue later
+    </iframe>