Просмотр исходного кода

docs: define overall report evaluator contract

jimmylee 2 недель назад
Родитель
Сommit
680ff90760

+ 85 - 0
docs/superpowers/specs/2026-04-25-english-speaking-report-pipeline-design.md

@@ -128,6 +128,91 @@ These fields must be based on all expected student answers, all Azure scores, al
 
 Full OverallReport still requires all expected student answers to have completed Azure pronunciation scores. Text-only sentence feedback is useful for `DetailedReport`, but it is not enough to compute complete aggregate pronunciation scores.
 
+## OverallReport Evaluator Contract
+
+The OverallReport evaluator receives:
+
+```json
+{
+  "conversationHistory": [
+    {
+      "round": 1,
+      "role": "ai",
+      "content": "...",
+      "timestamp": "..."
+    },
+    {
+      "round": 1,
+      "role": "student",
+      "content": "...",
+      "timestamp": "...",
+      "pronunciation": {
+        "accuracy": 82,
+        "fluency": 76,
+        "completeness": 88,
+        "prosody": 70
+      },
+      "contentFeedback": {
+        "comment": "表达清楚,because 用得很好。",
+        "betterExpression": "My favorite animal is the panda because it is cute."
+      }
+    }
+  ],
+  "grade": "五年级",
+  "vocabulary": ["favorite", "adorable", "bamboo"],
+  "sentences": ["My favorite animal is ... because ..."]
+}
+```
+
+`pronunciation` and `contentFeedback` are included for student turns when available. Missing sentence feedback should not block OverallReport generation if Azure pronunciation scores are complete.
+
+The prompt should ask the evaluator to:
+
+- analyze the full multi-turn AI/student dialogue
+- use every student's Speech API scores
+- consider grade, target vocabulary, and target sentence patterns
+- produce specific, verifiable highlights and actionable suggestions
+- output valid JSON only
+
+The raw LLM output should be:
+
+```json
+{
+  "overall_evaluation": {
+    "evaluation": "你的聊天特别棒!你能说出自己上学、下雨、和朋友玩这些真实故事,很不错呀!"
+  },
+  "highlights": [
+    "积极主动参与对话,回应及时",
+    "能够灵活使用多种句型结构进行交流",
+    "发音清晰易懂,语调自然流畅"
+  ],
+  "suggestions": [
+    "尝试使用更多形容词和副词丰富表达细节,如将 good 替换为 fantastic",
+    "注意第三人称单数动词变化,确保主谓一致,如 He goes 而非 He go",
+    "增加连接词使用使句子更连贯,如 because、however、in addition"
+  ]
+}
+```
+
+For robustness, the parser may also accept `overall_evaluation.chinese` as an alias for `overall_evaluation.evaluation`, because sample prompts sometimes use `chinese`. The normalized backend object should use one stable shape.
+
+Normalized OverallReport fields:
+
+```json
+{
+  "aiComment": "你的聊天特别棒!你能说出自己上学、下雨、和朋友玩这些真实故事,很不错呀!",
+  "highlights": ["..."],
+  "improvements": ["..."]
+}
+```
+
+Mapping rules:
+
+- `overall_evaluation.evaluation` maps to frontend `aiComment`
+- `overall_evaluation.chinese` maps to frontend `aiComment` only if `evaluation` is absent
+- `highlights` maps to frontend `highlights`
+- `suggestions` maps to frontend `improvements`
+
 ## Overall Generation Gate
 
 `/report` should decide whether Full OverallReport can be generated with an explicit gate: