|
|
@@ -0,0 +1,1289 @@
|
|
|
+# 单轮内容评语 MVP Implementation Plan
|
|
|
+
|
|
|
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
+
|
|
|
+**Goal:** 在现有每轮 Azure 发音四分之外,补一条 LLM 内容评语链路,产出 `{highlights, corrections, suggestions}` 挂在每轮评估上,只在结果页展示。
|
|
|
+
|
|
|
+**Architecture:** 每轮 `/speak` 后台任务里,Azure PA 完成后串联一次 OpenAI JSON-mode 调用;失败降级到 `content_feedback=null` 不影响发音分;`/report` 返回时带 `contentFeedback` 字段。
|
|
|
+
|
|
|
+**Tech Stack:** Python 3.13 · FastAPI · SQLAlchemy 2.x async · MySQL · OpenAI SDK (via onehub base_url) · pytest · uv · Vue 3 · TypeScript
|
|
|
+
|
|
|
+**Repos:**
|
|
|
+- Backend: `/Users/buoy/Development/gitrepo/cococlass-english-speaking-api`
|
|
|
+- Frontend: `/Users/buoy/Development/gitrepo/PPT`
|
|
|
+
|
|
|
+**Spec:** `/Users/buoy/Development/gitrepo/PPT/doc/ContentEvaluationDesign.md`
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## File Structure
|
|
|
+
|
|
|
+### Backend (cococlass-english-speaking-api)
|
|
|
+
|
|
|
+**Create:**
|
|
|
+- `app/service/speaking/content_evaluator.py` — 单一职责:把 (4 发音分 + AI 上一句 + 学生转录) 丢给 LLM 出 JSON 评语
|
|
|
+- `tests/conftest.py` — pytest 异步 + mock 夹具
|
|
|
+- `tests/service/__init__.py`
|
|
|
+- `tests/service/speaking/__init__.py`
|
|
|
+- `tests/service/speaking/test_content_evaluator.py` — evaluator 的单元测试
|
|
|
+- `tests/service/speaking/test_dialogue_service_content.py` — 串联逻辑的单元测试
|
|
|
+- `migrations/001_add_content_feedback.sql` — 对已有 DB 的增量 SQL
|
|
|
+
|
|
|
+**Modify:**
|
|
|
+- `init.sql` — 对新 DB 的建表语句同步加列
|
|
|
+- `app/models/dialogue.py` — `PronunciationEvaluation` 增加 `content_feedback` 列
|
|
|
+- `app/service/speaking/dialogue_service.py` — `_evaluate_pronunciation` 成功分支后追加 content 评估;`get_report` 返回 `contentFeedback`
|
|
|
+
|
|
|
+### Frontend (PPT)
|
|
|
+
|
|
|
+**Modify:**
|
|
|
+- `src/views/Editor/EnglishSpeaking/services/llmService.ts` — `getReport` 的响应转换(把后端 `rounds[i].evaluation.contentFeedback` 映射到 `sentenceEvaluations[i].feedback`)
|
|
|
+
|
|
|
+不改:`DetailedReport.vue` 已经按 `sentence.feedback.{highlights, corrections, suggestions}` 形状渲染;`englishSpeaking.ts` 的 `SentenceEvaluation.feedback` 类型也已经对齐。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 1: [backend] 加 `content_feedback` 列
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `cococlass-english-speaking-api/init.sql`(新建表语句)
|
|
|
+- Create: `cococlass-english-speaking-api/migrations/001_add_content_feedback.sql`
|
|
|
+- Modify: `cococlass-english-speaking-api/app/models/dialogue.py`(SQLAlchemy 模型)
|
|
|
+
|
|
|
+- [ ] **Step 1: 更新 `init.sql` 的 `pronunciation_evaluation` 建表语句**
|
|
|
+
|
|
|
+在 `pronunciation_evaluation` 表定义里,`completed_at` 之前插入 `content_feedback` 列:
|
|
|
+
|
|
|
+打开 `cococlass-english-speaking-api/init.sql`,把:
|
|
|
+
|
|
|
+```sql
|
|
|
+ word_analysis JSON NULL,
|
|
|
+ error_message TEXT NULL,
|
|
|
+ created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
|
+ completed_at DATETIME NULL,
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```sql
|
|
|
+ word_analysis JSON NULL,
|
|
|
+ content_feedback JSON NULL,
|
|
|
+ error_message TEXT NULL,
|
|
|
+ created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
|
|
+ completed_at DATETIME NULL,
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 创建 `migrations/` 目录并写入增量 SQL**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+mkdir -p migrations
|
|
|
+```
|
|
|
+
|
|
|
+创建 `migrations/001_add_content_feedback.sql`,内容:
|
|
|
+
|
|
|
+```sql
|
|
|
+-- Add content_feedback column to existing pronunciation_evaluation table.
|
|
|
+-- Apply once against an existing database (new DBs use updated init.sql).
|
|
|
+ALTER TABLE pronunciation_evaluation
|
|
|
+ ADD COLUMN content_feedback JSON NULL AFTER word_analysis;
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 3: 更新 SQLAlchemy 模型**
|
|
|
+
|
|
|
+打开 `cococlass-english-speaking-api/app/models/dialogue.py`。
|
|
|
+
|
|
|
+定位:
|
|
|
+
|
|
|
+```python
|
|
|
+ word_analysis: Mapped[Optional[dict]] = mapped_column(JSON, nullable=True)
|
|
|
+ error_message: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
|
|
|
+```
|
|
|
+
|
|
|
+改为(在中间插入 `content_feedback`):
|
|
|
+
|
|
|
+```python
|
|
|
+ word_analysis: Mapped[Optional[dict]] = mapped_column(JSON, nullable=True)
|
|
|
+ content_feedback: Mapped[Optional[dict]] = mapped_column(JSON, nullable=True)
|
|
|
+ error_message: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+git add init.sql migrations/001_add_content_feedback.sql app/models/dialogue.py
|
|
|
+git commit -m "feat(db): 为 pronunciation_evaluation 增加 content_feedback 列"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 2: [backend] 搭 pytest 目录骨架 + conftest
|
|
|
+
|
|
|
+本仓库 `tests/` 目前只有空 `__init__.py`。先建立可运行的单测基础。
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Create: `cococlass-english-speaking-api/tests/conftest.py`
|
|
|
+- Create: `cococlass-english-speaking-api/tests/service/__init__.py`
|
|
|
+- Create: `cococlass-english-speaking-api/tests/service/speaking/__init__.py`
|
|
|
+- Create: `cococlass-english-speaking-api/tests/service/speaking/test_smoke.py`
|
|
|
+
|
|
|
+- [ ] **Step 1: 创建 `tests/conftest.py`**
|
|
|
+
|
|
|
+```python
|
|
|
+"""Pytest global fixtures & asyncio config."""
|
|
|
+
|
|
|
+import pytest
|
|
|
+
|
|
|
+
|
|
|
+@pytest.fixture
|
|
|
+def anyio_backend() -> str:
|
|
|
+ """Force asyncio backend for anyio tests (not trio)."""
|
|
|
+ return "asyncio"
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 创建空 `__init__.py` 使 pytest 能发现嵌套目录**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+mkdir -p tests/service/speaking
|
|
|
+touch tests/service/__init__.py tests/service/speaking/__init__.py
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 3: 写冒烟测试确认 pytest 跑得起来**
|
|
|
+
|
|
|
+创建 `tests/service/speaking/test_smoke.py`:
|
|
|
+
|
|
|
+```python
|
|
|
+def test_pytest_works() -> None:
|
|
|
+ assert 1 + 1 == 2
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 运行冒烟测试**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_smoke.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `1 passed`.
|
|
|
+
|
|
|
+如果 `uv run pytest` 报 "pytest: command not found",先 `uv sync --group dev` 装开发依赖再重跑。
|
|
|
+
|
|
|
+- [ ] **Step 5: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+git add tests/
|
|
|
+git commit -m "chore(test): 搭建 pytest 目录骨架和 conftest"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 3: [backend] 写 `content_evaluator` 模块(TDD)
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Create: `cococlass-english-speaking-api/app/service/speaking/content_evaluator.py`
|
|
|
+- Modify: `cococlass-english-speaking-api/tests/service/speaking/test_content_evaluator.py`(上一任务 smoke 测试文件所在目录,新建另一个文件)
|
|
|
+
|
|
|
+ContentEvaluator 直接实例化 `AsyncOpenAI`(和 `OneHubLLM` 一样用 `settings.ONEHUB_BASE_URL` + `settings.ONEHUB_API_KEY`),因为需要 `response_format` 参数,现有 `LLMProvider.chat()` 接口不暴露它。
|
|
|
+
|
|
|
+- [ ] **Step 1: 写 evaluator 的失败测试(happy path)**
|
|
|
+
|
|
|
+创建 `cococlass-english-speaking-api/tests/service/speaking/test_content_evaluator.py`:
|
|
|
+
|
|
|
+```python
|
|
|
+"""Unit tests for ContentEvaluator."""
|
|
|
+
|
|
|
+import json
|
|
|
+from unittest.mock import AsyncMock, MagicMock, patch
|
|
|
+
|
|
|
+import pytest
|
|
|
+
|
|
|
+from app.service.speaking.content_evaluator import ContentEvaluator
|
|
|
+
|
|
|
+
|
|
|
+def _mock_openai_response(content: str) -> MagicMock:
|
|
|
+ """Construct a fake AsyncOpenAI chat completion response."""
|
|
|
+ choice = MagicMock()
|
|
|
+ choice.message.content = content
|
|
|
+ resp = MagicMock()
|
|
|
+ resp.choices = [choice]
|
|
|
+ return resp
|
|
|
+
|
|
|
+
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_evaluate_happy_path() -> None:
|
|
|
+ fake_json = json.dumps(
|
|
|
+ {
|
|
|
+ "highlights": ["发音清晰", "句子完整"],
|
|
|
+ "corrections": [
|
|
|
+ {
|
|
|
+ "original": "I go to park yesterday",
|
|
|
+ "corrected": "I went to the park yesterday",
|
|
|
+ "explanation": "过去式应用 went,park 前加 the",
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "suggestions": ["可增加连接词"],
|
|
|
+ }
|
|
|
+ )
|
|
|
+
|
|
|
+ with patch(
|
|
|
+ "app.service.speaking.content_evaluator.AsyncOpenAI"
|
|
|
+ ) as MockClient:
|
|
|
+ instance = MockClient.return_value
|
|
|
+ instance.chat.completions.create = AsyncMock(
|
|
|
+ return_value=_mock_openai_response(fake_json)
|
|
|
+ )
|
|
|
+
|
|
|
+ evaluator = ContentEvaluator()
|
|
|
+ result = await evaluator.evaluate(
|
|
|
+ transcript="I go to park yesterday",
|
|
|
+ prior_ai_turn="What did you do last weekend?",
|
|
|
+ pron_scores={"accuracy": 72, "fluency": 85, "completeness": 90, "prosody": 60},
|
|
|
+ )
|
|
|
+
|
|
|
+ assert result is not None
|
|
|
+ assert result["highlights"] == ["发音清晰", "句子完整"]
|
|
|
+ assert len(result["corrections"]) == 1
|
|
|
+ assert result["corrections"][0]["corrected"] == "I went to the park yesterday"
|
|
|
+ assert result["suggestions"] == ["可增加连接词"]
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 运行,确认 fail(模块还不存在)**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_content_evaluator.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `ModuleNotFoundError: No module named 'app.service.speaking.content_evaluator'` 或类似 import 错误。
|
|
|
+
|
|
|
+- [ ] **Step 3: 实现最小 evaluator 让 happy path 通过**
|
|
|
+
|
|
|
+创建 `cococlass-english-speaking-api/app/service/speaking/content_evaluator.py`:
|
|
|
+
|
|
|
+```python
|
|
|
+"""Per-turn content evaluation via LLM (JSON mode)."""
|
|
|
+
|
|
|
+import asyncio
|
|
|
+import json
|
|
|
+
|
|
|
+from openai import AsyncOpenAI
|
|
|
+
|
|
|
+from app.config import settings
|
|
|
+from app.logging import get_logger
|
|
|
+
|
|
|
+logger = get_logger(__name__)
|
|
|
+
|
|
|
+
|
|
|
+SYSTEM_PROMPT = """You are an English tutor evaluating a student's single spoken turn
|
|
|
+in an open dialogue. You receive:
|
|
|
+- Azure pronunciation scores (accuracy/fluency/completeness/prosody, 0-100)
|
|
|
+- The immediate prior AI turn (context)
|
|
|
+- The student's transcript
|
|
|
+
|
|
|
+Return JSON with exactly these keys:
|
|
|
+- highlights: 1-2 Chinese sentences praising specific strengths. Reference a
|
|
|
+ pronunciation dimension if that score is >= 85. <= 30 chars each.
|
|
|
+- corrections: array of grammar/word-choice fixes. Each item has keys:
|
|
|
+ original (EN), corrected (EN), explanation (ZH, <= 30 chars).
|
|
|
+- suggestions: 1-2 Chinese actionable improvements. Reference a pronunciation
|
|
|
+ dimension if that score is < 70. <= 30 chars each.
|
|
|
+
|
|
|
+Rules:
|
|
|
+- Empty arrays are valid. Do not invent errors to fill quota.
|
|
|
+- If the student only said a filler ("yes", "ok", "hmm"), return empty
|
|
|
+ corrections and suggestions plus one encouragement in highlights.
|
|
|
+- Never include raw score numbers in output text; describe qualitatively
|
|
|
+ ("发音准确度很高" not "accuracy 92").
|
|
|
+- Output MUST be a single JSON object with keys highlights, corrections, suggestions.
|
|
|
+"""
|
|
|
+
|
|
|
+
|
|
|
+class ContentEvaluator:
|
|
|
+ """Generates per-turn content feedback via LLM in JSON mode."""
|
|
|
+
|
|
|
+ def __init__(self, timeout_seconds: float = 10.0):
|
|
|
+ self.client = AsyncOpenAI(
|
|
|
+ base_url=settings.ONEHUB_BASE_URL,
|
|
|
+ api_key=settings.ONEHUB_API_KEY,
|
|
|
+ )
|
|
|
+ self.model = settings.ONEHUB_MODEL
|
|
|
+ self.timeout_seconds = timeout_seconds
|
|
|
+
|
|
|
+ async def evaluate(
|
|
|
+ self,
|
|
|
+ transcript: str,
|
|
|
+ prior_ai_turn: str,
|
|
|
+ pron_scores: dict,
|
|
|
+ ) -> dict | None:
|
|
|
+ """Return {highlights, corrections, suggestions} or None on failure."""
|
|
|
+ user_payload = json.dumps(
|
|
|
+ {
|
|
|
+ "pronunciation": pron_scores,
|
|
|
+ "ai_said": prior_ai_turn,
|
|
|
+ "student_said": transcript,
|
|
|
+ },
|
|
|
+ ensure_ascii=False,
|
|
|
+ )
|
|
|
+
|
|
|
+ try:
|
|
|
+ resp = await asyncio.wait_for(
|
|
|
+ self.client.chat.completions.create(
|
|
|
+ model=self.model,
|
|
|
+ messages=[
|
|
|
+ {"role": "system", "content": SYSTEM_PROMPT},
|
|
|
+ {"role": "user", "content": user_payload},
|
|
|
+ ],
|
|
|
+ response_format={"type": "json_object"},
|
|
|
+ temperature=0,
|
|
|
+ ),
|
|
|
+ timeout=self.timeout_seconds,
|
|
|
+ )
|
|
|
+ except asyncio.TimeoutError:
|
|
|
+ logger.warning("ContentEvaluator LLM timeout")
|
|
|
+ return None
|
|
|
+ except Exception as e:
|
|
|
+ logger.error(f"ContentEvaluator LLM error: {e}")
|
|
|
+ return None
|
|
|
+
|
|
|
+ raw = resp.choices[0].message.content or ""
|
|
|
+ try:
|
|
|
+ parsed = json.loads(raw)
|
|
|
+ except json.JSONDecodeError:
|
|
|
+ logger.warning(f"ContentEvaluator got non-JSON: {raw[:200]}")
|
|
|
+ return None
|
|
|
+
|
|
|
+ if not self._has_required_shape(parsed):
|
|
|
+ logger.warning(f"ContentEvaluator got invalid shape: {parsed}")
|
|
|
+ return None
|
|
|
+
|
|
|
+ return {
|
|
|
+ "highlights": parsed.get("highlights", []),
|
|
|
+ "corrections": parsed.get("corrections", []),
|
|
|
+ "suggestions": parsed.get("suggestions", []),
|
|
|
+ }
|
|
|
+
|
|
|
+ @staticmethod
|
|
|
+ def _has_required_shape(obj: object) -> bool:
|
|
|
+ if not isinstance(obj, dict):
|
|
|
+ return False
|
|
|
+ for key in ("highlights", "corrections", "suggestions"):
|
|
|
+ if key not in obj or not isinstance(obj[key], list):
|
|
|
+ return False
|
|
|
+ return True
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 运行 happy path 测试,确认 pass**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_content_evaluator.py::test_evaluate_happy_path -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `1 passed`.
|
|
|
+
|
|
|
+如果报错 `pytest-asyncio plugin not installed`,在 `pyproject.toml` 的 `[dependency-groups].dev` 里追加 `"pytest-asyncio>=0.26.0"`,并在 `tests/conftest.py` 顶部加:
|
|
|
+
|
|
|
+```python
|
|
|
+import pytest
|
|
|
+
|
|
|
+pytest_plugins = ["pytest_asyncio"]
|
|
|
+```
|
|
|
+
|
|
|
+再 `uv sync --group dev` 重跑。
|
|
|
+
|
|
|
+- [ ] **Step 5: 加失败分支测试 — JSON 解析失败**
|
|
|
+
|
|
|
+在 `test_content_evaluator.py` 追加:
|
|
|
+
|
|
|
+```python
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_evaluate_returns_none_on_invalid_json() -> None:
|
|
|
+ with patch(
|
|
|
+ "app.service.speaking.content_evaluator.AsyncOpenAI"
|
|
|
+ ) as MockClient:
|
|
|
+ instance = MockClient.return_value
|
|
|
+ instance.chat.completions.create = AsyncMock(
|
|
|
+ return_value=_mock_openai_response("not a json")
|
|
|
+ )
|
|
|
+
|
|
|
+ evaluator = ContentEvaluator()
|
|
|
+ result = await evaluator.evaluate(
|
|
|
+ transcript="Hi",
|
|
|
+ prior_ai_turn="Hello",
|
|
|
+ pron_scores={"accuracy": 80, "fluency": 80, "completeness": 80, "prosody": 80},
|
|
|
+ )
|
|
|
+
|
|
|
+ assert result is None
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 6: 加失败分支测试 — 超时**
|
|
|
+
|
|
|
+追加:
|
|
|
+
|
|
|
+```python
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_evaluate_returns_none_on_timeout() -> None:
|
|
|
+ async def never_returns(**kwargs):
|
|
|
+ await asyncio.sleep(5)
|
|
|
+
|
|
|
+ with patch(
|
|
|
+ "app.service.speaking.content_evaluator.AsyncOpenAI"
|
|
|
+ ) as MockClient:
|
|
|
+ instance = MockClient.return_value
|
|
|
+ instance.chat.completions.create = never_returns
|
|
|
+
|
|
|
+ evaluator = ContentEvaluator(timeout_seconds=0.05)
|
|
|
+ result = await evaluator.evaluate(
|
|
|
+ transcript="Hi",
|
|
|
+ prior_ai_turn="Hello",
|
|
|
+ pron_scores={"accuracy": 80, "fluency": 80, "completeness": 80, "prosody": 80},
|
|
|
+ )
|
|
|
+
|
|
|
+ assert result is None
|
|
|
+```
|
|
|
+
|
|
|
+同时在文件顶部 import 里加 `import asyncio`(如果还没有)。
|
|
|
+
|
|
|
+- [ ] **Step 7: 加失败分支测试 — 非法 shape**
|
|
|
+
|
|
|
+追加:
|
|
|
+
|
|
|
+```python
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_evaluate_returns_none_on_wrong_shape() -> None:
|
|
|
+ # LLM 返回 JSON 但少字段
|
|
|
+ bad = json.dumps({"highlights": ["ok"]})
|
|
|
+ with patch(
|
|
|
+ "app.service.speaking.content_evaluator.AsyncOpenAI"
|
|
|
+ ) as MockClient:
|
|
|
+ instance = MockClient.return_value
|
|
|
+ instance.chat.completions.create = AsyncMock(
|
|
|
+ return_value=_mock_openai_response(bad)
|
|
|
+ )
|
|
|
+
|
|
|
+ evaluator = ContentEvaluator()
|
|
|
+ result = await evaluator.evaluate(
|
|
|
+ transcript="Hi",
|
|
|
+ prior_ai_turn="Hello",
|
|
|
+ pron_scores={"accuracy": 80, "fluency": 80, "completeness": 80, "prosody": 80},
|
|
|
+ )
|
|
|
+
|
|
|
+ assert result is None
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 8: 运行全部 evaluator 测试**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_content_evaluator.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `4 passed`.
|
|
|
+
|
|
|
+- [ ] **Step 9: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+git add app/service/speaking/content_evaluator.py tests/service/speaking/test_content_evaluator.py
|
|
|
+# 如果改了 pyproject.toml / conftest.py
|
|
|
+git add pyproject.toml tests/conftest.py uv.lock 2>/dev/null || true
|
|
|
+git commit -m "feat(speaking): 新增 content_evaluator(LLM JSON 模式生成单轮评语)"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 4: [backend] 把 ContentEvaluator 串进 `_evaluate_pronunciation`(TDD)
|
|
|
+
|
|
|
+这是核心集成点:Azure 成功后追加一次 content 评估;Azure 失败则不调用;content 失败不影响 status。
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `cococlass-english-speaking-api/app/service/speaking/dialogue_service.py`
|
|
|
+- Create: `cococlass-english-speaking-api/tests/service/speaking/test_dialogue_service_content.py`
|
|
|
+
|
|
|
+注意:原 `_evaluate_pronunciation` 通过 `self.assessor` 依赖注入。为了让 content evaluator 可被测替换,下面把它也作为依赖挂到 `DialogueService` 上。
|
|
|
+
|
|
|
+- [ ] **Step 1: 写测试 — Azure 成功 + content 成功 → 两个字段都写入**
|
|
|
+
|
|
|
+创建 `cococlass-english-speaking-api/tests/service/speaking/test_dialogue_service_content.py`:
|
|
|
+
|
|
|
+```python
|
|
|
+"""Integration-ish tests for content evaluation wired into DialogueService._evaluate_pronunciation."""
|
|
|
+
|
|
|
+from unittest.mock import AsyncMock, MagicMock
|
|
|
+
|
|
|
+import pytest
|
|
|
+
|
|
|
+from app.service.speaking.dialogue_service import DialogueService
|
|
|
+
|
|
|
+
|
|
|
+class _StubDB:
|
|
|
+ """Minimal stand-in for AsyncSession that supports get() + commit()."""
|
|
|
+
|
|
|
+ def __init__(self, evaluation):
|
|
|
+ self._evaluation = evaluation
|
|
|
+ self.commit = AsyncMock()
|
|
|
+
|
|
|
+ async def __aenter__(self):
|
|
|
+ return self
|
|
|
+
|
|
|
+ async def __aexit__(self, *args):
|
|
|
+ return False
|
|
|
+
|
|
|
+ async def get(self, _cls, _id):
|
|
|
+ return self._evaluation
|
|
|
+
|
|
|
+
|
|
|
+def _fake_evaluation() -> MagicMock:
|
|
|
+ ev = MagicMock()
|
|
|
+ ev.status = "pending"
|
|
|
+ ev.accuracy_score = None
|
|
|
+ ev.fluency_score = None
|
|
|
+ ev.completeness_score = None
|
|
|
+ ev.prosody_score = None
|
|
|
+ ev.word_analysis = None
|
|
|
+ ev.content_feedback = None
|
|
|
+ ev.completed_at = None
|
|
|
+ ev.error_message = None
|
|
|
+ return ev
|
|
|
+
|
|
|
+
|
|
|
+def _build_service(assessor, evaluator) -> DialogueService:
|
|
|
+ return DialogueService(
|
|
|
+ asr=MagicMock(),
|
|
|
+ llm=MagicMock(),
|
|
|
+ assessor=assessor,
|
|
|
+ storage=MagicMock(),
|
|
|
+ content_evaluator=evaluator,
|
|
|
+ )
|
|
|
+
|
|
|
+
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_azure_success_then_content_success_writes_both(monkeypatch) -> None:
|
|
|
+ ev = _fake_evaluation()
|
|
|
+ stub_db = _StubDB(ev)
|
|
|
+ monkeypatch.setattr(
|
|
|
+ "app.service.speaking.dialogue_service.async_session", lambda: stub_db
|
|
|
+ )
|
|
|
+
|
|
|
+ assessor = MagicMock()
|
|
|
+ assessor.assess = AsyncMock(
|
|
|
+ return_value={
|
|
|
+ "accuracy_score": 80,
|
|
|
+ "fluency_score": 85,
|
|
|
+ "completeness_score": 90,
|
|
|
+ "prosody_score": 75,
|
|
|
+ "word_analysis": [],
|
|
|
+ }
|
|
|
+ )
|
|
|
+ evaluator = MagicMock()
|
|
|
+ evaluator.evaluate = AsyncMock(
|
|
|
+ return_value={
|
|
|
+ "highlights": ["nice"],
|
|
|
+ "corrections": [],
|
|
|
+ "suggestions": [],
|
|
|
+ }
|
|
|
+ )
|
|
|
+
|
|
|
+ service = _build_service(assessor, evaluator)
|
|
|
+ await service._evaluate_pronunciation(
|
|
|
+ evaluation_id=1,
|
|
|
+ audio_bytes=b"",
|
|
|
+ reference_text="hi",
|
|
|
+ prior_ai_turn="hello",
|
|
|
+ )
|
|
|
+
|
|
|
+ assert ev.status == "completed"
|
|
|
+ assert ev.accuracy_score == 80
|
|
|
+ assert ev.content_feedback == {"highlights": ["nice"], "corrections": [], "suggestions": []}
|
|
|
+ evaluator.evaluate.assert_awaited_once()
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 写测试 — Azure 成功 + content 失败 → content_feedback None,status 仍 completed**
|
|
|
+
|
|
|
+追加:
|
|
|
+
|
|
|
+```python
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_azure_success_content_failure_keeps_status_completed(monkeypatch) -> None:
|
|
|
+ ev = _fake_evaluation()
|
|
|
+ stub_db = _StubDB(ev)
|
|
|
+ monkeypatch.setattr(
|
|
|
+ "app.service.speaking.dialogue_service.async_session", lambda: stub_db
|
|
|
+ )
|
|
|
+
|
|
|
+ assessor = MagicMock()
|
|
|
+ assessor.assess = AsyncMock(
|
|
|
+ return_value={
|
|
|
+ "accuracy_score": 80,
|
|
|
+ "fluency_score": 85,
|
|
|
+ "completeness_score": 90,
|
|
|
+ "prosody_score": 75,
|
|
|
+ "word_analysis": [],
|
|
|
+ }
|
|
|
+ )
|
|
|
+ evaluator = MagicMock()
|
|
|
+ evaluator.evaluate = AsyncMock(return_value=None) # LLM failed
|
|
|
+
|
|
|
+ service = _build_service(assessor, evaluator)
|
|
|
+ await service._evaluate_pronunciation(
|
|
|
+ evaluation_id=1,
|
|
|
+ audio_bytes=b"",
|
|
|
+ reference_text="hi",
|
|
|
+ prior_ai_turn="hello",
|
|
|
+ )
|
|
|
+
|
|
|
+ assert ev.status == "completed"
|
|
|
+ assert ev.accuracy_score == 80
|
|
|
+ assert ev.content_feedback is None
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 3: 写测试 — Azure 失败 → ContentEvaluator 不被调用**
|
|
|
+
|
|
|
+追加:
|
|
|
+
|
|
|
+```python
|
|
|
+@pytest.mark.asyncio
|
|
|
+async def test_azure_failure_skips_content_evaluator(monkeypatch) -> None:
|
|
|
+ ev = _fake_evaluation()
|
|
|
+ stub_db = _StubDB(ev)
|
|
|
+ monkeypatch.setattr(
|
|
|
+ "app.service.speaking.dialogue_service.async_session", lambda: stub_db
|
|
|
+ )
|
|
|
+
|
|
|
+ assessor = MagicMock()
|
|
|
+ assessor.assess = AsyncMock(side_effect=RuntimeError("azure exploded"))
|
|
|
+ evaluator = MagicMock()
|
|
|
+ evaluator.evaluate = AsyncMock()
|
|
|
+
|
|
|
+ service = _build_service(assessor, evaluator)
|
|
|
+ await service._evaluate_pronunciation(
|
|
|
+ evaluation_id=1,
|
|
|
+ audio_bytes=b"",
|
|
|
+ reference_text="hi",
|
|
|
+ prior_ai_turn="hello",
|
|
|
+ )
|
|
|
+
|
|
|
+ assert ev.status == "failed"
|
|
|
+ assert ev.content_feedback is None
|
|
|
+ evaluator.evaluate.assert_not_awaited()
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 运行测试,确认 fail**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_dialogue_service_content.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: 3 tests fail(`DialogueService.__init__` 还没有 `content_evaluator` 参数,`_evaluate_pronunciation` 也没有 `prior_ai_turn` 参数)。
|
|
|
+
|
|
|
+- [ ] **Step 5: 修改 `DialogueService.__init__` 接受 content_evaluator**
|
|
|
+
|
|
|
+打开 `cococlass-english-speaking-api/app/service/speaking/dialogue_service.py`。
|
|
|
+
|
|
|
+在文件顶部 import 追加:
|
|
|
+
|
|
|
+```python
|
|
|
+from app.service.speaking.content_evaluator import ContentEvaluator
|
|
|
+```
|
|
|
+
|
|
|
+定位 `__init__`:
|
|
|
+
|
|
|
+```python
|
|
|
+ def __init__(
|
|
|
+ self,
|
|
|
+ asr: ASRProvider,
|
|
|
+ llm: LLMProvider,
|
|
|
+ assessor: PronunciationAssessor,
|
|
|
+ storage: AudioStorage,
|
|
|
+ ):
|
|
|
+ self.asr = asr
|
|
|
+ self.llm = llm
|
|
|
+ self.assessor = assessor
|
|
|
+ self.storage = storage
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ def __init__(
|
|
|
+ self,
|
|
|
+ asr: ASRProvider,
|
|
|
+ llm: LLMProvider,
|
|
|
+ assessor: PronunciationAssessor,
|
|
|
+ storage: AudioStorage,
|
|
|
+ content_evaluator: ContentEvaluator | None = None,
|
|
|
+ ):
|
|
|
+ self.asr = asr
|
|
|
+ self.llm = llm
|
|
|
+ self.assessor = assessor
|
|
|
+ self.storage = storage
|
|
|
+ self.content_evaluator = content_evaluator or ContentEvaluator()
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 6: 改 `_evaluate_pronunciation` 签名和逻辑**
|
|
|
+
|
|
|
+定位现有实现(`dialogue_service.py:321` 左右):
|
|
|
+
|
|
|
+```python
|
|
|
+ async def _evaluate_pronunciation(
|
|
|
+ self,
|
|
|
+ evaluation_id: int,
|
|
|
+ audio_bytes: bytes,
|
|
|
+ reference_text: str,
|
|
|
+ content_type: str = "audio/webm;codecs=opus",
|
|
|
+ ):
|
|
|
+ """后台静默发音评估"""
|
|
|
+ from app.models.database import async_session
|
|
|
+
|
|
|
+ async with async_session() as db:
|
|
|
+ evaluation = await db.get(PronunciationEvaluation, evaluation_id)
|
|
|
+ if not evaluation:
|
|
|
+ logger.error(f"Evaluation record not found: id={evaluation_id}")
|
|
|
+ return
|
|
|
+
|
|
|
+ try:
|
|
|
+ result = await self.assessor.assess(audio_bytes, reference_text, content_type)
|
|
|
+ logger.info(f"Pronunciation assessment done: eval={evaluation_id}, accuracy={result['accuracy_score']}")
|
|
|
+ evaluation.status = "completed"
|
|
|
+ evaluation.accuracy_score = result["accuracy_score"]
|
|
|
+ evaluation.fluency_score = result["fluency_score"]
|
|
|
+ evaluation.completeness_score = result["completeness_score"]
|
|
|
+ evaluation.prosody_score = result["prosody_score"]
|
|
|
+ evaluation.word_analysis = result.get("word_analysis")
|
|
|
+ evaluation.completed_at = datetime.now()
|
|
|
+ except Exception as e:
|
|
|
+ logger.error(f"Pronunciation assessment failed: eval={evaluation_id}, error={e}")
|
|
|
+ evaluation.status = "failed"
|
|
|
+ evaluation.error_message = str(e)
|
|
|
+
|
|
|
+ await db.commit()
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ async def _evaluate_pronunciation(
|
|
|
+ self,
|
|
|
+ evaluation_id: int,
|
|
|
+ audio_bytes: bytes,
|
|
|
+ reference_text: str,
|
|
|
+ prior_ai_turn: str = "",
|
|
|
+ content_type: str = "audio/webm;codecs=opus",
|
|
|
+ ):
|
|
|
+ """后台静默发音评估 + 内容评语"""
|
|
|
+ from app.models.database import async_session
|
|
|
+
|
|
|
+ async with async_session() as db:
|
|
|
+ evaluation = await db.get(PronunciationEvaluation, evaluation_id)
|
|
|
+ if not evaluation:
|
|
|
+ logger.error(f"Evaluation record not found: id={evaluation_id}")
|
|
|
+ return
|
|
|
+
|
|
|
+ try:
|
|
|
+ result = await self.assessor.assess(audio_bytes, reference_text, content_type)
|
|
|
+ logger.info(f"Pronunciation assessment done: eval={evaluation_id}, accuracy={result['accuracy_score']}")
|
|
|
+ evaluation.status = "completed"
|
|
|
+ evaluation.accuracy_score = result["accuracy_score"]
|
|
|
+ evaluation.fluency_score = result["fluency_score"]
|
|
|
+ evaluation.completeness_score = result["completeness_score"]
|
|
|
+ evaluation.prosody_score = result["prosody_score"]
|
|
|
+ evaluation.word_analysis = result.get("word_analysis")
|
|
|
+ evaluation.completed_at = datetime.now()
|
|
|
+
|
|
|
+ # Content evaluation: 仅在 Azure 成功时触发;失败不影响 status。
|
|
|
+ try:
|
|
|
+ content_feedback = await self.content_evaluator.evaluate(
|
|
|
+ transcript=reference_text,
|
|
|
+ prior_ai_turn=prior_ai_turn,
|
|
|
+ pron_scores={
|
|
|
+ "accuracy": result["accuracy_score"],
|
|
|
+ "fluency": result["fluency_score"],
|
|
|
+ "completeness": result["completeness_score"],
|
|
|
+ "prosody": result["prosody_score"],
|
|
|
+ },
|
|
|
+ )
|
|
|
+ evaluation.content_feedback = content_feedback
|
|
|
+ logger.info(
|
|
|
+ f"Content evaluation done: eval={evaluation_id}, "
|
|
|
+ f"has_feedback={content_feedback is not None}"
|
|
|
+ )
|
|
|
+ except Exception as e:
|
|
|
+ logger.error(f"Content evaluation error (soft-fail): eval={evaluation_id}, error={e}")
|
|
|
+ evaluation.content_feedback = None
|
|
|
+
|
|
|
+ except Exception as e:
|
|
|
+ logger.error(f"Pronunciation assessment failed: eval={evaluation_id}, error={e}")
|
|
|
+ evaluation.status = "failed"
|
|
|
+ evaluation.error_message = str(e)
|
|
|
+
|
|
|
+ await db.commit()
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 7: 更新 `speak()` 里的 `asyncio.create_task` 传入 `prior_ai_turn`**
|
|
|
+
|
|
|
+定位 `speak()` 方法内现有的 `create_task` 调用(`dialogue_service.py:189` 左右):
|
|
|
+
|
|
|
+```python
|
|
|
+ asyncio.create_task(
|
|
|
+ self._evaluate_pronunciation(
|
|
|
+ evaluation_id=evaluation.id,
|
|
|
+ audio_bytes=audio_bytes,
|
|
|
+ reference_text=transcript,
|
|
|
+ content_type=content_type,
|
|
|
+ )
|
|
|
+ )
|
|
|
+```
|
|
|
+
|
|
|
+在 create_task 之前,计算 `prior_ai_turn`。新增变量(放在 "⑩ 后台发音评估" 之前):
|
|
|
+
|
|
|
+```python
|
|
|
+ # 找到本轮 student 消息之前最近的一条 AI 消息作为 content 评估的上下文
|
|
|
+ prior_ai_turn = ""
|
|
|
+ for msg in reversed(history):
|
|
|
+ if msg.role == "ai":
|
|
|
+ prior_ai_turn = msg.content
|
|
|
+ break
|
|
|
+```
|
|
|
+
|
|
|
+然后把 create_task 改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ asyncio.create_task(
|
|
|
+ self._evaluate_pronunciation(
|
|
|
+ evaluation_id=evaluation.id,
|
|
|
+ audio_bytes=audio_bytes,
|
|
|
+ reference_text=transcript,
|
|
|
+ prior_ai_turn=prior_ai_turn,
|
|
|
+ content_type=content_type,
|
|
|
+ )
|
|
|
+ )
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 8: 运行测试,确认全部通过**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_dialogue_service_content.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `3 passed`.
|
|
|
+
|
|
|
+如果报 `ImportError: cannot import name 'ContentEvaluator' from partial init`(循环引用),把 `ContentEvaluator` 的 import 放到 `dialogue_service.py` 的 `__init__` 方法内的首行(延迟导入)而不是文件顶部。
|
|
|
+
|
|
|
+- [ ] **Step 9: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+git add app/service/speaking/dialogue_service.py tests/service/speaking/test_dialogue_service_content.py
|
|
|
+git commit -m "feat(speaking): 在 _evaluate_pronunciation 串联 content_evaluator"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 5: [backend] `/report` 返回 `contentFeedback`
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `cococlass-english-speaking-api/app/service/speaking/dialogue_service.py`(`get_report` 方法)
|
|
|
+- Create: `cococlass-english-speaking-api/tests/service/speaking/test_dialogue_service_report.py`
|
|
|
+
|
|
|
+- [ ] **Step 1: 写测试 — evaluation 带 content_feedback 时,report entry 也带**
|
|
|
+
|
|
|
+创建 `cococlass-english-speaking-api/tests/service/speaking/test_dialogue_service_report.py`:
|
|
|
+
|
|
|
+```python
|
|
|
+"""Tests for get_report including content_feedback pass-through."""
|
|
|
+
|
|
|
+from unittest.mock import MagicMock
|
|
|
+
|
|
|
+import pytest
|
|
|
+
|
|
|
+
|
|
|
+def _stub_message(role: str, content: str, round_: int, evaluation=None):
|
|
|
+ msg = MagicMock()
|
|
|
+ msg.role = role
|
|
|
+ msg.content = content
|
|
|
+ msg.round = round_
|
|
|
+ msg.audio_url = None
|
|
|
+ msg.evaluation = evaluation
|
|
|
+ return msg
|
|
|
+
|
|
|
+
|
|
|
+def _stub_evaluation(content_feedback=None, status="completed"):
|
|
|
+ ev = MagicMock()
|
|
|
+ ev.status = status
|
|
|
+ ev.accuracy_score = 80
|
|
|
+ ev.fluency_score = 80
|
|
|
+ ev.completeness_score = 80
|
|
|
+ ev.prosody_score = 80
|
|
|
+ ev.word_analysis = None
|
|
|
+ ev.content_feedback = content_feedback
|
|
|
+ return ev
|
|
|
+
|
|
|
+
|
|
|
+def _build_report_entry(msg) -> dict:
|
|
|
+ """Replicates the entry construction in DialogueService.get_report.
|
|
|
+
|
|
|
+ We only exercise the dict-shaping step in isolation — the full get_report
|
|
|
+ path hits DB/LLM summary and is not needed for this contract check.
|
|
|
+ """
|
|
|
+ entry = {
|
|
|
+ "round": msg.round,
|
|
|
+ "role": msg.role,
|
|
|
+ "content": msg.content,
|
|
|
+ "audioUrl": msg.audio_url,
|
|
|
+ }
|
|
|
+ if msg.role == "student" and msg.evaluation:
|
|
|
+ ev = msg.evaluation
|
|
|
+ entry["evaluation"] = {
|
|
|
+ "status": ev.status,
|
|
|
+ "accuracyScore": ev.accuracy_score,
|
|
|
+ "fluencyScore": ev.fluency_score,
|
|
|
+ "completenessScore": ev.completeness_score,
|
|
|
+ "prosodyScore": ev.prosody_score,
|
|
|
+ "wordAnalysis": ev.word_analysis,
|
|
|
+ "contentFeedback": ev.content_feedback,
|
|
|
+ }
|
|
|
+ return entry
|
|
|
+
|
|
|
+
|
|
|
+def test_report_entry_includes_content_feedback_when_present() -> None:
|
|
|
+ feedback = {"highlights": ["good"], "corrections": [], "suggestions": []}
|
|
|
+ ev = _stub_evaluation(content_feedback=feedback)
|
|
|
+ msg = _stub_message("student", "hi", 1, evaluation=ev)
|
|
|
+
|
|
|
+ entry = _build_report_entry(msg)
|
|
|
+
|
|
|
+ assert entry["evaluation"]["contentFeedback"] == feedback
|
|
|
+
|
|
|
+
|
|
|
+def test_report_entry_content_feedback_is_null_when_absent() -> None:
|
|
|
+ ev = _stub_evaluation(content_feedback=None)
|
|
|
+ msg = _stub_message("student", "hi", 1, evaluation=ev)
|
|
|
+
|
|
|
+ entry = _build_report_entry(msg)
|
|
|
+
|
|
|
+ assert entry["evaluation"]["contentFeedback"] is None
|
|
|
+
|
|
|
+
|
|
|
+def test_ai_message_has_no_evaluation_key() -> None:
|
|
|
+ msg = _stub_message("ai", "hello", 1, evaluation=None)
|
|
|
+ entry = _build_report_entry(msg)
|
|
|
+ assert "evaluation" not in entry
|
|
|
+```
|
|
|
+
|
|
|
+这里我们测试的是 entry-shaping 的契约(独立 helper)。真实 `get_report` 里我们要修改同样的 entry 构造块保持一致。
|
|
|
+
|
|
|
+- [ ] **Step 2: 运行测试,应该全过(独立 helper 不依赖还没修改的代码)**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/test_dialogue_service_report.py -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: `3 passed`. 这一步验证的是契约,下一步把它应用到真实代码。
|
|
|
+
|
|
|
+- [ ] **Step 3: 修改 `get_report` 的 entry 构造块**
|
|
|
+
|
|
|
+打开 `cococlass-english-speaking-api/app/service/speaking/dialogue_service.py`,定位:
|
|
|
+
|
|
|
+```python
|
|
|
+ if msg.role == "student" and msg.evaluation:
|
|
|
+ ev = msg.evaluation
|
|
|
+ entry["evaluation"] = {
|
|
|
+ "status": ev.status,
|
|
|
+ "accuracyScore": ev.accuracy_score,
|
|
|
+ "fluencyScore": ev.fluency_score,
|
|
|
+ "completenessScore": ev.completeness_score,
|
|
|
+ "prosodyScore": ev.prosody_score,
|
|
|
+ "wordAnalysis": ev.word_analysis,
|
|
|
+ }
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```python
|
|
|
+ if msg.role == "student" and msg.evaluation:
|
|
|
+ ev = msg.evaluation
|
|
|
+ entry["evaluation"] = {
|
|
|
+ "status": ev.status,
|
|
|
+ "accuracyScore": ev.accuracy_score,
|
|
|
+ "fluencyScore": ev.fluency_score,
|
|
|
+ "completenessScore": ev.completeness_score,
|
|
|
+ "prosodyScore": ev.prosody_score,
|
|
|
+ "wordAnalysis": ev.word_analysis,
|
|
|
+ "contentFeedback": ev.content_feedback,
|
|
|
+ }
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 4: 重跑 report 相关测试**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest tests/service/speaking/ -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: 全部 pass(smoke 1 + evaluator 4 + content 3 + report 3 = 11 passed)。
|
|
|
+
|
|
|
+- [ ] **Step 5: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+git add app/service/speaking/dialogue_service.py tests/service/speaking/test_dialogue_service_report.py
|
|
|
+git commit -m "feat(speaking): /report 返回每轮 contentFeedback"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 6: [frontend] 把 `contentFeedback` 透传到 `sentence.feedback`
|
|
|
+
|
|
|
+`DetailedReport.vue` 已经按 `sentence.feedback.{highlights, corrections, suggestions}` 渲染(`PPT/src/views/Editor/EnglishSpeaking/preview/DetailedReport.vue:94-116`),所以前端只需要在 `getReport` 响应转 `OverallEvaluation` 的地方加一个 field pass-through。
|
|
|
+
|
|
|
+**Files:**
|
|
|
+- Modify: `PPT/src/views/Editor/EnglishSpeaking/services/llmService.ts`
|
|
|
+
|
|
|
+- [ ] **Step 1: 定位后端→前端形状转换位置**
|
|
|
+
|
|
|
+运行:
|
|
|
+
|
|
|
+```bash
|
|
|
+grep -n "rounds\|sentenceEvaluations\|evaluation" /Users/buoy/Development/gitrepo/PPT/src/views/Editor/EnglishSpeaking/services/llmService.ts
|
|
|
+```
|
|
|
+
|
|
|
+后端 `/report` 返回 `{ sessionId, topic, status, rounds[], summary }`,前端 `DialogueReport` 期望 `{ evaluation: OverallEvaluation }`(`sentenceEvaluations[]` 里每项的 `feedback` 字段)。当前 `RealDialogueAPI.getReport()`(`llmService.ts:86-92`)直接 `return res.json()`,不做形状转换。
|
|
|
+
|
|
|
+这意味着:**当前前端要么通过其他层做 shape adaption,要么 `DetailedReport.vue` 从别处拿数据**。先跑一次 grep 找适配位置:
|
|
|
+
|
|
|
+```bash
|
|
|
+grep -rn "sentenceEvaluations\|rounds" /Users/buoy/Development/gitrepo/PPT/src/views/Editor/EnglishSpeaking --include="*.ts" --include="*.vue" | head -30
|
|
|
+```
|
|
|
+
|
|
|
+- [ ] **Step 2: 根据 Step 1 结果选一条支路**
|
|
|
+
|
|
|
+**支路 A(理想情况):如果已经有一个 `mapReportToEvaluation(backendRes)` 之类的函数**
|
|
|
+- 在那个函数里给每个 sentence 加 `feedback: round.evaluation?.contentFeedback ?? undefined`
|
|
|
+- 继续 Step 3
|
|
|
+
|
|
|
+**支路 B(没有转换层):如果 `getReport` 的返回值直接裸传给组件**
|
|
|
+- 在 `RealDialogueAPI.getReport()` 里把 `rounds[]` 转成 `OverallEvaluation.sentenceEvaluations[]`,其中每个 `student` 角色的轮次 emit 一个 `SentenceEvaluation` 带 `feedback: r.evaluation?.contentFeedback ?? undefined`
|
|
|
+- 继续 Step 3
|
|
|
+
|
|
|
+**支路 C(Mock API 已经长成 `{ evaluation: OverallEvaluation }` 但真实后端没适配):** 这是当前最可能的状态。此时必须在 `RealDialogueAPI.getReport()` 里写一个显式 adapter。按支路 B 实现。
|
|
|
+
|
|
|
+- [ ] **Step 3: 在 `RealDialogueAPI.getReport` 里加 adapter(假设走支路 B/C)**
|
|
|
+
|
|
|
+打开 `PPT/src/views/Editor/EnglishSpeaking/services/llmService.ts`。
|
|
|
+
|
|
|
+把:
|
|
|
+
|
|
|
+```typescript
|
|
|
+ async getReport(sessionId: string): Promise<DialogueReport> {
|
|
|
+ const res = await fetch(`${API_BASE}/report?sessionId=${encodeURIComponent(sessionId)}`, {
|
|
|
+ credentials: 'include',
|
|
|
+ })
|
|
|
+ if (!res.ok) throw new Error(`getReport failed: ${res.status}`)
|
|
|
+ return res.json()
|
|
|
+ }
|
|
|
+```
|
|
|
+
|
|
|
+改为:
|
|
|
+
|
|
|
+```typescript
|
|
|
+ async getReport(sessionId: string): Promise<DialogueReport> {
|
|
|
+ const res = await fetch(`${API_BASE}/report?sessionId=${encodeURIComponent(sessionId)}`, {
|
|
|
+ credentials: 'include',
|
|
|
+ })
|
|
|
+ if (!res.ok) throw new Error(`getReport failed: ${res.status}`)
|
|
|
+ const raw = await res.json() as BackendReportResponse
|
|
|
+ return adaptReport(raw)
|
|
|
+ }
|
|
|
+```
|
|
|
+
|
|
|
+在 `RealDialogueAPI` 类定义**之前**加:
|
|
|
+
|
|
|
+```typescript
|
|
|
+interface BackendEvaluation {
|
|
|
+ status: 'pending' | 'completed' | 'failed'
|
|
|
+ accuracyScore: number | null
|
|
|
+ fluencyScore: number | null
|
|
|
+ completenessScore: number | null
|
|
|
+ prosodyScore: number | null
|
|
|
+ wordAnalysis: unknown
|
|
|
+ contentFeedback: {
|
|
|
+ highlights: string[]
|
|
|
+ corrections: { original: string; corrected: string; explanation: string }[]
|
|
|
+ suggestions: string[]
|
|
|
+ } | null
|
|
|
+}
|
|
|
+
|
|
|
+interface BackendRound {
|
|
|
+ round: number
|
|
|
+ role: 'ai' | 'student'
|
|
|
+ content: string
|
|
|
+ audioUrl: string | null
|
|
|
+ evaluation?: BackendEvaluation
|
|
|
+}
|
|
|
+
|
|
|
+interface BackendReportResponse {
|
|
|
+ sessionId: string
|
|
|
+ topic: string
|
|
|
+ status: 'evaluating' | 'ready'
|
|
|
+ rounds: BackendRound[]
|
|
|
+ summary: string | null
|
|
|
+}
|
|
|
+
|
|
|
+function adaptReport(raw: BackendReportResponse): DialogueReport {
|
|
|
+ const sentenceEvaluations: SentenceEvaluation[] = raw.rounds.map((r, idx) => ({
|
|
|
+ id: `${raw.sessionId}-${idx}`,
|
|
|
+ round: r.round,
|
|
|
+ role: r.role,
|
|
|
+ content: r.content,
|
|
|
+ audioUrl: r.audioUrl ?? undefined,
|
|
|
+ pronunciation: r.evaluation && r.role === 'student'
|
|
|
+ ? {
|
|
|
+ accuracy: r.evaluation.accuracyScore ?? 0,
|
|
|
+ fluency: r.evaluation.fluencyScore ?? 0,
|
|
|
+ // enspeak 原型用 intonation/stress 做 UI label;把 Azure 的 prosody/completeness 分别
|
|
|
+ // 映射到这两格(prosody → intonation 表示语调、completeness → stress 表示完整读出)。
|
|
|
+ // 这是一个 UI 贴合性决定,如未来 UI 统一改用 Azure 四维,再把 key 改回来。
|
|
|
+ intonation: r.evaluation.prosodyScore ?? 0,
|
|
|
+ stress: r.evaluation.completenessScore ?? 0,
|
|
|
+ }
|
|
|
+ : undefined,
|
|
|
+ feedback: r.evaluation?.contentFeedback ?? undefined,
|
|
|
+ }))
|
|
|
+
|
|
|
+ // overallScore 先用平均分作为 MVP 占位;其他字段留空/安全默认。
|
|
|
+ const studentEvals = sentenceEvaluations.filter(s => s.role === 'student' && s.pronunciation)
|
|
|
+ const avg = studentEvals.length > 0
|
|
|
+ ? Math.round(
|
|
|
+ studentEvals.reduce(
|
|
|
+ (sum, s) => sum + (s.pronunciation!.accuracy + s.pronunciation!.fluency + s.pronunciation!.intonation + s.pronunciation!.stress) / 4,
|
|
|
+ 0,
|
|
|
+ ) / studentEvals.length,
|
|
|
+ )
|
|
|
+ : 0
|
|
|
+
|
|
|
+ return {
|
|
|
+ evaluation: {
|
|
|
+ overallScore: avg,
|
|
|
+ scoreLevel: avg >= 85 ? 'excellent' : avg >= 70 ? 'good' : avg >= 60 ? 'fair' : 'needsWork',
|
|
|
+ percentile: 0,
|
|
|
+ dimensions: { fluency: 0, interaction: 0, vocabulary: 0, grammar: 0 },
|
|
|
+ aiComment: raw.summary ?? '',
|
|
|
+ highlights: [],
|
|
|
+ improvements: [],
|
|
|
+ nextChallenge: {},
|
|
|
+ statistics: {
|
|
|
+ totalRounds: Math.max(...sentenceEvaluations.map(s => s.round), 0),
|
|
|
+ averageScore: avg,
|
|
|
+ highestScore: 0,
|
|
|
+ highestRound: 0,
|
|
|
+ grammarErrors: 0,
|
|
|
+ excellentExpressions: 0,
|
|
|
+ totalDuration: 0,
|
|
|
+ },
|
|
|
+ sentenceEvaluations,
|
|
|
+ },
|
|
|
+ }
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+然后在顶部 import 里追加 `SentenceEvaluation`:
|
|
|
+
|
|
|
+```typescript
|
|
|
+import type {
|
|
|
+ DialogueAPI, DialogueReport, SessionConfig, SessionInfo, SSEEvent,
|
|
|
+ SentenceEvaluation,
|
|
|
+} from '@/types/englishSpeaking'
|
|
|
+```
|
|
|
+
|
|
|
+(如果 `SentenceEvaluation` 未从 `englishSpeaking.ts` 导出,先去那个文件确认 `export interface SentenceEvaluation` 已加 `export` 关键字。)
|
|
|
+
|
|
|
+**注意**:如果 Step 1 的 grep 显示已经有现成的 adapter 函数,**以现有适配层为准**——只在那里追加 `feedback` 字段、不新建 adapter。跳过这里的 `adaptReport` 整段代码,改为找到现有函数加一行 pass-through。
|
|
|
+
|
|
|
+- [ ] **Step 4: 类型检查**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/PPT
|
|
|
+npm run type-check
|
|
|
+```
|
|
|
+
|
|
|
+(如果项目用 `pnpm` / `yarn`,相应调整。若没有 `type-check` script,跑 `npx vue-tsc --noEmit`。)
|
|
|
+
|
|
|
+Expected: 无 type error。
|
|
|
+
|
|
|
+- [ ] **Step 5: 手动 smoke 验证**
|
|
|
+
|
|
|
+1. 启动后端:
|
|
|
+ ```bash
|
|
|
+ cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+ uv run uvicorn app.main:app --reload
|
|
|
+ ```
|
|
|
+2. 启动前端:
|
|
|
+ ```bash
|
|
|
+ cd /Users/buoy/Development/gitrepo/PPT
|
|
|
+ npm run dev
|
|
|
+ ```
|
|
|
+3. 浏览器进入 EnglishSpeaking 组件,完成一轮对话。
|
|
|
+4. 打开结果页(DetailedReport),确认每轮学生句子下面能看到"亮点 / 改正 / 建议"三段内容。
|
|
|
+5. 同时在后端 DB 查:
|
|
|
+ ```sql
|
|
|
+ SELECT round, status, accuracy_score, content_feedback
|
|
|
+ FROM pronunciation_evaluation
|
|
|
+ WHERE session_id = (SELECT id FROM dialogue_session ORDER BY id DESC LIMIT 1);
|
|
|
+ ```
|
|
|
+ 确认 `content_feedback` 是 `{highlights, corrections, suggestions}` 结构(或 `NULL` 如果 LLM 失败)。
|
|
|
+
|
|
|
+任意一项不通过,回到对应 Task 定位 bug。
|
|
|
+
|
|
|
+- [ ] **Step 6: Commit**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/PPT
|
|
|
+git add src/views/Editor/EnglishSpeaking/services/llmService.ts
|
|
|
+git commit -m "feat(english-speaking): 结果页透传 contentFeedback 到 SentenceCard"
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## Task 7: 回归校验全部测试和现有流程
|
|
|
+
|
|
|
+- [ ] **Step 1: 跑后端全测试**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/cococlass-english-speaking-api
|
|
|
+uv run pytest -v
|
|
|
+```
|
|
|
+
|
|
|
+Expected: 所有 test 通过(包含本次新增的 smoke 1 + evaluator 4 + content-dispatch 3 + report 3 = 11 个)。
|
|
|
+
|
|
|
+- [ ] **Step 2: 跑前端类型检查**
|
|
|
+
|
|
|
+```bash
|
|
|
+cd /Users/buoy/Development/gitrepo/PPT
|
|
|
+npm run type-check
|
|
|
+```
|
|
|
+
|
|
|
+Expected: 无 type error。
|
|
|
+
|
|
|
+- [ ] **Step 3: 把两个 repo 的 HEAD 记下来,作为本次实施的完成标记**
|
|
|
+
|
|
|
+```bash
|
|
|
+echo "backend: $(git -C /Users/buoy/Development/gitrepo/cococlass-english-speaking-api rev-parse --short HEAD)"
|
|
|
+echo "frontend: $(git -C /Users/buoy/Development/gitrepo/PPT rev-parse --short HEAD)"
|
|
|
+```
|
|
|
+
|
|
|
+把输出贴到本 plan 文件底部的"完成记录"栏。
|
|
|
+
|
|
|
+---
|
|
|
+
|
|
|
+## 完成记录(实施时填写)
|
|
|
+
|
|
|
+- 计划完成日期:_____
|
|
|
+- 后端 HEAD:_____
|
|
|
+- 前端 HEAD:_____
|
|
|
+- 偏差或额外说明:_____
|