2 недель назад · 457af99003
--- a/docs/superpowers/specs/2026-04-25-english-speaking-report-pipeline-design.md
+++ b/docs/superpowers/specs/2026-04-25-english-speaking-report-pipeline-design.md
@@ -105,17 +105,15 @@ Both keys are required. Values may be empty strings only if the evaluator fails
 
				 
			
 
				 ## Overall Evaluation
			
 
				 
			
 
				-Overall evaluation is generated only when every expected student answer has a completed pronunciation evaluation.
			
 
				+Overall evaluation is generated only when the conversation has ended and every expected student answer has a completed pronunciation evaluation.
			
 
				 
			
 
				-The ready condition is:
			
 
				+The report API status should be one of:
			
 
				 
			
 
				-- The session has at least one student message.
			
 
				-- Every student message has a `pronunciation_evaluation` row.
			
 
				-- Every evaluation has `status = "completed"`.
			
 
				-
			
 
				-If any evaluation is still `pending`, `/report` returns `status: "evaluating"` and should not generate the overall report.
			
 
				+```text
			
 
				+evaluating | ready | failed | incomplete
			
 
				+```
			
 
				 
			
 
				-If any evaluation has `status = "failed"`, `/report` returns `status: "failed"` for the full report and should not generate the complete overall report. The frontend may still show the available detailed rows. A failed pronunciation evaluation can still include text-only `contentFeedback`.
			
 
				+If the conversation has not ended or any evaluation is still `pending`, `/report` returns `status: "evaluating"` and should not generate the overall report. A failed pronunciation evaluation can still include text-only `contentFeedback`, but it blocks Full OverallReport.
			
 
				 
			
 
				 When all sentence evaluations are completed, the backend generates and caches the overall report. Overall output owns:
			
 
				 
			
@@ -130,6 +128,111 @@ These fields must be based on all expected student answers, all Azure scores, al
 
				 
			
 
				 Full OverallReport still requires all expected student answers to have completed Azure pronunciation scores. Text-only sentence feedback is useful for `DetailedReport`, but it is not enough to compute complete aggregate pronunciation scores.
			
 
				 
			
 
				+## Overall Generation Gate
			
 
				+
			
 
				+`/report` should decide whether Full OverallReport can be generated with an explicit gate:
			
 
				+
			
 
				+```text
			
 
				+1. The conversation has ended.
			
 
				+2. At least one valid student answer exists.
			
 
				+3. Every saved student answer has a PronunciationEvaluation row.
			
 
				+4. Every PronunciationEvaluation is terminal.
			
 
				+5. Every PronunciationEvaluation has status = completed.
			
 
				+6. OverallReport has not already been generated and cached.
			
 
				+```
			
 
				+
			
 
				+The first implementation should use saved student messages as the expected answer set, not `total_rounds`. `total_rounds` is the configured target, but it may not equal the final number of valid answers because time mode can expire, the user can end practice manually, ASR can fail before a student message is created, or the final AI closing message should not create another expected student answer.
			
 
				+
			
 
				+Recommended status decisions:
			
 
				+
			
 
				+```text
			
 
				+evaluating:
			
 
				+  - conversation has not ended; or
			
 
				+  - a saved student answer has no evaluation row yet; or
			
 
				+  - at least one evaluation is still pending/running/retrying
			
 
				+
			
 
				+ready:
			
 
				+  - conversation has ended; and
			
 
				+  - at least one valid student answer exists; and
			
 
				+  - every saved student answer has completed Azure pronunciation scores; and
			
 
				+  - overall has been generated or can be generated now
			
 
				+
			
 
				+failed:
			
 
				+  - conversation has ended; and
			
 
				+  - at least one valid student answer exists; and
			
 
				+  - any evaluation is failed after retries
			
 
				+
			
 
				+incomplete:
			
 
				+  - conversation has ended; and
			
 
				+  - no valid student answers exist; or
			
 
				+  - the product later requires a minimum valid answer count and the session did not reach it
			
 
				+```
			
 
				+
			
 
				+If the product later requires exactly three valid answers for a completed practice, that requirement should be added as a product-level gate. It should not be inferred from `total_rounds` inside the report generator without an explicit requirement.
			
 
				+
			
 
				+## Bounded Retries
			
 
				+
			
 
				+Automatic retries must be bounded and owned by the worker performing that stage.
			
 
				+
			
 
				+`/report` must never create or retry sentence-evaluation work. It only reads sentence evaluation state and, when the Overall Generation Gate passes, generates or returns cached OverallReport data. This prevents frontend polling from starting duplicate sentence jobs.
			
 
				+
			
 
				+Each external call should have an explicit maximum attempt count:
			
 
				+
			
 
				+```text
			
 
				+Azure pronunciation scoring: max_attempts = 3
			
 
				+Sentence feedback LLM: max_attempts = 1 or 2
			
 
				+OverallReport LLM: max_attempts = 1 or 2
			
 
				+```
			
 
				+
			
 
				+Attempt exhaustion must always produce a terminal state:
			
 
				+
			
 
				+```text
			
 
				+Azure exhausted:
			
 
				+  evaluation.status = failed
			
 
				+  optional text-only contentFeedback may still be attempted
			
 
				+
			
 
				+Sentence feedback exhausted after Azure success:
			
 
				+  evaluation.status = completed
			
 
				+  contentFeedback = null
			
 
				+
			
 
				+Sentence feedback exhausted after Azure failure:
			
 
				+  evaluation.status = failed
			
 
				+  contentFeedback = null
			
 
				+
			
 
				+OverallReport exhausted:
			
 
				+  report status = failed
			
 
				+  no automatic retry on the next /report poll
			
 
				+```
			
 
				+
			
 
				+Once a stage reaches a terminal state, normal `/report` polling must not rerun it:
			
 
				+
			
 
				+```text
			
 
				+evaluation.status = completed
			
 
				+evaluation.status = failed
			
 
				+overall report exists
			
 
				+overall generation failed
			
 
				+```
			
 
				+
			
 
				+Future explicit retry can be added as a separate admin or user action, but it must not be implicit in report polling.
			
 
				+
			
 
				+If the first implementation does not add persistent retry counters or job rows, retries should still be bounded inside the single background task:
			
 
				+
			
 
				+```py
			
 
				+for attempt in range(1, max_attempts + 1):
			
 
				+    try:
			
 
				+        run_stage()
			
 
				+        mark_success()
			
 
				+        return
			
 
				+    except Exception as exc:
			
 
				+        log_attempt_failure(attempt, exc)
			
 
				+        if attempt == max_attempts:
			
 
				+            mark_terminal_failure()
			
 
				+            return
			
 
				+        await sleep(backoff_seconds(attempt))
			
 
				+```
			
 
				+
			
 
				+OverallReport lazy generation needs the same protection. If the backend attempts OverallReport generation inside `/report`, a failed generation must set a terminal failed state for that report response path or use an in-process guard so frontend polling does not repeatedly invoke the OverallReport LLM.
			
 
				+
			
 
				 ## API Contract
			
 
				 
			
 
				 `GET /report?sessionId=...` remains the frontend entry point.
			
@@ -249,6 +352,7 @@ If `status !== "ready"`, the frontend should show a report-generating state and
 
				 
			
 
				 - `status === "ready"`
			
 
				 - `status === "failed"`
			
 
				+- `status === "incomplete"`
			
 
				 - a retry limit is reached
			
 
				 - the request fails with an unrecoverable error