|
|
@@ -105,17 +105,15 @@ Both keys are required. Values may be empty strings only if the evaluator fails
|
|
|
|
|
|
## Overall Evaluation
|
|
|
|
|
|
-Overall evaluation is generated only when every expected student answer has a completed pronunciation evaluation.
|
|
|
+Overall evaluation is generated only when the conversation has ended and every expected student answer has a completed pronunciation evaluation.
|
|
|
|
|
|
-The ready condition is:
|
|
|
+The report API status should be one of:
|
|
|
|
|
|
-- The session has at least one student message.
|
|
|
-- Every student message has a `pronunciation_evaluation` row.
|
|
|
-- Every evaluation has `status = "completed"`.
|
|
|
-
|
|
|
-If any evaluation is still `pending`, `/report` returns `status: "evaluating"` and should not generate the overall report.
|
|
|
+```text
|
|
|
+evaluating | ready | failed | incomplete
|
|
|
+```
|
|
|
|
|
|
-If any evaluation has `status = "failed"`, `/report` returns `status: "failed"` for the full report and should not generate the complete overall report. The frontend may still show the available detailed rows. A failed pronunciation evaluation can still include text-only `contentFeedback`.
|
|
|
+If the conversation has not ended or any evaluation is still `pending`, `/report` returns `status: "evaluating"` and should not generate the overall report. A failed pronunciation evaluation can still include text-only `contentFeedback`, but it blocks Full OverallReport.
|
|
|
|
|
|
When all sentence evaluations are completed, the backend generates and caches the overall report. Overall output owns:
|
|
|
|
|
|
@@ -130,6 +128,111 @@ These fields must be based on all expected student answers, all Azure scores, al
|
|
|
|
|
|
Full OverallReport still requires all expected student answers to have completed Azure pronunciation scores. Text-only sentence feedback is useful for `DetailedReport`, but it is not enough to compute complete aggregate pronunciation scores.
|
|
|
|
|
|
+## Overall Generation Gate
|
|
|
+
|
|
|
+`/report` should decide whether Full OverallReport can be generated with an explicit gate:
|
|
|
+
|
|
|
+```text
|
|
|
+1. The conversation has ended.
|
|
|
+2. At least one valid student answer exists.
|
|
|
+3. Every saved student answer has a PronunciationEvaluation row.
|
|
|
+4. Every PronunciationEvaluation is terminal.
|
|
|
+5. Every PronunciationEvaluation has status = completed.
|
|
|
+6. OverallReport has not already been generated and cached.
|
|
|
+```
|
|
|
+
|
|
|
+The first implementation should use saved student messages as the expected answer set, not `total_rounds`. `total_rounds` is the configured target, but it may not equal the final number of valid answers because time mode can expire, the user can end practice manually, ASR can fail before a student message is created, or the final AI closing message should not create another expected student answer.
|
|
|
+
|
|
|
+Recommended status decisions:
|
|
|
+
|
|
|
+```text
|
|
|
+evaluating:
|
|
|
+ - conversation has not ended; or
|
|
|
+ - a saved student answer has no evaluation row yet; or
|
|
|
+ - at least one evaluation is still pending/running/retrying
|
|
|
+
|
|
|
+ready:
|
|
|
+ - conversation has ended; and
|
|
|
+ - at least one valid student answer exists; and
|
|
|
+ - every saved student answer has completed Azure pronunciation scores; and
|
|
|
+ - overall has been generated or can be generated now
|
|
|
+
|
|
|
+failed:
|
|
|
+ - conversation has ended; and
|
|
|
+ - at least one valid student answer exists; and
|
|
|
+ - any evaluation is failed after retries
|
|
|
+
|
|
|
+incomplete:
|
|
|
+ - conversation has ended; and
|
|
|
+ - no valid student answers exist; or
|
|
|
+ - the product later requires a minimum valid answer count and the session did not reach it
|
|
|
+```
|
|
|
+
|
|
|
+If the product later requires exactly three valid answers for a completed practice, that requirement should be added as a product-level gate. It should not be inferred from `total_rounds` inside the report generator without an explicit requirement.
|
|
|
+
|
|
|
+## Bounded Retries
|
|
|
+
|
|
|
+Automatic retries must be bounded and owned by the worker performing that stage.
|
|
|
+
|
|
|
+`/report` must never create or retry sentence-evaluation work. It only reads sentence evaluation state and, when the Overall Generation Gate passes, generates or returns cached OverallReport data. This prevents frontend polling from starting duplicate sentence jobs.
|
|
|
+
|
|
|
+Each external call should have an explicit maximum attempt count:
|
|
|
+
|
|
|
+```text
|
|
|
+Azure pronunciation scoring: max_attempts = 3
|
|
|
+Sentence feedback LLM: max_attempts = 1 or 2
|
|
|
+OverallReport LLM: max_attempts = 1 or 2
|
|
|
+```
|
|
|
+
|
|
|
+Attempt exhaustion must always produce a terminal state:
|
|
|
+
|
|
|
+```text
|
|
|
+Azure exhausted:
|
|
|
+ evaluation.status = failed
|
|
|
+ optional text-only contentFeedback may still be attempted
|
|
|
+
|
|
|
+Sentence feedback exhausted after Azure success:
|
|
|
+ evaluation.status = completed
|
|
|
+ contentFeedback = null
|
|
|
+
|
|
|
+Sentence feedback exhausted after Azure failure:
|
|
|
+ evaluation.status = failed
|
|
|
+ contentFeedback = null
|
|
|
+
|
|
|
+OverallReport exhausted:
|
|
|
+ report status = failed
|
|
|
+ no automatic retry on the next /report poll
|
|
|
+```
|
|
|
+
|
|
|
+Once a stage reaches a terminal state, normal `/report` polling must not rerun it:
|
|
|
+
|
|
|
+```text
|
|
|
+evaluation.status = completed
|
|
|
+evaluation.status = failed
|
|
|
+overall report exists
|
|
|
+overall generation failed
|
|
|
+```
|
|
|
+
|
|
|
+Future explicit retry can be added as a separate admin or user action, but it must not be implicit in report polling.
|
|
|
+
|
|
|
+If the first implementation does not add persistent retry counters or job rows, retries should still be bounded inside the single background task:
|
|
|
+
|
|
|
+```py
|
|
|
+for attempt in range(1, max_attempts + 1):
|
|
|
+ try:
|
|
|
+ run_stage()
|
|
|
+ mark_success()
|
|
|
+ return
|
|
|
+ except Exception as exc:
|
|
|
+ log_attempt_failure(attempt, exc)
|
|
|
+ if attempt == max_attempts:
|
|
|
+ mark_terminal_failure()
|
|
|
+ return
|
|
|
+ await sleep(backoff_seconds(attempt))
|
|
|
+```
|
|
|
+
|
|
|
+OverallReport lazy generation needs the same protection. If the backend attempts OverallReport generation inside `/report`, a failed generation must set a terminal failed state for that report response path or use an in-process guard so frontend polling does not repeatedly invoke the OverallReport LLM.
|
|
|
+
|
|
|
## API Contract
|
|
|
|
|
|
`GET /report?sessionId=...` remains the frontend entry point.
|
|
|
@@ -249,6 +352,7 @@ If `status !== "ready"`, the frontend should show a report-generating state and
|
|
|
|
|
|
- `status === "ready"`
|
|
|
- `status === "failed"`
|
|
|
+- `status === "incomplete"`
|
|
|
- a retry limit is reached
|
|
|
- the request fails with an unrecoverable error
|
|
|
|