Nelze vybrat více než 25 témat Téma musí začínat písmenem nebo číslem, může obsahovat pomlčky („-“) a může být dlouhé až 35 znaků.

step-03-quality-evaluation.md 7.5KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274
  1. ---
  2. name: 'step-03-quality-evaluation'
  3. description: 'Orchestrate adaptive quality dimension checks (agent-team, subagent, or sequential)'
  4. nextStepFile: '{skill-root}/steps-c/step-03f-aggregate-scores.md'
  5. ---
  6. # Step 3: Orchestrate Adaptive Quality Evaluation
  7. ## STEP GOAL
  8. Select execution mode deterministically, then evaluate quality dimensions using agent-team, subagent, or sequential execution while preserving output contracts:
  9. - Determinism
  10. - Isolation
  11. - Maintainability
  12. - Performance
  13. Coverage is intentionally excluded from this workflow and handled by `trace`.
  14. ## MANDATORY EXECUTION RULES
  15. - 📖 Read the entire step file before acting
  16. - ✅ Speak in `{communication_language}`
  17. - ✅ Resolve execution mode from config (`tea_execution_mode`, `tea_capability_probe`)
  18. - ✅ Apply fallback rules deterministically when requested mode is unsupported
  19. - ✅ Wait for required worker steps to complete
  20. - ❌ Do NOT skip capability checks when probing is enabled
  21. - ❌ Do NOT proceed until required worker steps finish
  22. ---
  23. ## EXECUTION PROTOCOLS:
  24. - 🎯 Follow the MANDATORY SEQUENCE exactly
  25. - 💾 Wait for subagent outputs
  26. - 📖 Load the next step only when instructed
  27. ## CONTEXT BOUNDARIES:
  28. - Available context: test files from Step 2, knowledge fragments
  29. - Focus: orchestration only (mode selection + worker dispatch)
  30. - Limits: do not evaluate quality directly (delegate to worker steps)
  31. ---
  32. ## MANDATORY SEQUENCE
  33. ### 1. Prepare Execution Context
  34. **Generate unique timestamp:**
  35. ```javascript
  36. const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
  37. ```
  38. **Prepare context for all subagents:**
  39. ```javascript
  40. const parseBooleanFlag = (value, defaultValue = true) => {
  41. if (typeof value === 'string') {
  42. const normalized = value.trim().toLowerCase();
  43. if (['false', '0', 'off', 'no'].includes(normalized)) return false;
  44. if (['true', '1', 'on', 'yes'].includes(normalized)) return true;
  45. }
  46. if (value === undefined || value === null) return defaultValue;
  47. return Boolean(value);
  48. };
  49. const subagentContext = {
  50. test_files: /* from Step 2 */,
  51. knowledge_fragments_loaded: ['test-quality'],
  52. config: {
  53. execution_mode: config.tea_execution_mode || 'auto', // "auto" | "subagent" | "agent-team" | "sequential"
  54. capability_probe: parseBooleanFlag(config.tea_capability_probe, true), // supports booleans and "false"/"true" strings
  55. },
  56. timestamp: timestamp
  57. };
  58. ```
  59. ---
  60. ### 2. Resolve Execution Mode with Capability Probe
  61. ```javascript
  62. const normalizeUserExecutionMode = (mode) => {
  63. if (typeof mode !== 'string') return null;
  64. const normalized = mode.trim().toLowerCase().replace(/[-_]/g, ' ').replace(/\s+/g, ' ');
  65. if (normalized === 'auto') return 'auto';
  66. if (normalized === 'sequential') return 'sequential';
  67. if (normalized === 'subagent' || normalized === 'sub agent' || normalized === 'subagents' || normalized === 'sub agents') {
  68. return 'subagent';
  69. }
  70. if (normalized === 'agent team' || normalized === 'agent teams' || normalized === 'agentteam') {
  71. return 'agent-team';
  72. }
  73. return null;
  74. };
  75. const normalizeConfigExecutionMode = (mode) => {
  76. if (mode === 'subagent') return 'subagent';
  77. if (mode === 'auto' || mode === 'sequential' || mode === 'subagent' || mode === 'agent-team') {
  78. return mode;
  79. }
  80. return null;
  81. };
  82. // Explicit user instruction in the active run takes priority over config.
  83. const explicitModeFromUser = normalizeUserExecutionMode(runtime.getExplicitExecutionModeHint?.() || null);
  84. const requestedMode = explicitModeFromUser || normalizeConfigExecutionMode(subagentContext.config.execution_mode) || 'auto';
  85. const probeEnabled = subagentContext.config.capability_probe;
  86. const supports = {
  87. subagent: false,
  88. agentTeam: false,
  89. };
  90. if (probeEnabled) {
  91. supports.subagent = runtime.canLaunchSubagents?.() === true;
  92. supports.agentTeam = runtime.canLaunchAgentTeams?.() === true;
  93. }
  94. let resolvedMode = requestedMode;
  95. if (requestedMode === 'auto') {
  96. if (supports.agentTeam) resolvedMode = 'agent-team';
  97. else if (supports.subagent) resolvedMode = 'subagent';
  98. else resolvedMode = 'sequential';
  99. } else if (probeEnabled && requestedMode === 'agent-team' && !supports.agentTeam) {
  100. resolvedMode = supports.subagent ? 'subagent' : 'sequential';
  101. } else if (probeEnabled && requestedMode === 'subagent' && !supports.subagent) {
  102. resolvedMode = 'sequential';
  103. }
  104. subagentContext.execution = {
  105. requestedMode,
  106. resolvedMode,
  107. probeEnabled,
  108. supports,
  109. };
  110. ```
  111. Resolution precedence:
  112. 1. Explicit user request in this run (`agent team` => `agent-team`; `subagent` => `subagent`; `sequential`; `auto`)
  113. 2. `tea_execution_mode` from config
  114. 3. Runtime capability fallback (when probing enabled)
  115. If probing is disabled, honor the requested mode strictly. If that mode cannot be executed at runtime, fail with explicit error instead of silent fallback.
  116. ---
  117. ### 3. Dispatch 4 Quality Workers
  118. **Subagent A: Determinism**
  119. - File: `./step-03a-subagent-determinism.md`
  120. - Output: `/tmp/tea-test-review-determinism-${timestamp}.json`
  121. - Execution:
  122. - `agent-team` or `subagent`: launch non-blocking
  123. - `sequential`: run blocking and wait
  124. - Status: Running... ⟳
  125. **Subagent B: Isolation**
  126. - File: `./step-03b-subagent-isolation.md`
  127. - Output: `/tmp/tea-test-review-isolation-${timestamp}.json`
  128. - Status: Running... ⟳
  129. **Subagent C: Maintainability**
  130. - File: `./step-03c-subagent-maintainability.md`
  131. - Output: `/tmp/tea-test-review-maintainability-${timestamp}.json`
  132. - Status: Running... ⟳
  133. **Subagent D: Performance**
  134. - File: `./step-03e-subagent-performance.md`
  135. - Output: `/tmp/tea-test-review-performance-${timestamp}.json`
  136. - Status: Running... ⟳
  137. In `agent-team` and `subagent` modes, runtime decides worker scheduling and concurrency.
  138. ---
  139. ### 4. Wait for Expected Worker Completion
  140. **If `resolvedMode` is `agent-team` or `subagent`:**
  141. ```
  142. ⏳ Waiting for 4 quality subagents to complete...
  143. ✅ All 4 quality subagents completed successfully!
  144. ```
  145. **If `resolvedMode` is `sequential`:**
  146. ```
  147. ✅ Sequential mode: each worker already completed during dispatch.
  148. ```
  149. ---
  150. ### 5. Verify All Outputs Exist
  151. ```javascript
  152. const outputs = ['determinism', 'isolation', 'maintainability', 'performance'].map(
  153. (dim) => `/tmp/tea-test-review-${dim}-${timestamp}.json`,
  154. );
  155. outputs.forEach((output) => {
  156. if (!fs.existsSync(output)) {
  157. throw new Error(`Subagent output missing: ${output}`);
  158. }
  159. });
  160. ```
  161. ---
  162. ### 6. Execution Report
  163. ```
  164. 🚀 Performance Report:
  165. - Execution Mode: {resolvedMode}
  166. - Total Elapsed: ~mode-dependent
  167. - Parallel Gain: ~60-70% faster when mode is subagent/agent-team
  168. ```
  169. ---
  170. ### 7. Proceed to Aggregation
  171. Pass the same `timestamp` value to Step 3F (do not regenerate it). Step 3F must read the exact temp files written in this step.
  172. Load next step: `{nextStepFile}`
  173. The aggregation step (3F) will:
  174. - Read all 4 subagent outputs
  175. - Calculate weighted overall score (0-100)
  176. - Aggregate violations by severity
  177. - Generate review report with top suggestions
  178. ---
  179. ## EXIT CONDITION
  180. Proceed to Step 3F when:
  181. - ✅ All 4 subagents completed successfully
  182. - ✅ All output files exist and are valid JSON
  183. - ✅ Execution metrics displayed
  184. **Do NOT proceed if any subagent failed.**
  185. ---
  186. ## 🚨 SYSTEM SUCCESS METRICS
  187. ### ✅ SUCCESS:
  188. - All 4 subagents launched and completed
  189. - All required worker steps completed
  190. - Output files generated and valid
  191. - Fallback behavior respected configuration and capability probe rules
  192. ### ❌ FAILURE:
  193. - One or more subagents failed
  194. - Output files missing or invalid
  195. - Unsupported requested mode with probing disabled
  196. **Master Rule:** Deterministic mode selection + stable output contract. Use the best supported mode, then aggregate normally.