You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387
  1. ---
  2. stepsCompleted: []
  3. lastStep: ''
  4. lastSaved: ''
  5. workflowType: 'testarch-test-review'
  6. inputDocuments: []
  7. ---
  8. # Test Quality Review: {test_filename}
  9. **Quality Score**: {score}/100 ({grade} - {assessment})
  10. **Review Date**: {YYYY-MM-DD}
  11. **Review Scope**: {single | directory | suite}
  12. **Reviewer**: {user_name or TEA Agent}
  13. ---
  14. Note: This review audits existing tests; it does not generate tests.
  15. Coverage mapping and coverage gates are out of scope here. Use `trace` for coverage decisions.
  16. ## Executive Summary
  17. **Overall Assessment**: {Excellent | Good | Acceptable | Needs Improvement | Critical Issues}
  18. **Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
  19. ### Key Strengths
  20. ✅ {strength_1}
  21. ✅ {strength_2}
  22. ✅ {strength_3}
  23. ### Key Weaknesses
  24. ❌ {weakness_1}
  25. ❌ {weakness_2}
  26. ❌ {weakness_3}
  27. ### Summary
  28. {1-2 paragraph summary of overall test quality, highlighting major findings and recommendation rationale}
  29. ---
  30. ## Quality Criteria Assessment
  31. | Criterion | Status | Violations | Notes |
  32. | ------------------------------------ | ------------------------------- | ---------- | ------------ |
  33. | BDD Format (Given-When-Then) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  34. | Test IDs | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  35. | Priority Markers (P0/P1/P2/P3) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  36. | Hard Waits (sleep, waitForTimeout) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  37. | Determinism (no conditionals) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  38. | Isolation (cleanup, no shared state) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  39. | Fixture Patterns | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  40. | Data Factories | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  41. | Network-First Pattern | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  42. | Explicit Assertions | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  43. | Test Length (≤300 lines) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {lines} | {brief_note} |
  44. | Test Duration (≤1.5 min) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {duration} | {brief_note} |
  45. | Flakiness Patterns | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count} | {brief_note} |
  46. **Total Violations**: {critical_count} Critical, {high_count} High, {medium_count} Medium, {low_count} Low
  47. ---
  48. ## Quality Score Breakdown
  49. ```
  50. Starting Score: 100
  51. Critical Violations: -{critical_count} × 10 = -{critical_deduction}
  52. High Violations: -{high_count} × 5 = -{high_deduction}
  53. Medium Violations: -{medium_count} × 2 = -{medium_deduction}
  54. Low Violations: -{low_count} × 1 = -{low_deduction}
  55. Bonus Points:
  56. Excellent BDD: +{0|5}
  57. Comprehensive Fixtures: +{0|5}
  58. Data Factories: +{0|5}
  59. Network-First: +{0|5}
  60. Perfect Isolation: +{0|5}
  61. All Test IDs: +{0|5}
  62. --------
  63. Total Bonus: +{bonus_total}
  64. Final Score: {final_score}/100
  65. Grade: {grade}
  66. ```
  67. ---
  68. ## Critical Issues (Must Fix)
  69. {If no critical issues: "No critical issues detected. ✅"}
  70. {For each critical issue:}
  71. ### {issue_number}. {Issue Title}
  72. **Severity**: P0 (Critical)
  73. **Location**: `{filename}:{line_number}`
  74. **Criterion**: {criterion_name}
  75. **Knowledge Base**: [{fragment_name}]({fragment_path})
  76. **Issue Description**:
  77. {Detailed explanation of what the problem is and why it's critical}
  78. **Current Code**:
  79. ```typescript
  80. // ❌ Bad (current implementation)
  81. {
  82. code_snippet_showing_problem;
  83. }
  84. ```
  85. **Recommended Fix**:
  86. ```typescript
  87. // ✅ Good (recommended approach)
  88. {
  89. code_snippet_showing_solution;
  90. }
  91. ```
  92. **Why This Matters**:
  93. {Explanation of impact - flakiness risk, maintainability, reliability}
  94. **Related Violations**:
  95. {If similar issue appears elsewhere, note line numbers}
  96. ---
  97. ## Recommendations (Should Fix)
  98. {If no recommendations: "No additional recommendations. Test quality is excellent. ✅"}
  99. {For each recommendation:}
  100. ### {rec_number}. {Recommendation Title}
  101. **Severity**: {P1 (High) | P2 (Medium) | P3 (Low)}
  102. **Location**: `{filename}:{line_number}`
  103. **Criterion**: {criterion_name}
  104. **Knowledge Base**: [{fragment_name}]({fragment_path})
  105. **Issue Description**:
  106. {Detailed explanation of what could be improved and why}
  107. **Current Code**:
  108. ```typescript
  109. // ⚠️ Could be improved (current implementation)
  110. {
  111. code_snippet_showing_current_approach;
  112. }
  113. ```
  114. **Recommended Improvement**:
  115. ```typescript
  116. // ✅ Better approach (recommended)
  117. {
  118. code_snippet_showing_improvement;
  119. }
  120. ```
  121. **Benefits**:
  122. {Explanation of benefits - maintainability, readability, reusability}
  123. **Priority**:
  124. {Why this is P1/P2/P3 - urgency and impact}
  125. ---
  126. ## Best Practices Found
  127. {If good patterns found, highlight them}
  128. {For each best practice:}
  129. ### {practice_number}. {Best Practice Title}
  130. **Location**: `{filename}:{line_number}`
  131. **Pattern**: {pattern_name}
  132. **Knowledge Base**: [{fragment_name}]({fragment_path})
  133. **Why This Is Good**:
  134. {Explanation of why this pattern is excellent}
  135. **Code Example**:
  136. ```typescript
  137. // ✅ Excellent pattern demonstrated in this test
  138. {
  139. code_snippet_showing_best_practice;
  140. }
  141. ```
  142. **Use as Reference**:
  143. {Encourage using this pattern in other tests}
  144. ---
  145. ## Test File Analysis
  146. ### File Metadata
  147. - **File Path**: `{relative_path_from_project_root}`
  148. - **File Size**: {line_count} lines, {kb_size} KB
  149. - **Test Framework**: {Playwright | Jest | Cypress | Vitest | Other}
  150. - **Language**: {TypeScript | JavaScript}
  151. ### Test Structure
  152. - **Describe Blocks**: {describe_count}
  153. - **Test Cases (it/test)**: {test_count}
  154. - **Average Test Length**: {avg_lines_per_test} lines per test
  155. - **Fixtures Used**: {fixture_count} ({fixture_names})
  156. - **Data Factories Used**: {factory_count} ({factory_names})
  157. ### Test Scope
  158. - **Test IDs**: {test_id_list}
  159. - **Priority Distribution**:
  160. - P0 (Critical): {p0_count} tests
  161. - P1 (High): {p1_count} tests
  162. - P2 (Medium): {p2_count} tests
  163. - P3 (Low): {p3_count} tests
  164. - Unknown: {unknown_count} tests
  165. ### Assertions Analysis
  166. - **Total Assertions**: {assertion_count}
  167. - **Assertions per Test**: {avg_assertions_per_test} (avg)
  168. - **Assertion Types**: {assertion_types_used}
  169. ---
  170. ## Context and Integration
  171. ### Related Artifacts
  172. {If story file found:}
  173. - **Story File**: [{story_filename}]({story_path})
  174. {If test-design found:}
  175. - **Test Design**: [{test_design_filename}]({test_design_path})
  176. - **Risk Assessment**: {risk_level}
  177. - **Priority Framework**: P0-P3 applied
  178. ---
  179. ## Knowledge Base References
  180. This review consulted the following knowledge base fragments:
  181. - **[test-quality.md](../../../agents/bmad-tea/resources/knowledge/test-quality.md)** - Definition of Done for tests (no hard waits, <300 lines, <1.5 min, self-cleaning)
  182. - **[fixture-architecture.md](../../../agents/bmad-tea/resources/knowledge/fixture-architecture.md)** - Pure function → Fixture → mergeTests pattern
  183. - **[network-first.md](../../../agents/bmad-tea/resources/knowledge/network-first.md)** - Route intercept before navigate (race condition prevention)
  184. - **[data-factories.md](../../../agents/bmad-tea/resources/knowledge/data-factories.md)** - Factory functions with overrides, API-first setup
  185. - **[test-levels-framework.md](../../../agents/bmad-tea/resources/knowledge/test-levels-framework.md)** - E2E vs API vs Component vs Unit appropriateness
  186. - **[component-tdd.md](../../../agents/bmad-tea/resources/knowledge/component-tdd.md)** - Red-Green-Refactor patterns
  187. - **[selective-testing.md](../../../agents/bmad-tea/resources/knowledge/selective-testing.md)** - Duplicate coverage detection
  188. - **[ci-burn-in.md](../../../agents/bmad-tea/resources/knowledge/ci-burn-in.md)** - Flakiness detection patterns (10-iteration loop)
  189. - **[test-priorities-matrix.md](../../../agents/bmad-tea/resources/knowledge/test-priorities-matrix.md)** - P0/P1/P2/P3 classification framework
  190. For coverage mapping, consult `trace` workflow outputs.
  191. See [tea-index.csv](../../../agents/bmad-tea/resources/tea-index.csv) for complete knowledge base.
  192. ---
  193. ## Next Steps
  194. ### Immediate Actions (Before Merge)
  195. 1. **{action_1}** - {description}
  196. - Priority: {P0 | P1 | P2}
  197. - Owner: {team_or_person}
  198. - Estimated Effort: {time_estimate}
  199. 2. **{action_2}** - {description}
  200. - Priority: {P0 | P1 | P2}
  201. - Owner: {team_or_person}
  202. - Estimated Effort: {time_estimate}
  203. ### Follow-up Actions (Future PRs)
  204. 1. **{action_1}** - {description}
  205. - Priority: {P2 | P3}
  206. - Target: {next_milestone | backlog}
  207. 2. **{action_2}** - {description}
  208. - Priority: {P2 | P3}
  209. - Target: {next_milestone | backlog}
  210. ### Re-Review Needed?
  211. {✅ No re-review needed - approve as-is}
  212. {⚠️ Re-review after critical fixes - request changes, then re-review}
  213. {❌ Major refactor required - block merge, pair programming recommended}
  214. ---
  215. ## Decision
  216. **Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
  217. **Rationale**:
  218. {1-2 paragraph explanation of recommendation based on findings}
  219. **For Approve**:
  220. > Test quality is excellent/good with {score}/100 score. {Minor issues noted can be addressed in follow-up PRs.} Tests are production-ready and follow best practices.
  221. **For Approve with Comments**:
  222. > Test quality is acceptable with {score}/100 score. {High-priority recommendations should be addressed but don't block merge.} Critical issues resolved, but improvements would enhance maintainability.
  223. **For Request Changes**:
  224. > Test quality needs improvement with {score}/100 score. {Critical issues must be fixed before merge.} {X} critical violations detected that pose flakiness/maintainability risks.
  225. **For Block**:
  226. > Test quality is insufficient with {score}/100 score. {Multiple critical issues make tests unsuitable for production.} Recommend pairing session with QA engineer to apply patterns from knowledge base.
  227. ---
  228. ## Appendix
  229. ### Violation Summary by Location
  230. {Table of all violations sorted by line number:}
  231. | Line | Severity | Criterion | Issue | Fix |
  232. | ------ | ------------- | ----------- | ------------- | ----------- |
  233. | {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
  234. | {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
  235. ### Quality Trends
  236. {If reviewing same file multiple times, show trend:}
  237. | Review Date | Score | Grade | Critical Issues | Trend |
  238. | ------------ | ------------- | --------- | --------------- | ----------- |
  239. | {YYYY-MM-DD} | {score_1}/100 | {grade_1} | {count_1} | ⬆️ Improved |
  240. | {YYYY-MM-DD} | {score_2}/100 | {grade_2} | {count_2} | ⬇️ Declined |
  241. | {YYYY-MM-DD} | {score_3}/100 | {grade_3} | {count_3} | ➡️ Stable |
  242. ### Related Reviews
  243. {If reviewing multiple files in directory/suite:}
  244. | File | Score | Grade | Critical | Status |
  245. | -------- | ----------- | ------- | -------- | ------------------ |
  246. | {file_1} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
  247. | {file_2} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
  248. | {file_3} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
  249. **Suite Average**: {avg_score}/100 ({avg_grade})
  250. ---
  251. ## Review Metadata
  252. **Generated By**: BMad TEA Agent (Test Architect)
  253. **Workflow**: testarch-test-review v4.0
  254. **Review ID**: test-review-{filename}-{YYYYMMDD}
  255. **Timestamp**: {YYYY-MM-DD HH:MM:SS}
  256. **Version**: 1.0
  257. ---
  258. ## Feedback on This Review
  259. If you have questions or feedback on this review:
  260. 1. Review patterns in knowledge base: `../../../agents/bmad-tea/resources/knowledge/`
  261. 2. Consult tea-index.csv for detailed guidance
  262. 3. Request clarification on specific violations
  263. 4. Pair with QA engineer to apply patterns
  264. This review is guidance, not rigid rules. Context matters - if a pattern is justified, document it with a comment.