You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

webhook-risk-guidance.md 6.3KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114
  1. # Webhook Testing Risk Guidance
  2. ## Principle
  3. Webhook integration points are high-risk boundaries — they represent asynchronous side effects that cross service boundaries. A missing or malformed webhook means a downstream system never received its trigger. Default risk level: **P2 × I3** (medium probability, high impact = Risk Score 6) → must be covered by integration tests.
  4. ## When Webhook Tests Are Required
  5. Webhook tests are **required** (not optional) when:
  6. | Condition | Rationale |
  7. | ------------------------------------------------------------------ | ---------------------------------------------------------------------- |
  8. | Application publishes events to external subscribers | External consumers depend on correct payload shape and delivery timing |
  9. | Event-driven architecture (Kafka/SQS/event bus → webhook delivery) | The delivery pipeline is a risk boundary; delivery failures are silent |
  10. | Payment, order, or notification side effects | Business-critical; missed webhooks = missed transactions |
  11. | Integration with third-party services via webhooks | Breaking payload changes won't surface in unit or component tests |
  12. | Any async side effect that a consumer polls-on or reacts-to | Polling tests (`recurse`) can mask webhook delivery failures entirely |
  13. ## Risk Scoring
  14. ```
  15. Risk = Probability × Impact
  16. Probability factors (P1–P3):
  17. P1 (low): Webhook system is mature, well-tested, no history of failures
  18. P2 (medium): Kafka pipeline, multiple consumers, new integrations
  19. P3 (high): New delivery mechanism, external third-party webhooks, no retry logic
  20. Impact factors (I1–I3):
  21. I1 (low): Non-critical notifications (e.g. audit logs)
  22. I2 (medium): Feature-level side effects (e.g. search index updates)
  23. I3 (high): Business-critical events (payments, orders, compliance)
  24. ```
  25. Default webhook integrations: **P2 × I3 = 6** → High → must be tested.
  26. ## What a Complete Webhook Test Looks Like
  27. A complete webhook test covers:
  28. 1. **Happy path**: Action fires → webhook arrives with correct payload
  29. 2. **Sequential events (drain pattern)**: Preceding event drained before asserting on next
  30. 3. **Parallel isolation**: Template scoped by entity ID — workers don't cross-contaminate
  31. 4. **Timeout/error shape**: `WebhookTimeoutError` tested for negative path coverage
  32. 5. **Cleanup verification**: Fixture auto-cleans; no leaked webhooks after test
  33. **Minimal complete example** (from playwright-utils E2E suite):
  34. ```typescript
  35. // Template factories scoped by ID — parallel safety
  36. const movieCreated = (movieId: number) =>
  37. webhookTemplate<{ event: string; data: { id: number } }>('movie.created')
  38. .matchField('event', 'movie.created')
  39. .matchField('data.id', movieId)
  40. .withTimeout(15_000)
  41. .withInterval(500)
  42. .build();
  43. const movieDeleted = (movieId: number) =>
  44. webhookTemplate<{ event: string; data: { id: number } }>('movie.deleted')
  45. .matchField('event', 'movie.deleted')
  46. .matchField('data.id', movieId)
  47. .withTimeout(15_000)
  48. .withInterval(500)
  49. .build();
  50. test('movie deletion triggers a webhook with correct payload', async ({ authToken, addMovie, deleteMovie, webhookRegistry }) => {
  51. const movie = generateMovieWithoutId();
  52. const { body: createResponse } = await addMovie(authToken, movie);
  53. const movieId = createResponse.data.id;
  54. // Drain: consume the create webhook before testing the delete path
  55. await webhookRegistry.waitFor(movieCreated(movieId));
  56. await deleteMovie(authToken, movieId);
  57. const webhook = await webhookRegistry.waitFor(movieDeleted(movieId));
  58. expect(webhook.body).toMatchObject({
  59. event: 'movie.deleted',
  60. data: { id: movieId, name: movie.name },
  61. });
  62. });
  63. ```
  64. ## Common Failure Patterns
  65. | Failure pattern | Root cause | How the module addresses it |
  66. | -------------------------------------- | ------------------------------------------------------ | ---------------------------------------------------------------------------- |
  67. | Test passes but webhook never verified | Test asserted on status endpoint, not delivery | `waitFor` forces assertion on actual webhook arrival |
  68. | Flaky under `fullyParallel: true` | `full-reset` cleanup deletes another worker's webhooks | `matched-only` strategy — only matched webhooks are deleted |
  69. | Timeout gives no useful information | No payload inspection on failure | `WebhookTimeoutError.receivedWebhooks` snapshot |
  70. | Template matches wrong test's webhook | Template not scoped by entity ID | Template factories accept ID parameter; `matchPredicate` for complex scoping |
  71. | Test hangs at 30s default timeout | Webhook not arriving; pipeline is slow | Use `withTimeout()` and `withInterval(500)` per template |
  72. | Journal grows unbounded | No cleanup strategy configured | Configure `cleanupStrategy` in `webhookConfig`; fixture auto-cleans |
  73. ## Risk Mitigation Checklist (for TA assessment)
  74. When a system uses webhooks, verify the test suite covers:
  75. - [ ] Happy path for each event type that has an external subscriber
  76. - [ ] Template factories scoped by entity ID (parallel-safe)
  77. - [ ] Drain pattern applied to all sequential event assertions
  78. - [ ] Cleanup strategy matches provider capability: `matched-only` for providers that support `deleteById` (e.g. WireMock); `full-reset` with serial execution or an isolated provider instance per worker for MockServer/Mockoon
  79. - [ ] Timeout values appropriate for the delivery pipeline latency (Kafka pipelines need 15s+)
  80. - [ ] `WebhookTimeoutError` imported and tested in negative path coverage
  81. - [ ] Mock server (WireMock/MockServer/Mockoon) in Docker Compose / test infra
  82. ## Related Fragments
  83. - `webhook-testing-fundamentals.md` — Why webhook tests are hard
  84. - `webhook-module-setup.md` — Fixture wiring for each provider
  85. - `webhook-template-matchers.md` — Template and matcher patterns
  86. - `risk-governance.md` — Risk scoring framework
  87. - `probability-impact.md` — P×I scale definitions