Skip to content

Specialized AI for

Fawdy connects to your servers via SSH with a pre-approved command set. It digs through logs, processes, and stack traces to find what broke. Nothing gets written to disk.

Incident Alert
SEV-1
Triggered

p99 latency > 2s on web-prod-03

Service: API Gateway#38291
Assigned: SRE On-Call

Root Cause Analysis

Fawdy surfaces what it found so your team can act faster.

Investigation Findings

Correlated signals mapped to a causal chain
Primary trigger isolated from contributing factors
Blast radius across affected services and endpoints

Suggested Next Steps

Suggested mitigation steps to contain the incident
Recommended areas for deeper investigation
Potential preventive measures to evaluate

Attached Evidence

Timeline
Metrics
Logs
Config Diff

Fawdy Investigates

$ fawdy investigate web-prod-03

* connecting to infrastructure

Logs - haproxy, catalog, k8s
Metrics - CPU, memory, latency
Config changes - last 24h
Deployment history
Correlating across services
Reconstructing timeline
Jira
for your engineer
Slack
for your team
PDF Export
for your customer
Confluence
for your wiki
Postmortem Doc
for stakeholders
Timeline Brief
for incident review

Works with the infrastructure you

From raw chaos to

Drag to see the before and after.
Fawdy turns thousands of log lines into a structured analysis.

Raw incident logs

/var/log/haproxy/haproxy.log
  1. 114:22:31INFOfrontend http-in/srv1 200 1024 0/0/0/8/8 "GET /api/v2/products HTTP/1.1" - src=198.51.100.14
  2. 214:22:31INFObackend user-api/user-pod-1 200 512 0/0/0/6/6 "GET /api/v2/users/me HTTP/1.1" - src=203.0.113.42
  3. 314:22:33INFObackend search-api/search-pod-1 200 2048 0/0/0/14/14 "GET /api/v2/search?q=laptop&page=1 HTTP/1.1" - src=198.51.100.88
  4. 414:22:35INFObackend catalog-api/catalog-pod-2 200 4096 0/0/0/11/11 "GET /api/v2/products?category=electronics HTTP/1.1" - src=10.0.1.45
  5. 514:22:36INFOHealth check for server user-api/user-pod-1 succeeded
  6. 614:22:36INFOHealth check for server user-api/user-pod-2 succeeded
  7. 714:22:37INFObackend payment-api/payment-pod-1 200 256 0/0/0/22/22 "POST /api/v2/payments/verify HTTP/1.1" - src=10.0.2.18
  8. 814:22:38INFObackend catalog-api/catalog-pod-1 200 8192 0/0/0/9/9 "GET /api/v2/products/8842 HTTP/1.1" - src=203.0.113.71
  9. 914:22:40INFObackend order-api/order-pod-1 200 1536 0/0/0/18/18 "GET /api/v2/orders/usr-44210 HTTP/1.1" - src=198.51.100.14
  10. 1014:22:41INFOfrontend http-in/srv1 304 0 0/0/0/2/2 "GET /static/js/bundle.min.js HTTP/1.1" - src=203.0.113.99
  11. 1114:22:42INFObackend catalog-api/catalog-pod-3 200 3072 0/0/0/7/7 "GET /api/v2/products?category=clothing&sort=price HTTP/1.1" - src=198.51.100.201
  12. 1214:22:44INFObackend user-api/user-pod-2 200 384 0/0/0/5/5 "GET /api/v2/users/preferences HTTP/1.1" - src=10.0.3.92
  13. 1314:22:45INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  14. 1414:22:45INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  15. 1514:22:45INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  16. 1614:22:47INFObackend search-api/search-pod-2 200 6144 0/0/0/31/31 "GET /api/v2/search?q=wireless+headphones&filters=brand:Sony HTTP/1.1" - src=203.0.113.15
  17. 1714:22:48INFOfrontend http-in/srv1 301 0 0/0/0/1/1 "GET /products HTTP/1.1" - src=198.51.100.33
  18. 1814:22:50INFObackend catalog-api/catalog-pod-1 200 2048 0/0/0/12/12 "GET /api/v2/products/1192/reviews HTTP/1.1" - src=10.0.1.110
  19. 1914:22:52INFObackend payment-api/payment-pod-2 200 128 0/0/0/45/45 "POST /api/v2/payments/charge HTTP/1.1" - src=10.0.2.18
  20. 2014:22:54INFObackend order-api/order-pod-2 201 768 0/0/0/34/34 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.77
  21. 2114:22:56INFOfrontend http-in/srv1 200 512 0/0/0/3/3 "GET /api/v2/health HTTP/1.1" - src=10.0.0.1
  22. 2214:22:58INFObackend catalog-api/catalog-pod-2 200 1024 0/0/0/10/10 "GET /api/v2/inventory/sku/EL-9921 HTTP/1.1" - src=10.0.4.55
  23. 2314:23:01INFObackend user-api/user-pod-1 200 256 0/0/0/4/4 "PUT /api/v2/users/profile HTTP/1.1" - src=203.0.113.42
  24. 2414:23:03INFObackend catalog-api/catalog-pod-3 404 128 0/0/0/5/5 "GET /api/v2/products/99999 HTTP/1.1" - src=198.51.100.14
  25. 2514:23:05INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/css/main.css HTTP/1.1" - src=203.0.113.99
  26. 2614:23:07INFObackend search-api/search-pod-1 200 4096 0/0/0/19/19 "POST /api/v2/search/suggest HTTP/1.1" - src=198.51.100.88
  27. 2714:23:10INFObackend catalog-api/catalog-pod-1 200 16384 0/0/0/15/15 "GET /api/v2/products?page=3&limit=50 HTTP/1.1" - src=10.0.1.45
  28. 2814:23:12INFObackend order-api/order-pod-1 200 2048 0/0/0/21/21 "GET /api/v2/orders/ORD-88412/status HTTP/1.1" - src=198.51.100.201
  29. 2914:23:15INFOHealth check for server order-api/order-pod-1 succeeded
  30. 3014:23:15INFOHealth check for server order-api/order-pod-2 succeeded
  31. 3114:23:18INFObackend payment-api/payment-pod-1 200 64 0/0/0/8/8 "GET /api/v2/payments/methods HTTP/1.1" - src=203.0.113.71
  32. 3214:23:20INFObackend catalog-api/catalog-pod-2 200 2048 0/0/0/9/9 "GET /api/v2/products/featured HTTP/1.1" - src=198.51.100.33
  33. 3314:23:23INFObackend user-api/user-pod-2 200 1024 0/0/0/7/7 "GET /api/v2/users/notifications HTTP/1.1" - src=10.0.3.92
  34. 3414:23:26INFOfrontend http-in/srv1 200 8192 0/0/0/4/4 "GET /static/images/hero-banner.webp HTTP/1.1" - src=203.0.113.15
  35. 3514:23:28INFObackend catalog-api/catalog-pod-3 200 512 0/0/0/6/6 "GET /api/v2/products/3321/variants HTTP/1.1" - src=198.51.100.77
  36. 3614:23:31INFObackend search-api/search-pod-2 200 1024 0/0/0/22/22 "GET /api/v2/search?q=running+shoes&sort=relevance HTTP/1.1" - src=10.0.1.110
  37. 3714:23:34INFObackend order-api/order-pod-2 200 384 0/0/0/12/12 "GET /api/v2/cart HTTP/1.1" - src=198.51.100.14
  38. 3814:23:37INFObackend catalog-api/catalog-pod-1 200 4096 0/0/0/13/13 "GET /api/v2/products?brand=Nike HTTP/1.1" - src=203.0.113.42
  39. 3914:23:40INFOHealth check for server payment-api/payment-pod-1 succeeded
  40. 4014:23:40INFOHealth check for server payment-api/payment-pod-2 succeeded
  41. 4114:23:42INFObackend user-api/user-pod-1 200 128 0/0/0/3/3 "DELETE /api/v2/users/sessions/old HTTP/1.1" - src=10.0.2.18
  42. 4214:23:45INFObackend payment-api/payment-pod-2 200 256 0/0/0/38/38 "POST /api/v2/payments/refund HTTP/1.1" - src=10.0.4.55
  43. 4314:23:48INFObackend catalog-api/catalog-pod-2 200 3072 0/0/0/8/8 "GET /api/v2/products/deals HTTP/1.1" - src=198.51.100.88
  44. 4414:23:51INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/fonts/inter-var.woff2 HTTP/1.1" - src=203.0.113.99
  45. 4514:23:54INFObackend search-api/search-pod-1 200 2048 0/0/0/16/16 "GET /api/v2/search/trending HTTP/1.1" - src=198.51.100.201
  46. 4614:23:57INFObackend catalog-api/catalog-pod-3 200 1536 0/0/0/11/11 "GET /api/v2/products/categories HTTP/1.1" - src=10.0.1.45
  47. 4714:24:00INFObackend order-api/order-pod-1 200 768 0/0/0/25/25 "POST /api/v2/cart/items HTTP/1.1" - src=203.0.113.71
  48. 4814:24:03INFObackend user-api/user-pod-2 200 512 0/0/0/6/6 "PATCH /api/v2/users/address HTTP/1.1" - src=198.51.100.33
  49. 4914:24:06INFOHealth check for server search-api/search-pod-1 succeeded
  50. 5014:24:06INFOHealth check for server search-api/search-pod-2 succeeded
  51. 5114:24:09INFObackend catalog-api/catalog-pod-1 200 6144 0/0/0/14/14 "GET /api/v2/products?category=home&page=2 HTTP/1.1" - src=198.51.100.77
  52. 5214:24:12INFObackend payment-api/payment-pod-1 200 192 0/0/0/29/29 "POST /api/v2/payments/validate-card HTTP/1.1" - src=10.0.2.18
  53. 5314:24:15INFOfrontend http-in/srv1 200 384 0/0/0/2/2 "GET /api/v2/health HTTP/1.1" - src=10.0.0.1
  54. 5414:24:18INFObackend catalog-api/catalog-pod-2 200 1024 0/0/0/10/10 "GET /api/v2/products/7781/images HTTP/1.1" - src=203.0.113.15
  55. 5514:24:21INFObackend search-api/search-pod-2 200 3072 0/0/0/18/18 "POST /api/v2/search/filters HTTP/1.1" - src=10.0.3.92
  56. 5614:24:25INFObackend order-api/order-pod-2 200 512 0/0/0/15/15 "PUT /api/v2/cart/items/ITM-2291 HTTP/1.1" - src=198.51.100.14
  57. 5714:24:28INFObackend user-api/user-pod-1 200 768 0/0/0/5/5 "GET /api/v2/users/wishlist HTTP/1.1" - src=203.0.113.42
  58. 5814:24:32INFObackend catalog-api/catalog-pod-3 200 2048 0/0/0/9/9 "GET /api/v2/reviews?product_id=8842 HTTP/1.1" - src=198.51.100.88
  59. 5914:24:36INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  60. 6014:24:36INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  61. 6114:24:36INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  62. 6214:24:40INFObackend catalog-api/catalog-pod-1 200 4096 0/0/0/12/12 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=10.0.1.110
  63. 6314:24:44INFObackend payment-api/payment-pod-2 200 128 0/0/0/41/41 "POST /api/v2/payments/webhook HTTP/1.1" - src=10.0.4.55
  64. 6414:24:48INFObackend order-api/order-pod-1 200 1024 0/0/0/19/19 "GET /api/v2/orders/ORD-88501/tracking HTTP/1.1" - src=198.51.100.201
  65. 6514:24:52INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/images/logo.svg HTTP/1.1" - src=203.0.113.99
  66. 6614:24:56INFObackend user-api/user-pod-2 201 256 0/0/0/8/8 "POST /api/v2/users/register HTTP/1.1" - src=198.51.100.33
  67. 6714:25:01INFObackend catalog-api/catalog-pod-2 200 8192 0/0/0/11/11 "GET /api/v2/products?sale=true&limit=100 HTTP/1.1" - src=203.0.113.71
  68. 6814:25:05INFObackend search-api/search-pod-1 200 1536 0/0/0/20/20 "GET /api/v2/search?q=kitchen+appliances HTTP/1.1" - src=10.0.1.45
  69. 6914:25:10INFObackend catalog-api/catalog-pod-3 200 512 0/0/0/7/7 "GET /api/v2/inventory/check?skus=EL-9921,CL-4421 HTTP/1.1" - src=10.0.2.18
  70. 7014:25:15INFObackend order-api/order-pod-2 200 384 0/0/0/16/16 "DELETE /api/v2/cart/items/ITM-1102 HTTP/1.1" - src=198.51.100.77
  71. 7114:25:20INFOHealth check for server user-api/user-pod-1 succeeded
  72. 7214:25:20INFOHealth check for server user-api/user-pod-2 succeeded
  73. 7314:25:25INFObackend payment-api/payment-pod-1 200 64 0/0/0/11/11 "GET /api/v2/payments/status/PAY-71882 HTTP/1.1" - src=198.51.100.14
  74. 7414:26:02INFObackend catalog-api/catalog-pod-1 200 2048 0/0/0/10/10 "GET /api/v2/products/5512 HTTP/1.1" - src=203.0.113.42
  75. 7514:26:18INFObackend user-api/user-pod-1 200 384 0/0/0/4/4 "GET /api/v2/users/activity HTTP/1.1" - src=10.0.3.92
  76. 7614:26:35INFObackend search-api/search-pod-2 200 4096 0/0/0/23/23 "GET /api/v2/search?q=gaming+mouse&price_min=20&price_max=100 HTTP/1.1" - src=198.51.100.88
  77. 7714:26:52INFObackend catalog-api/catalog-pod-2 200 1024 0/0/0/8/8 "GET /api/v2/products/1192/related HTTP/1.1" - src=203.0.113.15
  78. 7814:27:10INFOfrontend http-in/srv1 200 256 0/0/0/2/2 "GET /api/v2/health HTTP/1.1" - src=10.0.0.1
  79. 7914:27:28INFObackend order-api/order-pod-1 200 1536 0/0/0/22/22 "GET /api/v2/orders?user_id=usr-44210&status=shipped HTTP/1.1" - src=198.51.100.201
  80. 8014:28:05INFObackend catalog-api/catalog-pod-3 200 3072 0/0/0/13/13 "GET /api/v2/products?category=sports HTTP/1.1" - src=10.0.1.45
  81. 8114:28:22INFObackend payment-api/payment-pod-2 200 192 0/0/0/35/35 "POST /api/v2/payments/charge HTTP/1.1" - src=198.51.100.33
  82. 8214:28:40INFOHealth check for server order-api/order-pod-1 succeeded
  83. 8314:28:40INFOHealth check for server order-api/order-pod-2 succeeded
  84. 8414:28:55INFObackend user-api/user-pod-2 200 512 0/0/0/6/6 "POST /api/v2/users/login HTTP/1.1" - src=203.0.113.71
  85. 8514:29:12INFObackend catalog-api/catalog-pod-1 200 2048 0/0/0/9/9 "GET /api/v2/products/bestsellers HTTP/1.1" - src=198.51.100.77
  86. 8614:29:30INFObackend search-api/search-pod-1 200 1024 0/0/0/15/15 "GET /api/v2/search/autocomplete?q=sam HTTP/1.1" - src=10.0.1.110
  87. 8714:29:48INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/js/vendor.chunk.js HTTP/1.1" - src=203.0.113.99
  88. 8814:30:05INFObackend catalog-api/catalog-pod-2 200 6144 0/0/0/11/11 "GET /api/v2/products?category=beauty&sort=rating HTTP/1.1" - src=198.51.100.14
  89. 8914:30:22INFObackend order-api/order-pod-2 201 768 0/0/0/28/28 "POST /api/v2/orders HTTP/1.1" - src=203.0.113.42
  90. 9014:30:40INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  91. 9114:30:40INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  92. 9214:30:40INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  93. 9314:30:55INFObackend user-api/user-pod-1 200 128 0/0/0/3/3 "GET /api/v2/users/me HTTP/1.1" - src=10.0.3.92
  94. 9414:31:10INFObackend payment-api/payment-pod-1 200 384 0/0/0/42/42 "POST /api/v2/payments/intent HTTP/1.1" - src=198.51.100.88
  95. 9514:31:28INFObackend catalog-api/catalog-pod-3 200 1024 0/0/0/10/10 "GET /api/v2/products/3321 HTTP/1.1" - src=203.0.113.15
  96. 9614:31:45INFObackend search-api/search-pod-2 200 2048 0/0/0/17/17 "POST /api/v2/search HTTP/1.1" - src=10.0.4.55
  97. 9714:32:02INFObackend catalog-api/catalog-pod-1 200 4096 0/0/0/14/14 "GET /api/v2/products?page=5&limit=20 HTTP/1.1" - src=198.51.100.201
  98. 9814:32:20INFObackend user-api/user-pod-2 200 256 0/0/0/5/5 "GET /api/v2/users/orders HTTP/1.1" - src=198.51.100.33
  99. 9914:32:38INFObackend order-api/order-pod-1 200 512 0/0/0/20/20 "GET /api/v2/cart/count HTTP/1.1" - src=203.0.113.71
  100. 10014:32:55INFOHealth check for server payment-api/payment-pod-1 succeeded
  101. 10114:32:55INFOHealth check for server payment-api/payment-pod-2 succeeded
  102. 10214:33:12INFObackend catalog-api/catalog-pod-2 200 1536 0/0/0/12/12 "GET /api/v2/products/7781 HTTP/1.1" - src=10.0.1.45
  103. 10314:33:30INFObackend search-api/search-pod-1 200 768 0/0/0/13/13 "GET /api/v2/search?q=bluetooth+speaker HTTP/1.1" - src=198.51.100.77
  104. 10414:33:48INFOfrontend http-in/srv1 200 128 0/0/0/1/1 "GET /favicon.ico HTTP/1.1" - src=203.0.113.99
  105. 10514:34:02INFObackend catalog-api/catalog-pod-2 200 2048 0/0/0/52/52 "GET /api/v2/products?category=electronics HTTP/1.1" - src=198.51.100.14
  106. 10614:34:05INFObackend user-api/user-pod-1 200 384 0/0/0/5/5 "GET /api/v2/users/me HTTP/1.1" - src=203.0.113.42
  107. 10714:34:08INFObackend search-api/search-pod-2 200 1024 0/0/0/14/14 "GET /api/v2/search?q=monitor+4k HTTP/1.1" - src=10.0.1.110
  108. 10814:34:12INFObackend catalog-api/catalog-pod-1 200 4096 0/0/0/68/68 "GET /api/v2/products/featured HTTP/1.1" - src=198.51.100.88
  109. 10914:34:15INFObackend payment-api/payment-pod-1 200 192 0/0/0/25/25 "GET /api/v2/payments/methods HTTP/1.1" - src=10.0.2.18
  110. 11014:34:18INFObackend catalog-api/catalog-pod-3 200 3072 0/0/0/74/74 "GET /api/v2/products?brand=Apple HTTP/1.1" - src=203.0.113.15
  111. 11114:34:22INFObackend order-api/order-pod-1 200 768 0/0/0/18/18 "GET /api/v2/orders/ORD-88620 HTTP/1.1" - src=198.51.100.201
  112. 11214:34:25INFOHealth check for server search-api/search-pod-1 succeeded
  113. 11314:34:25INFOHealth check for server search-api/search-pod-2 succeeded
  114. 11414:34:30INFObackend catalog-api/catalog-pod-2 200 2048 0/0/0/112/112 "GET /api/v2/products/8842 HTTP/1.1" - src=198.51.100.33
  115. 11514:34:34INFObackend user-api/user-pod-2 200 512 0/0/0/4/4 "GET /api/v2/users/preferences HTTP/1.1" - src=10.0.3.92
  116. 11614:34:38INFObackend catalog-api/catalog-pod-1 200 1024 0/0/0/148/148 "GET /api/v2/products/5512/variants HTTP/1.1" - src=203.0.113.71
  117. 11714:34:42INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/js/bundle.min.js HTTP/1.1" - src=203.0.113.99
  118. 11814:34:48INFObackend catalog-api/catalog-pod-3 200 6144 0/0/0/203/203 "GET /api/v2/products?category=clothing&page=1 HTTP/1.1" - src=10.0.1.45
  119. 11914:34:52INFObackend search-api/search-pod-1 200 2048 0/0/0/16/16 "GET /api/v2/search/trending HTTP/1.1" - src=198.51.100.77
  120. 12014:34:56INFObackend catalog-api/catalog-pod-2 200 4096 0/0/0/245/245 "GET /api/v2/products?sale=true HTTP/1.1" - src=198.51.100.14
  121. 12114:35:01INFObackend order-api/order-pod-2 200 384 0/0/0/14/14 "GET /api/v2/cart HTTP/1.1" - src=203.0.113.42
  122. 12214:35:05WARNbackend catalog-api/catalog-pod-1 200 2048 0/0/0/312/312 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=198.51.100.88 [slow response]
  123. 12314:35:10INFObackend user-api/user-pod-1 200 256 0/0/0/6/6 "GET /api/v2/users/notifications HTTP/1.1" - src=10.0.4.55
  124. 12414:35:15WARNbackend catalog-api/catalog-pod-3 200 1024 0/0/0/438/438 "GET /api/v2/products/categories HTTP/1.1" - src=203.0.113.15 [slow response]
  125. 12514:35:20INFObackend payment-api/payment-pod-2 200 128 0/0/0/32/32 "POST /api/v2/payments/verify HTTP/1.1" - src=10.0.2.18
  126. 12614:35:25WARNbackend catalog-api/catalog-pod-2 200 8192 0/0/0/521/521 "GET /api/v2/products?page=1&limit=50 HTTP/1.1" - src=198.51.100.201 [slow response]
  127. 12714:35:30INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  128. 12814:35:30INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  129. 12914:35:30INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  130. 13014:35:35INFObackend search-api/search-pod-2 200 3072 0/0/0/19/19 "GET /api/v2/search?q=desk+lamp HTTP/1.1" - src=198.51.100.33
  131. 13114:35:42WARNbackend catalog-api/catalog-pod-1 200 2048 0/0/0/814/814 "GET /api/v2/products/bestsellers HTTP/1.1" - src=203.0.113.71 [slow response]
  132. 13214:35:48INFObackend order-api/order-pod-1 200 1536 0/0/0/21/21 "GET /api/v2/orders?status=pending HTTP/1.1" - src=10.0.1.110
  133. 13314:35:54WARNbackend catalog-api/catalog-pod-3 200 4096 0/0/0/887/887 "GET /api/v2/products?category=electronics&sort=price HTTP/1.1" - src=198.51.100.77 [slow response]
  134. 13414:36:00INFObackend user-api/user-pod-2 200 384 0/0/0/5/5 "GET /api/v2/users/wishlist HTTP/1.1" - src=203.0.113.42
  135. 13514:36:08WARNbackend catalog-api/catalog-pod-2 200 1024 0/0/1/1102/1103 "GET /api/v2/products/3321/reviews HTTP/1.1" - src=198.51.100.14 [slow response]
  136. 13614:36:12INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/css/main.css HTTP/1.1" - src=203.0.113.99
  137. 13714:36:18WARNbackend catalog-api/catalog-pod-1 200 2048 0/0/1/1284/1285 "GET /api/v2/inventory/sku/EL-9921 HTTP/1.1" - src=10.0.1.45 [slow response]
  138. 13814:36:22INFObackend search-api/search-pod-1 200 1024 0/0/0/15/15 "GET /api/v2/search/autocomplete?q=hea HTTP/1.1" - src=198.51.100.88
  139. 13914:36:28INFObackend payment-api/payment-pod-1 200 256 0/0/0/28/28 "POST /api/v2/payments/intent HTTP/1.1" - src=10.0.4.55
  140. 14014:36:35WARNbackend catalog-api/catalog-pod-3 200 6144 0/0/2/1589/1591 "GET /api/v2/products?brand=Samsung HTTP/1.1" - src=203.0.113.15 [slow response]
  141. 14114:36:42INFObackend order-api/order-pod-2 200 512 0/0/0/16/16 "POST /api/v2/cart/items HTTP/1.1" - src=198.51.100.201
  142. 14214:36:50WARNbackend catalog-api/catalog-pod-2 200 3072 0/0/3/1812/1815 "GET /api/v2/products/deals HTTP/1.1" - src=198.51.100.33 [slow response]
  143. 14314:36:55INFObackend user-api/user-pod-1 200 128 0/0/0/4/4 "POST /api/v2/users/login HTTP/1.1" - src=203.0.113.71
  144. 14414:37:02WARNbackend catalog-api/catalog-pod-1 200 2048 0/0/4/1998/2002 "GET /api/v2/products/8842 HTTP/1.1" - src=198.51.100.77 [slow response]
  145. 14514:37:08INFOHealth check for server user-api/user-pod-1 succeeded
  146. 14614:37:08INFOHealth check for server user-api/user-pod-2 succeeded
  147. 14714:37:12INFObackend search-api/search-pod-2 200 2048 0/0/0/18/18 "GET /api/v2/search?q=protein+powder HTTP/1.1" - src=10.0.3.92
  148. 14814:37:20WARNbackend catalog-api/catalog-pod-3 200 1024 0/0/8/2814/2822 "GET /api/v2/products/featured HTTP/1.1" - src=10.0.1.110 [very slow response]
  149. 14914:37:28WARNbackend catalog-api/catalog-pod-2 200 4096 0/0/12/3102/3114 "GET /api/v2/products?category=home HTTP/1.1" - src=198.51.100.14 [very slow response]
  150. 15014:37:32INFObackend order-api/order-pod-1 200 768 0/0/0/19/19 "GET /api/v2/orders/ORD-88710 HTTP/1.1" - src=203.0.113.42
  151. 15114:37:38INFObackend payment-api/payment-pod-2 200 192 0/0/0/36/36 "POST /api/v2/payments/charge HTTP/1.1" - src=10.0.2.18
  152. 15214:37:45WARNbackend catalog-api/catalog-pod-1 200 2048 0/0/18/3891/3909 "GET /api/v2/products?page=2&limit=20 HTTP/1.1" - src=198.51.100.88 [very slow response]
  153. 15314:37:52INFObackend user-api/user-pod-2 200 512 0/0/0/7/7 "GET /api/v2/users/me HTTP/1.1" - src=198.51.100.201
  154. 15414:38:00ERRORbackend catalog-api/catalog-pod-3 504 0 0/0/30001/-1/30001 "GET /api/v2/products/5512 HTTP/1.1" - src=203.0.113.15 [request timeout]
  155. 15514:38:05INFObackend search-api/search-pod-1 200 1536 0/0/0/21/21 "POST /api/v2/search HTTP/1.1" - src=198.51.100.33
  156. 15614:38:10WARNbackend catalog-api/catalog-pod-2 200 1024 0/0/22/4201/4223 "GET /api/v2/products/categories HTTP/1.1" - src=203.0.113.71 [very slow response]
  157. 15714:38:15INFOfrontend http-in/srv1 200 384 0/0/0/2/2 "GET /api/v2/health HTTP/1.1" - src=10.0.0.1
  158. 15814:38:22ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=198.51.100.77 [request timeout]
  159. 15914:38:28INFObackend order-api/order-pod-2 200 384 0/0/0/15/15 "GET /api/v2/cart HTTP/1.1" - src=10.0.1.45
  160. 16014:38:35ERRORbackend catalog-api/catalog-pod-3 504 0 0/0/30001/-1/30001 "GET /api/v2/products?brand=Nike HTTP/1.1" - src=198.51.100.14 [request timeout]
  161. 16114:38:40INFObackend user-api/user-pod-1 200 256 0/0/0/5/5 "PATCH /api/v2/users/address HTTP/1.1" - src=203.0.113.42
  162. 16214:38:45WARNHealth check for server catalog-api/catalog-pod-1 succeeded but took 4200ms
  163. 16314:38:45WARNHealth check for server catalog-api/catalog-pod-2 succeeded but took 3800ms
  164. 16414:38:45WARNHealth check for server catalog-api/catalog-pod-3 succeeded but took 5100ms
  165. 16514:38:52ERRORbackend catalog-api/catalog-pod-2 504 0 0/0/30001/-1/30001 "GET /api/v2/products/bestsellers HTTP/1.1" - src=10.0.1.110 [request timeout]
  166. 16614:38:58INFObackend payment-api/payment-pod-1 200 128 0/0/0/30/30 "POST /api/v2/payments/validate-card HTTP/1.1" - src=10.0.4.55
  167. 16714:39:05ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/products?sale=true&limit=100 HTTP/1.1" - src=198.51.100.88 [request timeout]
  168. 16814:39:10INFObackend search-api/search-pod-2 200 2048 0/0/0/17/17 "GET /api/v2/search?q=yoga+mat HTTP/1.1" - src=198.51.100.201
  169. 16914:39:18ERRORbackend catalog-api/catalog-pod-3 504 0 0/0/30001/-1/30001 "GET /api/v2/inventory/check?skus=SP-1120 HTTP/1.1" - src=203.0.113.15 [request timeout]
  170. 17014:39:25INFObackend order-api/order-pod-1 200 1024 0/0/0/22/22 "GET /api/v2/orders?user_id=usr-55102 HTTP/1.1" - src=198.51.100.33
  171. 17114:39:32ERRORbackend catalog-api/catalog-pod-2 504 0 0/0/30001/-1/30001 "GET /api/v2/products/7781/images HTTP/1.1" - src=203.0.113.71 [request timeout]
  172. 17214:39:40INFObackend user-api/user-pod-2 200 384 0/0/0/6/6 "GET /api/v2/users/activity HTTP/1.1" - src=10.0.3.92
  173. 17314:39:48ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/reviews?product_id=3321 HTTP/1.1" - src=198.51.100.77 [request timeout]
  174. 17414:40:01ERRORHealth check for server catalog-api/catalog-pod-2 failed, reason: Layer7 timeout (30s)
  175. 17514:40:01ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  176. 17614:40:05ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/products?category=electronics HTTP/1.1" - src=198.51.100.14 [request timeout]
  177. 17714:40:08INFObackend search-api/search-pod-1 200 1024 0/0/0/14/14 "GET /api/v2/search?q=coffee+maker HTTP/1.1" - src=10.0.1.45
  178. 17814:40:12ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  179. 17914:40:15WARNServer catalog-api/catalog-pod-2 is DOWN, reason: Layer7 timeout, check duration: 30004ms, status: 0/2 UP
  180. 18014:40:15WARNServer catalog-api/catalog-pod-3 is DOWN, reason: Layer7 timeout, check duration: 30001ms, status: 0/2 UP
  181. 18114:40:20ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/products/featured HTTP/1.1" - src=203.0.113.42 [request timeout]
  182. 18214:40:22INFObackend user-api/user-pod-1 200 512 0/0/0/4/4 "GET /api/v2/users/notifications HTTP/1.1" - src=198.51.100.201
  183. 18314:40:25INFObackend payment-api/payment-pod-2 200 64 0/0/0/25/25 "GET /api/v2/payments/status/PAY-72001 HTTP/1.1" - src=10.0.2.18
  184. 18414:40:30WARNbackend order-api/order-pod-1 200 768 0/0/0/892/892 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.88 [slow response - upstream catalog timeout?]
  185. 18514:40:35ERRORbackend catalog-api/catalog-pod-1 504 0 0/0/30001/-1/30001 "GET /api/v2/products/8842 HTTP/1.1" - src=198.51.100.33 [request timeout]
  186. 18614:40:38INFOfrontend http-in/srv1 304 0 0/0/0/1/1 "GET /static/images/hero-banner.webp HTTP/1.1" - src=203.0.113.99
  187. 18714:40:42WARNServer catalog-api/catalog-pod-1 is DOWN, reason: Layer7 timeout, check duration: 30002ms, status: 0/3 UP
  188. 18814:40:42ERRORbackend catalog-api has no server available!
  189. 18914:40:48ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?category=clothing HTTP/1.1" - src=203.0.113.15 [backend catalog-api has no server available]
  190. 19014:40:52INFObackend search-api/search-pod-2 200 2048 0/0/0/20/20 "GET /api/v2/search?q=headphones HTTP/1.1" - src=198.51.100.77
  191. 19114:40:55ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/5512 HTTP/1.1" - src=10.0.1.110 [backend catalog-api has no server available]
  192. 19214:41:00WARNbackend order-api/order-pod-2 500 256 0/0/0/1205/1205 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.14 [internal error - upstream dependency failure]
  193. 19314:41:03INFObackend user-api/user-pod-2 200 256 0/0/0/5/5 "GET /api/v2/users/me HTTP/1.1" - src=10.0.3.92
  194. 19414:41:06ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/deals HTTP/1.1" - src=198.51.100.201 [backend catalog-api has no server available]
  195. 19514:41:10WARNbackend order-api/order-pod-1 500 256 0/0/0/2104/2104 "POST /api/v2/cart/items HTTP/1.1" - src=203.0.113.71 [internal error - catalog lookup failed]
  196. 19614:41:14ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?brand=Apple HTTP/1.1" - src=198.51.100.88 [backend catalog-api has no server available]
  197. 19714:41:18INFObackend payment-api/payment-pod-1 200 384 0/0/0/31/31 "POST /api/v2/payments/webhook HTTP/1.1" - src=10.0.4.55
  198. 19814:41:22WARNbackend order-api/order-pod-2 504 0 0/0/30001/-1/30001 "GET /api/v2/orders/ORD-88801 HTTP/1.1" - src=198.51.100.33 [timeout - blocked on catalog dependency]
  199. 19914:41:26ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=203.0.113.42 [backend catalog-api has no server available]
  200. 20014:41:30INFObackend search-api/search-pod-1 200 3072 0/0/0/16/16 "GET /api/v2/search?q=winter+jacket HTTP/1.1" - src=10.0.1.45
  201. 20114:41:34ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  202. 20214:41:34ERRORHealth check for server catalog-api/catalog-pod-2 failed, reason: Layer7 timeout (30s)
  203. 20314:41:34ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  204. 20414:41:38WARNbackend order-api/order-pod-1 500 128 0/0/0/3201/3201 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.77 [internal error - catalog dependency timeout]
  205. 20514:41:42ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/inventory/sku/CL-4421 HTTP/1.1" - src=203.0.113.15 [backend catalog-api has no server available]
  206. 20614:41:46INFObackend user-api/user-pod-1 200 128 0/0/0/3/3 "DELETE /api/v2/users/sessions/expired HTTP/1.1" - src=198.51.100.14
  207. 20714:41:50WARNHealth check for server order-api/order-pod-1 failed (timeout)
  208. 20814:41:54ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?page=1 HTTP/1.1" - src=10.0.1.110 [backend catalog-api has no server available]
  209. 20914:41:58WARNbackend order-api/order-pod-2 504 0 0/0/30001/-1/30001 "POST /api/v2/cart/items HTTP/1.1" - src=198.51.100.201 [timeout]
  210. 21014:42:02WARNbackend order-api/order-pod-1 503 0 0/0/0/102/102 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.88 [retried 2 times, all failed]
  211. 21114:42:06INFObackend payment-api/payment-pod-2 200 192 0/0/0/28/28 "POST /api/v2/payments/verify HTTP/1.1" - src=10.0.2.18
  212. 21214:42:10ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/bestsellers HTTP/1.1" - src=203.0.113.42 [backend catalog-api has no server available]
  213. 21314:42:14WARNHealth check for server order-api/order-pod-2 failed (timeout)
  214. 21414:42:18WARNServer order-api/order-pod-1 is DOWN, reason: Layer7 timeout, check duration: 30003ms, status: 0/1 UP
  215. 21514:42:22INFObackend search-api/search-pod-2 200 1536 0/0/0/19/19 "GET /api/v2/search?q=backpack HTTP/1.1" - src=198.51.100.33
  216. 21614:42:26ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/8842/reviews HTTP/1.1" - src=203.0.113.71 [backend catalog-api has no server available]
  217. 21714:42:30ERRORbackend order-api/order-pod-2 502 0 0/0/0/-1/0 "GET /api/v2/orders/ORD-88850 HTTP/1.1" - src=198.51.100.14 [connection refused]
  218. 21814:42:35WARNServer order-api/order-pod-2 is DOWN, reason: Layer7 timeout, check duration: 30005ms, status: 0/2 UP
  219. 21914:42:35ERRORbackend order-api has no server available!
  220. 22014:42:40INFObackend user-api/user-pod-2 200 384 0/0/0/6/6 "GET /api/v2/users/preferences HTTP/1.1" - src=10.0.3.92
  221. 22114:42:44ERRORfrontend http-in 503 0 0/0/0/-1/0 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.77 [backend order-api has no server available]
  222. 22214:42:48ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?category=sports HTTP/1.1" - src=10.0.1.45 [backend catalog-api has no server available]
  223. 22314:42:52INFOHealth check for server payment-api/payment-pod-1 succeeded
  224. 22414:42:52INFOHealth check for server payment-api/payment-pod-2 succeeded
  225. 22514:42:56ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/cart HTTP/1.1" - src=203.0.113.15 [backend order-api has no server available]
  226. 22614:43:00INFObackend search-api/search-pod-1 200 2048 0/0/0/22/22 "GET /api/v2/search?q=tablet HTTP/1.1" - src=198.51.100.88
  227. 22714:43:04ERRORfrontend http-in 503 0 0/0/0/-1/0 "POST /api/v2/cart/items HTTP/1.1" - src=198.51.100.201 [backend order-api has no server available]
  228. 22814:43:08ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  229. 22914:43:08ERRORHealth check for server catalog-api/catalog-pod-2 failed, reason: Layer7 timeout (30s)
  230. 23014:43:08ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  231. 23114:43:12INFObackend user-api/user-pod-1 200 256 0/0/0/4/4 "GET /api/v2/users/me HTTP/1.1" - src=203.0.113.42
  232. 23214:43:16ERRORHealth check for server order-api/order-pod-1 failed, reason: Layer7 timeout (30s)
  233. 23314:43:16ERRORHealth check for server order-api/order-pod-2 failed, reason: Layer7 timeout (30s)
  234. 23414:43:20ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/categories HTTP/1.1" - src=198.51.100.33 [backend catalog-api has no server available]
  235. 23514:43:25INFObackend payment-api/payment-pod-1 200 128 0/0/0/33/33 "POST /api/v2/payments/charge HTTP/1.1" - src=10.0.4.55
  236. 23614:44:01ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products HTTP/1.1" - src=198.51.100.14 [backend catalog-api has no server available]
  237. 23714:44:04ERRORfrontend http-in 503 0 0/0/0/-1/0 "POST /api/v2/orders HTTP/1.1" - src=203.0.113.71 [backend order-api has no server available]
  238. 23814:44:07INFObackend user-api/user-pod-2 200 512 0/0/0/5/5 "POST /api/v2/users/login HTTP/1.1" - src=10.0.3.92
  239. 23914:44:10ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/featured HTTP/1.1" - src=198.51.100.88 [backend catalog-api has no server available]
  240. 24014:44:13INFObackend search-api/search-pod-2 200 1024 0/0/0/15/15 "GET /api/v2/search?q=water+bottle HTTP/1.1" - src=10.0.1.45
  241. 24114:44:16ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/cart HTTP/1.1" - src=198.51.100.33 [backend order-api has no server available]
  242. 24214:44:19ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?sale=true HTTP/1.1" - src=203.0.113.15 [backend catalog-api has no server available]
  243. 24314:44:22INFObackend payment-api/payment-pod-2 200 256 0/0/0/29/29 "POST /api/v2/payments/intent HTTP/1.1" - src=10.0.2.18
  244. 24414:44:25ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/orders?user_id=usr-44210 HTTP/1.1" - src=198.51.100.201 [backend order-api has no server available]
  245. 24514:44:28INFOHealth check for server user-api/user-pod-1 succeeded
  246. 24614:44:28INFOHealth check for server user-api/user-pod-2 succeeded
  247. 24714:44:31ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/8842 HTTP/1.1" - src=198.51.100.77 [backend catalog-api has no server available]
  248. 24814:44:34INFObackend search-api/search-pod-1 200 2048 0/0/0/18/18 "GET /api/v2/search?q=sneakers HTTP/1.1" - src=203.0.113.42
  249. 24914:44:37ERRORfrontend http-in 503 0 0/0/0/-1/0 "POST /api/v2/cart/items HTTP/1.1" - src=10.0.1.110 [backend order-api has no server available]
  250. 25014:44:40ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  251. 25114:44:40ERRORHealth check for server catalog-api/catalog-pod-2 failed, reason: Layer7 timeout (30s)
  252. 25214:44:40ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  253. 25314:44:43INFObackend user-api/user-pod-1 200 384 0/0/0/6/6 "GET /api/v2/users/wishlist HTTP/1.1" - src=198.51.100.14
  254. 25414:44:46ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?brand=Samsung HTTP/1.1" - src=203.0.113.99 [backend catalog-api has no server available]
  255. 25514:44:50ERRORHealth check for server order-api/order-pod-1 failed, reason: Layer7 timeout (30s)
  256. 25614:44:50ERRORHealth check for server order-api/order-pod-2 failed, reason: Layer7 timeout (30s)
  257. 25714:44:55INFObackend payment-api/payment-pod-1 200 192 0/0/0/27/27 "POST /api/v2/payments/refund HTTP/1.1" - src=10.0.4.55
  258. 25814:44:58ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/deals HTTP/1.1" - src=198.51.100.88 [backend catalog-api has no server available]
  259. 25914:45:02ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/orders/ORD-88901 HTTP/1.1" - src=198.51.100.201 [backend order-api has no server available]
  260. 26014:45:06INFObackend search-api/search-pod-2 200 1536 0/0/0/21/21 "POST /api/v2/search/suggest HTTP/1.1" - src=203.0.113.71
  261. 26114:45:10INFOHealth check for server search-api/search-pod-1 succeeded
  262. 26214:45:10INFOHealth check for server search-api/search-pod-2 succeeded
  263. 26314:45:14ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?category=home HTTP/1.1" - src=10.0.1.45 [backend catalog-api has no server available]
  264. 26414:45:18INFObackend user-api/user-pod-2 200 128 0/0/0/4/4 "GET /api/v2/users/notifications HTTP/1.1" - src=198.51.100.33
  265. 26514:45:22ERRORfrontend http-in 503 0 0/0/0/-1/0 "PUT /api/v2/cart/items/ITM-2291 HTTP/1.1" - src=203.0.113.15 [backend order-api has no server available]
  266. 26614:45:30ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/categories HTTP/1.1" - src=198.51.100.77 [backend catalog-api has no server available]
  267. 26714:45:38INFOfrontend http-in/srv1 200 384 0/0/0/2/2 "GET /api/v2/health HTTP/1.1" - src=10.0.0.1
  268. 26814:45:45ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/3321 HTTP/1.1" - src=10.0.1.110 [backend catalog-api has no server available]
  269. 26914:45:52INFObackend payment-api/payment-pod-2 200 64 0/0/0/24/24 "GET /api/v2/payments/methods HTTP/1.1" - src=203.0.113.42
  270. 27014:46:00ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/inventory/sku/EL-9921 HTTP/1.1" - src=198.51.100.14 [backend catalog-api has no server available]
  271. 27114:46:08ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  272. 27214:46:08ERRORHealth check for server catalog-api/catalog-pod-2 failed, reason: Layer7 timeout (30s)
  273. 27314:46:08ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  274. 27414:46:15INFObackend search-api/search-pod-1 200 2048 0/0/0/17/17 "GET /api/v2/search?q=air+fryer HTTP/1.1" - src=198.51.100.88
  275. 27514:46:22ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?page=1&limit=20 HTTP/1.1" - src=198.51.100.201 [backend catalog-api has no server available]
  276. 27614:46:30ERRORHealth check for server order-api/order-pod-1 failed, reason: Layer7 timeout (30s)
  277. 27714:46:30ERRORHealth check for server order-api/order-pod-2 failed, reason: Layer7 timeout (30s)
  278. 27814:46:38INFObackend user-api/user-pod-1 200 256 0/0/0/5/5 "GET /api/v2/users/orders HTTP/1.1" - src=10.0.3.92
  279. 27914:46:45ERRORfrontend http-in 503 0 0/0/0/-1/0 "POST /api/v2/orders HTTP/1.1" - src=203.0.113.71 [backend order-api has no server available]
  280. 28014:46:52ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/bestsellers HTTP/1.1" - src=198.51.100.33 [backend catalog-api has no server available]
  281. 28114:47:00INFObackend payment-api/payment-pod-1 200 128 0/0/0/30/30 "POST /api/v2/payments/validate-card HTTP/1.1" - src=10.0.2.18
  282. 28214:47:08ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products?brand=Nike HTTP/1.1" - src=10.0.1.45 [backend catalog-api has no server available]
  283. 28314:47:15INFObackend search-api/search-pod-2 200 1024 0/0/0/19/19 "GET /api/v2/search/trending HTTP/1.1" - src=198.51.100.77
  284. 28414:47:22INFOHealth check for server payment-api/payment-pod-1 succeeded
  285. 28514:47:22INFOHealth check for server payment-api/payment-pod-2 succeeded
  286. 28614:47:30ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=203.0.113.15 [backend catalog-api has no server available]
  287. 28714:47:38ERRORfrontend http-in 503 0 0/0/0/-1/0 "GET /api/v2/cart HTTP/1.1" - src=198.51.100.14 [backend order-api has no server available]
  288. 28814:47:45INFObackend user-api/user-pod-2 200 384 0/0/0/6/6 "PATCH /api/v2/users/address HTTP/1.1" - src=203.0.113.42
  289. 28914:48:02ERRORHealth check for server catalog-api/catalog-pod-1 failed, reason: Layer7 timeout (30s)
  290. 29014:48:02INFOHealth check for server catalog-api/catalog-pod-2 succeeded (recovered)
  291. 29114:48:02ERRORHealth check for server catalog-api/catalog-pod-3 failed, reason: Layer7 timeout (30s)
  292. 29214:48:05INFOServer catalog-api/catalog-pod-2 is UP, reason: Layer7 check passed, status: 1/3 UP
  293. 29314:48:10WARNbackend catalog-api/catalog-pod-2 200 2048 0/0/0/1842/1842 "GET /api/v2/products?category=electronics HTTP/1.1" - src=198.51.100.88 [slow response - recovering]
  294. 29414:48:14INFObackend search-api/search-pod-1 200 1536 0/0/0/15/15 "GET /api/v2/search?q=phone+case HTTP/1.1" - src=198.51.100.201
  295. 29514:48:18INFObackend user-api/user-pod-1 200 512 0/0/0/4/4 "GET /api/v2/users/me HTTP/1.1" - src=10.0.3.92
  296. 29614:48:22INFOHealth check for server catalog-api/catalog-pod-1 succeeded (recovered)
  297. 29714:48:22INFOServer catalog-api/catalog-pod-1 is UP, reason: Layer7 check passed, status: 2/3 UP
  298. 29814:48:28WARNbackend catalog-api/catalog-pod-1 200 4096 0/0/0/982/982 "GET /api/v2/products/featured HTTP/1.1" - src=198.51.100.33 [slow response - recovering]
  299. 29914:48:32INFObackend payment-api/payment-pod-2 200 192 0/0/0/26/26 "POST /api/v2/payments/charge HTTP/1.1" - src=10.0.4.55
  300. 30014:48:38INFOHealth check for server order-api/order-pod-2 succeeded (recovered)
  301. 30114:48:38INFOServer order-api/order-pod-2 is UP, reason: Layer7 check passed, status: 1/2 UP
  302. 30214:48:42INFOHealth check for server catalog-api/catalog-pod-3 succeeded (recovered)
  303. 30314:48:42INFOServer catalog-api/catalog-pod-3 is UP, reason: Layer7 check passed, status: 3/3 UP
  304. 30414:48:48WARNbackend catalog-api/catalog-pod-2 200 1024 0/0/0/445/445 "GET /api/v2/products/8842 HTTP/1.1" - src=203.0.113.71 [response improving]
  305. 30514:48:52INFObackend order-api/order-pod-2 200 768 0/0/0/312/312 "GET /api/v2/cart HTTP/1.1" - src=198.51.100.14
  306. 30614:48:58INFOHealth check for server order-api/order-pod-1 succeeded (recovered)
  307. 30714:48:58INFOServer order-api/order-pod-1 is UP, reason: Layer7 check passed, status: 2/2 UP
  308. 30814:49:05INFObackend catalog-api/catalog-pod-3 200 3072 0/0/0/198/198 "GET /api/v2/products?category=clothing HTTP/1.1" - src=203.0.113.15
  309. 30914:49:12INFObackend catalog-api/catalog-pod-1 200 2048 0/0/0/145/145 "GET /api/v2/products/new-arrivals HTTP/1.1" - src=198.51.100.77
  310. 31014:49:18INFObackend order-api/order-pod-1 200 512 0/0/0/88/88 "POST /api/v2/cart/items HTTP/1.1" - src=10.0.1.110
  311. 31114:49:25INFObackend user-api/user-pod-2 200 256 0/0/0/5/5 "GET /api/v2/users/activity HTTP/1.1" - src=203.0.113.42
  312. 31214:49:32INFObackend catalog-api/catalog-pod-2 200 4096 0/0/0/82/82 "GET /api/v2/products?brand=Apple HTTP/1.1" - src=198.51.100.88
  313. 31314:49:40INFObackend search-api/search-pod-2 200 2048 0/0/0/16/16 "GET /api/v2/search?q=laptop+stand HTTP/1.1" - src=10.0.1.45
  314. 31414:49:48INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  315. 31514:49:48INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  316. 31614:49:48INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  317. 31714:50:02INFObackend catalog-api/catalog-pod-3 200 1536 0/0/0/34/34 "GET /api/v2/products/bestsellers HTTP/1.1" - src=198.51.100.201
  318. 31814:50:15INFObackend order-api/order-pod-2 201 768 0/0/0/28/28 "POST /api/v2/orders HTTP/1.1" - src=198.51.100.33
  319. 31914:50:28INFObackend catalog-api/catalog-pod-1 200 2048 0/0/0/15/15 "GET /api/v2/products/deals HTTP/1.1" - src=203.0.113.71
  320. 32014:50:42INFObackend payment-api/payment-pod-1 200 384 0/0/0/31/31 "POST /api/v2/payments/charge HTTP/1.1" - src=10.0.2.18
  321. 32114:50:55INFOHealth check for server order-api/order-pod-1 succeeded
  322. 32214:50:55INFOHealth check for server order-api/order-pod-2 succeeded
  323. 32314:51:10INFObackend catalog-api/catalog-pod-2 200 6144 0/0/0/12/12 "GET /api/v2/products?page=1&limit=50 HTTP/1.1" - src=198.51.100.14
  324. 32414:51:25INFObackend user-api/user-pod-1 200 128 0/0/0/3/3 "GET /api/v2/users/me HTTP/1.1" - src=10.0.3.92
  325. 32514:51:40INFObackend order-api/order-pod-1 200 1024 0/0/0/22/22 "GET /api/v2/orders/ORD-89001/status HTTP/1.1" - src=203.0.113.15
  326. 32614:51:55INFObackend catalog-api/catalog-pod-3 200 2048 0/0/0/11/11 "GET /api/v2/products?category=beauty HTTP/1.1" - src=198.51.100.77
  327. 32714:52:10INFObackend search-api/search-pod-1 200 1024 0/0/0/14/14 "GET /api/v2/search?q=wireless+earbuds HTTP/1.1" - src=10.0.1.110
  328. 32814:52:30INFOHealth check for server user-api/user-pod-1 succeeded
  329. 32914:52:30INFOHealth check for server user-api/user-pod-2 succeeded
  330. 33014:52:50INFObackend catalog-api/catalog-pod-1 200 4096 0/0/0/10/10 "GET /api/v2/products/featured HTTP/1.1" - src=198.51.100.88
  331. 33114:53:10INFObackend order-api/order-pod-2 200 384 0/0/0/15/15 "GET /api/v2/cart HTTP/1.1" - src=203.0.113.42
  332. 33214:53:30INFOHealth check for server catalog-api/catalog-pod-1 succeeded
  333. 33314:53:30INFOHealth check for server catalog-api/catalog-pod-2 succeeded
  334. 33414:53:30INFOHealth check for server catalog-api/catalog-pod-3 succeeded
  335. 33514:53:50INFObackend payment-api/payment-pod-2 200 256 0/0/0/27/27 "POST /api/v2/payments/verify HTTP/1.1" - src=10.0.4.55
  336. 33614:54:10INFObackend catalog-api/catalog-pod-2 200 1024 0/0/0/9/9 "GET /api/v2/products/7781 HTTP/1.1" - src=198.51.100.201
  337. 33714:54:30INFObackend user-api/user-pod-2 200 512 0/0/0/6/6 "GET /api/v2/users/preferences HTTP/1.1" - src=198.51.100.33
  338. 33814:54:50INFObackend catalog-api/catalog-pod-3 200 3072 0/0/0/13/13 "GET /api/v2/products?category=sports&sort=rating HTTP/1.1" - src=203.0.113.71

Incident report

SEV-1INC-2026-0212-0032026-02-12

Cascading Service Failure — Catalog Cache OOM

33 min duration4 services affected~34k failed requests

Detection

Automated page via PagerDuty at 14:40 UTC when catalog-service p99 latency exceeded 2s SLO threshold. 18-minute detection gap from deploy to alert — no cache-size or GC-frequency monitors existed.

Summary

catalog-service v2.14.0 deployed with cache TTL set to Integer.MAX_VALUE (infinite). Unbounded cache consumed the full 4GB heap, triggering a G1GC death spiral (Full GC pauses >10s). order-service failed due to catalog dependency timeout and its own thread pool exhaustion. API gateway retries amplified load 3x. Kubernetes restarts caused immediate re-OOM via eager cache reload, producing a CrashLoopBackOff loop. Resolved by rollback to v2.13.8.

Timeline

  1. 14:22catalog-service v2.14.0 deployed (rolling update, 3 pods)
  2. 14:34GC pauses climbing; cache >1.5GB, no evictions
  3. 14:38Catalog p99 >500ms; gateway starts retrying
  4. 14:40Full GC triggered; pauses >2s. PagerDuty alert fires
  5. 14:41order-service 80% timeout rate; thread pool saturated (200/200)
  6. 14:42Gateway circuit breaker opens for catalog-service
  7. 14:43catalog-service OOMKilled (4198Mi used / 4096Mi limit)
  8. 14:44Restarted pod eagerly reloads cache → re-OOM in <60s
  9. 14:44order-service OOMKilled (heap exhausted by queued requests)
  10. 14:45CrashLoopBackOff on catalog-service (10s → 20s → 40s backoff)
  11. 14:45HAProxy marks catalog-api + order-api backends DOWN
  12. 14:48On-call identifies cache TTL misconfiguration; initiates rollback
  13. 14:51Rolled-back pods healthy (TTL=3600s, maxSize=50k)
  14. 14:53order-service recovered; circuit breakers closed; backends UP
  15. 14:55All services nominal; stable memory + latency confirmed

Root Cause

catalog-service v2.14.0 set ProductCacheManager TTL to Integer.MAX_VALUE instead of 3600s. Cache grew unbounded, filling the 4GB heap. G1GC could not reclaim live cache references → Full GC pauses escalated to 10s+. On restart, eager cache reload from DB reproduced the OOM immediately, preventing self-healing.

Contributing Factors

  • No cache-size or eviction-rate monitoring — 18-minute detection gap
  • Gateway retry policy (3x) amplified load on an already-stalled service
  • Eager cache warm-up on startup made pod restarts reproduce the OOM instantly
  • No integration test validating cache config bounds
  • K8s memory request (2Gi) far below limit (4Gi) — delayed OOMKill, extended GC spiral

Impact

  • Full catalog + order API outage: 14:42–14:53 UTC (11 min)
  • ~34,000 failed requests across ~8,400 users
  • Checkout, cart, and product browsing unavailable
  • HPA scaled to 8 catalog replicas — all OOMed (wasted cluster capacity)
  • Payment + user services operational but functionally degraded

Mitigation

  1. Identified unbounded cache via kubectl top + heap dump
  2. Rolled back to v2.13.8 (kubectl rollout undo)
  3. Verified TTL=3600s, maxSize=50,000 in rolled-back config
  4. Monitored GC + heap 30 min post-recovery

Action Items

ActionOwnerPriStatus
Add Prometheus alert: G1GC Full GC >2/min or pause >1splatformP0In Progress
Add cache-size alert: >70% of maxSizeplatformP0In Progress
Integration test: cache TTL within bounds (<86400s)catalogP0Open
Replace eager cache warm-up with lazy loadingcatalogP1Open
Switch gateway retries to request hedgingplatformP1Open
Set K8s memory request = limit to fail fastplatformP1Open
Enable -XX:+HeapDumpOnOutOfMemoryError → S3platformP2Open
Require config review for cache parameter changesengP2Open

Faster answers when things break

Stop spending hours in SSH sessions. Get to root cause in minutes.

Read-only by design

Fawdy never writes, never modifies, never restarts. It observes your system through standard read commands. Nothing changes. Audit-safe by default.

Plain English reports

Root cause, contributing factors, and a recommended fix. Structured and readable -- not a wall of raw terminal output.

No agent required

Standard SSH. No daemon to install, no package to manage, no ports to open. Works with your existing access controls and jump hosts.

Legacy friendly

Built for the servers running your most critical workloads. The ones that predate your Kubernetes cluster. If it has SSH and a shell, Fawdy can investigate it.