Incident Summary
On July 27, 2023, between 9:17 AM and 10:49 AM EST, we observed that approximately one-third of incoming API requests to our platform failed, resulting in a "501 Gateway Timeout" error. The primary areas impacted were attempts to sign into the app and dashboard.
Impact
During the specified period, a subset of users encountered difficulties attempting to sign into the Jump Desktop app and the web dashboard, receiving a "501 Gateway Timeout" error message.
Users who were already signed into Jump Desktop applications did not experience these disruptions. Additionally, the incident did not affect existing connections or the creation of new connections from already signed-in users.
Root Cause
An API server, which was not functioning properly, remained in the pool of operational servers instead of being automatically removed as expected. The server was partially healthy and responded correctly to some requests while failing others. Our monitoring system did not accurately identify the server's partially responsive state, necessitating manual intervention to resolve the issue.
Steps We're Taking
We are investigating the reason behind the partially responsive state of the API server. Furthermore, we are simultaneously improving our server health checks to take into account partially responsive server states.
We sincerely apologize for the inconvenience caused.
Comments
0 comments
Please sign in to leave a comment.