Summary
Across the period of Friday 9th - Saturday 10th June, several Enable Trading programs channels, hosted in the UK Live environment, experienced general slowness and performance issues. Examples of this were slow load times, slow program line saves, and long waits for reports to run. These processes seemed to be slow, however, they would not complete during the incident. Additionally, some users also found that any actions they undertook would fail instantly.
Cause
The root cause of this problem was located within Microsoft’s Azure servers, on which Enable is hosted upon. Database locking was occurring, which prevented any tasks requested in the user interface from being completed and stored on the database. As the problem was with the server itself, Microsoft were required to fix it.
Preventative measures
To improve communication during incidents, going forward, we will improve both the transparency and frequency of our communication during any severe incidents or outages in the platform. This will provide clarity that Enable are aware of the issue and are working as a priority to resolve it. Our status page will be much more prominent in our communication and handling of incidents.
To improve the speed at which we identify incidents, we will extend our monitoring and tracking. These targeted improvements will plug monitoring gaps that were identified by our teams, whilst investigating the root cause of this incident. This will allow us to identify the specific cause of the issue sooner, should this happen again.