Admin and checkout services downtime
Incident Report for Subbly
Postmortem

Unfortunately, we experienced some downtime with the admin and checkout services tonight. Websites were not affected.

Here’s what happened:

We jumped to action right away to investigate, so there was some lag with us providing updates.

After about 20 minutes we managed to isolate the issue to one of our caching server clusters running out of memory.

One of our scripts that generates reports was running multiple instances simultaneously and due to over-caching it used up all the spare memory, eventually bringing the cache layer offline and the rest of the application with it. Ultimately it came down to poor planning.

To avoid risking losing data we had to free up some of the cache records manually to allow us to then clear unnecessary data.

The problem has been resolved and the offending script has been permanently patched and a good lesson along with it.

We sincerely apologise for the awful timing of this incident. We know your trust in us is essential and that you depend on us. We will continue to fight for 100% uptime and reliability in order to earn your trust.

If you have any follow up questions or concerns, please don’t hesitate to reach out to us.

Sincerely,

Stefan Pretty, CEO of Subbly

Posted Nov 27, 2021 - 01:36 UTC

Resolved
Service is back online and fully operational. The issue has been permanently patched. We sincerely apologise for the inconvenience.
Posted Nov 27, 2021 - 01:35 UTC
Identified
Issue identified and working on deploying the fix.
Posted Nov 27, 2021 - 01:04 UTC
Investigating
Hey everyone! We're experiencing downtime with the Admin & Checkout but working as fast as we can to normalize the service.

Will keep you posted.
Posted Nov 27, 2021 - 00:43 UTC
This incident affected: Core Subbly Services (Admin, Checkout, Billing Engine).