Past Incidents

Saturday 30th December 2023

No incidents reported

Friday 29th December 2023

Cellar [NORTH] Partial Cellar requests timeout

Between 16:58 UTC and 17:03 UTC, the Cellar service on the North region timed out on some requests. The faulty component has been decommissioned and further investigations will be done to understand the source of the timeouts. The service is currently up and running.

EDIT 2023-12-30 00:51 UTC: The problem has been identified and resolved. The component is back in the pool and is working as expected. This incident is now over.

Thursday 28th December 2023

Access Logs [Metrics] Elevated queries error rate

We are seeing elevated error rate for metrics queries due to the underlying storage system. The problem has been identified and we are working toward its resolution. This can impact some of the grafana dashboards or API queries.

EDIT 09:44 UTC: The issue is not fully resolved yet but we are seeing improvements. We continue working on the issue.

EDIT 11:04 UTC: Queries are now working since 10:15 UTC, we continue monitoring to ensure everything is working as intended.

EDIT 15:43 UTC: Everything is back to normal, this incident is now over.

Wednesday 27th December 2023

No incidents reported

Tuesday 26th December 2023

No incidents reported

Monday 25th December 2023

Infrastructure [RBX] Unreachable hypervisor

An hypervisor is unreachable, we are investigating.

EDIT 03:17 UTC : There is no database affected on this hypervisor and applications has been redeployed.

EDIT 03:30 UTC : The hypervisor has been reboot and everything comes back to normal

Sunday 24th December 2023

No incidents reported