Past Incidents

Sunday 28th January 2024

No incidents reported

Saturday 27th January 2024

No incidents reported

Friday 26th January 2024

[Heptapod Cloud] Security update, scheduled 9 months ago

An update of our Heptapod Cloud service will be done today at 15:00 UTC+1 to apply the latest Gitlab security patches related to https://about.gitlab.com/releases/2024/01/25/critical-security-release-gitlab-16-8-1-released/. Expected downtime should be less than 1 minute.

EDIT 15:34 UTC+1: Patches were applied and services were restarted. The maintenance is now over.

Thursday 25th January 2024

Metrics [Metrics] query latency

We have enabled a new parameter designed to improve the reliability of the cluster. Some queries may not work. We are watching it.

Wednesday 24th January 2024

No incidents reported

Tuesday 23rd January 2024

Metrics [Metrics] Requests timeouts

We are currently observing requests timeouts on the Metrics cluster. The issue has been identified and we are working towards the resolution. No data loss is to be expected. Various graphs (grafana, console, ..) might not properly load or render with various errors.

Edit Tue Jan 23 17:59:56 2024 UTC: A faulty configuration has been applied to a node to investigate a memory-leak. The configuration backfired on the whole cluster, making it unhealthy. The configuration have been rollback. The storage layer is currently under healing mode. To speed-up the recovery, query have been disabled.

Edit Tue Jan 23 19:51:21 2024 UTC: cluster is now healthy and recovering lag, which should last a few hours. Query will be opened when lag is resorbed.

Edit Wed Jan 24 00:04:59 2024 UTC: datalag is now ok. We are still reloading metrics's metadata, so query is still not available. Should be up in a few hours

Edit Wed Jan 24 01:54:22 2024 UTC: metadata lag is now ok, query is back online

[Accesslog] Not available

We are encountering problems with the delivery of accesslogs. We are investigating.

EDIT Edit Thu Jan 25 11:00:00 2024 UTC : Platform is now ok, we're ingesting lag

EDIT Edit Thu Jan 25 16:54:00 2024 UTC : Lag ingested, Some applications may not have accesslog reachable.

Reverse Proxies [Scaleway] Load balancer instability

We are detecting a higher number of errors than usual on the load-balancers serving the scaleway zone. We are investigating.

Monday 22nd January 2024

No incidents reported