Some systems are experiencing issues

Past Incidents

Wednesday 17th April 2024

Metrics Metrics: Lag in queries results

Metrics queries results are lagging a bit, we have identified the underlying issue and issued a preliminary fix. We are monitoring the result. Grafana dashboards or results obtained from the metrics API might be missing some recent values until this is resolved.

EDIT 2024-04-18 10:53 UTC+2: The issue is still present. We've been force to sample incoming data until we figure out the underlying issue.

EDIT 2024-04-18 12:07 UTC+2: Our storage layer has been stabilized, we still apply a sampling on incoming data. Queries should be working properly.

EDIT 2024-04-18 21:13 UTC+2: The situation has improved, sampling on incoming data has been disabled. We continue to monitor the system but queries should now return the correct data without lag.

EDIT 2024-04-18 23:37 UTC+2: This incident is now over.

Infrastructure PAR: Hypervisor unreachable

An hypervisor on the Paris region was unreachable and rebooted. We are looking into it and making sure it restarts all of its services.

EDIT 15:40 UTC+2: All services are up again since ~15:30 UTC+2. We continue to monitor the situation. If you still have issues, please contact our support.

Tuesday 16th April 2024

No incidents reported

Monday 15th April 2024

No incidents reported

Sunday 14th April 2024

No incidents reported

Saturday 13th April 2024

Cellar Cellar on Paris is experiencing trouble

Ceph (the software we are running Cellar on) is rebalancing some shards due to a change in its storage capacity. Some requests might fail while doing so.

edit: after a few alerts, everything has been running smoothly.

Friday 12th April 2024

No incidents reported

Thursday 11th April 2024

Metrics [GLOBAL] Metrics query unavailable

The metrics query is currently unavailable as some indexing shared are offline. We are working to get them up as quickly as possible. There is no impact on ingestion pipeline and storage layer.

UTC 11:00: Queries are available