Our monitoring has detected failure on the storage layer of metrics and access logs.
We have found that a storage node has lost several disk.
We have remove faulty disks and restarted the storage node.
EDIT 16:00 UTC : The storage layer is restarted and we are consuming the ingestion lag