add incident 028: peridically restart prometheus on eae-adp-jump01

master
Gregor Michels 2022-12-23 01:28:30 +01:00
parent 3e2fc42c19
commit 9506e94dad
1 changed files with 17 additions and 0 deletions

View File

@ -1211,3 +1211,20 @@ eae-adp-jump01# rcctl start prometheus
eae-adp-jump01# syspatch
eae-adp-jump01# reboot
```
028 2022.11.29 02:00 | periodically restart prometheus
------------------------------------------------------
**problem**:
`prometheus` crashed regularly on `eae-adp-jump01`.
It seems like `OpenBSD` is missing some functionality on file handles that let's `prometheus` crash.
Here is an [github issue](https://github.com/prometheus/prometheus/issues/8799) (for an older `OpenBSD` release) that descripes the same problems.
**solution**:
until I've got time to install a new linux machine somewhere that does the monitoring: regularly restart `prometheus`:
```
eae-adp-jump01# crontab -e
[...]
0 */2 * * * rcctl restart prometheus
```