Compare commits

...

2 Commits

Author SHA1 Message Date
Gregor Michels 9506e94dad add incident 028: peridically restart prometheus on eae-adp-jump01 2022-12-23 01:28:30 +01:00
Gregor Michels 3e2fc42c19 incident 027: remembered that I also sysupgraded eae-adp-jump01
Fixes: 34e4fbf000
2022-12-23 01:27:00 +01:00
1 changed files with 19 additions and 1 deletions

View File

@ -1184,7 +1184,8 @@ After installing a prometheus stack onto `eae-adp-jump01` (`8389a18`) the `/var/
Limiting the size of the TSDB did not resolve this issues (maybe i've misconifigured the limit).
**solution**:
attach 20GB block device onto vm and mount it as `/var/prometheus`:
* `sysupgrade` to `OpenBSD 7.2`
* attach 20GB block device onto vm and mount it as `/var/prometheus`:
```
eae-adp-jump01# rcctl stop prometheus
eae-adp-jump01# rm -r /var/prometheus/*
@ -1210,3 +1211,20 @@ eae-adp-jump01# rcctl start prometheus
eae-adp-jump01# syspatch
eae-adp-jump01# reboot
```
028 2022.11.29 02:00 | periodically restart prometheus
------------------------------------------------------
**problem**:
`prometheus` crashed regularly on `eae-adp-jump01`.
It seems like `OpenBSD` is missing some functionality on file handles that let's `prometheus` crash.
Here is an [github issue](https://github.com/prometheus/prometheus/issues/8799) (for an older `OpenBSD` release) that descripes the same problems.
**solution**:
until I've got time to install a new linux machine somewhere that does the monitoring: regularly restart `prometheus`:
```
eae-adp-jump01# crontab -e
[...]
0 */2 * * * rcctl restart prometheus
```