add incident 029: ans create a service for the offloader vm
This commit is contained in:
parent
fb901524ca
commit
ec0cfc908a
|
@ -1235,3 +1235,31 @@ eae-adp-jump01# crontab -e
|
||||||
[...]
|
[...]
|
||||||
0 */2 * * * rcctl restart prometheus
|
0 */2 * * * rcctl restart prometheus
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
029 2022.11.29 03:00 (ANS) | (maintenance) automagically start offloader
|
||||||
|
------------------------------------------------------------------------
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
_this log entry was added way after doing the actual work.
|
||||||
|
Please read it with a grain of salt_
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**problem**:
|
||||||
|
ANS washes the traffic via a FFLPZ/FFDD offloader vm.
|
||||||
|
There only was a script that manually started the offloader vm.
|
||||||
|
On reboots the offloader vm would not automagically start.
|
||||||
|
|
||||||
|
**solution**:
|
||||||
|
implement a service that starts the vm
|
||||||
|
|
||||||
|
**impact**:
|
||||||
|
after validating the script on another openwrt machine I tested the script in production.
|
||||||
|
This created the following downtimes:
|
||||||
|
* `offloader` down from 02:50 to 03:05 -- service interruption for the public wifi
|
||||||
|
* `ffl-ans-gw-core01` down from 02:53 to 02:55 -- service interruption for everybody
|
||||||
|
|
||||||
|
**disclaimer**:
|
||||||
|
The script is manually deployed on `ffl-ans-gw-core01` and therefore not part of this repo at the moment
|
||||||
|
|
Reference in New Issue