From 1171e76cd75c8030bde4425e838cd583f23b6973 Mon Sep 17 00:00:00 2001 From: Gregor Michels Date: Wed, 8 Mar 2023 00:29:05 +0100 Subject: [PATCH] incidents: add 035 - 040 was sitting on them a long time... --- documentation/INCIDENTS.md | 82 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/documentation/INCIDENTS.md b/documentation/INCIDENTS.md index 09b33f2..5dd2552 100644 --- a/documentation/INCIDENTS.md +++ b/documentation/INCIDENTS.md @@ -1392,3 +1392,85 @@ someone accidentially unplugged the power for the network core in the facility m * `ap-c495` additional wifi restart at 05:10 * `ap-ac7c` additional wifi restart at 05:10 * `ap-0b99` additional wifi restart at 05:10 + + +035 2023.01.16 04:30 - 12:15 (ANS) | uplink broken +-------------------------------------------------- + +facility management rebooted the gigacube + + +036 2023.01.24 02:15 (RGS) | (maintenance) increase tx power of aps +------------------------------------------------------------------- + +``` +RUNNING HANDLER [reload wireless] ********************************************************************************************************************* +Tuesday 24 January 2023 02:16:45 +0100 (0:00:31.967) 0:02:06.789 ******* +``` + +see `191b7f2` for details + + +037 2023.01.29 23:10 - 2023.01.30 17:00 (ADP) | unstable ethernet link to tent-3 +-------------------------------------------------------------------------------- + +**impact**: +very unstable uplink for ap in `tent-3` + +**hotfix**: +shutdown ap via poe to move clients onto other accesspoint (there really are no other ap in this tent though :() + +**problem**: +someone butchered the ethernet cables (from the network core) by squeezing and bending them through cable guides. + +**fix**: +Tried "unbending" them and the link came back! + + +038 2023.02.01 04:00 (ADP) | move to different mullvad account +-------------------------------------------------------------- + +old pubkey: 'Sqz0LEJVmgNlq6ZgmR9YqUu3EcJzFw0bJNixGUV9Nl8=` + +``` +RUNNING HANDLER [reload network] ********************************************************************************************************************** +Wednesday 01 February 2023 03:59:45 +0100 (0:00:03.789) 0:01:22.895 **** +changed: [gw-core01] +``` + +see commit `68ee430` for details + + +039 2023.02.07 (ADP) | unstable ethernet link in tent-3 (again) +--------------------------------------------------------------- + +**introduction**: +the uplink for `tent-3` went flacky again + +**problem**: +the cables took irrepearable damage from mishandling (see `incident 035` for details) + +**fix**: +* install new access switch into `tent-2` (`sw-access04`: `220bb14`) +* migrate uplink for `tent-3` from the `core` onto `sw-access04` + + +040 2023.02.28 08:00 (ADP) | dns issues +--------------------------------------- + +**introduction**: +Someone on site called and notified me that "the internet is not working". + +**problem**: +`gw-core01` stopped serving dns queries: +``` +root@gw-core01:~# logread | grep max +Tue Feb 28 08:44:16 2023 daemon.warn dnsmasq[1]: Maximum number of concurrent DNS queries reached (max: 150) +``` + +**fix**: +* increased `maxdnsqueries` +* increased `dnscache` +* changed upstream dns to `9.9.9.9` (quad9) and `1.1.1.1` (cloudflare) + +see `a236643` for details