incidents: close "wifi issues in tent 5"

This commit is contained in:
Gregor Michels 2022-07-02 23:58:41 +02:00
parent 356edc6318
commit 2379820ab9
1 changed files with 52 additions and 128 deletions

View File

@ -19,13 +19,61 @@ Tape back the protective cap of the power strip and reinsert the power supply
No internet access for 2 hours
2022.06.30 12:00 - ??:?? | wifi issues in Tent 5
------------------------------------------------
2022.06.30 12:00 - 2022.07.01 19:00 | wifi issues in tent 5
-----------------------------------------------------------
**issue**:
### issue
A resident reported slow internet speeds. He resides in tent 5. I do not have more information.
But it seems that the ap in tent 5 `ap-ac7c` is very slow and hangs/freezes a lot via ssh.
While trying to check logs for the ap I noticed that `ap-ac7c` is very slow and hangs/freezes a lot via ssh.
Rebooting did not solve the problem.
### cause
Unknown
I've checked the ap the next day in person. I tested the ap with a different lan cable on a different switch port.
The issued I've noticed the night before where not reproducible.
But I did notice that the short patchcable (connecting the ap to the switch) had some light rust on it.
### solution
_01.07.2022 ~ 03:00_ (shortterm):
After noticing the issue myself I tried rebooting the ap.
Unfortunately that did not solve the problem.
To spare the clients from connecting to a bonkers ap I disabled poe for the switch port to take the ap offline:
```
root@sw-access02:~# uci show poe | grep lan2
poe.@port[1].name='lan2'
root@sw-access02:~# uci set poe.@port[1].enable=0
root@sw-access02:~# uci commit poe
root@sw-access02:~# /etc/init.d/poe restart
```
_01.07.2022 ~ 19:00_ (longterm):
I could not reproduce the issue in person. To be on the safe side I replaced the short patchcable (connecting the ap to the switch) and ap:
`ap-ac7c -> ap-1a38`.
Afterwards I reenabled poe on the corrosponding switch port.
### impact
* `2022.06.31 12:00 - 2022.07.01 03:30`: (probably) unreliable wifi for clients connected to `ap-ac7c`
* `2022.07.01 03:30 - 2022.07.01 18:30`: bad signal strength to clients in and around tent 5
### notes
While disabling poe on the port connecting `ap-ac7c` I restarted the `poe` service.
That resulted in all ports shortly dropping power.
Therefore I also accidentially rebooted `ap-2bbf`.
Next time I'll just reload the service (shame on me).
### logs
This was my test to show that ssh was slow/freezed a lot on `ap-ac7c`.
good ap:
```
@ -59,127 +107,3 @@ user 0m0.081s
sys 0m0.015s
user@freifunk-admin:~$
```
a reboot (2022.07.01 03:09:10 - 03:10:29) did not solve the slugish ssh connections.
**solution**:
shortterm: poweroff ap so clients can hopefully roam to a non-broken ap (done)
```
root@sw-access02:~# uci show poe | grep lan2
poe.@port[1].name='lan2'
root@sw-access02:~# uci set poe.@port[1].enable=0
root@sw-access02:~# uci commit poe
root@sw-access02:~# /etc/init.d/poe restart
root@sw-access02:~# ubus call poe info
{
"firmware": "v22.4",
"mcu": "ST Micro ST32F100 Microcontroller",
"budget": 77.000000,
"consumption": 0.000000,
"ports": {
"lan1": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan3": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan4": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan5": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan6": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan7": {
"priority": 0,
"mode": "PoE+",
"status": "Disabled"
},
"lan8": {
"priority": 0,
"mode": "PoE+",
"status": "Disabled"
}
}
}
root@sw-access02:~#
```
longterm: replace the broken ap
**impact**:
**2022.06.31 12:00 - 2022.07.01 03:30**:
(very) slow wifi for clients connected to `ap-ac7c`
**2022.07.01 03:30 - ??**:
bad signal strength for clients in tent 5
**notes**:
While restarting `poe` on `sw-access02` all poe ports dropped power for a few seconds....
Therefore `ap-2bbf` also rebooted (whopsi):
```
user@freifunk-admin:~$ ssh sw-access02 ubus call poe info
{
"firmware": "v22.4",
"mcu": "ST Micro ST32F100 Microcontroller",
"budget": 77.000000,
"consumption": 4.600000,
"ports": {
"lan1": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan3": {
"priority": 0,
"mode": "PoE+",
"status": "Delivering power",
"consumption": 4.600000
},
"lan4": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan5": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan6": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan7": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
},
"lan8": {
"priority": 0,
"mode": "PoE+",
"status": "Searching"
}
}
}
user@freifunk-admin:~$ ssh ap-2bbf uptime
01:34:18 up 6 min, load average: 0.02, 0.07, 0.04
user@freifunk-admin:~$
```
Next time i'll only `reload` the `poe` service