Commit Graph

129 Commits

Author SHA1 Message Date
Tim Wilkinson 056b60bb4d
Use wifi assoc list when looking for unresponsive nodes. (#881)
The arp cache keeps wifi entries long past them being associated with
the node, so now use wifi assoc list to find nodes, and the arp cache
to get their IPs.
2023-06-24 23:37:48 -07:00
Tim Wilkinson 270d7fd5f1
Typo (#876) 2023-06-21 13:57:21 -07:00
Tim Wilkinson 7650b627e8
Minor wifi monitor improvements for better metrics reporting (#875) 2023-06-20 20:51:18 -07:00
Tim Wilkinson 8c4d9edd99
Merge all the station monitoring and mitigation into a single service. (#874)
This is an attempt to unify all the station monitoring and make it work
better as one. We're trying to square a circle here somewhat, with taking
steps to kick nodes when problems are detected, but not kick them too quickly
or often in case we're mis-identifing issues.
We've seen these issue manifest themselves which nodes messing VoIP services
as well as resets causing nodes to get into unrecoverable states when there
was no real problems in the first place.
This will probably need to evolve before the next release, but would be good
to get some milage on the new code.
2023-06-20 01:27:23 -07:00
Tim Wilkinson 52c7286a4c
Remove another coverage test which causes problems. (#871)
Coverage is handled by modifying firmware state, and the driver stores
the values the first time it is set. When we reset this state might be lost
so it will be reloaded from the firmware. We set the coverage back to 0
so the reloaded value will be the default again.
We also remove a check which can fail incorrectly.
2023-06-12 23:29:38 -07:00
Tim Wilkinson 8f6e943237
Avoid fatal error if mac disappears across a radio reset (#868) 2023-06-08 21:27:37 -07:00
Tim Wilkinson b64aa0c988
Monitor bug fixes (#867) 2023-06-07 22:42:42 -07:00
Tim Wilkinson a61dfcdafe
Alternate ath9k and ath10k radio reset for deaf nodes (#857)
* A scan, especially if we have to do both active and passive, essentially mutes
the radio to AREDN traffic for 10-20 seconds, which isn't good. If the radio is completely
deaf then it doesn't matter, but particularly on the 9K radios we do this when
things are looking a bit dodgy, though not deaf. 
* Provide hook to reset ath9k from userspace.  This hook is attributed to:
Linus Lüssing <ll@simonwunderlich.de>
* User /sys reset hooks rather than iw scan
2023-06-01 17:06:39 -07:00
Tim Wilkinson 4e621baf0b
Support switching mesh radio on multi-radio devices (#847) 2023-05-25 21:27:59 -07:00
Tim Wilkinson e5a0b43480 Fix occasional nil error 2023-04-11 00:38:35 -07:00
Tim Wilkinson 922949abc0
Eliminate false network rejoins using LQM information (#781)
* Use LQM information to filter out neighbors we dont care about.
These can cause false rejoin events and degrade the network.
* Only use active station monitor with LQM info.
2023-04-10 10:21:30 -07:00
Tim Wilkinson 211006b47c
Resolve unresponsive node problems with Mikrotik AC devices. (#776)
* Resolve unresponsive node problems with Mikrotik AC devices.
Mikrotik AC devices get into a state where they wont communicate with
non-AC devices .. sometimes. Leaving and rejoinging the network resets
everything. We monitor for this situation and rejoin the network when detected
to resolve the issue.
* Make reporting less chatty
2023-04-02 01:29:46 -07:00
Tim Wilkinson 59ed665f3d
General station monitor logging service. (#767)
* General station monitor service.
It turns out this station bug is not limited to the ath10k driver, so
make this monitor service wifi generic.
(I've now seen this at both ends of the Mikrotik AC <-> Rocket pair)
* New logs
* Just monitor for now
2023-03-30 11:36:31 -07:00
Tim Wilkinson 933e411a10 Force badly associated stations to reassociate.
There appears to be a bug in the ath10k firmware for Mikrotik devices (maybe others)
where a station will associate but only broadcast traffic will be passed - unicast traffic
will fail. This code detects this situation and forces the device to reassociate which
fixes the problem.
2023-03-28 18:41:28 -07:00
Tim Wilkinson 05d247d15f
Fix rule checking for existing drop rules. (#719) 2023-02-17 21:07:39 -08:00
Tim Wilkinson 32e02de328
Fix fccid beacon (#717) 2023-02-17 21:06:04 -08:00
Tim Wilkinson 61fa802f80
Fix monitors not detecting non-mesh mode (#716) 2023-02-17 21:02:21 -08:00
Tim Wilkinson 9ae6e13ee0 Force dnsmasq to update itself if no network changes for > 60secs
On small networks there are not a lot of OLSR name changes. While
dnsmasq watches for changes and updates itself, it will sometimes miss
them. On busy networks this doesnt matter as the next change will catch
it up. But on smaller network (esp. test networks) a missed change can
stop name resolution working for some time. So now, if no changes are
detected for > 60 seconds, we force dnsmasq to reload its tables.
2023-02-15 20:21:17 -08:00
Tim Wilkinson dd00c7b1c3 Fix the bandwidth reporting for ath10k devices 2023-02-15 20:14:53 -08:00
Tim Wilkinson 214a93367a
Fix AC coverage calculation in driver. (#710)
For some reason, there was code in the driver to block the setting of
the coverage when a previous setting wasn't a particular value.
It's unclear what this was trying to achieve or prevent, but it stopped AC
devices operating efficiently (by a factor of 10x or more).
2023-02-12 15:50:49 -06:00
Tim Wilkinson 6834271946
Reworked ARP cache (#707) 2023-02-11 13:45:04 -06:00
Tim Wilkinson 571dbf6251
Disable RTS by default in ath10k devices 02/11/2023 (#706) 2023-02-11 13:44:10 -06:00
Tim Wilkinson c70a23f7a8
Improve LQM distance management 02/11/2023 (#705) 2023-02-11 13:43:36 -06:00
Tim Wilkinson 062ffb3521
Normalize the case of the macs and node names (#700) 2023-02-11 13:42:03 -06:00
Tim Wilkinson 863d098554 Filter even earlier 2023-02-03 09:39:28 -10:00
Tim Wilkinson eefcc888dc Filter out non-routable ARP entries which confuse LQM 2023-02-03 09:39:28 -10:00
Tim Wilkinson 33684d22d2
Gather statistics about RF links (#684) 2023-01-29 21:21:58 -06:00
Tim Wilkinson 8817b70b52 Remove firewall counters except for specific ports 2023-01-24 23:16:42 -08:00
Tim Wilkinson 701b2afa3c Refresh LQM's hostname periodically 2023-01-23 11:30:36 -08:00
Tim Wilkinson aa76c06b6a
Ignore non-routable when calculating hidden nodes (#665)
* Exclude neighbor's neighbors which are non-routable.
If a neighbor node's neighbor is non-routable, then no traffic will
flow from it, so it's not hidden

* Use routable flag for exposed node detection
2023-01-20 21:39:54 -06:00
Tim Wilkinson 78b8578a06
Automatically enable RTS when hidden nodes detected (#659) 2023-01-19 13:11:30 -06:00
Tim Wilkinson fdeda7d0cc
New network configuration code (#650)
* Migrate wired network building into node-setup
* Rework network setup
* Fixes for various alt wireless modes
* Commit to new vlan model
2023-01-14 23:07:22 -08:00
Tim Wilkinson 21c3b80e59 An node with a single RF link cant have exposed nodes 2023-01-14 22:54:00 -08:00
Tim Wilkinson b26017c2d1 Rework DTD blocking detection 2023-01-14 21:22:07 -08:00
Tim Wilkinson e0498ca856 Handle missing ip and more general RF/DTD identification 2023-01-13 14:08:36 -08:00
Tim Wilkinson 252b1dc8b7
Exposed node detection (#644) 2023-01-12 19:58:27 -08:00
Tim Wilkinson d2ac62e775 Bug fixes + distance information 2023-01-12 14:54:35 -08:00
Tim Wilkinson 72cc6c8a06
Hidden node detection (and optional mitigation) (#635)
* Enable RTS/CTS when we detect hidden nodes
* Only change rts setting when we need to
* RTS advanced config option
* Include neighbors blocked neighbors (they still transmit)
* Bump default RTS threashold
* Report list of hidden node rather than yes/no
* Canonical hostnames
* When we enable RTS, enable it for all traffic by default
* Show hidden neighbors in display
* Default RTS threshold to -1 (always off)
2023-01-12 10:31:28 -08:00
Tim Wilkinson fb6060cf3a Fix idle tunnel quality check
When a tunnel is idle, binding to the tun* device fails; so remove it.
As we have a direct tunnel route in the routing table (not OLSR table 30)
created by vtun, we will still correctly route the quality testing traffic.
2023-01-07 20:32:42 -08:00
Tim Wilkinson b082f56fee Remove LQM first run code
This was used during the transition to using LQM and is no longer needed.
2023-01-07 07:43:23 -08:00
Tim Wilkinson fce9629249 Switch from wget to curl for better control of timeouts 2023-01-04 22:43:14 -08:00
Tim Wilkinson bea7fb6723
Fix tunnel quality measurement (#617) 2023-01-04 15:16:13 -06:00
Tim Wilkinson bc77ff8b5b
Enable ac neg channels (#615) 2023-01-03 21:25:22 -08:00
Tim Wilkinson 0992c62755
Terminate monitors when nothing to monitor (#577) 2022-12-22 23:35:01 -06:00
Tim Wilkinson 6950479bf1
Update AREDN to OpenWRT 22.3.2 (Major Upgrade) (#574)
* Update to Openwrt 21.02 and add support for the CPE710 v1
Update scripts to change references to ifname to device due to a change in Openwrt naming
reverse-wpad-basic-wolfssl and disable SSL on Curl

NOTE: The compile host must have python3-distutils installed for gpsd to build

* aredn: initial working upgrade to openwrt 21.02.1

* aredn: update 1 to working upgrade to openwrt 21.02.1

* aredn: add cpe710v1 to build config

* Andrew's patches

* Remove duplicates + display perl

* Temp disable wifi extension patch

* ifname/ports support

* Add spectrum patch back in

* Generic function to extra interfaces

* New api to get wifi ifname

* Disables jails

* Style link

* aredn: partial upgrade to openwrt 22.0.3.0

added AC device images and partial migration to 22.0.3.0
firewall upgrade pending

* aredn:  update mesh-release and revert config.mk

* Unused

* NFT firewall rewrite

* Common-isze configs

* Fix network layout for hap2

* Use local packages dev (new firewall rules)

* Add HAP2

* Add pause after network restart to let bridge reinitialize

* Various lua fixes for new lua version

* Tweak config

* Re-fix networking (lost patch change)

* Add new radio names

* Tolerate missing wifi

* Fix hap-lite switch setup

* More devices

* New radio id

* Build Rocket 5AC lite

* Remove need for luci.sys

* Remove need for luci.sys

* Explicitly name wlan interfaces

* Handle different compatibility verisoning

* Update networking for switches

* ipref version bump

* Extra flag for curl

* Better compat_version fix

* Remove wolfssl

* Fix dns server

* Fix device name

* Unused

* Remove things we dont need

* Remove unused packages

* Generic macaddr overrides

* Fix uci commit

* Fix luci.template.parser to avoid luci.http loading the real thing

* Rocket-M build

* Add search-domain dhcp option

* Turn of ipv6

* No IPV6 in dnsmasq

* Override mac addresses if devices all  the same

* Working from master (for now)

* Put back hostap

* Disable old ethmac fixup

* Tweak configs

* Move back to v22.03.2
Leave ipq4019 builds to master

* Need IPV6 to compile nft firewall

* Rocket-M fixes

* Before we start

* WIP

* Working snapshot

* Cleaned patches

* Merged patch

* Single patch to support HAP2

* Fix typo

* Add nanostation-m

* 5/10Mhz patch

* 5+10MHz patch for ath10k-ct driver

* Extend 2Ghz channel check to include -4 to -1

* Add chanbw setup for ath10k (like ath9k)

* Added TP-Link CPE710 v1

* Override firmwares

* Missing patch

* Dropbear config like 3.22.8.0

* Add Ubiquiti Rocket 5AC Lite

* Fix c6

* Update

* Need more scan channels

* Remove IPV6

* Improve mac fixups

* Put back missing nft app

* IPv6 removed so dont have to disable it

* Fix rocket-m flash bug

* Fix nanostation-m

* Nanobridge is tiny

* Fix wifi order for ar750

* Rocket M5 XW support

* New rates

* Fix firewall4 so we don't need IPv6

* Allow channel width to be restricted

* Move channel list into library

* Fix naming

* Mechanism to block specific channels on specific radios

* Refresh buttons

* routerboard-sxt-5nd

* CPE605 v1.0

* Improve rocket m xw

* tpink

* Update patch

* Update to remove disable

* Remove BW restrictions on cpe710

* Restrict to what has been tested

* Remove test BW restrictions

* sxtsq-5-ac

* Update

* Update

* powerbeam-m5-300 support

* Fix

* Fix hap2

* Tidy unused patches

* Remove limit

* Add ubnt_bullet-m-ar7241

* Added ubnt_nanobeam-ac-gen2

* Fix typo

* Tolerate missing dtd ip

* Explicitly gix hap2 mac addresses

* Fix some broken patches

* Hap2 wont work at 5MHz

* Ubiquiti LiteBeam 5AC Gen2

* Fix compat_version for sxt 5ac

* Update patch

* Unused

* Fix lan configuration for some devices

* Rolling average of noise level

* Unused

* Split out the ath10k rssi monitor (its very simple at the moment)

* Ignore .DS_Store

* Reboot if ethernet doesnt come up (but only once!)

* reboot returns - add exit

* Add some logging info

* Fix ]

* Check all possibly ethernet bridges

* Improve mac fixing

* Remove HostAP on small memory devices

* Reduce dropbear footprint

* Add setsid

* Kill hostap when upgrading to save memory

* Different way to detect hostapd unavailable

* New build steps

* Improve manager logging

* Fix name conflict for the two monitors

* Try to improve test mesh name resolve problem

* Migrate tiny to generic (tiny doesnt work properly)

* Typo

* Another attempt to fix macs for Mikrotik

* Protect against missing trackers

* Fix wpad for ipq40xx

* Remove old tunnel check code

* Enable ZRAM swap to aid low memory devices

* ath10k noise can something be out of range - protect against that

* Updated with current devices and status

* Update firmware which has been tested

* Updated with more builds

* More binary/README

* Fix css error

* Start noise at sensible base level

* Unfix the css so it looks how it use to.

* Save as much memory as we can on lowmem nodes

* Hide some options on low memory devices

* Add "eol" to 32MB devices

* Restart network rather than reboot node if it seems to be broken

* Fixes

* Revert network reset

* Fix ar750 networking

* Continue to trim tiny configs

* More devices

* Dump IW output messages

* Fix Rocket 5AC intermittent ethernet issue

* Ethernet fix for PowerBeam 5AC 500

* More tiny size reduction

* More support data

* Fixed POE and USB power features

* Add Ubiquiti NanoBeam AC (gen1)

* NanoStation (not NanoBeam)

* Add mii-tool package

* Device updates

* Bump update time to 5 minutes

* Fix ethernet negotiation for rocker-5ac and nanobeam

* Fix iplookup

* Config changes based on call feedback

* Radio listing fixes

* Update with more untested builds

* Fallback TxMbps extracted from iw station dump

* Fix tunnel detection for low memory nodes

* Remove unused feed packages

* snapshot build

* Update stability info

* Add powerbeam-5ac-500

* Typo

* Add missing 3.22.1.0

* Add MikroTik LHG 5 AC

* Fix permissions

* Fix permissions

* AirGrid's take Bullet builds

* Mikrotik AC3

* Improve supportdata structure a little to make it easier to find things

* Restore WAN VLAN overrides

* Fix vlan regex for hap2 and hap3

* Support old and new style poe controls

* hap-ac3 is version 1.1

* Handle typo in some openwrt config files

* Fix HAP AC3 install

* Update hap ac3 status

* Support user overrides for network ports (non-swconfig devices)

* LHG 5AC support

* Remove -nand

* Remove non-working platform.sh change

* tunnel weight override

* Omit LinkQualityMult when value is 1

* Add mANTBox 19s and 15s

* Support ath79 mikrotik devices which require ath10k in the initramfs

Co-authored-by: apcameron <apcameron@softhome.net>
Co-authored-by: Joe AE6XE <ae6xe@arrl.net>
Co-authored-by: Joe Ayers <joe@arrl.net>
2022-12-22 14:22:49 -06:00
Tim Wilkinson 41b5040102
Improve xlink integration (#545) 2022-11-14 22:45:58 -06:00
Tim Wilkinson 580bbc79fe
Fix for when dtd distance hasn't been found (#549) 2022-11-14 21:45:08 -06:00
Tim Wilkinson 277610bf27
Fix new mac extraction code in LQM was breaking for tunnels (#525) 2022-10-14 15:29:40 -05:00
Tim Wilkinson 93ad1f5ee7
Strip out as many dependncies from Lua Manager as possible to save memory (#522) 2022-10-13 12:07:36 -05:00
Tim Wilkinson c341bba378
Switch to more active wifi reset (#508) 2022-09-20 18:29:03 -05:00
Tim Wilkinson 5efd0276fe Add a wifi scan trigger for when the nodes detected becomes zero 2022-09-10 12:39:44 -07:00
Tim Wilkinson 238d0fcd70
Stop node's LQM neighbors including itself (#502) 2022-09-09 08:50:39 -05:00
Tim Wilkinson 6ba17b8e5a
Snapshot hostnames after updates so we have a consistent copy to display (#488) 2022-09-06 09:58:18 -05:00
Tim Wilkinson dd590a6102
Handle dtd bridge device (#431) 2022-07-13 16:19:56 -05:00
Tim Wilkinson 7887497cb3 Allow auto-distance to be overridden when LQM cannot determine the
distance to other nodes
2022-06-27 15:29:39 -07:00
Tim Wilkinson f8d71b6552 Never block short DtD links regardless of quality.
Ignore invalid mac from arp table when building lookup table.
 This avoids a problem where a mac can be in the table twice,
 once valid and once invalid with an old ip address.
2022-06-22 11:57:29 -07:00
Tim Wilkinson bdb46624f0 DtD links have to be close by 2022-06-16 20:34:39 -07:00
Tim Wilkinson 6b1ec622aa DtD links have to be close by 2022-06-16 20:34:39 -07:00
Tim Wilkinson 28f25cf951 Allow user to force certain macs to be accepted 2022-06-16 20:34:39 -07:00
Tim Wilkinson fb2ec36bb6 LQM2 2022-06-16 20:34:39 -07:00
Tim Wilkinson b86213a66f
LQM fixes 6 (#379) 2022-05-31 21:54:02 -05:00
Tim Wilkinson ba94a86ce3 Fix empty initial lqm status.
Limit distance between DtD nodes which are considered at the same site.
Some network setups use non-ham networks to connect nodes over DtD links.
These should not be consider the same site, so we limit how far appart DtDed
nodes can be when optimizing.
2022-05-26 23:32:37 -07:00
Tim Wilkinson 2f96f2bc7a Really old sysinfo.json dont have link_info 2022-05-25 21:55:27 -07:00
Tim Wilkinson 53632d322d
LQM fixes 4 (#370)
* Tidy LQM status
Remove TX Estimate which was duplicating information on the mesh page and
confusing folk.
Sort by name to stop the display jumping around.

* Split out ping and tx qualities and use average of both.

* Improve keeping re-discovered nodes in pending

* Remove .local.mesh from hostname (they're there sometimes)

* Identify why poor quality traffic is blocked
2022-05-24 10:35:36 -05:00
Tim Wilkinson 988c7f251b
Turn LQM off when not enabled!! (#369) 2022-05-23 07:44:37 -05:00
Tim Wilkinson b680d2019e
LQM fixes 3 (#366) 2022-05-22 21:06:02 -05:00
Tim Wilkinson 1ceb7b2140
LQM fixes 2 (#365) 2022-05-20 21:23:57 -05:00
Tim Wilkinson a8b7f8a216
LQM improvements (#364) 2022-05-20 08:10:01 -05:00
Tim Wilkinson b23ab5ee8a
Link Quality Management (#360)
* Link Quality Management experiment (built in)

* Protect LQM pages

* Omit "empty" mac addresses

* Integrate LQM v0.2
Includes proposed UI if this were built-in.
When LQM is enabled (advanced settings) the usual distance inputs are
replaced with "min snr' and "max distance" inputs which are the major
ones you might tweak, as well as a link to the LQM status page.
Other controls are now available (so protected) in advanced settings.

* Improve LQM updating

* Use running snr averages

* Merge app changes

* AREDN-ize the UI

* Improve status language

* Improved DtD detection

* Improve quality reporting

* Link Quality category

* Enable by default

* Better intergration

* Link => Neighbor

* Formatting

* Make sure initial page is populated without extra fetch

* Handle empty lqm.info

* Update with latest experiment algorithm changes

* Validate LQM settings before applying them

* Algorithm updates

* Improve quality reporting

* %% -> %

* Default max distance now 50 miles

* Get actual noise if radio will provide it

* low_snr => min_snr

* Dont print node description if we dont have one

* Remove properties duplicated from setup page

* Localize max distance. Miles in GB and US, Kilometers everywhere else.

* Ping link quality testing

* UDP 'ping' for quality check

* Change Active Settings title

* Expand ping test

* Improve messaging

* Add a ping penalty for neighbors which cannot be contacted in a timely manner.

* Remove user_blocks config option. No one needs to use this anymore.

* Localize distances on lqm page

* Improve status reporting

* First run emergency node setup.
When a node first runs LQM, if the default settings fail to connect to
a node we will now adjust them so that at least one node is viable.

* Restore blocking of mac addresses

* LQM now off by default
fixed #47
2022-05-18 12:49:00 -05:00
Tim Wilkinson 7aff95711e
Improve rss_monitor startup (#346) 04/27/2022
And shut down if we hae no wifi to monitor
2022-04-27 10:38:54 -05:00
Tim Wilkinson 0632a63853
Missing year (in one place) when updating snr log (#341) 2022-04-23 20:37:26 -05:00
Tim Wilkinson 2e4b51105c Handle nil links from olsrd 2022-04-19 04:34:41 -07:00
Tim Wilkinson b26476f5e2
Fix displaying "Previous neighbors" with empty hostnames (#325)
* Dont display previous neighbors with empty hostnames
* Use IP address when name missing
* Fix bug where missing names became ever growing string of whitespace
2022-04-04 22:16:35 -04:00
Tim Wilkinson a322c4a113
Add periodic tasks in the style of cron hourly, daily and weekly scripts (#317) 2022-03-24 21:52:32 -05:00
Tim Wilkinson 93e8d0a53d Cron is only running to poll AREDN messages, so kill it.
And move polling into the Lua Manager
2022-03-09 19:18:19 -08:00
Tim Wilkinson 861db07ad6
Move multi ant detection into main run method (#229)
to avoid startup failures.
2022-02-24 11:06:29 -06:00
Tim Wilkinson 2d3e9a86b3 Fix typo when checking for multiple antennas
This check can fail if the rssi manager is started early but the daemon will
be restarted.
2022-02-23 15:40:32 -08:00
Tim Wilkinson 1ddedbb1e6
Link led fixes (#212) 2022-01-28 20:19:34 -06:00
Tim Wilkinson fa6c2da4fe
Lua Services (#189)
* Lua Services

* Support multiple antenna chains

* Improved led detection

* Fix logging

* Add manager.log files to support tool
2022-01-17 18:54:44 -06:00