This is an attempt to unify all the station monitoring and make it work
better as one. We're trying to square a circle here somewhat, with taking
steps to kick nodes when problems are detected, but not kick them too quickly
or often in case we're mis-identifing issues.
We've seen these issue manifest themselves which nodes messing VoIP services
as well as resets causing nodes to get into unrecoverable states when there
was no real problems in the first place.
This will probably need to evolve before the next release, but would be good
to get some milage on the new code.
Coverage is handled by modifying firmware state, and the driver stores
the values the first time it is set. When we reset this state might be lost
so it will be reloaded from the firmware. We set the coverage back to 0
so the reloaded value will be the default again.
We also remove a check which can fail incorrectly.
* A scan, especially if we have to do both active and passive, essentially mutes
the radio to AREDN traffic for 10-20 seconds, which isn't good. If the radio is completely
deaf then it doesn't matter, but particularly on the 9K radios we do this when
things are looking a bit dodgy, though not deaf.
* Provide hook to reset ath9k from userspace. This hook is attributed to:
Linus Lüssing <ll@simonwunderlich.de>
* User /sys reset hooks rather than iw scan
* Make admin and user bar menus pluggable
* Realign header block to stop is moving around
* Remove ref
* Use modular nav to disable ineligable options during initial install
* Dont offer tunnel menus options when no tunnel daemon installed.
This is for low-memory devices
* Simplify
* Improve messaging when running ram image
* Disable rather than hide vpn menu items on tiny memory devices
* Move menu navs
* Use LQM information to filter out neighbors we dont care about.
These can cause false rejoin events and degrade the network.
* Only use active station monitor with LQM info.
* Resolve unresponsive node problems with Mikrotik AC devices.
Mikrotik AC devices get into a state where they wont communicate with
non-AC devices .. sometimes. Leaving and rejoinging the network resets
everything. We monitor for this situation and rejoin the network when detected
to resolve the issue.
* Make reporting less chatty
* General station monitor service.
It turns out this station bug is not limited to the ath10k driver, so
make this monitor service wifi generic.
(I've now seen this at both ends of the Mikrotik AC <-> Rocket pair)
* New logs
* Just monitor for now
There appears to be a bug in the ath10k firmware for Mikrotik devices (maybe others)
where a station will associate but only broadcast traffic will be passed - unicast traffic
will fail. This code detects this situation and forces the device to reassociate which
fixes the problem.
* Track validation state of hosts and services. Only remove a host/service if it fails multiple times in a row.
* Let new addresses/services be valid for a while regardless
* Initially unknown addresses will be valid for a while
* Reset validation state when services updated
On small networks there are not a lot of OLSR name changes. While
dnsmasq watches for changes and updates itself, it will sometimes miss
them. On busy networks this doesnt matter as the next change will catch
it up. But on smaller network (esp. test networks) a missed change can
stop name resolution working for some time. So now, if no changes are
detected for > 60 seconds, we force dnsmasq to reload its tables.
For some reason, there was code in the driver to block the setting of
the coverage when a previous setting wasn't a particular value.
It's unclear what this was trying to achieve or prevent, but it stopped AC
devices operating efficiently (by a factor of 10x or more).
* Exclude neighbor's neighbors which are non-routable.
If a neighbor node's neighbor is non-routable, then no traffic will
flow from it, so it's not hidden
* Use routable flag for exposed node detection
* Enable RTS/CTS when we detect hidden nodes
* Only change rts setting when we need to
* RTS advanced config option
* Include neighbors blocked neighbors (they still transmit)
* Bump default RTS threashold
* Report list of hidden node rather than yes/no
* Canonical hostnames
* When we enable RTS, enable it for all traffic by default
* Show hidden neighbors in display
* Default RTS threshold to -1 (always off)
The connec timeout did not include DNS looksup, and if DNS is broken this can hang forever. Add
a maxmimum timeout so this call will eventually terminate regardless.
When a tunnel is idle, binding to the tun* device fails; so remove it.
As we have a direct tunnel route in the routing table (not OLSR table 30)
created by vtun, we will still correctly route the quality testing traffic.
* Update to Openwrt 21.02 and add support for the CPE710 v1
Update scripts to change references to ifname to device due to a change in Openwrt naming
reverse-wpad-basic-wolfssl and disable SSL on Curl
NOTE: The compile host must have python3-distutils installed for gpsd to build
* aredn: initial working upgrade to openwrt 21.02.1
* aredn: update 1 to working upgrade to openwrt 21.02.1
* aredn: add cpe710v1 to build config
* Andrew's patches
* Remove duplicates + display perl
* Temp disable wifi extension patch
* ifname/ports support
* Add spectrum patch back in
* Generic function to extra interfaces
* New api to get wifi ifname
* Disables jails
* Style link
* aredn: partial upgrade to openwrt 22.0.3.0
added AC device images and partial migration to 22.0.3.0
firewall upgrade pending
* aredn: update mesh-release and revert config.mk
* Unused
* NFT firewall rewrite
* Common-isze configs
* Fix network layout for hap2
* Use local packages dev (new firewall rules)
* Add HAP2
* Add pause after network restart to let bridge reinitialize
* Various lua fixes for new lua version
* Tweak config
* Re-fix networking (lost patch change)
* Add new radio names
* Tolerate missing wifi
* Fix hap-lite switch setup
* More devices
* New radio id
* Build Rocket 5AC lite
* Remove need for luci.sys
* Remove need for luci.sys
* Explicitly name wlan interfaces
* Handle different compatibility verisoning
* Update networking for switches
* ipref version bump
* Extra flag for curl
* Better compat_version fix
* Remove wolfssl
* Fix dns server
* Fix device name
* Unused
* Remove things we dont need
* Remove unused packages
* Generic macaddr overrides
* Fix uci commit
* Fix luci.template.parser to avoid luci.http loading the real thing
* Rocket-M build
* Add search-domain dhcp option
* Turn of ipv6
* No IPV6 in dnsmasq
* Override mac addresses if devices all the same
* Working from master (for now)
* Put back hostap
* Disable old ethmac fixup
* Tweak configs
* Move back to v22.03.2
Leave ipq4019 builds to master
* Need IPV6 to compile nft firewall
* Rocket-M fixes
* Before we start
* WIP
* Working snapshot
* Cleaned patches
* Merged patch
* Single patch to support HAP2
* Fix typo
* Add nanostation-m
* 5/10Mhz patch
* 5+10MHz patch for ath10k-ct driver
* Extend 2Ghz channel check to include -4 to -1
* Add chanbw setup for ath10k (like ath9k)
* Added TP-Link CPE710 v1
* Override firmwares
* Missing patch
* Dropbear config like 3.22.8.0
* Add Ubiquiti Rocket 5AC Lite
* Fix c6
* Update
* Need more scan channels
* Remove IPV6
* Improve mac fixups
* Put back missing nft app
* IPv6 removed so dont have to disable it
* Fix rocket-m flash bug
* Fix nanostation-m
* Nanobridge is tiny
* Fix wifi order for ar750
* Rocket M5 XW support
* New rates
* Fix firewall4 so we don't need IPv6
* Allow channel width to be restricted
* Move channel list into library
* Fix naming
* Mechanism to block specific channels on specific radios
* Refresh buttons
* routerboard-sxt-5nd
* CPE605 v1.0
* Improve rocket m xw
* tpink
* Update patch
* Update to remove disable
* Remove BW restrictions on cpe710
* Restrict to what has been tested
* Remove test BW restrictions
* sxtsq-5-ac
* Update
* Update
* powerbeam-m5-300 support
* Fix
* Fix hap2
* Tidy unused patches
* Remove limit
* Add ubnt_bullet-m-ar7241
* Added ubnt_nanobeam-ac-gen2
* Fix typo
* Tolerate missing dtd ip
* Explicitly gix hap2 mac addresses
* Fix some broken patches
* Hap2 wont work at 5MHz
* Ubiquiti LiteBeam 5AC Gen2
* Fix compat_version for sxt 5ac
* Update patch
* Unused
* Fix lan configuration for some devices
* Rolling average of noise level
* Unused
* Split out the ath10k rssi monitor (its very simple at the moment)
* Ignore .DS_Store
* Reboot if ethernet doesnt come up (but only once!)
* reboot returns - add exit
* Add some logging info
* Fix ]
* Check all possibly ethernet bridges
* Improve mac fixing
* Remove HostAP on small memory devices
* Reduce dropbear footprint
* Add setsid
* Kill hostap when upgrading to save memory
* Different way to detect hostapd unavailable
* New build steps
* Improve manager logging
* Fix name conflict for the two monitors
* Try to improve test mesh name resolve problem
* Migrate tiny to generic (tiny doesnt work properly)
* Typo
* Another attempt to fix macs for Mikrotik
* Protect against missing trackers
* Fix wpad for ipq40xx
* Remove old tunnel check code
* Enable ZRAM swap to aid low memory devices
* ath10k noise can something be out of range - protect against that
* Updated with current devices and status
* Update firmware which has been tested
* Updated with more builds
* More binary/README
* Fix css error
* Start noise at sensible base level
* Unfix the css so it looks how it use to.
* Save as much memory as we can on lowmem nodes
* Hide some options on low memory devices
* Add "eol" to 32MB devices
* Restart network rather than reboot node if it seems to be broken
* Fixes
* Revert network reset
* Fix ar750 networking
* Continue to trim tiny configs
* More devices
* Dump IW output messages
* Fix Rocket 5AC intermittent ethernet issue
* Ethernet fix for PowerBeam 5AC 500
* More tiny size reduction
* More support data
* Fixed POE and USB power features
* Add Ubiquiti NanoBeam AC (gen1)
* NanoStation (not NanoBeam)
* Add mii-tool package
* Device updates
* Bump update time to 5 minutes
* Fix ethernet negotiation for rocker-5ac and nanobeam
* Fix iplookup
* Config changes based on call feedback
* Radio listing fixes
* Update with more untested builds
* Fallback TxMbps extracted from iw station dump
* Fix tunnel detection for low memory nodes
* Remove unused feed packages
* snapshot build
* Update stability info
* Add powerbeam-5ac-500
* Typo
* Add missing 3.22.1.0
* Add MikroTik LHG 5 AC
* Fix permissions
* Fix permissions
* AirGrid's take Bullet builds
* Mikrotik AC3
* Improve supportdata structure a little to make it easier to find things
* Restore WAN VLAN overrides
* Fix vlan regex for hap2 and hap3
* Support old and new style poe controls
* hap-ac3 is version 1.1
* Handle typo in some openwrt config files
* Fix HAP AC3 install
* Update hap ac3 status
* Support user overrides for network ports (non-swconfig devices)
* LHG 5AC support
* Remove -nand
* Remove non-working platform.sh change
* tunnel weight override
* Omit LinkQualityMult when value is 1
* Add mANTBox 19s and 15s
* Support ath79 mikrotik devices which require ath10k in the initramfs
Co-authored-by: apcameron <apcameron@softhome.net>
Co-authored-by: Joe AE6XE <ae6xe@arrl.net>
Co-authored-by: Joe Ayers <joe@arrl.net>
* Fix for port ranges
Fix port range validation.
* Update CONTRIBUTORS
added myself
* Update files/usr/lib/lua/aredn/utils.lua
Reverting to whitespace protection plus escaping hyppen.
Co-authored-by: Tim Wilkinson <tim.wilkinson@me.com>
* Update ports
added %s* infront of the port range input in case a whitespace has been inserted.
Co-authored-by: Tim Wilkinson <tim.wilkinson@me.com>
Ignore invalid mac from arp table when building lookup table.
This avoids a problem where a mac can be in the table twice,
once valid and once invalid with an old ip address.
Limit distance between DtD nodes which are considered at the same site.
Some network setups use non-ham networks to connect nodes over DtD links.
These should not be consider the same site, so we limit how far appart DtDed
nodes can be when optimizing.
* Tidy LQM status
Remove TX Estimate which was duplicating information on the mesh page and
confusing folk.
Sort by name to stop the display jumping around.
* Split out ping and tx qualities and use average of both.
* Improve keeping re-discovered nodes in pending
* Remove .local.mesh from hostname (they're there sometimes)
* Identify why poor quality traffic is blocked
* Link Quality Management experiment (built in)
* Protect LQM pages
* Omit "empty" mac addresses
* Integrate LQM v0.2
Includes proposed UI if this were built-in.
When LQM is enabled (advanced settings) the usual distance inputs are
replaced with "min snr' and "max distance" inputs which are the major
ones you might tweak, as well as a link to the LQM status page.
Other controls are now available (so protected) in advanced settings.
* Improve LQM updating
* Use running snr averages
* Merge app changes
* AREDN-ize the UI
* Improve status language
* Improved DtD detection
* Improve quality reporting
* Link Quality category
* Enable by default
* Better intergration
* Link => Neighbor
* Formatting
* Make sure initial page is populated without extra fetch
* Handle empty lqm.info
* Update with latest experiment algorithm changes
* Validate LQM settings before applying them
* Algorithm updates
* Improve quality reporting
* %% -> %
* Default max distance now 50 miles
* Get actual noise if radio will provide it
* low_snr => min_snr
* Dont print node description if we dont have one
* Remove properties duplicated from setup page
* Localize max distance. Miles in GB and US, Kilometers everywhere else.
* Ping link quality testing
* UDP 'ping' for quality check
* Change Active Settings title
* Expand ping test
* Improve messaging
* Add a ping penalty for neighbors which cannot be contacted in a timely manner.
* Remove user_blocks config option. No one needs to use this anymore.
* Localize distances on lqm page
* Improve status reporting
* First run emergency node setup.
When a node first runs LQM, if the default settings fail to connect to
a node we will now adjust them so that at least one node is viable.
* Restore blocking of mac addresses
* LQM now off by default
fixed#47
If wifi is disabled, we will be using a "fake" device for the meshrf. However, this requires that the
underlying physical device is attached, and this might not be the case on devices with present
multiple ethernets (e.g. eth0 and eth1). Detect this and add an extra Hna4 config to OLSR to allow it to
keep using the wifi_ip even when no physical ethernet is attached.
* Dont display previous neighbors with empty hostnames
* Use IP address when name missing
* Fix bug where missing names became ever growing string of whitespace
* Improve the firmware upgrade process
The old firmware upgrade process attempted to free up RAM by reusing
the 'upgrade_kill_prep' script which is later used by '/sbin/sysupgrade'.
Unfortuantely this doesn't work as intented. While the script will go about
killing various services, 'procd' just goes and starts them up again using
quite a bit more memory in the process. Instead this script just kills
the various daemons 'no questions asked' and then runs the associated
'/etc/init.d/xxx stop' script to instruct 'procd' not the start them up again.
This gets us to the place the original script was trying to go.
+ A syntax fix in '007' patch (need spaces around the [ .. ])
* Inline the style for the firmware page to avoid sleep before flash
* Minor reliability improvements
* Clear away services even earlier
* Final bits of perl replaced by lua
* Use iwinfo during first boot (api I was using fails this early)
* Retry getting phy device (it can fail as the node is booting up)
* Lua vpn server and client pages
* Lua vpn server and client pages
* Fix reporting of daemon restart errors
* Lua olsrd-config
* Fix reversed client/server ip assignments
* Fix patterns for finding active tunnels