nebula

Commit Graph

Author	SHA1	Message	Date
Wade Simmons	fe16ea566d	firewall reject packets: cleanup error cases (#957 )	2023-11-13 12:43:51 -06:00
Nate Brown	a44e1b8b05	Clean up a hostinfo to reduce memory usage (#955 )	2023-11-02 16:53:59 -05:00
Nate Brown	5a131b2975	Combine ca, cert, and key handling (#952 )	2023-08-14 21:32:40 -05:00
Nate Brown	a10baeee92	Pull hostmap and pending hostmap apart, remove unused functions (#843 )	2023-07-24 12:37:52 -05:00
Nate Brown	03e4a7f988	Rehandshaking (#838 ) Co-authored-by: Brad Higgins <brad@defined.net> Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2023-05-04 15:16:37 -05:00
brad-defined	9b03053191	update EncReader and EncWriter interface function args to have concrete types (#844 ) * Update LightHouseHandlerFunc to remove EncWriter param. * Move EncWriter to interface * EncReader, too	2023-04-07 14:28:37 -04:00
Nate Brown	ee8e1348e9	Use connection manager to drive NAT maintenance (#835 ) Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>	2023-03-31 15:45:05 -05:00
Nate Brown	1a6c657451	Normalize logs (#837 )	2023-03-30 15:07:31 -05:00
brad-defined	2801fb2286	Fix relay (#827 ) Co-authored-by: Nate Brown <nbrown.us@gmail.com>	2023-03-30 11:09:20 -05:00
Wade Simmons	6e0ae4f9a3	firewall: add option to send REJECT replies (#738 ) * firewall: add option to send REJECT replies This change allows you to configure the firewall to send REJECT packets when a packet is denied. firewall: # Action to take when a packet is not allowed by the firewall rules. # Can be one of: # `drop` (default): silently drop the packet. # `reject`: send a reject reply. # - For TCP, this will be a RST "Connection Reset" packet. # - For other protocols, this will be an ICMP port unreachable packet. outbound_action: drop inbound_action: drop These packets are only sent to established tunnels, and only on the overlay network (currently IPv4 only). $ ping -c1 192.168.100.3 PING 192.168.100.3 (192.168.100.3) 56(84) bytes of data. From 192.168.100.3 icmp_seq=2 Destination Port Unreachable --- 192.168.100.3 ping statistics --- 2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 31ms $ nc -nzv 192.168.100.3 22 (UNKNOWN) [192.168.100.3] 22 (?) : Connection refused This change also modifies the smoke test to capture tcpdump pcaps from both the inside and outside to inspect what is going on over the wire. It also now does TCP and UDP packet tests using the Nmap version of ncat. * calculate seq and ack the same was as the kernel The logic a bit confusing, so we copy it straight from how the kernel does iptables `--reject-with tcp-reset`: - https://github.com/torvalds/linux/blob/v5.19/net/ipv4/netfilter/nf_reject_ipv4.c#L193-L221 * cleanup	2023-03-13 15:08:40 -04:00
Nate Brown	92cc32f844	Remove handshake race avoidance (#820 ) Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2023-03-13 12:35:14 -05:00
Nate Brown	a06977bbd5	Track connections by local index id instead of vpn ip (#807 )	2023-02-13 14:41:05 -06:00
Caleb Jasik	12dbbd3dd3	Fix typos found by https://github.com/crate-ci/typos (#735 )	2022-12-19 11:28:27 -06:00
Nate Brown	4c0ae3df5e	Refuse to process double encrypted packets (#741 )	2022-09-19 12:47:48 -05:00
Wade Simmons	7b9287709c	add listen.send_recv_error config option (#670 ) By default, Nebula replies to packets it has no tunnel for with a `recv_error` packet. This packet helps speed up re-connection in the case that Nebula on either side did not shut down cleanly. This response can be abused as a way to discover if Nebula is running on a host though. This option lets you configure if you want to send `recv_error` packets always, never, or only to private network remotes. valid values: always, never, private This setting is reloadable with SIGHUP.	2022-06-27 12:37:54 -04:00
brad-defined	1a7c575011	Relay (#678 ) Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>	2022-06-21 13:35:23 -05:00
Wade Simmons	45d1d2b6c6	Update dependencies - 2022-04 (#664 ) Updated github.com/kardianos/service https://github.com/kardianos/service/compare/v1.2.0...v1.2.1 Updated github.com/miekg/dns https://github.com/miekg/dns/compare/v1.1.43...v1.1.48 Updated github.com/prometheus/client_golang https://github.com/prometheus/client_golang/compare/v1.11.0...v1.12.1 Updated github.com/prometheus/common https://github.com/prometheus/common/compare/v0.32.1...v0.33.0 Updated github.com/stretchr/testify https://github.com/stretchr/testify/compare/v1.7.0...v1.7.1 Updated golang.org/x/crypto `5770296d90...ae2d96664a` Updated golang.org/x/net `69e39bad7d...749bd193bc` Updated golang.org/x/sys `7861aae155...289d7a0edf` Updated golang.zx2c4.com/wireguard/windows v0.5.1...v0.5.3 Updated google.golang.org/protobuf v1.27.1...v1.28.0	2022-04-18 12:12:25 -04:00
Nate Brown	312a01dc09	Lighthouse reload support (#649 ) Co-authored-by: John Maguire <contact@johnmaguire.me>	2022-03-14 12:35:13 -05:00
Wade Simmons	949ec78653	don't set ConnectionState to nil (#590 ) * don't set ConnectionState to nil We might have packets processing in another thread, so we can't safely just set this to nil. Since we removed it from the hostmaps, the next packets to process should start the handshake over again. I believe this comment is outdated or incorrect, since the next handshake will start over with a new HostInfo, I don't think there is any way a counter reuse could happen: > We must null the connectionstate or a counter reuse may happen Here is a panic we saw that I think is related: panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x93a037] goroutine 59 [running, locked to thread]: github.com/slackhq/nebula.(Firewall).Drop(...) github.com/slackhq/nebula/firewall.go:380 github.com/slackhq/nebula.(Interface).consumeInsidePacket(...) github.com/slackhq/nebula/inside.go:59 github.com/slackhq/nebula.(Interface).listenIn(...) github.com/slackhq/nebula/interface.go:233 created by github.com/slackhq/nebula.(Interface).run github.com/slackhq/nebula/interface.go:191 * use closeTunnel	2021-12-06 14:09:05 -05:00
Nate Brown	bcabcfdaca	Rework some things into packages (#489 )	2021-11-03 20:54:04 -05:00
Wade Simmons	ea2c186a77	remote_allow_ranges: allow inside CIDR specific remote_allow_lists (#540 ) This allows you to configure remote allow lists specific to different subnets of the inside CIDR. Example: remote_allow_ranges: 10.42.42.0/24: 192.168.0.0/16: true This would only allow hosts with a VPN IP in the 10.42.42.0/24 range to have private IPs (and thus don't connect over public IPs). The PR also refactors AllowList into RemoteAllowList and LocalAllowList to make it clearer which methods are allowed on which allow list.	2021-10-19 10:54:30 -04:00
Andrii Chubatiuk	d13f4b5948	fixed recv_errors spoofing condition (#482 ) Hi @nbrownus Fixed a small bug that was introduced in df7c7ee#diff-5d05d02296a1953fd5fbcb3f4ab486bc5f7c34b14c3bdedb068008ec8ff5beb4 having problems due to it	2021-06-03 13:04:04 -04:00
Nathan Brown	df7c7eec4a	Get out faster on nil udpAddr (#449 )	2021-04-26 20:21:47 -05:00
Nathan Brown	6f37280e8e	Fully close tunnels when CloseAllTunnels is called (#448 )	2021-04-26 10:42:24 -05:00
Nathan Brown	710df6a876	Refactor remotes and handshaking to give every address a fair shot (#437 )	2021-04-14 13:50:09 -05:00
Nathan Brown	75f7bda0a4	Lighthouse performance pass (#418 )	2021-03-31 17:32:02 -05:00
Nathan Brown	883e09a392	Don't use a global ca pool (#426 )	2021-03-29 12:10:19 -05:00
Nathan Brown	3ea7e1b75f	Don't use a global logger (#423 )	2021-03-26 09:46:30 -05:00
Nathan Brown	7073d204a8	IPv6 support for outside (udp) (#369 )	2021-03-18 20:37:24 -05:00
Wade Simmons	6c55d67f18	Refactor handshake_ix (#401 ) There are some subtle race conditions with the previous handshake_ix implementation, mostly around collisions with localIndexId. This change refactors it so that we have a "commit" phase during the handshake where we grab the lock for the hostmap and ensure that we have a unique local index before storing it. We also now avoid using the pending hostmap at all for receiving stage1 packets, since we have everything we need to just store the completed handshake. Co-authored-by: Nate Brown <nbrown.us@gmail.com> Co-authored-by: Ryan Huber <rhuber@gmail.com> Co-authored-by: forfuncsake <drussell@slack-corp.com>	2021-03-12 14:16:25 -05:00
Wade Simmons	64d8035d09	fix race in getOrHandshake (#400 ) We missed this race with #396 (and I think this is also the crash in issue #226). We need to lock a little higher in the getOrHandshake method, before we reset hostinfo.ConnectionInfo. Previously, two routines could enter this section and confuse the handshake process. This could result in the other side sending a recv_error that also has a race with setting hostinfo.ConnectionInfo back to nil. So we make sure to grab the lock in handleRecvError as well. Neither of these code paths are in the hot path (handling packets between two hosts over an active tunnel) so there should be no performance concerns.	2021-03-09 09:27:02 -05:00
Wade Simmons	2a4beb41b9	Routine-local conntrack cache (#391 ) Previously, every packet we see gets a lock on the conntrack table and updates it. When running with multiple routines, this can cause heavy lock contention and limit our ability for the threads to run independently. This change caches reads from the conntrack table for a very short period of time to reduce this lock contention. This cache will currently default to disabled unless you are running with multiple routines, in which case the default cache delay will be 1 second. This means that entries in the conntrack table may be up to 1 second out of date and remain in a routine local cache for up to 1 second longer than the global table. Instead of calling time.Now() for every packet, this cache system relies on a tick thread that updates the current cache "version" each tick. Every packet we check if the cache version is out of date, and reset the cache if so.	2021-03-01 19:52:17 -05:00
Wade Simmons	1bae5b2550	more validation in pending hostmap deletes (#344 ) We are currently seeing some cases where we are not deleting entries correctly from the pending hostmap. I believe this is a case of an inbound timer tick firing and deleting the Hosts map entry for a newer handshake attempt than intended, thus leaving the old Indexes entry orphaned. This change adds some extra checking when deleteing from the Indexes and Hosts maps to ensure we clean everything up correctly.	2021-03-01 12:40:46 -05:00
Tim Rots	e7e6a23cde	fix a few typos (#302 )	2021-03-01 11:14:34 -05:00
Wade Simmons	27d9a67dda	Proper multiqueue support for tun devices (#382 ) This change is for Linux only. Previously, when running with multiple tun.routines, we would only have one file descriptor. This change instead sets IFF_MULTI_QUEUE and opens a file descriptor for each routine. This allows us to process with multiple threads while preventing out of order packet reception issues. To attempt to distribute the flows across the queues, we try to write to the tun/UDP queue that corresponds with the one we read from. So if we read a packet from tun queue "2", we will write the outgoing encrypted packet to UDP queue "2". Because of the nature of how multi queue works with flows, a given host tunnel will be sticky to a given routine (so if you try to performance benchmark by only using one tunnel between two hosts, you are only going to be using a max of one thread for each direction). Because this system works much better when we can correlate flows between the tun and udp routines, we are deprecating the undocumented "tun.routines" and "listen.routines" parameters and introducing a new "routines" parameter that sets the value for both. If you use the old undocumented parameters, the max of the values will be used and a warning logged. Co-authored-by: Nate Brown <nbrown.us@gmail.com>	2021-02-25 15:01:14 -05:00
Wade Simmons	ee7c27093c	add HostMap.RemoteIndexes (#329 ) This change adds an index based on HostInfo.remoteIndexId. This allows us to use HostMap.QueryReverseIndex without having to loop over all entries in the map (this can be a bottleneck under high traffic lighthouses). Without this patch, a high traffic lighthouse server receiving recv_error packets and lots of handshakes, cpu pprof trace can look like this: flat flat% sum% cum cum% 2000ms 32.26% 32.26% 3040ms 49.03% github.com/slackhq/nebula.(*HostMap).QueryReverseIndex 870ms 14.03% 46.29% 1060ms 17.10% runtime.mapiternext Which shows 50% of total cpu time is being spent in QueryReverseIndex.	2020-11-23 14:51:16 -05:00
Wade Simmons	2e7ca027a4	Lighthouse handler optimizations (#320 ) We noticed that the number of memory allocations LightHouse.HandleRequest creates for each call can seriously impact performance for high traffic lighthouses. This PR introduces a benchmark in the first commit and then optimizes memory usage by creating a LightHouseHandler struct. This struct allows us to re-use memory between each lighthouse request (one instance per UDP listener go-routine).	2020-11-23 14:50:01 -05:00
Wade Simmons	aba42f9fa6	enforce the use of goimports (#248 ) * enforce the use of goimports Instead of enforcing `gofmt`, enforce `goimports`, which also asserts a separate section for non-builtin packages. * run `goimports` everywhere * exclude generated .pb.go files	2020-06-30 18:53:30 -04:00
Wade Simmons	b37a91cfbc	add meta packet statistics (#230 ) This change add more metrics around "meta" (non "message" type packets). For lighthouse packets, we also record statistics around the specific lighthouse meta type. We don't keep statistics for the "message" type so that we don't slow down the fast path (and you can just look at metrics on the tun interface to find that information).	2020-06-26 13:45:48 -04:00
Patrick Bogen	ecf0e5a9f6	drop packets even if we aren't going to emit Debug logs about it (#239 ) * drop packets even if we aren't going to emit Debug logs about it * smallify change	2020-06-10 16:55:49 -05:00
Patrick Bogen	363c836422	log the reason for fw drops (#220 ) * log the reason for fw drops * only prepare log if we will end up sending it	2020-04-10 10:57:21 -07:00
Wade Simmons	4f6313ebd3	fix config name for {remote,local}_allow_list (#219 ) This config option should be snake_case, not camelCase.	2020-04-08 16:20:12 -04:00
Wade Simmons	0a474e757b	Add lighthouse.{remoteAllowList,localAllowList} (#217 ) These settings make it possible to blacklist / whitelist IP addresses that are used for remote connections. `lighthouse.remoteAllowList` filters which remote IPs are allow when fetching from the lighthouse (or, if you are the lighthouse, which IPs you store and forward to querying hosts). By default, any remote IPs are allowed. You can provide CIDRs here with `true` to allow and `false` to deny. The most specific CIDR rule applies to each remote. If all rules are "allow", the default will be "deny", and vice-versa. If both "allow" and "deny" rules are present, then you MUST set a rule for "0.0.0.0/0" as the default. lighthouse: remoteAllowList: # Example to block IPs from this subnet from being used for remote IPs. "172.16.0.0/12": false # A more complicated example, allow public IPs but only private IPs from a specific subnet "0.0.0.0/0": true "10.0.0.0/8": false "10.42.42.0/24": true `lighthouse.localAllowList` has the same logic as above, but it applies to the local addresses we advertise to the lighthouse. Additionally, you can specify an `interfaces` map of regular expressions to match against interface names. The regexp must match the entire name. All interface rules must be either true or false (and the default rule will be the inverse). CIDR rules are matched after interface name rules. Default is all local IP addresses. lighthouse: localAllowList: # Example to blacklist docker interfaces. interfaces: 'docker.*': false # Example to only advertise IPs in this subnet to the lighthouse. "10.0.0.0/8": true	2020-04-08 15:36:43 -04:00
Wade Simmons	b4f2f7ce4e	log `certName` alongside `vpnIp` (#200 ) This change adds a new helper, `(*HostInfo).logger()`, that starts a new logrus.Entry with `vpnIp` and `certName`. We don't use the helper inside of handshake_ix though since the certificate has not been attached to the HostInfo yet. Fixes: #84	2020-04-06 11:34:00 -07:00
Ryan Huber	9333a8e3b7	subnet support	2019-12-12 16:34:17 +00:00
Slack Security Team	f22b4b584d	Public Release	2019-11-19 17:00:20 +00:00

46 Commits