Commit Graph

348 Commits

Author SHA1 Message Date
Wade Simmons 6c55d67f18
Refactor handshake_ix (#401)
There are some subtle race conditions with the previous handshake_ix implementation, mostly around collisions with localIndexId. This change refactors it so that we have a "commit" phase during the handshake where we grab the lock for the hostmap and ensure that we have a unique local index before storing it. We also now avoid using the pending hostmap at all for receiving stage1 packets, since we have everything we need to just store the completed handshake.

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Ryan Huber <rhuber@gmail.com>
Co-authored-by: forfuncsake <drussell@slack-corp.com>
2021-03-12 14:16:25 -05:00
Wade Simmons 64d8035d09
fix race in getOrHandshake (#400)
We missed this race with #396 (and I think this is also the crash in
issue #226). We need to lock a little higher in the getOrHandshake
method, before we reset hostinfo.ConnectionInfo. Previously, two
routines could enter this section and confuse the handshake process.

This could result in the other side sending a recv_error that also has
a race with setting hostinfo.ConnectionInfo back to nil. So we make sure
to grab the lock in handleRecvError as well.

Neither of these code paths are in the hot path (handling packets
between two hosts over an active tunnel) so there should be no
performance concerns.
2021-03-09 09:27:02 -05:00
Ryan Huber 73a5ed90b2
Do not allow someone to run a nebula lighthouse with an ephemeral port (#399)
* Do not allow someone to run a nebula lighthouse with an ephemeral port

* derp - we discover the port so we have to check the config setting

* No context needed for this error

* gofmt yourself

* Revert "gofmt yourself"

This reverts commit c01423498e.

* Revert "No context needed for this error"

This reverts commit 6792af6846.

* snip snap snip snap
2021-03-08 12:42:06 -08:00
Wade Simmons d604270966
Fix most known data races (#396)
This change fixes all of the known data races that `make smoke-docker-race` finds, except for one.

Most of these races are around the handshake phase for a hostinfo, so we add a RWLock to the hostinfo and Lock during each of the handshake stages.

Some of the other races are around consistently using `atomic` around the `messageCounter` field. To make this harder to mess up, I have renamed the field to `atomicMessageCounter` (I also removed the unnecessary extra pointer deference as we can just point directly to the struct field).

The last remaining data race is around reading `ConnectionInfo.ready`, which is a boolean that is only written to once when the handshake has finished. Due to it being in the hot path for packets and the rare case that this could actually be an issue, holding off on fixing that one for now.

here is the results of `make smoke-docker-race`:

before:

    lighthouse1: Found 2 data race(s)
    host2:       Found 36 data race(s)
    host3:       Found 17 data race(s)
    host4:       Found 31 data race(s)

after:

    host2: Found 1 data race(s)
    host4: Found 1 data race(s)

Fixes: #147
Fixes: #226
Fixes: #283
Fixes: #316
2021-03-05 21:18:33 -05:00
Nathan Brown 29c5f31f90
Add a check in the makefile to ensure a minimum version of go is installed (#383) 2021-03-02 13:29:05 -06:00
Nathan Brown b6234abfb3
Add a way to trigger punch backs via lighthouse (#394) 2021-03-01 19:06:01 -06:00
Wade Simmons 2a4beb41b9
Routine-local conntrack cache (#391)
Previously, every packet we see gets a lock on the conntrack table and updates it. When running with multiple routines, this can cause heavy lock contention and limit our ability for the threads to run independently. This change caches reads from the conntrack table for a very short period of time to reduce this lock contention. This cache will currently default to disabled unless you are running with multiple routines, in which case the default cache delay will be 1 second. This means that entries in the conntrack table may be up to 1 second out of date and remain in a routine local cache for up to 1 second longer than the global table.

Instead of calling time.Now() for every packet, this cache system relies on a tick thread that updates the current cache "version" each tick. Every packet we check if the cache version is out of date, and reset the cache if so.
2021-03-01 19:52:17 -05:00
Wade Simmons d232ccbfab
add metrics for the udp sockets using SO_MEMINFO (#390)
Retrieve the current socket stats using SO_MEMINFO and report them as
metrics gauges. If SO_MEMINFO isn't supported, we don't report these metrics.
2021-03-01 19:51:33 -05:00
Nathan Brown ecfb40f29c
Fix osx for mq changes, this does not implement mq on osx (#395) 2021-03-01 16:57:05 -05:00
Wade Simmons 1bae5b2550
more validation in pending hostmap deletes (#344)
We are currently seeing some cases where we are not deleting entries
correctly from the pending hostmap. I believe this is a case of
an inbound timer tick firing and deleting the Hosts map entry for
a newer handshake attempt than intended, thus leaving the old Indexes
entry orphaned. This change adds some extra checking when deleteing from
the Indexes and Hosts maps to ensure we clean everything up correctly.
2021-03-01 12:40:46 -05:00
Wade Simmons 73081d99bc
add `make smoke-docker` (#287)
This makes it easier to use the docker container smoke test that
GitHub actions runs. There is also `make smoke-docker-race` that runs the
smoke test with `-race` enabled.
2021-03-01 11:15:15 -05:00
Tim Rots e7e6a23cde
fix a few typos (#302) 2021-03-01 11:14:34 -05:00
Wade Simmons a0583ebdca
tun_disabled: reply to ICMP Echo Request (#342)
This change allows a server running with `tun.disabled: true` (usually
a lighthouse) to still reply to ICMP EchoRequest packets. This allows
you to "ping" the lighthouse Nebula IP as a quick check to make sure the
tunnel is up, even when running with tun.disabled.

This is still gated by allowing `icmp` packets in the inbound firewall
rules.
2021-03-01 11:09:41 -05:00
Wade Simmons 27d9a67dda
Proper multiqueue support for tun devices (#382)
This change is for Linux only.

Previously, when running with multiple tun.routines, we would only have one file descriptor. This change instead sets IFF_MULTI_QUEUE and opens a file descriptor for each routine. This allows us to process with multiple threads while preventing out of order packet reception issues.

To attempt to distribute the flows across the queues, we try to write to the tun/UDP queue that corresponds with the one we read from. So if we read a packet from tun queue "2", we will write the outgoing encrypted packet to UDP queue "2". Because of the nature of how multi queue works with flows, a given host tunnel will be sticky to a given routine (so if you try to performance benchmark by only using one tunnel between two hosts, you are only going to be using a max of one thread for each direction).

Because this system works much better when we can correlate flows between the tun and udp routines, we are deprecating the undocumented "tun.routines" and "listen.routines" parameters and introducing a new "routines" parameter that sets the value for both. If you use the old undocumented parameters, the max of the values will be used and a warning logged.

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2021-02-25 15:01:14 -05:00
John Maguire 2bce222550
List possible cipher options in example config (#385) 2021-02-19 21:46:42 -06:00
Wade Simmons 3dd1108099
Go 1.16 and darwin-arm64 (#381)
This commit switches to Go 1.16 and adds a release binary for darwin-arm64.

Fixes: #343
2021-02-17 13:11:57 -05:00
Nathan Brown d4b81f9b8d
Add QR code support to `nebula-cert` (#297) 2021-02-11 18:53:25 -06:00
brad-defined 454bc8a6bb
Check certificate banner during nebula-cert print (#373) 2021-02-05 14:52:32 -06:00
Wade Simmons ce9ad37431
fix regression with LightHouseHandler and punchBack (#346)
The change introduced by #320 incorrectly re-uses the output buffer for
sending punchBack packets. Since we are currently spawning a new
goroutine for each send here, we need to allocate a new buffer each
time. We can come back and optimize this in the future, but for now we
should fix the regression.
2020-11-25 17:49:26 -05:00
Wade Simmons ee7c27093c
add HostMap.RemoteIndexes (#329)
This change adds an index based on HostInfo.remoteIndexId. This allows
us to use HostMap.QueryReverseIndex without having to loop over all
entries in the map (this can be a bottleneck under high traffic
lighthouses).

Without this patch, a high traffic lighthouse server receiving recv_error
packets and lots of handshakes, cpu pprof trace can look like this:

      flat  flat%   sum%        cum   cum%
    2000ms 32.26% 32.26%     3040ms 49.03%  github.com/slackhq/nebula.(*HostMap).QueryReverseIndex
     870ms 14.03% 46.29%     1060ms 17.10%  runtime.mapiternext

Which shows 50% of total cpu time is being spent in QueryReverseIndex.
2020-11-23 14:51:16 -05:00
Wade Simmons 2e7ca027a4
Lighthouse handler optimizations (#320)
We noticed that the number of memory allocations LightHouse.HandleRequest creates for each call can seriously impact performance for high traffic lighthouses. This PR introduces a benchmark in the first commit and then optimizes memory usage by creating a LightHouseHandler struct. This struct allows us to re-use memory between each lighthouse request (one instance per UDP listener go-routine).
2020-11-23 14:50:01 -05:00
mhp 672ce1f0a8
Move slice allocations in connection manager monitor loop (#340)
* Move slice allocations in connection manager monitor loop

* move further out

Co-authored-by: Miran Park <mpark@slack-corp.com>
2020-11-19 15:44:05 -08:00
Wade Simmons 384b1166ea
fix panic in UnmarshalNebulaCertificate (#339)
This fixes a panic in UnmarshalNebulaCertificate when unmarshaling
a payload with Details set to nil.

Fixes: #332
2020-11-19 08:44:54 -05:00
Wade Simmons 0389596f66
don't mark handshake packets as "lost" (#331)
Packet 1 is always a stage 1 handshake and packet 2 is always stage 2.
Normal packets don't start flowing until the message counter is 3 or
higher.

Currently we only receive either packet 1 or 2 depending on if
we are the initiator or responder for the handshake, so we end up
marking one of these as "lost". We should mark these packets as "seen"
when we are the one sending them, since we don't expect to see them from
the other side.
2020-11-16 14:03:08 -05:00
Ryan Huber 43a3988afc
i don't think this is used at all anymore (#323) 2020-10-29 21:43:50 -04:00
Brian Kelly 5c23676a0f
Added line to systemd config template to start Nebula before sshd (#317)
During shutdown, this will keep Nebula alive until after sshd is finished. This cleanly terminates ssh clients accessing a server over a Nebula tunnel.
2020-10-29 21:43:02 -04:00
Nathan Brown f6d0b4b893
Update README for supported platforms (#312) 2020-10-12 13:11:32 -05:00
Ryan Huber 0d6b55e495
Bring in the new version of kardianos/service and output logfiles on osx (#303)
* this brings in the new version of kardianos/service which properly
outputs logs from launchd services

* add go sum

* is it really this easy?

* Update CHANGELOG.md
2020-09-24 15:34:08 -07:00
Wade Simmons c71c84882e
v1.3.0 (#268)
Update the CHANGELOG for Nebula v1.3.0

Co-authored-by: forfuncsake <drussell@slack-corp.com>
2020-09-22 12:21:12 -04:00
Darren Hoo 0010db46e4
Fix a data race on message counter (#284)
3. ==================
WARNING: DATA RACE
Write at 0x00c00030e020 by goroutine 17:
  sync/atomic.AddInt64()
      runtime/race_amd64.s:276 +0xb
  github.com/slackhq/nebula.(*Interface).sendNoMetrics()
      github.com/slackhq/nebula/inside.go:226 +0x9c
  github.com/slackhq/nebula.(*Interface).send()
      github.com/slackhq/nebula/inside.go:214 +0x149
  github.com/slackhq/nebula.(*Interface).readOutsidePackets()
      github.com/slackhq/nebula/outside.go:94 +0x1213
  github.com/slackhq/nebula.(*udpConn).ListenOut()
      github.com/slackhq/nebula/udp_generic.go:109 +0x3b5
  github.com/slackhq/nebula.(*Interface).listenOut()
      github.com/slackhq/nebula/interface.go:147 +0x15e

Previous read at 0x00c00030e020 by goroutine 18:
  github.com/slackhq/nebula.(*Interface).consumeInsidePacket()
      github.com/slackhq/nebula/inside.go:58 +0x892
  github.com/slackhq/nebula.(*Interface).listenIn()
      github.com/slackhq/nebula/interface.go:164 +0x178
2020-09-21 21:41:46 -04:00
Nathan Brown 68e3e84fdc
More like a library (#279) 2020-09-18 09:20:09 -05:00
Brian Luong 6238f1550b
Handle panic when invalid IP entered in sshd (#296) 2020-09-18 10:10:25 -04:00
forfuncsake 50b04413c7
Block nebula ssh server from listening on port 22 (#266)
Port 22 is blocked as a safety mechanism. In a case where nebula is
started before sshd, a system may be rendered unreachable if nebula
is holding the system ssh port and there is no other connectivity.

This commit enforces the restriction, which could previously be worked
around by listening on an IPv6 address, e.g.  "[::]:22".
2020-09-15 09:57:32 -04:00
CzBiX ef498a31da
Add disable_timestamp option (#288) 2020-09-09 07:42:11 -04:00
forfuncsake 2e5a477a50
Align linux UDP performance optimizations with configuration (#275)
* Remove unused (*udpConn).Read method

* Align linux UDP performance optimizations with configuration

While attempting to run nebula on an older Synology NAS, it became
apparent that some of the performance optimizations effectively
block support for older kernels. The recvmmsg syscall was added in
linux kernel 2.6.33, and the Synology DS212j (among other models)
is pinned to 2.6.32.12.

Similarly, SO_REUSEPORT was added to the kernel in the 3.9 cycle.
While this option has been backported into some older trees, it
is also missing from the Synology kernel.

This commit allows nebula to be run on linux devices with older
kernels if the config options are set up with a single listener
and a UDP batch size of 1.
2020-08-13 08:24:05 +10:00
Wade Simmons 32fe9bfe75
Use Go 1.15 (#277)
Update all CI checks and release process to use the latest patch version
of go1.15.
2020-08-12 16:16:21 -04:00
forfuncsake 9b8b3c478b
Support startup without a tun device (#269)
This commit adds support for Nebula to be started without creating
a tun device. A node started in this mode still has a full "control
plane", but no effective "data plane". Its use is suited to a
lighthouse that has no need to partake in the mesh VPN.

Consequently, creation of the tun device is the only reason nebula
neesd to be started with elevated privileged, so this example
lighthouse can also be run as a non-root user.
2020-08-10 09:15:55 -04:00
Michael Hardy 7b3f23d9a1
Start nebula after the network is up (#270) 2020-08-07 11:33:48 -05:00
forfuncsake 25964b54f6
Use inclusive terminology for cert blocking (#272) 2020-08-06 11:17:47 +10:00
Wade Simmons ac557f381b
drop unroutable packets (#267)
Currently, if a packet arrives on the tun device with a destination that
is not a routable Nebula IP, `queryUnsafeRoute` converts that IP to
0.0.0.0 and we store that packet and try to look up that IP with the
lighthouse. This doesn't make any sense to do, if we get a packet that
is unroutable we should just drop it.

Note, we have a few configurable options like `drop_local_broadcast`
and `drop_multicast` which do this for a few specific types, but since
no packets like this will send correctly I think we should just drop
anything that is unroutable.
2020-08-04 22:59:04 -04:00
Wade Simmons a54f3fc681
fix fast handshake trigger for static hosts (#265)
We are currently triggering a fast handshake for static hosts right
inside HandshakeManager.AddVpnIP, but this can actually trigger before
we have generated the handshake packet to use. Instead, we should be
triggering right after we call ixHandshakeStage0 in getOrHandshake
(which generates the handshake packet)
2020-08-02 20:59:50 -04:00
Alan Lam 5545cff6ef
log remote certificate fingerprint on handshakes (#262) 2020-07-31 18:54:51 -04:00
Wade Simmons f3a6d8d990
Preserve conntrack table during firewall rules reload (SIGHUP) (#233)
Currently, we drop the conntrack table when firewall rules change during a SIGHUP reload. This means responses to inflight HTTP requests can be dropped, among other issues. This change copies the conntrack table over to the new firewall (it holds the conntrack mutex lock during this process, to be safe).

This change also records which firewall rules hash each conntrack entry used, so that we can re-verify the rules after the new firewall has been loaded.
2020-07-31 18:53:36 -04:00
forfuncsake 9b06748506
Make Interface.Inside an interface type (#252)
This commit updates the Interface.Inside type to be a new interface
type instead of a *Tun. This will allow for an inside interface
that does not use a tun device, such as a single-binary client that
can run without elevated privileges.
2020-07-28 08:53:16 -04:00
Wade Simmons 4756c9613d
trigger handshakes when lighthouse reply arrives (#246)
Currently, we wait until the next timer tick to act on the lighthouse's
reply to our HostQuery. This means we can easily add hundreds of
milliseconds of unnecessary delay to the handshake. To fix this, we
can introduce a channel to trigger an outbound handshake without waiting
for the next timer tick.

A few samples of cold ping time between two hosts that require a
lighthouse lookup:

    before (v1.2.0):

    time=156 ms
    time=252 ms
    time=12.6 ms
    time=301 ms
    time=352 ms
    time=49.4 ms
    time=150 ms
    time=13.5 ms
    time=8.24 ms
    time=161 ms
    time=355 ms

    after:

    time=3.53 ms
    time=3.14 ms
    time=3.08 ms
    time=3.92 ms
    time=7.78 ms
    time=3.59 ms
    time=3.07 ms
    time=3.22 ms
    time=3.12 ms
    time=3.08 ms
    time=8.04 ms

I recommend reviewing this PR by looking at each commit individually, as
some refactoring was required that makes the diff a bit confusing when
combined together.
2020-07-22 10:35:10 -04:00
Nathan Brown 4645e6034b
Fix up the tun for android (#249) 2020-07-01 10:20:52 -05:00
Wade Simmons aba42f9fa6
enforce the use of goimports (#248)
* enforce the use of goimports

Instead of enforcing `gofmt`, enforce `goimports`, which also asserts
a separate section for non-builtin packages.

* run `goimports` everywhere

* exclude generated .pb.go files
2020-06-30 18:53:30 -04:00
Nathan Brown 41578ca971
Be more like a library to support mobile (#247) 2020-06-30 13:48:58 -05:00
Wade Simmons 1ea8847085
linux: set advmss correctly when route MTU is used (#245)
If different mtus are specified for different routes, we should set
advmss on each route because Linux does a poor job of selecting the
default (from ip-route(8)):

    advmss NUMBER (Linux 2.3.15+ only)
           the MSS ('Maximal Segment Size') to advertise to these destinations when estab‐
           lishing TCP connections. If it is not given, Linux uses a default value calcu‐
           lated from the first hop device MTU.  (If the path to these destination is asym‐
           metric, this guess may be wrong.)

Note that the default value is calculated from the first hop *device
MTU*, not the *route MTU*. In practice this is usually ok as long as the
other side of the tunnel has the mtu configured exactly the same, but we
should probably just set advmss correctly on these routes.
2020-06-26 13:47:21 -04:00
Wade Simmons 55858c64cc
smoke test: test firewall inbound / outbound (#240)
Test that basic inbound / outbound firewall rules work during the smoke
test. This change sets an inbound firewall rule on host3, and a new
host4 with outbound firewall rules. It also tests that conntrack allows
packets once the connection has been established.
2020-06-26 13:46:51 -04:00