Commit Graph

17 Commits

Author SHA1 Message Date
Nathan Brown 3ea7e1b75f
Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Nathan Brown 7a9f9dbded
Don't craft buffers if we don't need them (#416) 2021-03-22 18:25:06 -05:00
Nathan Brown 7073d204a8
IPv6 support for outside (udp) (#369) 2021-03-18 20:37:24 -05:00
Wade Simmons 6c55d67f18
Refactor handshake_ix (#401)
There are some subtle race conditions with the previous handshake_ix implementation, mostly around collisions with localIndexId. This change refactors it so that we have a "commit" phase during the handshake where we grab the lock for the hostmap and ensure that we have a unique local index before storing it. We also now avoid using the pending hostmap at all for receiving stage1 packets, since we have everything we need to just store the completed handshake.

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Ryan Huber <rhuber@gmail.com>
Co-authored-by: forfuncsake <drussell@slack-corp.com>
2021-03-12 14:16:25 -05:00
Wade Simmons d604270966
Fix most known data races (#396)
This change fixes all of the known data races that `make smoke-docker-race` finds, except for one.

Most of these races are around the handshake phase for a hostinfo, so we add a RWLock to the hostinfo and Lock during each of the handshake stages.

Some of the other races are around consistently using `atomic` around the `messageCounter` field. To make this harder to mess up, I have renamed the field to `atomicMessageCounter` (I also removed the unnecessary extra pointer deference as we can just point directly to the struct field).

The last remaining data race is around reading `ConnectionInfo.ready`, which is a boolean that is only written to once when the handshake has finished. Due to it being in the hot path for packets and the rare case that this could actually be an issue, holding off on fixing that one for now.

here is the results of `make smoke-docker-race`:

before:

    lighthouse1: Found 2 data race(s)
    host2:       Found 36 data race(s)
    host3:       Found 17 data race(s)
    host4:       Found 31 data race(s)

after:

    host2: Found 1 data race(s)
    host4: Found 1 data race(s)

Fixes: #147
Fixes: #226
Fixes: #283
Fixes: #316
2021-03-05 21:18:33 -05:00
Nathan Brown b6234abfb3
Add a way to trigger punch backs via lighthouse (#394) 2021-03-01 19:06:01 -06:00
Wade Simmons 1bae5b2550
more validation in pending hostmap deletes (#344)
We are currently seeing some cases where we are not deleting entries
correctly from the pending hostmap. I believe this is a case of
an inbound timer tick firing and deleting the Hosts map entry for
a newer handshake attempt than intended, thus leaving the old Indexes
entry orphaned. This change adds some extra checking when deleteing from
the Indexes and Hosts maps to ensure we clean everything up correctly.
2021-03-01 12:40:46 -05:00
Tim Rots e7e6a23cde
fix a few typos (#302) 2021-03-01 11:14:34 -05:00
Wade Simmons ee7c27093c
add HostMap.RemoteIndexes (#329)
This change adds an index based on HostInfo.remoteIndexId. This allows
us to use HostMap.QueryReverseIndex without having to loop over all
entries in the map (this can be a bottleneck under high traffic
lighthouses).

Without this patch, a high traffic lighthouse server receiving recv_error
packets and lots of handshakes, cpu pprof trace can look like this:

      flat  flat%   sum%        cum   cum%
    2000ms 32.26% 32.26%     3040ms 49.03%  github.com/slackhq/nebula.(*HostMap).QueryReverseIndex
     870ms 14.03% 46.29%     1060ms 17.10%  runtime.mapiternext

Which shows 50% of total cpu time is being spent in QueryReverseIndex.
2020-11-23 14:51:16 -05:00
Ryan Huber 43a3988afc
i don't think this is used at all anymore (#323) 2020-10-29 21:43:50 -04:00
Wade Simmons b37a91cfbc
add meta packet statistics (#230)
This change add more metrics around "meta" (non "message" type packets).
For lighthouse packets, we also record statistics around the specific
lighthouse meta type.

We don't keep statistics for the "message" type so that we don't slow
down the fast path (and you can just look at metrics on the tun
interface to find that information).
2020-06-26 13:45:48 -04:00
Wade Simmons 0a474e757b
Add lighthouse.{remoteAllowList,localAllowList} (#217)
These settings make it possible to blacklist / whitelist IP addresses
that are used for remote connections.

`lighthouse.remoteAllowList` filters which remote IPs are allow when
fetching from the lighthouse (or, if you are the lighthouse, which IPs
you store and forward to querying hosts). By default, any remote IPs are
allowed. You can provide CIDRs here with `true` to allow and `false` to
deny. The most specific CIDR rule applies to each remote.  If all rules
are "allow", the default will be "deny", and vice-versa. If both "allow"
and "deny" rules are present, then you MUST set a rule for "0.0.0.0/0"
as the default.

    lighthouse:
      remoteAllowList:
        # Example to block IPs from this subnet from being used for remote IPs.
        "172.16.0.0/12": false

        # A more complicated example, allow public IPs but only private IPs from a specific subnet
        "0.0.0.0/0": true
        "10.0.0.0/8": false
        "10.42.42.0/24": true

`lighthouse.localAllowList` has the same logic as above, but it applies
to the local addresses we advertise to the lighthouse. Additionally, you
can specify an `interfaces` map of regular expressions to match against
interface names. The regexp must match the entire name. All interface
rules must be either true or false (and the default rule will be the
inverse). CIDR rules are matched after interface name rules.

Default is all local IP addresses.

    lighthouse:
      localAllowList:
        # Example to blacklist docker interfaces.
        interfaces:
          'docker.*': false

        # Example to only advertise IPs in this subnet to the lighthouse.
        "10.0.0.0/8": true
2020-04-08 15:36:43 -04:00
Wade Simmons b4f2f7ce4e
log `certName` alongside `vpnIp` (#200)
This change adds a new helper, `(*HostInfo).logger()`, that starts a new
logrus.Entry with `vpnIp` and `certName`. We don't use the helper inside
of handshake_ix though since the certificate has not been attached to
the HostInfo yet.

Fixes: #84
2020-04-06 11:34:00 -07:00
Wade Simmons add1b21777
only create a CIDRTree for each host if necessary (#198)
A CIDRTree can be expensive to create, so only do it if we need
it. If the remote host only has one IP address and no subnets, just do
an exact IP match instead.

Fixes: #171
2020-03-02 16:21:33 -05:00
Ryan Huber ad7079d370 make this a warning, even though i believe it is fundamentally an error
(in judgement)
2019-12-13 21:55:01 +00:00
Ryan Huber 9333a8e3b7 subnet support 2019-12-12 16:34:17 +00:00
Slack Security Team f22b4b584d Public Release 2019-11-19 17:00:20 +00:00