Commit Graph

1135 Commits

Author SHA1 Message Date
Dirk Klimpel 6edefef602
Add some type hints to datastore (#12717) 2022-05-17 15:29:06 +01:00
Sean Quah 6ee61b9052
Complain if a federation endpoint has the `@cancellable` flag (#12705)
`BaseFederationServlet` wraps its endpoints in a bunch of async code
that has not been vetted for compatibility with cancellation.
Fail CI if a `@cancellable` flag is applied to a federation endpoint.

Signed-off-by: Sean Quah <seanq@element.io>
2022-05-11 14:52:26 +01:00
Val Lorentz bf0c3ca20a
Fix inconsistent spelling of 'M_UNRECOGNIZED'. (#12665) 2022-05-09 20:29:07 +00:00
DeepBlueV7.X a377a43386
Support MSC3266 room summaries over federation (#11507)
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2022-05-05 15:25:00 +01:00
Richard van der Hoff d66d68f917
Add extra debug logging to federation sender (#12614)
... in order to debug some problems we've been having with certain events not
being sent when expected.
2022-05-03 16:32:40 +01:00
Richard van der Hoff db2edf5a65
Exclude OOB memberships from the federation sender (#12570)
As the comment says, there is no need to process such events, and indeed we
need to avoid doing so.

Fixes #12509.
2022-05-03 12:47:56 +00:00
David Robertson 6463244375
Remove unused `# type: ignore`s (#12531)
Over time we've begun to use newer versions of mypy, typeshed, stub
packages---and of course we've improved our own annotations. This makes
some type ignore comments no longer necessary. I have removed them.

There was one exception: a module that imports `select.epoll`. The
ignore is redundant on Linux, but I've kept it ignored for those of us
who work on the source tree using not-Linux. (#11771)

I'm more interested in the config line which enforces this. I want
unused ignores to be reported, because I think it's useful feedback when
annotating to know when you've fixed a problem you had to previously
ignore.

* Installing extras before typechecking

Lacking an easy way to install all extras generically, let's bite the bullet and
make install the hand-maintained `all` extra before typechecking.

Now that https://github.com/matrix-org/backend-meta/pull/6 is merged to
the release/v1 branch.
2022-04-27 14:03:44 +01:00
Jan Christian Grünhage a1f87f57ff
Implement MSC3383: include destination in X-Matrix auth header (#11398)
Co-authored-by: Jan Christian Grünhage <jan.christian@gruenhage.xyz>
Co-authored-by: Marcus Hoffmann <bubu@bubu1.eu>
2022-04-19 16:23:53 +01:00
Richard van der Hoff b121a3ad2b
Back out implementation of MSC2314 (#12474)
MSC2314 has now been closed, so we're backing out its implementation, which
originally happened in #6176.

Unfortunately it's not a direct revert, as that PR mixed in a bunch of
unrelated changes to tests etc.
2022-04-19 11:17:29 +00:00
Patrick Cloke 4bdbebccb9
Remove the unstable event field for `/send_join` per MSC3083. (#12395)
This was missed when initially stabilising room version 8 and was
left in as a compatibility shim. Most homeservers have upgraded
to a version which expects the proper field name, and the failure
mode is reasonable (a user on an older server may have to attempt
joining the room twice with an obscure error message the first time).
2022-04-12 11:27:45 -04:00
David Robertson 95a038c106
Unify HTTP query parameter type hints (#12415)
* Pull out query param types to `synapse.http.types`
* Use QueryParams everywhere
* Simplify `encode_query_args`
* Add annotation which would have caught #12410
2022-04-08 13:06:51 +01:00
Erik Johnston 36af768c13
Fix fetching public rooms over federation (#12410)
Broke by #12364
2022-04-07 14:18:02 +00:00
Sean Quah 800ba87cc8
Refactor and convert `Linearizer` to async (#12357)
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.

Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-05 15:43:52 +01:00
reivilibre 42d8710f38
Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. (#12364) 2022-04-05 12:45:36 +01:00
Richard van der Hoff 38adf14998
Enhance logging for inbound federation events (#12301)
It is currently rather hard to see which rooms are causing inbound federation
traffic. Add the room id to the logs.
2022-03-25 14:44:57 +00:00
Richard van der Hoff afa17f0eab
Return a 404 from `/state` for an outlier (#12087)
* Replace `get_state_for_pdu` with  `get_state_ids_for_pdu` and `get_events_as_list`.
* Return a 404 from `/state` and `/state_ids` for an outlier
2022-03-21 11:23:32 +00:00
Patrick Cloke 54f674f7a9
Deprecate the groups/communities endpoints and add an experimental configuration flag. (#12200) 2022-03-12 13:23:37 -05:00
Patrick Cloke 3e4af36bc8
Rename get_tcp_replication to get_replication_command_handler. (#12192)
Since the object it returns is a ReplicationCommandHandler.

This is clean-up from adding support to Redis where the command handler
was added as an additional layer of abstraction from the TCP protocol.
2022-03-10 13:01:56 +00:00
Erik Johnston 423cca9efe
Spread out sending device lists to remote hosts (#12132) 2022-03-04 11:48:15 +00:00
Patrick Cloke 1103c5fe8a
Check if instances are lists, not sequences. (#12128)
As a str is a sequence, the checks were not granular
enough and would allow lists or strings, when only
lists were valid.
2022-03-02 13:18:51 +00:00
Patrick Cloke 7754af24ab
Remove the unstable `/spaces` endpoint. (#12073)
...and various code supporting it.

The /spaces endpoint was from an old version of MSC2946 and included
both a Client-Server and Server-Server API. Note that the unstable
/hierarchy endpoint (from the final version of MSC2946) is not yet
removed.
2022-02-28 18:33:00 +00:00
David Robertson 5565f454e1
Actually fix bad debug logging rejecting device list & signing key transactions (#12098) 2022-02-28 14:10:36 +00:00
Patrick Cloke 9e83521af8
Properly failover for unknown endpoints from Conduit/Dendrite. (#12077)
Before this fix, a legitimate 404 from a federation endpoint (e.g. due
to an unknown room) would be treated as an unknown endpoint. This
could cause unnecessary federation traffic.
2022-02-28 07:52:44 -05:00
Richard van der Hoff e24ff8ebe3
Remove `HomeServer.get_datastore()` (#12031)
The presence of this method was confusing, and mostly present for backwards
compatibility. Let's get rid of it.

Part of #11733
2022-02-23 11:04:02 +00:00
Brendan Abolivier 250104d357
Implement account status endpoints (MSC3720) (#12001)
See matrix-org/matrix-doc#3720

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2022-02-22 15:10:10 +00:00
Richard van der Hoff 7273011f60
Faster joins: Support for calling `/federation/v1/state` (#12013)
This is an endpoint that we have server-side support for, but no client-side support. It's going to be useful for resyncing partial-stated rooms, so let's introduce it.
2022-02-22 12:17:10 +00:00
Richard van der Hoff 3070af4809
remote join processing: get create event from state, not auth_chain (#12039)
A follow-up to #12005, in which I apparently missed that there are a bunch of other places that assume the create event is in the auth chain.
2022-02-21 19:27:35 +00:00
Richard van der Hoff a85dde3445
Minor typing fixes (#12034)
These started failing in
https://github.com/matrix-org/synapse/pull/12031... I'm a bit mystified by how
they ever worked.
2022-02-21 18:37:04 +00:00
Richard van der Hoff da0e9f8efd
Faster joins: parse msc3706 fields in send_join response (#12011)
Part of my work on #11249: add code to handle the new fields added in MSC3706.
2022-02-17 16:11:59 +00:00
David Robertson 4ae956c8bb
Use version string helper from matrix-common (#11979)
* Require latest matrix-common
* Use the common function
2022-02-14 13:12:22 +00:00
Richard van der Hoff 63c46349c4
Implement MSC3706: partial state in `/send_join` response (#11967)
* Make `get_auth_chain_ids` return a Set

It has a set internally, and a set is often useful where it gets used, so let's
avoid converting to an intermediate list.

* Minor refactors in `on_send_join_request`

A little bit of non-functional groundwork

* Implement MSC3706: partial state in /send_join response
2022-02-12 10:44:16 +00:00
Richard van der Hoff 964f5b9324
Improve opentracing for federation requests (#11870)
The idea here is to set the parent span for incoming federation requests to the
*outgoing* span on the other end. That means that you can see (most of) the
full end-to-end flow when you have a process that includes federation requests.

However, in order not to lose information, we still want a link to the
`incoming-federation-request` span from the servlet, so we have to create
another span to do exactly that.
2022-02-03 12:29:16 +00:00
David Robertson dd7f825118
Fix losing incoming EDUs if debug logging enabled (#11890)
* Fix losing incoming EDUs if debug logging enabled

Fixes #11889. Homeservers should only be affected if the
`synapse.8631_debug` logger was enabled for DEBUG mode.

I am not sure if this merits a bugfix release: I think the logging can
be disabled in config if anyone is affected? But it is still pretty bad.
2022-02-02 16:25:17 +00:00
Dirk Klimpel 0d6cfea9b8
Add admin API to reset connection timeouts for remote server (#11639)
* Fix get federation status of destination if no error occured
2022-01-25 12:06:29 +00:00
David Robertson f160fe18e3
Debug for device lists updates (#11760)
Debug for #8631.

I'm having a hard time tracking down what's going wrong in that issue.
In the reported example, I could see server A sending federation traffic
to server B and all was well. Yet B reports out-of-sync device updates
from A.

I couldn't see what was _in_ the events being sent from A to B. So I
have added some crude logging to track

- when we have updates to send to a remote HS
- the edus we actually accumulate to send
- when a federation transaction includes a device list update edu
- when such an EDU is received

This is a bit of a sledgehammer.
2022-01-20 13:38:44 +00:00
Sean Quah af13a3be29
Fix a bug that corrupted the cache of federated space hierarchies (#11775)
`FederationClient.get_room_hierarchy()` caches its return values, so
refactor the code to avoid modifying the returned room summary.
2022-01-20 11:03:42 +00:00
Richard van der Hoff 251b5567ec
Remove `log_function` and its uses (#11761)
I've never found this terribly useful. I think it was added in the early days
of Synapse, without much thought as to what would actually be useful to log,
and has just been cargo-culted ever since.

Rather, it tends to clutter up debug logs with useless information.
2022-01-18 13:06:04 +00:00
Patrick Cloke 10a88ba91c
Use auto_attribs/native type hints for attrs classes. (#11692) 2022-01-13 13:49:28 +00:00
Shay 70ce9aea71
Strip unauthorized fields from `unsigned` object in events received over federation (#11530)
* add some tests to verify we are stripping unauthorized fields out of unsigned

* add function to strip unauthorized fields from the unsigned object of event

* newsfragment

* update newsfragment number

* add check to on_send_membership_event

* refactor tests

* fix lint error

* slightly refactor tests and add some comments

* slight refactor

* refactor tests

* fix import error

* slight refactor

* remove unsigned filtration code from synapse/handlers/federation_event.py

* lint

* move unsigned filtering code to event base

* refactor tests

* update newsfragment

* requested changes

* remove unused retun values
2022-01-06 09:09:30 -08:00
Richard van der Hoff 0fb3dd0830
Refactor the way we set `outlier` (#11634)
* `_auth_and_persist_outliers`: mark persisted events as outliers

Mark any events that get persisted via `_auth_and_persist_outliers` as, well,
outliers.

Currently this will be a no-op as everything will already be flagged as an
outlier, but I'm going to change that.

* `process_remote_join`: stop flagging as outlier

The events are now flagged as outliers later on, by `_auth_and_persist_outliers`.

* `send_join`: remove `outlier=True`

The events created here are returned in the result of `send_join` to
`FederationHandler.do_invite_join`. From there they are passed into
`FederationEventHandler.process_remote_join`, which passes them to
`_auth_and_persist_outliers`... which sets the `outlier` flag.

* `get_event_auth`: remove `outlier=True`

stop flagging the events returned by `get_event_auth` as outliers. This method
is only called by `_get_remote_auth_chain_for_event`, which passes the results
into `_auth_and_persist_outliers`, which will flag them as outliers.

* `_get_remote_auth_chain_for_event`: remove `outlier=True`

we pass all the events into `_auth_and_persist_outliers`, which will now flag
the events as outliers.

* `_check_sigs_and_hash_and_fetch`: remove unused `outlier` parameter

This param is now never set to True, so we can remove it.

* `_check_sigs_and_hash_and_fetch_one`: remove unused `outlier` param

This is no longer set anywhere, so we can remove it.

* `get_pdu`: remove unused `outlier` parameter

... and chase it down into `get_pdu_from_destination_raw`.

* `event_from_pdu_json`: remove redundant `outlier` param

This is never set to `True`, so can be removed.

* changelog

* update docstring
2022-01-05 12:26:11 +00:00
reivilibre 84bfe47b01
Re-apply: Move glob_to_regex and re_word_boundary to matrix-python-common #11505 (#11687)
Co-authored-by: Sean Quah <seanq@element.io>
2022-01-05 11:41:49 +00:00
Richard van der Hoff 878aa55293
`FederationClient.backfill`: stop flagging events as outliers (#11632)
Events returned by `backfill` should not be flagged as outliers.

Fixes:

```
AssertionError: null
  File "synapse/handlers/federation.py", line 313, in try_backfill
    dom, room_id, limit=100, extremities=extremities
  File "synapse/handlers/federation_event.py", line 517, in backfill
    await self._process_pulled_events(dest, events, backfilled=True)
  File "synapse/handlers/federation_event.py", line 642, in _process_pulled_events
    await self._process_pulled_event(origin, ev, backfilled=backfilled)
  File "synapse/handlers/federation_event.py", line 669, in _process_pulled_event
    assert not event.internal_metadata.is_outlier()
```

See https://sentry.matrix.org/sentry/synapse-matrixorg/issues/231992

Fixes #8894.
2022-01-04 16:31:32 +00:00
Patrick Cloke cbd82d0b2d
Convert all namedtuples to attrs. (#11665)
To improve type hints throughout the code.
2021-12-30 18:47:12 +00:00
Richard van der Hoff 60fa4935b5
Improve opentracing for incoming HTTP requests (#11618)
* remove `start_active_span_from_request`

Instead, pull out a separate function, `span_context_from_request`, to extract
the parent span, which we can then pass into `start_active_span` as
normal. This seems to be clearer all round.

* Remove redundant tags from `incoming-federation-request`

These are all wrapped up inside a parent span generated in AsyncResource, so
there's no point duplicating all the tags that are set there.

* Leave request spans open until the request completes

It may take some time for the response to be encoded into JSON, and that JSON
to be streamed back to the client, and really we want that inside the top-level
span, so let's hand responsibility for closure to the SynapseRequest.

* opentracing logs for HTTP request events

* changelog
2021-12-20 17:45:03 +00:00
Sean Quah 0147b3de20
Add missing type hints to `synapse.logging.context` (#11556) 2021-12-14 17:35:28 +00:00
Sean Quah 088d748f2c
Revert "Move `glob_to_regex` and `re_word_boundary` to `matrix-python-common` (#11505) (#11527)
This reverts commit a77c369897.
2021-12-07 13:51:11 +00:00
Sean Quah a77c369897
Move `glob_to_regex` and `re_word_boundary` to `matrix-python-common` (#11505) 2021-12-06 11:36:08 +00:00
Patrick Cloke d2279f471b
Add most of the missing type hints to `synapse.federation`. (#11483)
This skips a few methods which are difficult to type.
2021-12-02 16:18:10 +00:00
Eric Eastwood a6f1a3abec
Add MSC3030 experimental client and federation API endpoints to get the closest event to a given timestamp (#9445)
MSC3030: https://github.com/matrix-org/matrix-doc/pull/3030

Client API endpoint. This will also go and fetch from the federation API endpoint if unable to find an event locally or we found an extremity with possibly a closer event we don't know about.
```
GET /_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>
{
    "event_id": ...
    "origin_server_ts": ...
}
```

Federation API endpoint:
```
GET /_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>
{
    "event_id": ...
    "origin_server_ts": ...
}
```

Co-authored-by: Erik Johnston <erik@matrix.org>
2021-12-02 01:02:20 -06:00
Patrick Cloke a4521ce0a8
Support the stable /hierarchy endpoint from MSC2946 (#11329)
This also makes additional updates where the implementation
had drifted from the approved MSC.

Unstable endpoints will be removed at a later data.
2021-11-29 14:32:20 -05:00
Patrick Cloke 9d1971a5c4
Return the stable `event` field from `/send_join` per MSC3083. (#11413)
This does not remove the unstable field and still parses both.
Handling of the unstable field will need to be removed in the
future.
2021-11-29 15:43:20 +00:00
Eric Eastwood f1d5c2f269
Split out federated PDU retrieval into a non-cached version (#11242)
Context: https://github.com/matrix-org/synapse/pull/11114/files#r741643968
2021-11-09 15:07:57 -06:00
Erik Johnston 98c8fc6ce8
Handle federation inbound instances being killed more gracefully (#11262)
* Make lock better handle process being killed

If the process gets killed and restarted (so that it didn't have a
chance to drop its locks gracefully) then there may still be locks in
the DB that are for the same instance that haven't yet timed out but are
safe to delete.

We handle this case by a) checking if the current instance already has
taken out the lock, and b) if not then ignoring locks that are for the
same instance.

* Periodically check for old staged events

This is to protect against other instances dying and their locks timing
out.
2021-11-08 09:54:47 +00:00
Nick Barrett af54167516
Enable passing typing stream writers as a list. (#11237)
This makes the typing stream writer config match the other stream writers
that only currently support a single worker.
2021-11-03 14:25:47 +00:00
Shay e81fa92648
Add `use_float=true` to ijson calls in Synapse (#11217)
* add use_float=true to ijson calls

* lints

* add changelog

* Update changelog.d/11217.bugfix

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2021-11-01 09:28:04 -07:00
reivilibre 75ca0a6168
Annotate `log_function` decorator (#10943)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2021-10-27 17:27:23 +01:00
Sean Quah 2b82ec425f
Add type hints for most `HomeServer` parameters (#11095) 2021-10-22 18:15:41 +01:00
Patrick Cloke d1bf5f7c9d
Strip "join_authorised_via_users_server" from join events which do not need it. (#10933)
This fixes a "Event not signed by authorising server" error when
transition room member from join -> join, e.g. when updating a
display name or avatar URL for restricted rooms.
2021-09-30 11:13:59 -04:00
Richard van der Hoff 176aa55fd5
add event id to logcontext when handling incoming PDUs (#10936) 2021-09-29 11:59:43 +01:00
Patrick Cloke 94b620a5ed
Use direct references for configuration variables (part 6). (#10916) 2021-09-29 06:44:15 -04:00
Richard van der Hoff 85551b7a85
Factor out common code for persisting fetched auth events (#10896)
* Factor more stuff out of `_get_events_and_persist`

It turns out that the event-sorting algorithm in `_get_events_and_persist` is
also useful in other circumstances. Here we move the current
`_auth_and_persist_fetched_events` to `_auth_and_persist_fetched_events_inner`,
and then factor the sorting part out to `_auth_and_persist_fetched_events`.

* `_get_remote_auth_chain_for_event`: remove redundant `outlier` assignment

`get_event_auth` returns events with the outlier flag already set, so this is
redundant (though we need to update a test where `get_event_auth` is mocked).

* `_get_remote_auth_chain_for_event`: move existing-event tests earlier

Move a couple of tests outside the loop. This is a bit inefficient for now, but
a future commit will make it better. It should be functionally identical.

* `_get_remote_auth_chain_for_event`: use `_auth_and_persist_fetched_events`

We can use the same codepath for persisting the events fetched as part of an
auth chain as for those fetched individually by `_get_events_and_persist` for
building the state at a backwards extremity.

* `_get_remote_auth_chain_for_event`: use a dict for efficiency

`_auth_and_persist_fetched_events` sorts the events itself, so we no longer
need to care about maintaining the ordering from `get_event_auth` (and no
longer need to sort by depth in `get_event_auth`).

That means that we can use a map, making it easier to filter out events we
already have, etc.

* changelog

* `_auth_and_persist_fetched_events`: improve docstring
2021-09-24 11:56:33 +01:00
Patrick Cloke 47854c71e9
Use direct references for configuration variables (part 4). (#10893) 2021-09-23 12:03:01 -04:00
Andrew Morgan aa2c027792
Remove unnecessary parentheses around tuples returned from methods (#10889) 2021-09-23 11:59:07 +01:00
Patrick Cloke 8c7a531e27
Use direct references for some configuration variables (part 2) (#10812) 2021-09-15 08:34:52 -04:00
Patrick Cloke 01c88a09cd
Use direct references for some configuration variables (#10798)
Instead of proxying through the magic getter of the RootConfig
object. This should be more performant (and is more explicit).
2021-09-13 13:07:12 -04:00
reivilibre 524b8ead77
Add types to synapse.util. (#10601) 2021-09-10 17:03:18 +01:00
Richard van der Hoff 1800aabfc2
Split `FederationHandler` in half (#10692)
The idea here is to take anything to do with incoming events and move it out to a separate handler, as a way of making FederationHandler smaller.
2021-08-26 21:41:44 +01:00
Patrick Cloke 5548fe0978
Cache the result of fetching the room hierarchy over federation. (#10647) 2021-08-26 07:16:53 -04:00
Patrick Cloke 31dac7ffee
Do not include stack traces for known exceptions when trying multiple federation destinations. (#10662) 2021-08-23 08:00:25 -04:00
Richard van der Hoff e81d62009e
Split `on_receive_pdu` in half (#10640)
Here we split on_receive_pdu into two functions (on_receive_pdu and process_pulled_event), rather than having both cases in the same method. There's a tiny bit of overlap, but not that much.
2021-08-19 17:05:12 +00:00
Patrick Cloke c4cf0c0473
Attempt to pull from the legacy spaces summary API over federation. (#10583)
If the new /hierarchy API does not exist on all destinations,
fallback to querying the /spaces API and translating the results.

This is a backwards compatibility hack since not all of the
federated homeservers will update at the same time.
2021-08-17 08:19:12 -04:00
Patrick Cloke 5af83efe8d
Validate the max_rooms_per_space parameter to ensure it is non-negative. (#10611) 2021-08-16 12:01:30 -04:00
Michael Telatynski 0ace38b7b3
Experimental support for MSC3266 Room Summary API. (#10394) 2021-08-16 14:49:12 +00:00
Patrick Cloke 87b62f8bb2
Split `synapse.federation.transport.server` into multiple files. (#10590) 2021-08-16 10:14:31 -04:00
Richard van der Hoff 2d9ca4ca77
Clean up some logging in the federation event handler (#10591)
* Include outlier status in `str(event)`

In places where we log event objects, knowing whether or not you're dealing
with an outlier is super useful.

* Remove duplicated logging in get_missing_events

When we process events received from get_missing_events, we log them twice
(once in `_get_missing_events_for_pdu`, and once in `on_receive_pdu`). Reduce
the duplication by removing the logging in `on_receive_pdu`, and ensuring the
call sites do sensible logging.

* log in `on_receive_pdu` when we already have the event

* Log which prev_events we are missing

* changelog
2021-08-16 13:19:02 +01:00
Patrick Cloke 7de445161f
Support federation in the new spaces summary API (MSC2946). (#10569) 2021-08-16 08:06:17 -04:00
Patrick Cloke c12b5577f2
Fix a harmless exception when the staged events queue is empty. (#10592) 2021-08-13 11:49:06 +00:00
Patrick Cloke 1de26b3467
Convert Transaction and Edu object to attrs (#10542)
Instead of wrapping the JSON into an object, this creates concrete
instances for Transaction and Edu. This allows for improved type
hints and simplified code.
2021-08-06 09:39:59 -04:00
Erik Johnston 60f0534b6e
Fix exceptions in logs when failing to get remote room list (#10541) 2021-08-06 14:05:41 +01:00
Patrick Cloke 3b354faad0
Refactoring before implementing the updated spaces summary. (#10527)
This should have no user-visible changes, but refactors some pieces of
the SpaceSummaryHandler before adding support for the updated
MSC2946.
2021-08-05 12:39:17 +00:00
Erik Johnston 01d45fe964
Prune inbound federation queues if they get too long (#10390) 2021-08-02 13:37:25 +00:00
Patrick Cloke 3a541a7daa
Improve failover logic for MSC3083 restricted rooms. (#10447)
If the federation client receives an M_UNABLE_TO_AUTHORISE_JOIN or
M_UNABLE_TO_GRANT_JOIN response it will attempt another server
before giving up completely.
2021-07-29 11:50:14 +00:00
Patrick Cloke 228decfce1
Update the MSC3083 support to verify if joins are from an authorized server. (#10254) 2021-07-26 12:17:00 -04:00
Patrick Cloke 4fb92d93ea
Add type hints to synapse.federation.transport.client. (#10408) 2021-07-26 11:53:09 -04:00
Patrick Cloke 590cc4e888
Add type hints to additional servlet functions (#10437)
Improves type hints for:

* parse_{boolean,integer}
* parse_{boolean,integer}_from_args
* parse_json_{value,object}_from_request

And fixes any incorrect calls that resulted from unknown types.
2021-07-21 18:12:22 +00:00
Patrick Cloke d427f64724
Do not include signatures/hashes in make_{join,leave,knock} responses. (#10404)
These signatures would end up invalid since the joining/leaving/knocking
server would modify the response before calling send_{join,leave,knock}.
2021-07-16 10:36:38 -04:00
Erik Johnston ac5c221208
Stagger send presence to remotes (#10398)
This is to help with performance, where trying to connect to thousands
of hosts at once can consume a lot of CPU (due to TLS etc).

Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
2021-07-15 11:52:56 +01:00
Jonathan de Jong bf72d10dbf
Use inline type hints in various other places (in `synapse/`) (#10380) 2021-07-15 11:02:43 +01:00
Patrick Cloke 30b56f6925
Add type hints to get_domain_from_id and get_localpart_from_id. (#10385) 2021-07-13 12:08:47 -04:00
Erik Johnston 1579fdd54a
Ensure we always drop the federation inbound lock (#10336) 2021-07-09 10:16:54 +01:00
Erik Johnston c65067d673
Handle old staged inbound events (#10303)
We might have events in the staging area if the service was restarted while there were unhandled events in the staging area.

Fixes #10295
2021-07-06 13:02:37 +01:00
Patrick Cloke 8d609435c0
Move methods involving event authentication to EventAuthHandler. (#10268)
Instead of mixing them with user authentication methods.
2021-07-01 14:25:37 -04:00
Erik Johnston 329ef5c715
Fix the inbound PDU metric (#10279)
This broke in #10272
2021-06-30 12:07:16 +01:00
Richard van der Hoff 785bceef72 Merge branch 'release-v1.37' into develop 2021-06-29 20:25:47 +01:00
Erik Johnston c54db67d0e
Handle inbound events from federation asynchronously (#10272)
Fixes #9490

This will break a couple of SyTest that are expecting failures to be added to the response of a federation /send, which obviously doesn't happen now that things are asynchronous.

Two drawbacks:

    Currently there is no logic to handle any events left in the staging area after restart, and so they'll only be handled on the next incoming event in that room. That can be fixed separately.
    We now only process one event per room at a time. This can be fixed up further down the line.
2021-06-29 19:55:22 +01:00
Richard van der Hoff a0ed0f363e
Soft-fail spammy events received over federation (#10263) 2021-06-29 11:08:06 +01:00
Patrick Cloke 0555d7b0dc
Add additional types to the federation transport server. (#10213) 2021-06-28 07:36:41 -04:00
Richard van der Hoff 6e8fb42be7
Improve validation for `send_{join,leave,knock}` (#10225)
The idea here is to stop people sending things that aren't joins/leaves/knocks through these endpoints: previously you could send anything you liked through them. I wasn't able to find any security holes from doing so, but it doesn't sound like a good thing.
2021-06-24 15:30:49 +01:00
Richard van der Hoff 91fa9cca99
Expose opentracing trace id in response headers (#10199)
Fixes: #9480
2021-06-18 11:43:22 +01:00
Patrick Cloke 9e5ab6dd58
Remove the experimental flag for knocking and use stable prefixes / endpoints. (#10167)
* Room version 7 for knocking.
* Stable prefixes and endpoints (both client and federation) for knocking.
* Removes the experimental configuration flag.
2021-06-15 07:45:14 -04:00
Sorunome d936371b69
Implement knock feature (#6739)
This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403

Signed-off-by: Sorunome mail@sorunome.de
Signed-off-by: Andrew Morgan andrewm@element.io
2021-06-09 19:39:51 +01:00
Patrick Cloke c7f3fb2745
Add type hints to the federation server transport. (#10080) 2021-06-08 11:19:25 -04:00
Erik Johnston c842c581ed
When joining a remote room limit the number of events we concurrently check signatures/hashes for (#10117)
If we do hundreds of thousands at once the memory overhead can easily reach 500+ MB.
2021-06-08 11:07:46 +01:00
Erik Johnston fc3d2dc269
Rewrite the KeyRing (#10035) 2021-06-02 16:37:59 +01:00
Andrew Morgan 3ff6fe2851 Merge branch 'master' into develop 2021-06-01 13:47:27 +01:00
Erik Johnston 84cf3e47a0
Allow response of `/send_join` to be larger. (#10093)
Fixes #10087.
2021-05-28 16:28:01 +01:00
Richard van der Hoff ed53bf314f
Set opentracing priority before setting other tags (#10092)
... because tags on spans which aren't being sampled get thrown away.
2021-05-28 16:14:08 +01:00
Erik Johnston 8e132fe64e Synapse 1.35.0rc2 (2021-05-27)
==============================
 
 Bugfixes
 --------
 
 - Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. ([\#10079](https://github.com/matrix-org/synapse/issues/10079))
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmCvpZMQHGVyaWtAbWF0
 cml4Lm9yZwAKCRClQuTtGw+sCdPwCACQIlIWd6eIoXLUc+wDcrd+k5xL376EdYah
 x7ABswiYSm+9C4xr58gJD3xc6eiD2PCIWdZN0rsQDLIOfSXW6x1lyKD+Ds0HySok
 MaVpsoxbb9o/Zf9qtXF2bLSArZUQwfoNaA45NLgNzfUIijf1e+bd2wNEgHlRSoMz
 m10GggOFU0Ds/CCYZpxZw/eXDbLWL7eaHR30vw/jQ1cEsV+S4ucnUmLHFCF7YCyI
 Np80pnywH5cYKerecldFWenL4YZJswVbx+AW9e3lBzq5jOrRZkmLkaHg10mCR2f6
 CV03ie65Ce+7x5UU6v6nHA0DYUTQGIjlJBtyCN3tFglUduQ8Gpu0
 =Mfsf
 -----END PGP SIGNATURE-----

Merge tag 'v1.35.0rc2' into develop

Synapse 1.35.0rc2 (2021-05-27)
==============================

Bugfixes
--------

- Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. ([\#10079](https://github.com/matrix-org/synapse/issues/10079))
2021-05-27 14:59:46 +01:00
Patrick Cloke 8e15c92c2f
Pass the origin when calculating the spaces summary over GET. (#10079)
Fixes a bug due to conflicting PRs which were merged. (One added a new caller to
a method, the other added a new parameter to the same method.)
2021-05-27 08:52:28 -04:00
Patrick Cloke f42e4c4eb9
Remove the experimental spaces enabled flag. (#10063)
In lieu of just always enabling the unstable spaces endpoint and
unstable room version.
2021-05-26 14:35:16 -04:00
Erik Johnston 3e831f24ff
Don't hammer the database for destination retry timings every ~5mins (#10036) 2021-05-21 17:57:08 +01:00
Erik Johnston 1c6a19002c
Add `Keyring.verify_events_for_server` and reduce memory usage (#10018)
Also add support for giving a callback to generate the JSON object to
verify. This should reduce memory usage, as we no longer have the event
in memory in dict form (which has a large memory footprint) for extend
periods of time.
2021-05-20 16:25:11 +01:00
Erik Johnston 64887f06fc
Use ijson to parse the response to `/send_join`, reducing memory usage. (#9958)
Instead of parsing the full response to `/send_join` into Python objects (which can be huge for large rooms) and *then* parsing that into events, we instead use ijson to stream parse the response directly into `EventBase` objects.
2021-05-20 16:11:48 +01:00
Patrick Cloke 551d2c3f4b
Allow a user who could join a restricted room to see it in spaces summary. (#9922)
This finishes up the experimental implementation of MSC3083 by showing
the restricted rooms in the spaces summary (from MSC2946).
2021-05-20 11:10:36 -04:00
Patrick Cloke f4833e0c06
Support fetching the spaces summary via GET over federation. (#9947)
Per changes in MSC2946, the C-S and S-S APIs for spaces summary
should use GET requests.

Until this is stable, the POST endpoints still exist.

This does not switch federation requests to use the GET version yet
since it is newly added and already deployed servers might not support
it. When switching to the stable endpoint we should switch to GET
requests.
2021-05-11 12:21:43 -04:00
Richard van der Hoff b378d98c8f
Add debug logging for issue #9533 (#9959)
Hopefully this will help us track down where to-device messages are getting
lost/delayed.
2021-05-11 11:04:03 +01:00
Richard van der Hoff 7967b36efe
Fix `m.room_key_request` to-device messages (#9961)
fixes #9960
2021-05-11 11:02:56 +01:00
Andrew Morgan 4e0fd35bc9 Revert "Experimental Federation Speedup (#9702)"
This reverts commit 05e8c70c05.
2021-04-28 11:38:33 +01:00
Patrick Cloke 1350b053da
Pass errors back to the client when trying multiple federation destinations. (#9868)
This ensures that something like an auth error (403) will be
returned to the requester instead of attempting to try more
servers, which will likely result in the same error, and then
passing back a generic 400 error.
2021-04-27 07:30:34 -04:00
Richard van der Hoff 294c675033
Remove `synapse.types.Collection` (#9856)
This is no longer required, since we have dropped support for Python 3.5.
2021-04-22 16:43:50 +01:00
Erik Johnston db70435de7
Fix bug where we sent remote presence states to remote servers (#9850) 2021-04-20 13:37:54 +01:00
Jonathan de Jong 495b214f4f
Fix (final) Bugbear violations (#9838) 2021-04-20 11:50:49 +01:00
Erik Johnston 2b7dd21655
Don't send normal presence updates over federation replication stream (#9828) 2021-04-19 10:50:49 +01:00
Richard van der Hoff 5a153772c1
remove `HomeServer.get_config` (#9815)
Every single time I want to access the config object, I have to remember
whether or not we use `get_config`. Let's just get rid of it.
2021-04-14 19:09:08 +01:00
Jonathan de Jong 05e8c70c05
Experimental Federation Speedup (#9702)
This basically speeds up federation by "squeezing" each individual dual database call (to destinations and destination_rooms), which previously happened per every event, into one call for an entire batch (100 max).

Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
2021-04-14 17:19:02 +01:00
Jonathan de Jong 4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Richard van der Hoff f946450184
Fix duplicate logging of exceptions in transaction processing (#9780)
There's no point logging this twice.
2021-04-09 18:12:15 +01:00
Jonathan de Jong 2ca4e349e9
Bugbear: Add Mutable Parameter fixes (#9682)
Part of #9366

Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.

Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
2021-04-08 22:38:54 +01:00
Erik Johnston 3a569fb200 Fix sharded federation sender sometimes using 100% CPU.
We pull all destinations requiring catchup from the DB in batches.
However, if all those destinations get filtered out (due to the
federation sender being sharded), then the `last_processed` destination
doesn't get updated, and we keep requesting the same set repeatedly.
2021-04-08 17:34:07 +01:00
Andrew Morgan 04819239ba
Add a Synapse Module for configuring presence update routing (#9491)
At the moment, if you'd like to share presence between local or remote users, those users must be sharing a room together. This isn't always the most convenient or useful situation though.

This PR adds a module to Synapse that will allow deployments to set up extra logic on where presence updates should be routed. The module must implement two methods, `get_users_for_states` and `get_interested_users`. These methods are given presence updates or user IDs and must return information that Synapse will use to grant passing presence updates around.

A method is additionally added to `ModuleApi` which allows triggering a set of users to receive the current, online presence information for all users they are considered interested in. This is the equivalent of that user receiving presence information during an initial sync. 

The goal of this module is to be fairly generic and useful for a variety of applications, with hard requirements being:

* Sending state for a specific set or all known users to a defined set of local and remote users.
* The ability to trigger an initial sync for specific users, so they receive all current state.
2021-04-06 14:38:30 +01:00
Patrick Cloke 44bb881096
Add type hints to expiring cache. (#9730) 2021-04-06 08:58:18 -04:00
Patrick Cloke d959d28730
Add type hints to the federation handler and server. (#9743) 2021-04-06 07:21:57 -04:00
Erik Johnston 33548f37aa
Improve tracing for to device messages (#9686) 2021-04-01 17:08:21 +01:00
Erik Johnston 963f4309fe
Make RateLimiter class check for ratelimit overrides (#9711)
This should fix a class of bug where we forget to check if e.g. the appservice shouldn't be ratelimited.

We also check the `ratelimit_override` table to check if the user has ratelimiting disabled. That table is really only meant to override the event sender ratelimiting, so we don't use any values from it (as they might not make sense for different rate limits), but we do infer that if ratelimiting is disabled for the user we should disabled all ratelimits.

Fixes #9663
2021-03-30 12:06:09 +01:00
Patrick Cloke da75d2ea1f
Add type hints for the federation sender. (#9681)
Includes an abstract base class which both the FederationSender
and the FederationRemoteSendQueue must implement.
2021-03-29 11:43:20 -04:00
Erik Johnston c602ba8336
Fixed undefined variable error in catchup (#9664)
Broke in #9640

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2021-03-24 16:12:47 +00:00
Richard van der Hoff c73cc2c2ad
Spaces summary: call out to other servers (#9653)
When we hit an unknown room in the space tree, see if there are other servers that we might be able to poll to get the data.

Fixes: #9447
2021-03-24 12:45:39 +00:00
Richard van der Hoff 4ecba9bd5c
Federation API for Space summary (#9652)
Builds on the work done in #9643 to add a federation API for space summaries.

There's a bit of refactoring of the existing client-server code first, to avoid too much duplication.
2021-03-23 11:51:12 +00:00
Patrick Cloke b7748d3c00
Import HomeServer from the proper module. (#9665) 2021-03-23 07:12:48 -04:00
Erik Johnston dd71eb0f8a
Make federation catchup send last event from any server. (#9640)
Currently federation catchup will send the last *local* event that we
failed to send to the remote. This can cause issues for large rooms
where lots of servers have sent events while the remote server was down,
as when it comes back up again it'll be flooded with events from various
points in the DAG.

Instead, let's make it so that all the servers send the most recent
events, even if its not theirs. The remote should deduplicate the
events, so there shouldn't be much overhead in doing this.
Alternatively, the servers could only send local events if they were
also extremities and hope that the other server will send the event
over, but that is a bit risky.
2021-03-18 15:52:26 +00:00
Erik Johnston 026503fa3b
Don't go into federation catch up mode so easily (#9561)
Federation catch up mode is very inefficient if the number of events
that the remote server has missed is small, since handling gaps can be
very expensive, c.f. #9492.

Instead of going into catch up mode whenever we see an error, we instead
do so only if we've backed off from trying the remote for more than an
hour (the assumption being that in such a case it is more than a
transient failure).
2021-03-15 14:42:40 +00:00
Patrick Cloke 55da8df078
Fix additional type hints from Twisted 21.2.0. (#9591) 2021-03-12 11:37:57 -05:00
Richard van der Hoff 1e67bff833
Reject concurrent transactions (#9597)
If more transactions arrive from an origin while we're still processing the
first one, reject them.

Hopefully a quick fix to https://github.com/matrix-org/synapse/issues/9489
2021-03-12 15:14:55 +00:00
Richard van der Hoff 2b328d7e02
Improve logging when processing incoming transactions (#9596)
Put the room id in the logcontext, to make it easier to understand what's going on.
2021-03-12 15:08:03 +00:00
Patrick Cloke 2a99cc6524
Use the chain cover index in get_auth_chain_ids. (#9576)
This uses a simplified version of get_chain_cover_difference to calculate
auth chain of events.
2021-03-10 09:57:59 -05:00
Patrick Cloke 7fdc6cefb3
Fix additional type hints. (#9543)
Type hint fixes due to Twisted 21.2.0 adding type hints.
2021-03-09 07:41:32 -05:00
Jonathan de Jong d6196efafc
Add ResponseCache tests. (#9458) 2021-03-08 14:00:07 -05:00
Richard van der Hoff 8a4b3738f3
Replace `last_*_pdu_age` metrics with timestamps (#9540)
Following the advice at
https://prometheus.io/docs/practices/instrumentation/#timestamps-not-time-since,
it's preferable to export unix timestamps, not ages.

There doesn't seem to be any particular naming convention for timestamp
metrics.
2021-03-04 16:40:18 +00:00
Patrick Cloke fc8b3d8809
Ratelimit cross-user key sharing requests. (#8957) 2021-02-19 13:20:34 -05:00
Andrew Morgan 8bcfc2eaad
Be smarter about which hosts to send presence to when processing room joins (#9402)
This PR attempts to eliminate unnecessary presence sending work when your local server joins a room, or when a remote server joins a room your server is participating in by processing state deltas in chunks rather than individually.

---

When your server joins a room for the first time, it requests the historical state as well. This chunk of new state is passed to the presence handler which, after filtering that state down to only membership joins, will send presence updates to homeservers for each join processed.

It turns out that we were being a bit naive and processing each event individually, and sending out presence updates for every one of those joins. Even if many different joins were users on the same server (hello IRC bridges), we'd send presence to that same homeserver for every remote user join we saw.

This PR attempts to deduplicate all of that by processing the entire batch of state deltas at once, instead of only doing each join individually. We process the joins and note down which servers need which presence:

* If it was a local user join, send that user's latest presence to all servers in the room
* If it was a remote user join, send the presence for all local users in the room to that homeserver

We deduplicate by inserting all of those pending updates into a dictionary of the form:

```
{
  server_name1: {presence_update1, ...},
  server_name2: {presence_update1, presence_update2, ...}
}
```

Only after building this dict do we then start sending out presence updates.
2021-02-19 11:37:29 +00:00