Commit Graph

268 Commits

Author SHA1 Message Date
Sean Quah 3ad817bfe5
Fix federated joins when the first server in the list is not in the room (#15074)
Previously we would give up upon receiving a 404 from the first server,
instead of trying the rest of the servers in the list.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-15 13:59:06 +00:00
Sean Quah 0a686d1d13
Faster joins: Refactor handling of servers in room (#14954)
Ensure that the list of servers in a partial state room always contains
the server we joined off.

Also refactor `get_partial_state_servers_at_join` to return `None` when
the given room is no longer partial stated, to explicitly indicate when
the room has partial state. Otherwise it's not clear whether an empty
list means that the room has full state, or the room is partial stated,
but the server we joined off told us that there are no servers in the
room.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-03 15:39:59 +00:00
Patrick Cloke 1182ae5063
Add helper to parse an enum from query args & use it. (#14956)
The `parse_enum` helper pulls an enum value from the query string
(by delegating down to the parse_string helper with values generated
from the enum).

This is used to pull out "f" and "b" in most places and then we thread
the resulting Direction enum throughout more code.
2023-02-01 21:35:24 +00:00
David Robertson 796a4b7482
Prefer `type(x) is int` to `isinstance(x, int)` (#14945)
* Perfer `type(x) is int` to `isinstance(x, int)`

This covered all additional instances I could see where `x` was
user-controlled.
The remaining cases are

```
$ rg -s 'isinstance.*[^_]int'
tests/replication/_base.py
576:        if isinstance(obj, int):

synapse/util/caches/stream_change_cache.py
136:        assert isinstance(stream_pos, int)
214:        assert isinstance(stream_pos, int)
246:        assert isinstance(stream_pos, int)
267:        assert isinstance(stream_pos, int)

synapse/replication/tcp/external_cache.py
133:        if isinstance(result, int):

synapse/metrics/__init__.py
100:        if isinstance(calls, (int, float)):

synapse/handlers/appservice.py
262:        assert isinstance(new_token, int)

synapse/config/_util.py
62:        if isinstance(p, int):
```

which cover metrics, logic related to `jsonschema`, and replication and
data streams. AFAICS these are all internal to Synapse

* Changelog
2023-01-31 10:33:07 +00:00
Sean Quah d329a566df
Faster joins: Fix incompatibility with restricted joins (#14882)
* Avoid clearing out forward extremities when doing a second remote join

When joining a restricted room where the local homeserver does not have
a user able to issue invites, we perform a second remote join. We want
to avoid clearing out forward extremities in this case because the
forward extremities we have are up to date and clearing out forward
extremities creates a window in which the room can get bricked if
Synapse crashes.

Signed-off-by: Sean Quah <seanq@matrix.org>

* Do a full join when doing a second remote join into a full state room

We cannot persist a partial state join event into a joined full state
room, so we perform a full state join for such rooms instead. As a
future optimization, we could always perform a partial state join and
compute or retrieve the full state ourselves if necessary.

Signed-off-by: Sean Quah <seanq@matrix.org>

* Add lock around partial state flag for rooms

Signed-off-by: Sean Quah <seanq@matrix.org>

* Preserve partial state info when doing a second partial state join

Signed-off-by: Sean Quah <seanq@matrix.org>

* Add newsfile

* Add a TODO(faster_joins) marker

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-22 19:19:31 +00:00
Sean Quah db5145a31d
Add parameter to control whether we do a partial state join (#14843)
When the local homeserver is already joined to a room and wants to
perform another remote join, we may find it useful to do a non-partial
state join if we already have the full state for the room.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-16 23:15:17 +00:00
David Robertson 85a7a201fa
Also use stable name in SendJoinResponse struct (#14841)
* Also use stable name in SendJoinResponse struct

follow-up to #14832

* Changelog

* Fix a rename I missed

* Run black

* Update synapse/federation/federation_client.py

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-01-16 12:40:25 +00:00
Patrick Cloke 9b6224577e
Failover on proper error responses. (#14620)
When querying a remote server handle a 404/405 with an
errcode of M_UNRECOGNIZED as an unimplemented endpoint.
2022-12-06 07:23:03 -05:00
Eric Eastwood 8f10c8b054
Move MSC3030 `/timestamp_to_event` endpoint to stable v1 location (#14471)
Fix https://github.com/matrix-org/synapse/issues/14390

 - Client API: `/_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>` -> `/_matrix/client/v1/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>`
 - Federation API: `/_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>` -> `/_matrix/federation/v1/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>`

Complement test changes: https://github.com/matrix-org/complement/pull/559
2022-11-28 15:54:18 -06:00
David Robertson d4fac8a3e2
Fix typo in #13320 which could cause log spam (#14347) 2022-11-01 19:20:35 +00:00
Eric Eastwood 40fa8294e3
Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull from `destination` pattern (#14096)
1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper.
 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern.
 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from
2022-10-26 16:10:55 -05:00
Andrew Morgan 9c23442ac9
Correct field name for stripped state events when knocking. `knock_state_events` -> `knock_room_state` (#14102) 2022-10-12 14:37:20 +01:00
Eric Eastwood 70a4317692
Track when the pulled event signature fails (#13815)
Because we're doing the recording in `_check_sigs_and_hash_for_pulled_events_and_fetch` (previously named `_check_sigs_and_hash_and_fetch`), this means we will track signature failures for `backfill`, `get_room_state`, `get_event_auth`, and `get_missing_events` (all pulled event scenarios). And we also record signature failures from `get_pdu`.

Part of https://github.com/matrix-org/synapse/issues/13700

Part of https://github.com/matrix-org/synapse/issues/13676 and https://github.com/matrix-org/synapse/issues/13356

This PR will be especially important for https://github.com/matrix-org/synapse/pull/13816 so we can avoid the costly `_get_state_ids_after_missing_prev_event` down the line when `/messages` calls backfill.
2022-10-03 14:53:29 -05:00
Denis c802ef1411
Don't include redundant prev_state in new events (#13791) 2022-09-20 09:44:38 +01:00
reivilibre c2fe48a6ff
Rename the `EventFormatVersions` enum values so that they line up with room version numbers. (#13706) 2022-09-07 11:08:20 +01:00
Eric Eastwood 7af07f9716
Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls (#13588)
Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls because I've see `_check_sigs_and_hash_and_fetch` take [10.41s to process 100 events](https://github.com/matrix-org/synapse/issues/13587)

Fix https://github.com/matrix-org/synapse/issues/13587

Part of https://github.com/matrix-org/synapse/issues/13356
2022-08-23 21:53:37 -05:00
Eric Eastwood 0a4efbc1dd
Instrument the federation/backfill part of `/messages` (#13489)
Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace.

Split out from https://github.com/matrix-org/synapse/pull/13440

Follow-up from https://github.com/matrix-org/synapse/pull/13368

Part of https://github.com/matrix-org/synapse/issues/13356
2022-08-16 12:39:40 -05:00
Eric Eastwood 92d21faf12
Instrument `/messages` for understandable traces in Jaeger (#13368)
In Jaeger:

 - Before: huge list of uncategorized database calls
 - After: nice and collapsible into units of work
2022-08-03 10:57:38 -05:00
reivilibre 39be5bc550
Make minor clarifications to the error messages given when we fail to join a room via any server. (#13160) 2022-07-27 10:37:50 +00:00
Eric Eastwood 4f3082d6bf
Fix `get_pdu` asking every remote destination even after it finds an event (#13346) 2022-07-27 10:40:04 +01:00
Eric Eastwood 0f971ca68e
Update `get_pdu` to return the original, pristine `EventBase` (#13320)
Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier`  and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call.

Split out from https://github.com/matrix-org/synapse/pull/13205

As discussed at:

 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746
 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125

Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](7864f33e28/synapse/federation/federation_client.py (L581-L594)).
2022-07-20 15:58:51 -05:00
Patrick Cloke a6895dd576
Add type annotations to `trace` decorator. (#13328)
Functions that are decorated with `trace` are now properly typed
and the type hints for them are fixed.
2022-07-19 14:14:30 -04:00
Patrick Cloke 81608490e3
Stop depending on `room_id` to be returned for children state in the hierarchy response. (#12991)
The `room_id` field was removed from MSC2946 before
it was accepted. It was initially kept for backwards compatibility
and should be removed now that the stable form of the API
is used.

This change only stops Synapse from validating that it is returned,
a future PR will remove returning it as part of the response.
2022-06-10 07:15:51 -04:00
Richard van der Hoff f0aec0abef
Improve logging when signature checks fail (#12925)
* Raise a dedicated `InvalidEventSignatureError` from `_check_sigs_on_pdu`

* Downgrade logging about redactions to DEBUG

this can be very spammy during a room join, and it's not very useful.

* Raise `InvalidEventSignatureError` from `_check_sigs_and_hash`

... and, more importantly, move the logging out to the callers.

* changelog
2022-05-31 23:32:56 +01:00
Sean Quah 2fba1076c5
Faster room joins: Try other destinations when resyncing the state of a partial-state room (#12812)
Signed-off-by: Sean Quah <seanq@matrix.org>
2022-05-31 15:50:29 +01:00
Val Lorentz bf0c3ca20a
Fix inconsistent spelling of 'M_UNRECOGNIZED'. (#12665) 2022-05-09 20:29:07 +00:00
DeepBlueV7.X a377a43386
Support MSC3266 room summaries over federation (#11507)
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2022-05-05 15:25:00 +01:00
David Robertson 95a038c106
Unify HTTP query parameter type hints (#12415)
* Pull out query param types to `synapse.http.types`
* Use QueryParams everywhere
* Simplify `encode_query_args`
* Add annotation which would have caught #12410
2022-04-08 13:06:51 +01:00
Patrick Cloke 1103c5fe8a
Check if instances are lists, not sequences. (#12128)
As a str is a sequence, the checks were not granular
enough and would allow lists or strings, when only
lists were valid.
2022-03-02 13:18:51 +00:00
Patrick Cloke 7754af24ab
Remove the unstable `/spaces` endpoint. (#12073)
...and various code supporting it.

The /spaces endpoint was from an old version of MSC2946 and included
both a Client-Server and Server-Server API. Note that the unstable
/hierarchy endpoint (from the final version of MSC2946) is not yet
removed.
2022-02-28 18:33:00 +00:00
Patrick Cloke 9e83521af8
Properly failover for unknown endpoints from Conduit/Dendrite. (#12077)
Before this fix, a legitimate 404 from a federation endpoint (e.g. due
to an unknown room) would be treated as an unknown endpoint. This
could cause unnecessary federation traffic.
2022-02-28 07:52:44 -05:00
Brendan Abolivier 250104d357
Implement account status endpoints (MSC3720) (#12001)
See matrix-org/matrix-doc#3720

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2022-02-22 15:10:10 +00:00
Richard van der Hoff 7273011f60
Faster joins: Support for calling `/federation/v1/state` (#12013)
This is an endpoint that we have server-side support for, but no client-side support. It's going to be useful for resyncing partial-stated rooms, so let's introduce it.
2022-02-22 12:17:10 +00:00
Richard van der Hoff 3070af4809
remote join processing: get create event from state, not auth_chain (#12039)
A follow-up to #12005, in which I apparently missed that there are a bunch of other places that assume the create event is in the auth chain.
2022-02-21 19:27:35 +00:00
Richard van der Hoff da0e9f8efd
Faster joins: parse msc3706 fields in send_join response (#12011)
Part of my work on #11249: add code to handle the new fields added in MSC3706.
2022-02-17 16:11:59 +00:00
Sean Quah af13a3be29
Fix a bug that corrupted the cache of federated space hierarchies (#11775)
`FederationClient.get_room_hierarchy()` caches its return values, so
refactor the code to avoid modifying the returned room summary.
2022-01-20 11:03:42 +00:00
Richard van der Hoff 251b5567ec
Remove `log_function` and its uses (#11761)
I've never found this terribly useful. I think it was added in the early days
of Synapse, without much thought as to what would actually be useful to log,
and has just been cargo-culted ever since.

Rather, it tends to clutter up debug logs with useless information.
2022-01-18 13:06:04 +00:00
Richard van der Hoff 0fb3dd0830
Refactor the way we set `outlier` (#11634)
* `_auth_and_persist_outliers`: mark persisted events as outliers

Mark any events that get persisted via `_auth_and_persist_outliers` as, well,
outliers.

Currently this will be a no-op as everything will already be flagged as an
outlier, but I'm going to change that.

* `process_remote_join`: stop flagging as outlier

The events are now flagged as outliers later on, by `_auth_and_persist_outliers`.

* `send_join`: remove `outlier=True`

The events created here are returned in the result of `send_join` to
`FederationHandler.do_invite_join`. From there they are passed into
`FederationEventHandler.process_remote_join`, which passes them to
`_auth_and_persist_outliers`... which sets the `outlier` flag.

* `get_event_auth`: remove `outlier=True`

stop flagging the events returned by `get_event_auth` as outliers. This method
is only called by `_get_remote_auth_chain_for_event`, which passes the results
into `_auth_and_persist_outliers`, which will flag them as outliers.

* `_get_remote_auth_chain_for_event`: remove `outlier=True`

we pass all the events into `_auth_and_persist_outliers`, which will now flag
the events as outliers.

* `_check_sigs_and_hash_and_fetch`: remove unused `outlier` parameter

This param is now never set to True, so we can remove it.

* `_check_sigs_and_hash_and_fetch_one`: remove unused `outlier` param

This is no longer set anywhere, so we can remove it.

* `get_pdu`: remove unused `outlier` parameter

... and chase it down into `get_pdu_from_destination_raw`.

* `event_from_pdu_json`: remove redundant `outlier` param

This is never set to `True`, so can be removed.

* changelog

* update docstring
2022-01-05 12:26:11 +00:00
Richard van der Hoff 878aa55293
`FederationClient.backfill`: stop flagging events as outliers (#11632)
Events returned by `backfill` should not be flagged as outliers.

Fixes:

```
AssertionError: null
  File "synapse/handlers/federation.py", line 313, in try_backfill
    dom, room_id, limit=100, extremities=extremities
  File "synapse/handlers/federation_event.py", line 517, in backfill
    await self._process_pulled_events(dest, events, backfilled=True)
  File "synapse/handlers/federation_event.py", line 642, in _process_pulled_events
    await self._process_pulled_event(origin, ev, backfilled=backfilled)
  File "synapse/handlers/federation_event.py", line 669, in _process_pulled_event
    assert not event.internal_metadata.is_outlier()
```

See https://sentry.matrix.org/sentry/synapse-matrixorg/issues/231992

Fixes #8894.
2022-01-04 16:31:32 +00:00
Patrick Cloke d2279f471b
Add most of the missing type hints to `synapse.federation`. (#11483)
This skips a few methods which are difficult to type.
2021-12-02 16:18:10 +00:00
Eric Eastwood a6f1a3abec
Add MSC3030 experimental client and federation API endpoints to get the closest event to a given timestamp (#9445)
MSC3030: https://github.com/matrix-org/matrix-doc/pull/3030

Client API endpoint. This will also go and fetch from the federation API endpoint if unable to find an event locally or we found an extremity with possibly a closer event we don't know about.
```
GET /_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>
{
    "event_id": ...
    "origin_server_ts": ...
}
```

Federation API endpoint:
```
GET /_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>
{
    "event_id": ...
    "origin_server_ts": ...
}
```

Co-authored-by: Erik Johnston <erik@matrix.org>
2021-12-02 01:02:20 -06:00
Patrick Cloke a4521ce0a8
Support the stable /hierarchy endpoint from MSC2946 (#11329)
This also makes additional updates where the implementation
had drifted from the approved MSC.

Unstable endpoints will be removed at a later data.
2021-11-29 14:32:20 -05:00
Eric Eastwood f1d5c2f269
Split out federated PDU retrieval into a non-cached version (#11242)
Context: https://github.com/matrix-org/synapse/pull/11114/files#r741643968
2021-11-09 15:07:57 -06:00
reivilibre 75ca0a6168
Annotate `log_function` decorator (#10943)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2021-10-27 17:27:23 +01:00
Patrick Cloke d1bf5f7c9d
Strip "join_authorised_via_users_server" from join events which do not need it. (#10933)
This fixes a "Event not signed by authorising server" error when
transition room member from join -> join, e.g. when updating a
display name or avatar URL for restricted rooms.
2021-09-30 11:13:59 -04:00
Richard van der Hoff 85551b7a85
Factor out common code for persisting fetched auth events (#10896)
* Factor more stuff out of `_get_events_and_persist`

It turns out that the event-sorting algorithm in `_get_events_and_persist` is
also useful in other circumstances. Here we move the current
`_auth_and_persist_fetched_events` to `_auth_and_persist_fetched_events_inner`,
and then factor the sorting part out to `_auth_and_persist_fetched_events`.

* `_get_remote_auth_chain_for_event`: remove redundant `outlier` assignment

`get_event_auth` returns events with the outlier flag already set, so this is
redundant (though we need to update a test where `get_event_auth` is mocked).

* `_get_remote_auth_chain_for_event`: move existing-event tests earlier

Move a couple of tests outside the loop. This is a bit inefficient for now, but
a future commit will make it better. It should be functionally identical.

* `_get_remote_auth_chain_for_event`: use `_auth_and_persist_fetched_events`

We can use the same codepath for persisting the events fetched as part of an
auth chain as for those fetched individually by `_get_events_and_persist` for
building the state at a backwards extremity.

* `_get_remote_auth_chain_for_event`: use a dict for efficiency

`_auth_and_persist_fetched_events` sorts the events itself, so we no longer
need to care about maintaining the ordering from `get_event_auth` (and no
longer need to sort by depth in `get_event_auth`).

That means that we can use a map, making it easier to filter out events we
already have, etc.

* changelog

* `_auth_and_persist_fetched_events`: improve docstring
2021-09-24 11:56:33 +01:00
Patrick Cloke 5548fe0978
Cache the result of fetching the room hierarchy over federation. (#10647) 2021-08-26 07:16:53 -04:00
Patrick Cloke 31dac7ffee
Do not include stack traces for known exceptions when trying multiple federation destinations. (#10662) 2021-08-23 08:00:25 -04:00
Patrick Cloke c4cf0c0473
Attempt to pull from the legacy spaces summary API over federation. (#10583)
If the new /hierarchy API does not exist on all destinations,
fallback to querying the /spaces API and translating the results.

This is a backwards compatibility hack since not all of the
federated homeservers will update at the same time.
2021-08-17 08:19:12 -04:00
Patrick Cloke 7de445161f
Support federation in the new spaces summary API (MSC2946). (#10569) 2021-08-16 08:06:17 -04:00