Commit Graph

15367 Commits

Author SHA1 Message Date
Erik Johnston 79d2e2e79c
Speed up membership queries for users with forgotten rooms (#15385) 2023-04-04 14:11:34 +01:00
Sean Quah 89a71e7390
Fix a rare bug where initial /syncs would fail (#15383)
This change fixes a rare bug where initial /syncs would fail with a
`KeyError` under the following circumstances:
 1. A user fast joins a remote room.
 2. The user is kicked from the room before the room's full state has
    been synced.
 3. A second local user fast joins the room.
 4. Events are backfilled into the room with a higher topological
    ordering than the original user's leave. They are assigned a
    negative stream ordering. It's not clear how backfill happened here,
    since it is expected to be equivalent to syncing the full state.
 5. The second local user leaves the room before the room's full state
    has been synced. The homeserver does not complete the sync.
 6. The original user performs an initial /sync with lazy_load_members
    enabled.
     * Because they were kicked from the room, the room is included in
       the /sync response even though the include_leave option is not
       specified.
     * To populate the room's timeline, `_load_filtered_recents` /
       `get_recent_events_for_room` fetches events with a lower stream
       ordering than the leave event and picks the ones with the highest
       topological orderings (which are most recent). This captures the
       backfilled events after the leave, since they have a negative
       stream ordering. These events are filtered out of the timeline,
       since the user was not in the room at the time and cannot view
       them. The sync code ends up with an empty timeline for the room
       that notably does not include the user's leave event.
       This seems buggy, but at least we don't disclose events the user
       isn't allowed to see.
     * Normally, `compute_state_delta` would fetch the state at the
       start and end of the room's timeline to generate the sync
       response. Since the timeline is empty, it fetches the state at
       `min(now, last event in the room)`, which corresponds with the
       second user's leave. The state during the entirety of the second
       user's membership does not include the membership for the first
       user because of partial state.
       This part is also questionable, since we are fetching state from
       outside the bounds of the user's membership.
     * `compute_state_delta` then tries and fails to find the user's
       membership in the auth events of timeline events. Because there
       is no timeline event whose auth events are expected to contain
       the user's membership, a `KeyError` is raised.

Also contains a drive-by fix for a separate unlikely race condition.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-04-04 13:10:25 +01:00
Patrick Cloke cf2f2934ad
Call appservices on modern paths, falling back to legacy paths. (#15317)
This uses the specced /_matrix/app/v1/... paths instead of the
"legacy" paths. If the homeserver receives an error it will retry
using the legacy path.
2023-04-03 13:20:32 -04:00
Jason Little 56efa9b167
Experimental Unix socket support (#15353)
* Add IReactorUNIX to ISynapseReactor type hint.

* Create listen_unix().

Two options, 'path' to the file and 'mode' of permissions(not umask, recommend 666 as default as
nginx/other reverse proxies write to it and it's setup as user www-data)

For the moment, leave the option to always create a PID lockfile turned on by default

* Create UnixListenerConfig and wire it up.

Rename ListenerConfig to TCPListenerConfig, then Union them together into ListenerConfig.
This spidered around a bit, but I think I got it all. Metrics and manhole have been placed
behind a conditional in case of accidental putting them onto a unix socket.

Use new helpers to get if a listener is configured for TLS, and to help create a site tag
for logging.

There are 2 TODO things in parse_listener_def() to finish up at a later point.

* Refactor SynapseRequest to handle logging correctly when using a unix socket.

This prevents an exception when an IP address can not be retrieved for a request.

* Make the 'Synapse now listening on Unix socket' log line a little prettier.

* No silent failures on generic workers when trying to use a unix socket with metrics or manhole.

* Inline variables in app/_base.py

* Update docstring for listen_unix() to remove reference to a hardcoded permission of 0o666 and add a few comments saying where the default IS declared.

* Disallow both a unix socket and a ip/port combo on the same listener resource

* Linting

* Changelog

* review: simplify how listen_unix returns(and get rid of a type: ignore)

* review: fix typo from ConfigError in app/homeserver.py

* review: roll conditional for http_options.tag into get_site_tag() helper(and add docstring)

* review: enhance the conditionals for checking if a port or path is valid, remove a TODO line

* review: Try updating comment in get_client_ip_if_available to clarify what is being retrieved and why

* Pretty up how 'Synapse now listening on Unix Socket' looks by decoding the byte string.

* review: In parse_listener_def(), raise ConfigError if neither socket_path nor port is declared(and fix a typo)
2023-04-03 10:27:51 +01:00
Jason Robinson 157092d97a
Fix copyright year in SSO footer template (#15358) 2023-03-31 18:20:40 +01:00
Erik Johnston 6204c3663e
Revert pruning of old devices (#15360)
* Revert "Fix registering a device on an account with lots of devices (#15348)"

This reverts commit f0d8f66eaa.

* Revert "Delete stale non-e2e devices for users, take 3 (#15183)"

This reverts commit 78cdb72cd6.
2023-03-31 13:51:51 +01:00
Olivier Wilkinson (reivilibre) 72d2ceaa9a Revert "Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)"
This reverts commit 2a234b788e.

See #15359 for context.
2023-03-31 12:10:10 +01:00
Patrick Cloke 2a234b788e
Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)
Clean-up from adding the thread_id column, which was initially
null but backfilled with values. It is desirable to require it to now
be non-null.

In addition to altering this column to be non-null, we clean up
obsolete background jobs, indexes, and just-in-time updating
code.
2023-03-30 15:11:31 -04:00
Mathieu Velten 6f68e32bfb
to_device updates could be dropped when consuming the replication stream (#15349)
Co-authored-by: reivilibre <oliverw@matrix.org>
2023-03-30 19:41:14 +02:00
Erik Johnston 91c3f32673
Speed up SQLite unit test CI (#15334)
Tests now take 40% of the time.
2023-03-30 16:21:12 +01:00
Patrick Cloke ae4acda1bb
Implement MSC3984 to proxy /keys/query requests to appservices. (#15321)
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for devices in addition
to any information stored in the Synapse database.
2023-03-30 08:39:38 -04:00
Sean Quah d9f694932c
Fix spinloop during partial state sync when a prev event is in backoff (#15351)
Previously, we would spin in a tight loop until
`update_state_for_partial_state_event` stopped raising
`FederationPullAttemptBackoffError`s. Replace the spinloop with a wait
until the backoff period has expired.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-03-30 13:36:41 +01:00
Warren Bailey a3bad89d57
Add the ability to enable/disable registrations when in the OIDC flow (#14978)
Signed-off-by: Warren Bailey <warren@warrenbailey.net>
2023-03-30 11:09:41 +00:00
Mathieu Velten 9228ae633f
Add some clarification to the doc/comments regarding TCP replication (#15354) 2023-03-30 12:51:35 +02:00
Cyberes 9d641d88b7
Fix missing app variable in mail subject for password resets (#15352)
* Update mailer.py

Fix `KeyError: 'app'`

* Create 15352.bugfix

Signed-off-by: Cyberes <cyberes@evulid.cc>

---------

Signed-off-by: Cyberes <cyberes@evulid.cc>
2023-03-30 11:44:53 +01:00
Erik Johnston f0d8f66eaa
Fix registering a device on an account with lots of devices (#15348)
Fixes up #15183
2023-03-29 13:37:06 +00:00
Erik Johnston 5350b5d04d
Revert "Reintroduce membership tables event stream ordering (#15128)" (#15347)
This reverts commit e6af49fbea.
2023-03-29 13:24:28 +01:00
Erik Johnston 78cdb72cd6
Delete stale non-e2e devices for users, take 3 (#15183)
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.

We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
2023-03-29 12:07:14 +01:00
DeepBlueV7.X 753d1d9cde
Fix joining rooms you have been unbanned from (#15323)
* Fix joining rooms you have been unbanned from

Since forever synapse did not allow you to join a room after you have
been unbanned from it over federation. This was not actually because of
the unban event not federating. Synapse simply used outdated state to
validate the join transition. This skips the validation if we are not in
the room and for that reason won't have the current room state.

Fixes #1563

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Add changelog

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Update changelog.d/15323.bugfix

---------

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2023-03-29 08:37:27 +00:00
Patrick Cloke 5282ba1e2b
Implement MSC3983 to proxy /keys/claim queries to appservices. (#15314)
Experimental support for MSC3983 is behind a configuration flag.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for one-time keys *if*
there are none uploaded to Synapse.
2023-03-28 18:26:27 +00:00
dependabot[bot] bd4d958aaf
Bump ruff from 0.0.252 to 0.0.259 (#15328)
* Bump ruff from 0.0.252 to 0.0.259

Bumps [ruff](https://github.com/charliermarsh/ruff) from 0.0.252 to 0.0.259.
- [Release notes](https://github.com/charliermarsh/ruff/releases)
- [Changelog](https://github.com/charliermarsh/ruff/blob/main/BREAKING_CHANGES.md)
- [Commits](https://github.com/charliermarsh/ruff/compare/v0.0.252...v0.0.259)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix new warnings

* Mypy

* Newsfile

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
2023-03-28 09:46:47 +01:00
Erik Johnston 96f163d932
Prune old typing notifications (#15332)
Rather than keeping them around forever in memory, slowing things down.

Fixes #11750.
2023-03-27 14:32:36 +01:00
Dirk Klimpel 4fc85e5a92
Load `/password_policy` endpoint on workers. (#15331) 2023-03-27 07:37:17 -04:00
reivilibre d5324ee111
Add developer documentation for the Federation Sender and add a documentation mechanism using Sphinx. (#15265)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-03-24 16:41:10 +00:00
reivilibre 5f7c908280
As an optimisation, use `TRUNCATE` on Postgres when clearing the user directory tables. (#15316) 2023-03-24 15:31:12 +00:00
Quentin Gliech 5b70f240cf
Make cleaning up pushers depend on the device_id instead of the token_id (#15280)
This makes it so that we rely on the `device_id` to delete pushers on logout,
instead of relying on the `access_token_id`. This ensures we're not removing
pushers on token refresh, and prepares for a world without access token IDs
(also known as the OIDC).

This actually runs the `set_device_id_for_pushers` background update, which
was forgotten in #13831.

Note that for backwards compatibility it still deletes pushers based on the
`access_token` until the background update finishes.
2023-03-24 11:09:39 -04:00
Patrick Cloke 68a6717312
Reject mentions on the C-S API which are invalid. (#15311)
Invalid mentions data received over the Client-Server API should
be rejected with a 400 error. This will hopefully stop clients from
sending invalid data, although does not help with data received
over federation.
2023-03-24 08:31:14 -04:00
Nick Mills-Barrett e6af49fbea
Reintroduce membership tables event stream ordering (#15128)
* Add `event_stream_ordering` column to membership state tables

Specifically this adds the column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used.

* Make denormalised `event_stream_ordering` columns foreign keys
* Add comment in schema file explaining new denormalised columns
* Add triggers to enforce consistency of `event_stream_ordering` columns
* Re-order purge room tables to account for foreign keys
* Bump schema version to 75

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2023-03-24 11:44:01 +00:00
reivilibre 98fd558382
Add a primitive helper script for listing worker endpoints. (#15243)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-03-23 12:11:14 +00:00
David Robertson 3b0083c92a
Use immutabledict instead of frozendict (#15113)
Additionally:

* Consistently use `freeze()` in test

---------

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2023-03-22 17:15:34 +00:00
Shay 7f02fafa28
Add a check to SQLite port DB script to ensure that the sqlite database passed to the script exists before trying to port from it (#15306) 2023-03-22 08:36:42 -07:00
David Robertson 1bc9985eb7
Have replication clients remove _INT_STREAM_POS (#15309)
* Have replication clients remove _INT_STREAM_POS

Suppose worker A makes an internal http request from worker B. B may
make changes that A later learns about over replication. We want A's
request to block until it has seen those changes—mainly to ensure A's
caches are invalidated promptly. This helps provide read-after-write
consistency, eliminating entire categories of races and test flakes.

To implement this, B includes a top-level field `_INT_STREAM_POS` in its
response JSON. Roughly speaking, the field's value tells A what to wait
for. But we weren't removing that internal field before A's request
completed!

Introduced in https://github.com/matrix-org/synapse/pull/14820.
Fixes #15308.

* Changelog
2023-03-22 12:53:55 +00:00
Shay 72f3f23c4d
Change the parameter `immediate` of `send_device_messages` to default to `True` (#15297) 2023-03-21 17:59:55 -07:00
Patrick Cloke 1bc4feb6c9
Apply & bundle edits for non-message events. (#15295) 2023-03-21 14:19:54 -04:00
Shay 96bcc5d902
Revert "check sqlite database file exists before porting/#14692" (#15301) 2023-03-21 10:49:25 -07:00
Andrew Morgan ec9224bf9a
Make `POST /_matrix/client/v3/rooms/{roomId}/report/{eventId}` endpoint return 404 if event exists, but the user lacks access (#15300) 2023-03-21 13:24:03 +00:00
Andrew Morgan b6aef59334
Make `EventHandler.get_event` return `None` when the requested event is not found (#15298) 2023-03-21 13:23:47 +00:00
Erik Johnston 827f198177
Fix error when sending message into deleted room. (#15235)
When a room is deleted in Synapse we remove the event forward
extremities in the room, so if (say a bot) tries to send a message into
the room we error out due to not being able to calculate prev events for
the new event *before* we check if the sender is in the room.

Fixes #8094
2023-03-21 09:13:43 +00:00
Patrick Cloke a5fb382a29
Separate HTTP preview code and URL previewer. (#15269)
Separates REST layer code from the actual URL previewing.
2023-03-20 14:32:26 -04:00
Shay 5ab7146e19
Add Synapse-Trace-Id to access-control-expose-headers header (#14974) 2023-03-20 11:14:05 -07:00
Patrick Cloke 25006acc17
Add /versions flag for MSC3952. (#15293) 2023-03-20 11:47:21 -04:00
Jason Little 3d70cc393f
Load `/register/available` endpoint on workers (#15268) 2023-03-17 09:50:31 -04:00
Patrick Cloke afb216c202
Remove no-op send_command for Redis replication. (#15274)
With Redis commands do not need to be re-issued by the main
process (they fan-out to all processes at once) and thus it is no
longer necessary to worry about them reflecting recursively forever.
2023-03-16 11:13:30 -04:00
Tulir Asokan b0a0fb5c97
Implement MSC2659: application service ping endpoint (#15249)
Signed-off-by: Tulir Asokan <tulir@maunium.net>
2023-03-16 15:00:03 +01:00
reivilibre 1f5473465d
Refresh remote profiles that have been marked as stale, in order to fill the user directory. [rei:userdirpriv] (#14756)
* Scaffolding for background process to refresh profiles

* Add scaffolding for background process to refresh profiles for a given server

* Implement the code to select servers to refresh from

* Ensure we don't build up multiple looping calls

* Make `get_profile` able to respect backoffs

* Add logic for refreshing users

* When backing off, schedule a refresh when the backoff is over

* Wake up the background processes when we receive an interesting state event

* Add tests

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Add comment about 1<<62

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 11:44:11 +00:00
Andrew Morgan 4953cd71df
Move Account Validity callbacks to a dedicated file (#15237) 2023-03-16 10:35:31 +00:00
reivilibre f54f877f27
Preparatory work to fix the user directory assuming that any remote membership state events represent a profile change. [rei:userdirpriv] (#14755)
* Remove special-case method for new memberships only, use more generic method

* Only collect profiles from state events in public rooms

* Add a table to track stale remote user profiles

* Add store methods to set and delete rows in this new table

* Mark remote profiles as stale when a member state event comes in to a private room

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Simplify by removing Optionality of `event_id`

* Replace names and avatars with None if they're set to dodgy things

I think this makes more sense anyway.

* Move schema delta to 74 (I missed the boat?)

* Turns out these can be None after all

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 09:55:19 +00:00
Patrick Cloke 3bf973edc7
Remove unused class: DirectTcpReplicationClientFactory. (#15272) 2023-03-15 15:42:20 -04:00
reivilibre 63d87c08c8
Add schema comments about the `destinations` and `destination_rooms` tables. (#15247) 2023-03-15 09:25:58 +00:00
reivilibre d0fe417f5c
Remove unused store method `_set_destination_retry_timings_emulated`. (#15266) 2023-03-14 17:32:46 +00:00