Commit Graph

15610 Commits

Author SHA1 Message Date
Sean Quah 3b837d856c
Revert "Reduce the size of the HTTP connection pool for non-pushers" (#15530)
#15514 introduced a regression where Synapse would encounter
`PartialDownloadError`s when fetching OpenID metadata for certain
providers on startup. Due to #8088, this prevents Synapse from starting
entirely.

Revert the change while we decide what to do about the regression.
2023-05-03 13:09:20 +01:00
Patrick Cloke a7b3e9ce65
Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15437)
Updates the database schema to require a thread_id (by adding a
constraint that the column is non-null) for event_push_actions,
event_push_actions_staging, and event_push_actions_summary.

For PostgreSQL we add the constraint as NOT VALID, then
VALIDATE the constraint a background job to avoid locking
the table during an upgrade.

For SQLite we simply rebuild the table & copy the data.
2023-05-03 07:49:03 -04:00
Sean Quah 04e79e6a18
Add config option to forget rooms automatically when users leave them (#15224)
This is largely based off the stats and user directory updater code.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-03 12:27:33 +01:00
Shay 0e8aa2a1b2
Remove references to supporting per-user flag for msc2654 (#15522) 2023-05-02 14:21:36 -07:00
Erik Johnston 4de271a7fc
Allow adding random delay to push (#15516)
This is to discourage timing based profiling on the push gateways.
2023-05-02 16:45:44 +00:00
Patrick Cloke 6aca4e7cb8
Reduce the size of the HTTP connection pool for non-pushers. (#15514)
Pushers tend to make many connections to the same HTTP host
(e.g. a new event comes in, causes events to be pushed, and then
the homeserver connects to the same host many times). Due to this
the per-host HTTP connection pool size was increased, but this does
not make sense for other SimpleHttpClients.

Add a parameter for the connection pool and override it for pushers
(making a separate SimpleHttpClient for pushers with the increased
configuration).

This returns the HTTP connection pool settings to the default Twisted
ones for non-pusher HTTP clients.
2023-05-02 09:29:40 -04:00
Patrick Cloke 07b1c70d6b
Initial implementation of MSC3981: recursive relations API (#15315)
Adds an optional keyword argument to the /relations API which
will recurse a limited number of event relationships.

This will cause the API to return not just the events related to the
parent event, but also events related to those related to the parent
event, etc.

This is disabled by default behind an experimental configuration
flag and is currently implemented using prefixed parameters.
2023-05-02 07:59:55 -04:00
Shay 89f6fb0d5a
Add an admin API endpoint to support per-user feature flags (#15344) 2023-04-28 11:33:45 -07:00
Patrick Cloke 57aeeb308b
Add support for claiming multiple OTKs at once. (#15468)
MSC3983 provides a way to request multiple OTKs at once from appservices,
this extends this concept to the Client-Server API.

Note that this will likely be spit out into a separate MSC, but is currently part of
MSC3983.
2023-04-27 12:57:46 -04:00
Patrick Cloke 6efa674004
Add type hints to schema deltas (#15497)
Cleans-up the schema delta files:

* Removes no-op functions.
* Adds missing type hints to function parameters.
* Fixes any issues with type hints.

This also renames one (very old) schema delta to avoid a conflict
that mypy complains about.
2023-04-27 12:44:53 +00:00
Patrick Cloke a346b43837
Check databases/__init__ and main/cache with mypy. (#15496) 2023-04-27 07:59:14 -04:00
mcalinghee 486c059479
Disable push rule evaluation for rooms excluded from sync (#15361)
* no push for excluded room from sync

* add changelog
Signed-off-by: Maghen Calinghee <maghen.calinghee@beta.gouv.fr>

* correct changelog
2023-04-27 11:32:02 +01:00
Shay 301b4156d5
Add column `full_user_id` to tables `profiles` and `user_filters`. (#15458) 2023-04-26 16:03:26 -07:00
Mathieu Velten 247e6a8a78
Add a module API to send an HTTP push notification (#15387)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-04-26 21:10:51 +02:00
Erik Johnston 9900f7c231
Add admin endpoint to query room sizes (#15482) 2023-04-26 16:00:11 +00:00
Patrick Cloke 8e9739449d
Add unstable /keys/claim endpoint which always returns fallback keys. (#15462)
It can be useful to always return the fallback key when attempting to
claim keys. This adds an unstable endpoint for `/keys/claim` which
always returns fallback keys in addition to one-time-keys.

The fallback key(s) are not marked as "used" unless there are no
corresponding OTKs.

This is currently defined in MSC3983 (although likely to be split out
to a separate MSC). The endpoint shape may change or be requested
differently (i.e. a keyword parameter on the current endpoint), but the
core logic should be reasonable.
2023-04-25 13:30:41 -04:00
Nick Mills-Barrett c55293c230
Re re introduce membership tables event stream ordering (#15356) 2023-04-25 09:44:29 +01:00
Quentin Gliech 8b3a502996
Experimental support for MSC3970: per-device transaction IDs (#15318) 2023-04-25 09:37:09 +01:00
Patrick Cloke ea5c3ede4f
Finish type hints for federation client HTTP code. (#15465) 2023-04-24 13:12:06 -04:00
Alok Kumar Singh 197fbb123b
Remove legacy code of single user device resync api (#15418)
* Removed single-user resync usage and updated it to use multi-user counterpart

Signed-off-by: Alok Kumar Singh alokaks601@gmail.com
2023-04-21 12:06:39 +01:00
Patrick Cloke 5e024a0645
Modify StoreKeyFetcher to read from server_keys_json. (#15417)
Before this change:

* `PerspectivesKeyFetcher` and `ServerKeyFetcher` write to `server_keys_json`.
* `PerspectivesKeyFetcher` also writes to `server_signature_keys`.
* `StoreKeyFetcher` reads from `server_signature_keys`.

After this change:

* `PerspectivesKeyFetcher` and `ServerKeyFetcher` write to `server_keys_json`.
* `PerspectivesKeyFetcher` also writes to `server_signature_keys`.
* `StoreKeyFetcher` reads from `server_keys_json`.

This results in `StoreKeyFetcher` now using the results from `ServerKeyFetcher`
in addition to those from `PerspectivesKeyFetcher`, i.e. keys which are directly
fetched from a server will now be pulled from the database instead of refetched.

An additional minor change is included to avoid creating a `PerspectivesKeyFetcher`
(and checking it) if no `trusted_key_servers` are configured.

The overall impact of this should be better usage of cached results:

* If a server has no trusted key servers configured then it should reduce how often keys
  are fetched.
* if a server's trusted key server does not have a requested server's keys cached then it
  should reduce how often keys are directly fetched.
2023-04-20 12:30:32 -04:00
Andrew Morgan aec639e3e3
Move Spam Checker callbacks to a dedicated file (#15453) 2023-04-18 00:57:40 +00:00
Jason Little e12d788bb7
Switch `InstanceLocationConfig` to a pydantic `BaseModel` (#15431)
* Switch InstanceLocationConfig to a pydantic BaseModel, apply Strict* types and add a few helper methods(that will make more sense in follow up work).

Co-authored-by: David Robertson <davidr@element.io>
2023-04-17 23:53:43 +00:00
Jason Little c9326140dc
Refactor `SimpleHttpClient` to pull out reusable methods (#15427)
Pulls out some methods to `BaseHttpClient` to eventually be
reused in other contexts.
2023-04-14 20:46:04 +00:00
David Robertson 8a47d6e3a6
More precise type for LoggingTransaction.execute (#15432)
* More precise type for LoggingTransaction.execute
* Add an annotation for stream_ordering_month_ago

This would have spotted the error that was fixed in "Add comma missing from #15382. (#15429)"
2023-04-14 18:04:49 +00:00
Dirk Klimpel 24b61f32ff
Disable directory listing for `StaticResource` (#15438) 2023-04-14 13:49:47 -04:00
Dirk Klimpel e4a25d022c
Load `/capabilities` endpoint on workers (#15436) 2023-04-14 12:26:07 -04:00
Erik Johnston b5192355f6
User directory background update speedup (#15435)
c.f. #15264

The two changes are:
1. Add indexes so that the select / deletes don't do sequential scans
2. Don't repeatedly call `SELECT count(*)` each iteration, as that's slow
2023-04-14 16:10:32 +01:00
Mathieu Velten dabbb94faf
Delete pushers after calling on_logged_out module hook on device delete (#15410) 2023-04-14 14:12:37 +02:00
Dirk Klimpel 4af0aec54d
Load `/directory/room/{roomAlias}` endpoint on workers (#15333)
* Enable `directory`

* move to worker store

* newsfile

* disable `ClientDirectoryListServer` and `ClientAppserviceDirectoryListServer` for workers
2023-04-14 10:24:06 +01:00
Patrick Cloke d751f65e71
Remove registration fallback code. (#15405)
The registration fallback is broken and unspecced. This removes it
since there is no plan to spec it.

Note that this does not modify the login fallback code.
2023-04-13 11:36:29 -04:00
reivilibre edae20f926
Improve robustness when handling a perspective key response by deduplicating received server keys. (#15423)
* Change `store_server_verify_keys` to take a `Mapping[(str, str), FKR]`

This is because we already can't handle duplicate keys — leads to cardinality violation

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-04-13 15:35:03 +01:00
reivilibre 38272be037
Add comma missing from #15382. (#15429)
* Add missing comma

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-04-13 15:06:25 +01:00
Patrick Cloke 2503126d52
Implement MSC2174: move redacts to a content property. (#15395)
This moves `redacts` from being a top-level property to
a `content` property in a new room version.

MSC2176 (which was previously implemented) states to not
`redact` this property.
2023-04-13 13:47:07 +00:00
Dirk Klimpel c9723a1c1f
Only load the SSO redirect servlet if SSO is enabled. (#15421) 2023-04-13 13:08:00 +00:00
Dirk Klimpel be36600327
Disable loading `RefreshTokenServlet` on workers (#15428) 2023-04-13 13:28:55 +02:00
Will Hunt 253e86a72e
Throw if the appservice config list is the wrong type (#15425)
* raise a ConfigError on an invalid app_service_config_files

* changelog

* Move config check to read_config

* Add test

* Ensure list also contains strings
2023-04-12 11:28:46 +00:00
Patrick Cloke d07d255830
Implement MSC2175: remove the creator field from create events. (#15394) 2023-04-06 16:26:28 -04:00
Erik Johnston 485b9fdefb
Don't keep old stream_ordering_to_exterm around (#15382) 2023-04-06 16:42:39 +00:00
Patrick Cloke 72b43bec8b Merge remote-tracking branch 'origin/release-v1.81' into develop 2023-04-06 11:44:26 -04:00
Patrick Cloke 83649b891d
Implement MSC3989 to redact the origin field. (#15393)
This will be done in a future room version, for now an unstable
room version is added which redacts the origin field.
2023-04-05 14:42:46 -04:00
Quentin Gliech 6eb3edec47
Fix the 'set_device_id_for_pushers_txn' background update. (#15391)
Refer to the correct field from the response when updating
the background update progress.
2023-04-05 07:49:15 -04:00
Shay 6b23d74ad1
Delete server-side backup keys when deactivating an account. (#15181) 2023-04-04 20:16:08 +00:00
Erik Johnston 79d2e2e79c
Speed up membership queries for users with forgotten rooms (#15385) 2023-04-04 14:11:34 +01:00
Sean Quah 89a71e7390
Fix a rare bug where initial /syncs would fail (#15383)
This change fixes a rare bug where initial /syncs would fail with a
`KeyError` under the following circumstances:
 1. A user fast joins a remote room.
 2. The user is kicked from the room before the room's full state has
    been synced.
 3. A second local user fast joins the room.
 4. Events are backfilled into the room with a higher topological
    ordering than the original user's leave. They are assigned a
    negative stream ordering. It's not clear how backfill happened here,
    since it is expected to be equivalent to syncing the full state.
 5. The second local user leaves the room before the room's full state
    has been synced. The homeserver does not complete the sync.
 6. The original user performs an initial /sync with lazy_load_members
    enabled.
     * Because they were kicked from the room, the room is included in
       the /sync response even though the include_leave option is not
       specified.
     * To populate the room's timeline, `_load_filtered_recents` /
       `get_recent_events_for_room` fetches events with a lower stream
       ordering than the leave event and picks the ones with the highest
       topological orderings (which are most recent). This captures the
       backfilled events after the leave, since they have a negative
       stream ordering. These events are filtered out of the timeline,
       since the user was not in the room at the time and cannot view
       them. The sync code ends up with an empty timeline for the room
       that notably does not include the user's leave event.
       This seems buggy, but at least we don't disclose events the user
       isn't allowed to see.
     * Normally, `compute_state_delta` would fetch the state at the
       start and end of the room's timeline to generate the sync
       response. Since the timeline is empty, it fetches the state at
       `min(now, last event in the room)`, which corresponds with the
       second user's leave. The state during the entirety of the second
       user's membership does not include the membership for the first
       user because of partial state.
       This part is also questionable, since we are fetching state from
       outside the bounds of the user's membership.
     * `compute_state_delta` then tries and fails to find the user's
       membership in the auth events of timeline events. Because there
       is no timeline event whose auth events are expected to contain
       the user's membership, a `KeyError` is raised.

Also contains a drive-by fix for a separate unlikely race condition.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-04-04 13:10:25 +01:00
Patrick Cloke cf2f2934ad
Call appservices on modern paths, falling back to legacy paths. (#15317)
This uses the specced /_matrix/app/v1/... paths instead of the
"legacy" paths. If the homeserver receives an error it will retry
using the legacy path.
2023-04-03 13:20:32 -04:00
Jason Little 56efa9b167
Experimental Unix socket support (#15353)
* Add IReactorUNIX to ISynapseReactor type hint.

* Create listen_unix().

Two options, 'path' to the file and 'mode' of permissions(not umask, recommend 666 as default as
nginx/other reverse proxies write to it and it's setup as user www-data)

For the moment, leave the option to always create a PID lockfile turned on by default

* Create UnixListenerConfig and wire it up.

Rename ListenerConfig to TCPListenerConfig, then Union them together into ListenerConfig.
This spidered around a bit, but I think I got it all. Metrics and manhole have been placed
behind a conditional in case of accidental putting them onto a unix socket.

Use new helpers to get if a listener is configured for TLS, and to help create a site tag
for logging.

There are 2 TODO things in parse_listener_def() to finish up at a later point.

* Refactor SynapseRequest to handle logging correctly when using a unix socket.

This prevents an exception when an IP address can not be retrieved for a request.

* Make the 'Synapse now listening on Unix socket' log line a little prettier.

* No silent failures on generic workers when trying to use a unix socket with metrics or manhole.

* Inline variables in app/_base.py

* Update docstring for listen_unix() to remove reference to a hardcoded permission of 0o666 and add a few comments saying where the default IS declared.

* Disallow both a unix socket and a ip/port combo on the same listener resource

* Linting

* Changelog

* review: simplify how listen_unix returns(and get rid of a type: ignore)

* review: fix typo from ConfigError in app/homeserver.py

* review: roll conditional for http_options.tag into get_site_tag() helper(and add docstring)

* review: enhance the conditionals for checking if a port or path is valid, remove a TODO line

* review: Try updating comment in get_client_ip_if_available to clarify what is being retrieved and why

* Pretty up how 'Synapse now listening on Unix Socket' looks by decoding the byte string.

* review: In parse_listener_def(), raise ConfigError if neither socket_path nor port is declared(and fix a typo)
2023-04-03 10:27:51 +01:00
Jason Robinson 157092d97a
Fix copyright year in SSO footer template (#15358) 2023-03-31 18:20:40 +01:00
Erik Johnston 6204c3663e
Revert pruning of old devices (#15360)
* Revert "Fix registering a device on an account with lots of devices (#15348)"

This reverts commit f0d8f66eaa.

* Revert "Delete stale non-e2e devices for users, take 3 (#15183)"

This reverts commit 78cdb72cd6.
2023-03-31 13:51:51 +01:00
Olivier Wilkinson (reivilibre) 72d2ceaa9a Revert "Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)"
This reverts commit 2a234b788e.

See #15359 for context.
2023-03-31 12:10:10 +01:00
Patrick Cloke 2a234b788e
Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)
Clean-up from adding the thread_id column, which was initially
null but backfilled with values. It is desirable to require it to now
be non-null.

In addition to altering this column to be non-null, we clean up
obsolete background jobs, indexes, and just-in-time updating
code.
2023-03-30 15:11:31 -04:00
Mathieu Velten 6f68e32bfb
to_device updates could be dropped when consuming the replication stream (#15349)
Co-authored-by: reivilibre <oliverw@matrix.org>
2023-03-30 19:41:14 +02:00
Erik Johnston 91c3f32673
Speed up SQLite unit test CI (#15334)
Tests now take 40% of the time.
2023-03-30 16:21:12 +01:00
Patrick Cloke ae4acda1bb
Implement MSC3984 to proxy /keys/query requests to appservices. (#15321)
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for devices in addition
to any information stored in the Synapse database.
2023-03-30 08:39:38 -04:00
Sean Quah d9f694932c
Fix spinloop during partial state sync when a prev event is in backoff (#15351)
Previously, we would spin in a tight loop until
`update_state_for_partial_state_event` stopped raising
`FederationPullAttemptBackoffError`s. Replace the spinloop with a wait
until the backoff period has expired.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-03-30 13:36:41 +01:00
Warren Bailey a3bad89d57
Add the ability to enable/disable registrations when in the OIDC flow (#14978)
Signed-off-by: Warren Bailey <warren@warrenbailey.net>
2023-03-30 11:09:41 +00:00
Mathieu Velten 9228ae633f
Add some clarification to the doc/comments regarding TCP replication (#15354) 2023-03-30 12:51:35 +02:00
Cyberes 9d641d88b7
Fix missing app variable in mail subject for password resets (#15352)
* Update mailer.py

Fix `KeyError: 'app'`

* Create 15352.bugfix

Signed-off-by: Cyberes <cyberes@evulid.cc>

---------

Signed-off-by: Cyberes <cyberes@evulid.cc>
2023-03-30 11:44:53 +01:00
Erik Johnston f0d8f66eaa
Fix registering a device on an account with lots of devices (#15348)
Fixes up #15183
2023-03-29 13:37:06 +00:00
Erik Johnston 5350b5d04d
Revert "Reintroduce membership tables event stream ordering (#15128)" (#15347)
This reverts commit e6af49fbea.
2023-03-29 13:24:28 +01:00
Erik Johnston 78cdb72cd6
Delete stale non-e2e devices for users, take 3 (#15183)
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.

We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
2023-03-29 12:07:14 +01:00
DeepBlueV7.X 753d1d9cde
Fix joining rooms you have been unbanned from (#15323)
* Fix joining rooms you have been unbanned from

Since forever synapse did not allow you to join a room after you have
been unbanned from it over federation. This was not actually because of
the unban event not federating. Synapse simply used outdated state to
validate the join transition. This skips the validation if we are not in
the room and for that reason won't have the current room state.

Fixes #1563

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Add changelog

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Update changelog.d/15323.bugfix

---------

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2023-03-29 08:37:27 +00:00
Patrick Cloke 5282ba1e2b
Implement MSC3983 to proxy /keys/claim queries to appservices. (#15314)
Experimental support for MSC3983 is behind a configuration flag.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for one-time keys *if*
there are none uploaded to Synapse.
2023-03-28 18:26:27 +00:00
dependabot[bot] bd4d958aaf
Bump ruff from 0.0.252 to 0.0.259 (#15328)
* Bump ruff from 0.0.252 to 0.0.259

Bumps [ruff](https://github.com/charliermarsh/ruff) from 0.0.252 to 0.0.259.
- [Release notes](https://github.com/charliermarsh/ruff/releases)
- [Changelog](https://github.com/charliermarsh/ruff/blob/main/BREAKING_CHANGES.md)
- [Commits](https://github.com/charliermarsh/ruff/compare/v0.0.252...v0.0.259)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix new warnings

* Mypy

* Newsfile

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
2023-03-28 09:46:47 +01:00
Erik Johnston 96f163d932
Prune old typing notifications (#15332)
Rather than keeping them around forever in memory, slowing things down.

Fixes #11750.
2023-03-27 14:32:36 +01:00
Dirk Klimpel 4fc85e5a92
Load `/password_policy` endpoint on workers. (#15331) 2023-03-27 07:37:17 -04:00
reivilibre d5324ee111
Add developer documentation for the Federation Sender and add a documentation mechanism using Sphinx. (#15265)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-03-24 16:41:10 +00:00
reivilibre 5f7c908280
As an optimisation, use `TRUNCATE` on Postgres when clearing the user directory tables. (#15316) 2023-03-24 15:31:12 +00:00
Quentin Gliech 5b70f240cf
Make cleaning up pushers depend on the device_id instead of the token_id (#15280)
This makes it so that we rely on the `device_id` to delete pushers on logout,
instead of relying on the `access_token_id`. This ensures we're not removing
pushers on token refresh, and prepares for a world without access token IDs
(also known as the OIDC).

This actually runs the `set_device_id_for_pushers` background update, which
was forgotten in #13831.

Note that for backwards compatibility it still deletes pushers based on the
`access_token` until the background update finishes.
2023-03-24 11:09:39 -04:00
Patrick Cloke 68a6717312
Reject mentions on the C-S API which are invalid. (#15311)
Invalid mentions data received over the Client-Server API should
be rejected with a 400 error. This will hopefully stop clients from
sending invalid data, although does not help with data received
over federation.
2023-03-24 08:31:14 -04:00
Nick Mills-Barrett e6af49fbea
Reintroduce membership tables event stream ordering (#15128)
* Add `event_stream_ordering` column to membership state tables

Specifically this adds the column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used.

* Make denormalised `event_stream_ordering` columns foreign keys
* Add comment in schema file explaining new denormalised columns
* Add triggers to enforce consistency of `event_stream_ordering` columns
* Re-order purge room tables to account for foreign keys
* Bump schema version to 75

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2023-03-24 11:44:01 +00:00
reivilibre 98fd558382
Add a primitive helper script for listing worker endpoints. (#15243)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-03-23 12:11:14 +00:00
David Robertson 3b0083c92a
Use immutabledict instead of frozendict (#15113)
Additionally:

* Consistently use `freeze()` in test

---------

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2023-03-22 17:15:34 +00:00
Shay 7f02fafa28
Add a check to SQLite port DB script to ensure that the sqlite database passed to the script exists before trying to port from it (#15306) 2023-03-22 08:36:42 -07:00
David Robertson 1bc9985eb7
Have replication clients remove _INT_STREAM_POS (#15309)
* Have replication clients remove _INT_STREAM_POS

Suppose worker A makes an internal http request from worker B. B may
make changes that A later learns about over replication. We want A's
request to block until it has seen those changes—mainly to ensure A's
caches are invalidated promptly. This helps provide read-after-write
consistency, eliminating entire categories of races and test flakes.

To implement this, B includes a top-level field `_INT_STREAM_POS` in its
response JSON. Roughly speaking, the field's value tells A what to wait
for. But we weren't removing that internal field before A's request
completed!

Introduced in https://github.com/matrix-org/synapse/pull/14820.
Fixes #15308.

* Changelog
2023-03-22 12:53:55 +00:00
Shay 72f3f23c4d
Change the parameter `immediate` of `send_device_messages` to default to `True` (#15297) 2023-03-21 17:59:55 -07:00
Patrick Cloke 1bc4feb6c9
Apply & bundle edits for non-message events. (#15295) 2023-03-21 14:19:54 -04:00
Shay 96bcc5d902
Revert "check sqlite database file exists before porting/#14692" (#15301) 2023-03-21 10:49:25 -07:00
Andrew Morgan ec9224bf9a
Make `POST /_matrix/client/v3/rooms/{roomId}/report/{eventId}` endpoint return 404 if event exists, but the user lacks access (#15300) 2023-03-21 13:24:03 +00:00
Andrew Morgan b6aef59334
Make `EventHandler.get_event` return `None` when the requested event is not found (#15298) 2023-03-21 13:23:47 +00:00
Erik Johnston 827f198177
Fix error when sending message into deleted room. (#15235)
When a room is deleted in Synapse we remove the event forward
extremities in the room, so if (say a bot) tries to send a message into
the room we error out due to not being able to calculate prev events for
the new event *before* we check if the sender is in the room.

Fixes #8094
2023-03-21 09:13:43 +00:00
Patrick Cloke a5fb382a29
Separate HTTP preview code and URL previewer. (#15269)
Separates REST layer code from the actual URL previewing.
2023-03-20 14:32:26 -04:00
Shay 5ab7146e19
Add Synapse-Trace-Id to access-control-expose-headers header (#14974) 2023-03-20 11:14:05 -07:00
Patrick Cloke 25006acc17
Add /versions flag for MSC3952. (#15293) 2023-03-20 11:47:21 -04:00
Jason Little 3d70cc393f
Load `/register/available` endpoint on workers (#15268) 2023-03-17 09:50:31 -04:00
Patrick Cloke afb216c202
Remove no-op send_command for Redis replication. (#15274)
With Redis commands do not need to be re-issued by the main
process (they fan-out to all processes at once) and thus it is no
longer necessary to worry about them reflecting recursively forever.
2023-03-16 11:13:30 -04:00
Tulir Asokan b0a0fb5c97
Implement MSC2659: application service ping endpoint (#15249)
Signed-off-by: Tulir Asokan <tulir@maunium.net>
2023-03-16 15:00:03 +01:00
reivilibre 1f5473465d
Refresh remote profiles that have been marked as stale, in order to fill the user directory. [rei:userdirpriv] (#14756)
* Scaffolding for background process to refresh profiles

* Add scaffolding for background process to refresh profiles for a given server

* Implement the code to select servers to refresh from

* Ensure we don't build up multiple looping calls

* Make `get_profile` able to respect backoffs

* Add logic for refreshing users

* When backing off, schedule a refresh when the backoff is over

* Wake up the background processes when we receive an interesting state event

* Add tests

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Add comment about 1<<62

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 11:44:11 +00:00
Andrew Morgan 4953cd71df
Move Account Validity callbacks to a dedicated file (#15237) 2023-03-16 10:35:31 +00:00
reivilibre f54f877f27
Preparatory work to fix the user directory assuming that any remote membership state events represent a profile change. [rei:userdirpriv] (#14755)
* Remove special-case method for new memberships only, use more generic method

* Only collect profiles from state events in public rooms

* Add a table to track stale remote user profiles

* Add store methods to set and delete rows in this new table

* Mark remote profiles as stale when a member state event comes in to a private room

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Simplify by removing Optionality of `event_id`

* Replace names and avatars with None if they're set to dodgy things

I think this makes more sense anyway.

* Move schema delta to 74 (I missed the boat?)

* Turns out these can be None after all

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 09:55:19 +00:00
Patrick Cloke 3bf973edc7
Remove unused class: DirectTcpReplicationClientFactory. (#15272) 2023-03-15 15:42:20 -04:00
reivilibre 63d87c08c8
Add schema comments about the `destinations` and `destination_rooms` tables. (#15247) 2023-03-15 09:25:58 +00:00
reivilibre d0fe417f5c
Remove unused store method `_set_destination_retry_timings_emulated`. (#15266) 2023-03-14 17:32:46 +00:00
Patrick Cloke e7b559d2ca
Avoid unneeded work if auto-join rooms aren't configured. (#15262)
It is not necessary to reach out to the database to check some
parameters if the auto-join rooms are not configured, or (in some cases)
if auto-create rooms is not configured.
2023-03-14 08:18:49 -04:00
David Robertson a1c9869394
Merge branch 'release-v1.79' into develop 2023-03-13 18:35:21 +00:00
David Robertson c071cd5a0e
Ensure fed-sender catchup does not block for full state (#15248)
* Reproduce bad scenario in test
* Avoid catchup optimisation for partial state rooms
2023-03-13 12:31:19 +00:00
David Robertson 4bb26c95a9
Refactor `filter_events_for_server` (#15240)
* Tweak docstring and type hint

* Flip logic and provide better name

* Separate decision from action

* Track a set of strings, not EventBases

* Require explicit boolean options from callers

* Add explicit option for partial state rooms

* Changelog

* Rename param
2023-03-10 15:31:25 +00:00
Andrew Morgan e157c63f68
Fix missing conditional for registering `on_remove_user_third_party_identifier` module api callbacks (#15227 2023-03-10 10:35:18 +00:00
David Robertson ce54477f6f
Give PyCharm some help with `@cache_in_self` (#15238)
* Give PyCharm some help with `@cache_in_self`

* Changelog

* Fix import for old python versions
2023-03-09 19:12:09 +00:00
Sean Quah caf43c3d7c
Faster joins: Fix spurious errors on incremental sync (#15232)
When pushing events in partial state rooms down incremental /sync, we
try to find the `m.room.member` state event for their senders by digging
through their auth events, so that we can present the membership to the
client. Events usually have a membership event in their auth events,
with the exception of the `m.room.create` event and a user's first join
into the room.

When implementing #13477, we took the case of a user's first join into
account, but forgot to handle the `m.room.create` case. This change
fixes that.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-03-09 14:18:39 +00:00
Patrick Cloke 3d060eae6c
Add missing type hints to `synapse.storage.database`. (#15230) 2023-03-09 07:10:09 -05:00
Patrick Cloke e7c3832ba6
Pull in netaddr type hints. (#15231)
And fix any issues from having those type hints.
2023-03-09 07:09:49 -05:00
Shay be4ea209e8
Add topic and name events to group of events that are batch persisted when creating a room. (#15229) 2023-03-08 19:27:20 -08:00
Patrick Cloke 88efc75bab
Include the room ID in more purge room log lines. (#15222) 2023-03-08 20:08:56 +00:00
Shay a368d30c1c
More speedups/fixes to creating batched events (#15195) 2023-03-07 13:54:39 -08:00
Patrick Cloke 20ed8c926b
Stabilize support for MSC3873: disambuguated event push keys. (#15190)
This removes the experimental configuration option and
always escapes the push rule condition keys.

Also escapes any (experimental) push rule condition keys
in the base rules which contain dot in a field name.
2023-03-07 11:27:57 -05:00
Quentin Gliech 47bc84dd53
Pass the Requester down to the HttpTransactionCache. (#15200) 2023-03-07 16:05:22 +00:00
Patrick Cloke 820f02b70b
Stabilize support for MSC3966: event_property_contains push condition. (#15187)
This removes the configuration flag & updates the identifiers to
use the stable version.
2023-03-07 10:06:02 -05:00
Erik Johnston c69aae94cd
Split up txn for fetching device keys (#15215)
We look up keys in batches, but we should do that outside of the
transaction to avoid starving the database pool.
2023-03-07 08:51:34 +00:00
Quentin Gliech 41f127e068
Pass the requester during event serialization. (#15174)
This allows Synapse to properly include the transaction ID in the
unsigned data of events.
2023-03-06 16:08:39 +00:00
Patrick Cloke 05e0a4089a
Stop applying edits to event contents (MSC3925). (#15193)
Enables MSC3925 support by default, which:

* Includes the full edit event in the bundled aggregations of an
  edited event.
* Stops modifying the original event's content to return the new
  content from the edit event.

This is a backwards-incompatible change that is considered to be
"correct" by the spec.
2023-03-06 09:43:01 -05:00
Patrick Cloke fd9cadcf53
Stabilize support for MSC3758: event_property_is push condition (#15185)
This removes the configuration flag & updates the identifiers to
use the stable version.
2023-03-06 08:38:01 -05:00
Patrick Cloke 02f74f3a99
Combine AbstractStreamIdTracker and AbstractStreamIdGenerator. (#15192)
AbstractStreamIdTracker (now) has only a single sub-class: AbstractStreamIdGenerator,
combine them to simplify some code and remove any direct references to
AbstractStreamIdTracker.
2023-03-03 08:13:37 -05:00
Quentin Gliech 848f7e3d5f
Remove unspecced and buggy `PUT` method on the unstable `/rooms/<room_id>/batch_send` endpoint. (#15199) 2023-03-03 12:22:49 +00:00
Andrew Morgan 15e975f68f
Experimental MSC3890 Implementation: Fix deleting account data when using an account data writer worker (#14869) 2023-03-03 10:51:57 +00:00
Andrew Morgan 1eea662780
Add a `get_next_txn` method to `StreamIdGenerator` to match `MultiWriterIdGenerator` (#15191 2023-03-02 18:27:00 +00:00
Dirk Klimpel ecbe0ddbe7
Add support for knocking to workers. (#15133) 2023-03-02 12:59:53 -05:00
Quentin Gliech c8665dd25d
Remove the unspecced and bugged PUT /knock/{roomIdOrAlias} endpoint (#15189) 2023-03-02 17:16:54 +00:00
Patrick Cloke 8ef324ea6f
Update intentional mentions (MSC3952) to depend on `exact_event_property_contains` (MSC3966). (#15051)
This replaces the specific `is_user_mention` push rule condition
used in MSC3952 with the generic `exact_event_property_contains`
push rule condition from MSC3966.
2023-03-02 08:30:51 -05:00
Patrick Cloke 33a85cf08c
Fix conflicting URLs for dehydrated devices. (#15180) 2023-03-02 07:24:29 -05:00
Dirk Klimpel 65f10afb64
Move event_reports to `RoomWorkerStore` (#15165) 2023-03-02 10:38:46 +00:00
Hugh Nimmo-Smith 916b8061d2
Implementation of MSC3967: Don't require UIA for initial upload of cross signing keys (#15077) 2023-03-02 10:34:59 +00:00
Richard van der Hoff 2b78981736
Remove support for aggregating reactions (#15172)
It turns out that no clients rely on server-side aggregation of `m.annotation`
relationships: it's just not very useful as currently implemented.

It's also non-trivial to calculate.

I want to remove it from MSC2677, so to keep the implementation in line, let's
remove it here.
2023-02-28 18:49:28 +00:00
H. Shay b2fd03d075 Merge branch 'master' into develop 2023-02-28 10:14:20 -08:00
reivilibre d62cd940cb
Fix a long-standing bug where an initial sync would not respond to changes to the list of ignored users if there was an initial sync cached. (#15163) 2023-02-28 17:11:26 +00:00
reivilibre 682d31c702
Allow use of the `/filter` Client-Server APIs on workers. (#15134) 2023-02-28 16:37:19 +00:00
Patrick Cloke c369d82df0
Add missing type hints to InsecureInterceptableContextFactory. (#15164) 2023-02-28 10:17:55 -05:00
Dirk Klimpel 93f7955eba
Admin API endpoint to delete a reported event (#15116)
* Admin api to delete event report

* lint +  tests

* newsfile

* Apply suggestions from code review

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>

* revert changes - move to WorkerStore

* update unit test

* Note that timestamp is in millseconds

---------

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
2023-02-28 12:09:10 +00:00
Travis Ralston 189a878a35
Remove dangling reference to being a reference implementation (#15167)
* Remove dangling reference to being a reference implementation

* Create 15167.misc
2023-02-27 20:08:18 +00:00
Andrew Morgan b40657314e
Add module API callbacks for adding and deleting local 3PID associations (#15044 2023-02-27 14:19:19 +00:00
Patrick Cloke 4fc8875876
Refactor media modules. (#15146)
* Removes the `v1` directory from `test.rest.media.v1`.
* Moves the non-REST code from `synapse.rest.media.v1` to `synapse.media`.
* Flatten the `v1` directory from `synapse.rest.media`,  but leave compatiblity
  with 3rd party media repositories and spam checkers.
2023-02-27 08:26:05 -05:00
Andrew Morgan 3f2ef205e2
Small fixes to `MatrixFederationHttpClient` docstrings (#15148) 2023-02-27 13:03:22 +00:00
Shay 1c95ddd09b
Batch up storing state groups when creating new room (#14918) 2023-02-24 13:15:29 -08:00
Erik Johnston b2357a898c
Fix bug where 5s delays would occasionally happen. (#15150)
This only affects deployments using workers.
2023-02-24 14:39:50 +00:00
Sean Quah 335f52d595
Improve handling of non-ASCII characters in user directory search (#15143)
* Fix a long-standing bug where non-ASCII characters in search terms,
  including accented letters, would not match characters in a different
  case.
* Fix a long-standing bug where search terms using combining accents
  would not match display names using precomposed accents and vice
  versa.

To fully take effect, the user directory must be rebuilt after this
change.

Fixes #14630.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-24 13:39:45 +00:00
Patrick Cloke 682151a464
Do not fail completely if oEmbed autodiscovery fails. (#15092)
Previously if an autodiscovered oEmbed request failed (e.g. the
oEmbed endpoint is down or does not exist) then the entire URL
preview would fail. Instead we now return everything we can, even
if this additional request fails.
2023-02-23 16:08:53 -05:00
Patrick Cloke f8a584ed02
Stop parsing the unspecced type parameter on thumbnail requests. (#15137)
Ideally we would replace this with parsing of the Accept header
or something else, but for now just make Synapse spec compliant
by ignoring the unspecced parameter.

It does not seem that this is ever sent by a client, and even if it is
there's a reasonable fallback.
2023-02-23 16:07:46 -05:00
Patrick Cloke ec79870f14
Fix a typo in MSC3873 config option. (#15138)
Previously the experimental configuration option referred to the wrong
MSC number.
2023-02-23 16:06:42 -05:00
Dirk Klimpel a068ad7dd4
Add information on uploaded media to user export command. (#15107) 2023-02-23 13:14:17 -05:00
dependabot[bot] 9bb2eac719
Bump black from 22.12.0 to 23.1.0 (#15103) 2023-02-22 15:29:09 -05:00
Patrick Cloke 4ed08ff72e
Tighten the default rate limit of creating new devices. (#15135) 2023-02-22 14:37:18 -05:00
Dirk Klimpel 6def779a1a
Use `json.dump` in `FileExfiltrationWriter` (#15095)
To directly write to the open file, instead of writing to an
in-memory string first.
2023-02-22 14:29:39 -05:00
David Robertson 647ff3ef65
Remove unused `room_alias` field from `/createRoom` response (#15093)
* Change `create_room` return type

* Don't return room alias from /createRoom

* Update other callsites

* Fix up mypy complaints

It looks like new_room_user_id is None iff new_room_id is None. It's a
shame we haven't expressed this in a way that mypy can understand.

* Changelog
2023-02-22 11:07:28 +00:00
reivilibre addd12f16d
Tweak logging for when a worker waits for its view of a replication stream to catch up. (#15120)Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Improve logging messages for the 'wait for repl stream' read-after-write consistency feature

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Update synapse/replication/tcp/client.py

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-02-21 12:26:00 +00:00
David Robertson e26d7d5ae7
Teach portdb about `un_partial_stated_event_stream` (#15108)
* Sort BOOLEAN_COLUMNS and APPEND_ONLY_TABLES

So I can see if a given table is present in logarithmic time, rather
than linear.

* Teach portdb about `un_partial_stated_event_streams`

* Comments comments comments

* Changelog
2023-02-20 13:35:24 +00:00
realtyem 490a3675bd
Allow health listener resource to load (#15096)
* Allow health listener resource to load.

* changelog

* Update changelog.d/15096.bugfix
2023-02-20 12:23:00 +00:00
reivilibre 1cbc3f197c
Fix a bug introduced in Synapse v1.74.0 where searching with colons when using ICU for search term tokenisation would fail with an error. (#15079)
Co-authored-by: David Robertson <davidr@element.io>
2023-02-20 12:00:18 +00:00
Dirk Klimpel 61bfcd669a
Add account data to export command (#14969)
* Add account data to to export command

* newsfile

* remove not needed function

* update newsfile

* adopt #14973
2023-02-17 13:54:55 +00:00
Sean Quah 4f4f27e57f
Mitigate a race where /make_join could 403 for restricted rooms (#15080)
Previously, when creating a join event in /make_join, we would decide
whether to include additional fields to satisfy restricted room checks
based on the current state of the room. Then, when building the event,
we would capture the forward extremities of the room to use as prev
events.

This is subject to race conditions. For example, when leaving and
rejoining a room, the following sequence of events leads to a misleading
403 response:
1. /make_join reads the current state of the room and sees that the user
   is still in the room. It decides to omit the field required for
   restricted room joins.
2. The leave event is persisted and the room's forward extremities are
   updated.
3. /make_join builds the event, using the post-leave forward extremities.
   The event then fails the restricted room checks.

To mitigate the race, we move the read of the forward extremities closer
to the read of the current state. Ideally, we would compute the state
based off the chosen prev events, but that can involve state resolution,
which is expensive.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-17 09:40:32 +00:00
David Robertson ffc2ee521d
Use mypy 1.0 (#15052)
* Update mypy and mypy-zope
* Remove unused ignores

These used to suppress

```
synapse/storage/engines/__init__.py:28: error: "__new__" must return a
class instance (got "NoReturn")  [misc]
```

and

```
synapse/http/matrixfederationclient.py:1270: error: "BaseException" has no attribute "reasons"  [attr-defined]
```

(note that we check `hasattr(e, "reasons")` above)

* Avoid empty body warnings, sometimes by marking methods as abstract

E.g.

```
tests/handlers/test_register.py:58: error: Missing return statement  [empty-body]
tests/handlers/test_register.py:108: error: Missing return statement  [empty-body]
```

* Suppress false positive about `JaegerConfig`

Complaint was

```
synapse/logging/opentracing.py:450: error: Function "Type[Config]" could always be true in boolean context  [truthy-function]
```

* Fix not calling `is_state()`

Oops!

```
tests/rest/client/test_third_party_rules.py:428: error: Function "Callable[[], bool]" could always be true in boolean context  [truthy-function]
```

* Suppress false positives from ParamSpecs

````
synapse/logging/opentracing.py:971: error: Argument 2 to "_custom_sync_async_decorator" has incompatible type "Callable[[Arg(Callable[P, R], 'func'), **P], _GeneratorContextManager[None]]"; expected "Callable[[Callable[P, R], **P], _GeneratorContextManager[None]]"  [arg-type]
synapse/logging/opentracing.py:1017: error: Argument 2 to "_custom_sync_async_decorator" has incompatible type "Callable[[Arg(Callable[P, R], 'func'), **P], _GeneratorContextManager[None]]"; expected "Callable[[Callable[P, R], **P], _GeneratorContextManager[None]]"  [arg-type]
````

* Drive-by improvement to `wrapping_logic` annotation

* Workaround false "unreachable" positives

See https://github.com/Shoobx/mypy-zope/issues/91

```
tests/http/test_proxyagent.py:626: error: Statement is unreachable  [unreachable]
tests/http/test_proxyagent.py:762: error: Statement is unreachable  [unreachable]
tests/http/test_proxyagent.py:826: error: Statement is unreachable  [unreachable]
tests/http/test_proxyagent.py:838: error: Statement is unreachable  [unreachable]
tests/http/test_proxyagent.py:845: error: Statement is unreachable  [unreachable]
tests/http/federation/test_matrix_federation_agent.py:151: error: Statement is unreachable  [unreachable]
tests/http/federation/test_matrix_federation_agent.py:452: error: Statement is unreachable  [unreachable]
tests/logging/test_remote_handler.py:60: error: Statement is unreachable  [unreachable]
tests/logging/test_remote_handler.py:93: error: Statement is unreachable  [unreachable]
tests/logging/test_remote_handler.py:127: error: Statement is unreachable  [unreachable]
tests/logging/test_remote_handler.py:152: error: Statement is unreachable  [unreachable]
```

* Changelog

* Tweak DBAPI2 Protocol to be accepted by mypy 1.0

Some extra context in:
- https://github.com/matrix-org/python-canonicaljson/pull/57
- https://github.com/python/mypy/issues/6002
- https://mypy.readthedocs.io/en/latest/common_issues.html#covariant-subtyping-of-mutable-protocol-members-is-rejected

* Pull in updated canonicaljson lib

so the protocol check just works

* Improve comments in opentracing

I tried to workaround the ignores but found it too much trouble.

I think the corresponding issue is
https://github.com/python/mypy/issues/12909. The mypy repo has a PR
claiming to fix this (https://github.com/python/mypy/pull/14677) which
might mean this gets resolved soon?

* Better annotation for INTERACTIVE_AUTH_CHECKERS

* Drive-by AUTH_TYPE annotation, to remove an ignore
2023-02-16 16:09:11 +00:00
Patrick Cloke 979f237b28
Update intentional mentions (MSC3952) to depend on `exact_event_match` (MSC3758). (#15037)
This replaces the specific `is_room_mention` push rule condition
used in MSC3952 with the generic `exact_event_match` push rule
condition from MSC3758.

No functionality changes due to this.
2023-02-16 09:51:22 -05:00
Sean Quah 3ad817bfe5
Fix federated joins when the first server in the list is not in the room (#15074)
Previously we would give up upon receiving a 404 from the first server,
instead of trying the rest of the servers in the list.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-15 13:59:06 +00:00
999lakhisidhu 27a3a72a50
Support for selecting the Redis logical database. (#15034)
Note that this is only used for key-value store (cached values)
and not for the pub/sub replication used by Synapse.
2023-02-15 07:39:31 -05:00
Richard van der Hoff 5febf88b6c
Update the error code for duplicate annotation (#15075) 2023-02-15 11:47:57 +00:00
David Robertson 06ba71083e
Fix order of partial state tables when purging (#15068)
* Fix order of partial state tables when purging

`partial_state_rooms` has an FK on `events` pointing to the join event we
get from `/send_join`, so we must delete from that table before deleting
from `events`.

**NB:** It would be nice to cancel any resync processes for the room
being purged. We do not do this at present. To do so reliably we'd need
an internal HTTP "replication" endpoint, because the worker doing the
resync process may be different to that handling the purge request.

The first time the resync process tries to write data after the deletion
it will fail because we have deleted necessary data e.g. auth
events. AFAICS it will not retry the resync, so the only downside to
not cancelling the resync is a scary-looking traceback.

(This is presumably extremely race-sensitive.)

* Changelog

* admist(?) -> between

* Warn about a race

* Fix typo, thanks Sean

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

---------

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-02-14 23:42:29 +00:00
Patrick Cloke 119e0795a5
Implement MSC3966: Add a push rule condition to search for a value in an array. (#15045)
The `exact_event_property_contains` condition can be used to
search for a value inside of an array.
2023-02-14 14:02:19 -05:00
reivilibre e9b1ff9f31
Prevent clients from reporting nonexistent events. (#13779) 2023-02-14 15:50:59 +00:00
Sean Quah 463c19ac36
Faster joins: Omit device list updates from partial state rooms in /sync (#15069)
...when lazy loading of members is not enabled. It's weird to notify
a client that another user's device list has changed when the client
doesn't think that they share a room.

Note that when a room is un-partial stated, device list updates are
emitted for every member in that room over /sync.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-14 12:32:19 +00:00
Erik Johnston cb262713b7
Fix clashing DB txn name (#15070)
* Fix clashing DB txn name

* Newsfile
2023-02-14 11:20:25 +00:00
Erik Johnston f09db5c991
Skip calculating unread push actions in `/sync` when `enable_push` is false. (#14980) 2023-02-14 11:10:29 +00:00
Harishankar Kumar db2b105d69
Change collection[str] to StrCollection in event_auth code (#14929)
Signed-off-by: Harishankar Kumar <hari01584@gmail.com>
2023-02-14 09:37:08 +00:00
reivilibre 3d7aead5d6
Tweak comment on `_is_local_room_accessible` as part of room visibility in `/hierarchy` to clarify the condition for a room being visible. (#14834) 2023-02-13 16:30:58 +00:00
Andrew Morgan bdccfd2477
Refactor arguments of `try_unbind_threepid(_with_id_server)` from dict to separate args (#15053) 2023-02-13 12:12:48 +00:00
David Robertson c10e131250
Apply logging from hotfixes branch to develop (#15054)
* Apply logging from hotfixes branch to develop

Part of #4826.

Originally added in #11882.

* Changelog
2023-02-13 11:49:20 +00:00
Mathieu Velten 6cddf24e36
Faster joins: don't stall when a user joins during a fast join (#14606)
Fixes #12801.
Complement tests are at
https://github.com/matrix-org/complement/pull/567.

Avoid blocking on full state when handling a subsequent join into a
partial state room.

Also always perform a remote join into partial state rooms, since we do
not know whether the joining user has been banned and want to avoid
leaking history to banned users.

Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:31:05 +00:00
Sean Quah d0c713cc85
Return read-only collections from `@cached` methods (#13755)
It's important that collections returned from `@cached` methods are not
modified, otherwise future retrievals from the cache will return the
modified collection.

This applies to the return values from `@cached` methods and the values
inside the dictionaries returned by `@cachedList` methods. It's not
necessary for the dictionaries returned by `@cachedList` methods
themselves to be read-only.

Signed-off-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:29:00 +00:00
Patrick Cloke 14be78d492
Support for MSC3758: exact_event_match push condition (#14964)
This specifies to search for an exact value match, instead of
string globbing. It only works across non-compound JSON values
(null, boolean, integer, and strings).
2023-02-10 12:37:07 -05:00
Patrick Cloke cf5233b783
Avoid fetching unused account data in sync. (#14973)
The per-room account data is no longer unconditionally
fetched, even if all rooms will be filtered out.

Global account data will not be fetched if it will all be
filtered out.
2023-02-10 14:22:16 +00:00
David Robertson d793fcd241
Merge branch 'release-v1.77' into develop 2023-02-10 13:43:18 +00:00
Sean Quah b95407908d
Avoid mutating cached values in `_generate_sync_entry_for_account_data` (#15047) 2023-02-10 08:11:20 -05:00
Patrick Cloke a481fb9f98
Refactor get_user_devices_from_cache to avoid mutating cached values. (#15040)
The previous version of the code could mutate a cached value,
but only if the input requested all devices of a user *and* a specific
device.

To avoid this nonsensical situation we no longer fetch a specific
device ID if all of a user's devices are returned.
2023-02-10 08:09:47 -05:00
Erik Johnston fd296b7343
Fix exception on start up about device lists (#15041)
Fixes #15010.
2023-02-10 09:52:35 +00:00
David Robertson a5a799722d
Tag federation request spans with the worker name (#15042)
* Systematically include worker name as process info

* Changelog

* don't bother with inner setdefault
2023-02-09 22:33:39 +00:00
Shay 03bccd542b
Add a class UnpersistedEventContext to allow for the batching up of storing state groups (#14675)
* add class UnpersistedEventContext

* modify create new client event to create unpersistedeventcontexts

* persist event contexts after creation

* fix tests to persist unpersisted event contexts

* cleanup

* misc lints + cleanup

* changelog + fix comments

* lints

* fix batch insertion?

* reduce redundant calculation

* add unpersisted event classes

* rework compute_event_context, split into function that returns unpersisted event context and then persists it

* use calculate_context_info to create unpersisted event contexts

* update typing

* $%#^&*

* black

* fix comments and consolidate classes, use attr.s for class

* requested changes

* lint

* requested changes

* requested changes

* refactor to be stupidly explicit

* clearer renaming and flow

* make partial state non-optional

* update docstrings

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2023-02-09 13:05:02 -08:00
Andrew Morgan c1d2ce2901
Do not always start a db txn on Postgres (#14840) 2023-02-09 19:57:01 +00:00
Patrick Cloke d22c1c862c
Respond correctly to unknown methods on known endpoints (#14605)
Respond with a 405 error if a request is received on a known endpoint,
but to an unknown method, per MSC3743.
2023-02-09 13:04:24 -05:00
Patrick Cloke 8a6e043488
Avoid mutating cached room aliases. (#15038)
This might cause incorrect data in other callers which
are not expecting the canonical alias to be added into
the response.
2023-02-09 15:56:02 +00:00
David Robertson cd2484dc2e
Bump schema version (#15036)
* Bump schema version

This should have been included in
f10caa73ee (and #14979).

* Changelog
2023-02-09 15:28:26 +00:00
Patrick Cloke 733531ee3e
Add final type hint to synapse.server. (#15035) 2023-02-09 09:49:04 -05:00
Shay 55e4d27b36
Limit concurrent event creation for a room to avoid state resolution when sending bursts of events to a local room (#14977) 2023-02-08 11:25:11 -08:00
Patrick Cloke c951fbedcb
MSC3873: Escape keys when flattening dicts. (#15004)
This disambiguates keys which attempt to match fields
with a dot in them (e.g. m.relates_to).

Disabled by default behind an experimental configuration flag.
2023-02-08 13:09:41 -05:00
Erik Johnston c78c67c5a9
Fix bug in replication where response is cached (#15024) 2023-02-08 16:41:55 +00:00
David Robertson dccae64083
Merge branch 'release-v1.77' into develop 2023-02-08 12:45:46 +00:00
David Robertson f10caa73ee
Disambiguate `get_ex_outlier_stream_rows` query
A backwards-compatible piece of #14979 that's safe to land now.
2023-02-07 15:33:33 +00:00
David Robertson 9cd7610f86
Revert "Add `event_stream_ordering` column to membership state tables (#14979)"
This reverts commit 5fdc12f482.
2023-02-07 15:26:55 +00:00
David Robertson 2dff93099b
Typecheck tests.rest.media.v1.test_media_storage (#15008)
* Fix MediaStorage type hint

* Typecheck tests.rest.media.v1.test_media_storage

* Changelog

* Remove assert and make the comment succinct

* Fix syntax for olddeps
2023-02-07 15:24:44 +00:00
Patrick Cloke 5b55c32d61
Add tests for using _flatten_dict with an event. (#15002) 2023-02-07 06:56:09 -05:00
David Robertson d0fed7a37b
Properly typecheck types.http (#14988)
* Tweak http types in Synapse

AFACIS these are correct, and they make mypy happier on tests.http.

* Type hints for test_proxyagent

* type hints for test_srv_resolver

* test_matrix_federation_agent

* tests.http.server._base

* tests.http.__init__

* tests.http.test_additional_resource

* tests.http.test_client

* tests.http.test_endpoint

* tests.http.test_matrixfederationclient

* tests.http.test_servlet

* tests.http.test_simple_client

* tests.http.test_site

* One fixup in tests.server

* Untyped defs

* Changelog

* Fixup syntax for Python 3.7

* Fix olddeps syntax

* Use a twisted IPv4 addr for dummy_address

* Fix typo, thanks Sean

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Remove redundant `Optional`

---------

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-02-07 00:20:04 +00:00
Nick Mills-Barrett 5fdc12f482
Add `event_stream_ordering` column to membership state tables (#14979)
This adds an `event_stream_ordering` column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used. Includes a background job to
populate these values from the `events` table.

Same idea as https://github.com/matrix-org/synapse/pull/13703.

Signed off by Nick @ Beeper (@fizzadar).
2023-02-07 00:10:54 +00:00
David Robertson e8269ed391
Type hints for tests.appservice (#14990)
* Accept a Sequence of events in synapse.appservice

This avoids some casts/ignores in the tests I'm about to fixup. It seems
that `List[Mock]` is not a subtype of `List[EventBase]`, but
`Sequence[Mock]` is a subtype of `Sequence[EventBase]`. So presumably
`Mock` is considered a subtype of anything, much like `Any`.

* make tests.appservice.test_scheduler pass mypy

* Extra hints in tests.appservice.test_scheduler

* Extra hints in tests.appservice.test_api

* Extra hints in tests.appservice.test_appservice

* Disallow untyped defs

* Changelog
2023-02-06 12:49:06 +00:00
David Robertson b3bf58a8a5
Only notify the target of a membership event (#14971)
* Only notify the target of a membership event

Naughty, but should be a big speedup in large rooms
2023-02-06 11:29:51 +00:00
David Robertson 6e6edea6c1
Properly typecheck tests.api (#14983) 2023-02-03 20:03:23 +00:00
Patrick Cloke b2d97bac09
Implement MSC3958: suppress notifications from edits (#14960)
Co-authored-by: Brad Murray <brad@beeper.com>
Co-authored-by: Nick Barrett <nick@beeper.com>

Copy the suppress_edits push rule from Beeper to implement MSC3958.

9415a1284b/rust/src/push/base_rules.rs (L98-L114)
2023-02-03 14:31:14 -05:00
Patrick Cloke f0cae26d58
Add a docstring & tests for _flatten_dict. (#14981) 2023-02-03 16:48:13 +00:00
Patrick Cloke 52700a0bcf
Support the backwards compatibility features in MSC3952. (#14958)
If the feature is enabled and the event has a `m.mentions` property,
skip processing of the legacy mentions rules.
2023-02-03 16:28:20 +00:00
Sean Quah 0a686d1d13
Faster joins: Refactor handling of servers in room (#14954)
Ensure that the list of servers in a partial state room always contains
the server we joined off.

Also refactor `get_partial_state_servers_at_join` to return `None` when
the given room is no longer partial stated, to explicitly indicate when
the room has partial state. Otherwise it's not clear whether an empty
list means that the room has full state, or the room is partial stated,
but the server we joined off told us that there are no servers in the
room.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-02-03 15:39:59 +00:00
Patrick Cloke 8e9fc28c6a
Reload the pyo3-log config when the Python logging config changes. (#14976)
Since pyo3-log is initialized very early in the Python start-up
it caches the state of the loggers before they're fully initialized
(and thus are essentially disabled). Whenever we reload the
logging configuration we now also tell pyo3-log to discard
any cached logging configuration it has; it will refetch the
current logging configuration from Python at the next point
it logs.

This fixes Rust log lines not appearing in the homeserver logs.
2023-02-03 08:27:31 -05:00
Patrick Cloke da05b70af5
Skip unused calculations in sync handler. (#14908)
If a sync request does not need to calculate per-room entries &
is not generating presence & is not generating device list data
(e.g. during initial sync) avoid the expensive calculation of room
specific data.

This is a micro-optimisation for clients syncing simply to receive
to-device information.
2023-02-02 13:45:12 -05:00
Patrick Cloke f36da501be
Do not calculate presence or ephemeral events when they are filtered out (#14970)
This expands the previous optimisation from being only for initial
sync to being for all sync requests.

It also inverts some of the logic to be inclusive instead of exclusive.
2023-02-02 11:58:20 -05:00
David Robertson 2186ebed6c
Fetch fewer events when getting hosts in room (#14962) 2023-02-02 16:49:14 +00:00
realtyem 58214dbb9b
Allow enabling the asyncio reactor in complement (#14858)
Signed-off-by: Jason Little realtyem@gmail.com
2023-02-01 23:42:45 +00:00
Patrick Cloke 1182ae5063
Add helper to parse an enum from query args & use it. (#14956)
The `parse_enum` helper pulls an enum value from the query string
(by delegating down to the parse_string helper with values generated
from the enum).

This is used to pull out "f" and "b" in most places and then we thread
the resulting Direction enum throughout more code.
2023-02-01 21:35:24 +00:00
Patrick Cloke 230a831c73
Attempt to delete more duplicate rows in receipts_linearized table. (#14915)
The previous assumption was that the stream_id column was unique
(for a room ID, receipt type, user ID tuple), but this turned out to be
incorrect.

Now find the max stream ID, then map this back to a database-specific
row identifier and delete other rows which match the (room ID, receipt type,
user ID) tuple, but *not* the row ID.
2023-02-01 15:45:10 -05:00
Dirk Klimpel bf82b56bab
Add more user information to export-data command. (#14894)
* The user's profile information.
* The user's devices.
* The user's connections / IP address information.
2023-02-01 15:45:19 +00:00
David Robertson 3b8574b4f2
Tag /send_join responses to detect faster joins (#14950)
* Tag /send_join responses to detect faster joins

* Changelog

* Define a proper SynapseTag

* isort
2023-01-31 12:43:20 +00:00
Sean Quah 805b641fb6
Fix "Re-starting finished log context" spam when creating events (#14947)
`run_in_background` calls re-use the current logging context. When they
are not awaited, they can complete after the current logging context has
been marked as finished, which leads to log spam. Use
`run_as_background_process` instead.

Fixes one of the instances of #13090.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-31 11:31:52 +00:00
Sean Quah 6d14fdc271
Make sqlite database migrations transactional again, part two (#14926)
#14910 fixed the regression introduced by #13873 where sqlite database
migrations would no longer run inside a transaction. However, it
committed the transaction before Synapse updated its bookkeeping of
which migrations have been run, which means that migrations may be run
again after they have completed successfully.

Leave the transaction open at the end of `executescript`, to restore the
old, correct behaviour. Also make the PostgreSQL behaviour consistent
with SQLite.

Fixes #14909.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-31 11:03:55 +00:00
David Robertson a134e626e4
Reject boolean power levels (#14944)
* Better test for bad values in power levels events

The previous test only checked that Synapse didn't raise an exception,
but didn't check that we had correctly interpreted the value of the
dodgy power level.

It also conflated two things: bad room notification levels, and bad user
levels. There _is_ logic for converting the latter to integers, but we
should test it separately.

* Check we ignore types that don't convert to int

* Handle `None` values in `notifications.room`

* Changelog

* Also test that bad values are rejected by event auth

* Docstring

* linter scripttttttttt

* Test boolean values in PL content

* Reject boolean power levels

* Changelog
2023-01-31 10:57:02 +00:00
David Robertson 796a4b7482
Prefer `type(x) is int` to `isinstance(x, int)` (#14945)
* Perfer `type(x) is int` to `isinstance(x, int)`

This covered all additional instances I could see where `x` was
user-controlled.
The remaining cases are

```
$ rg -s 'isinstance.*[^_]int'
tests/replication/_base.py
576:        if isinstance(obj, int):

synapse/util/caches/stream_change_cache.py
136:        assert isinstance(stream_pos, int)
214:        assert isinstance(stream_pos, int)
246:        assert isinstance(stream_pos, int)
267:        assert isinstance(stream_pos, int)

synapse/replication/tcp/external_cache.py
133:        if isinstance(result, int):

synapse/metrics/__init__.py
100:        if isinstance(calls, (int, float)):

synapse/handlers/appservice.py
262:        assert isinstance(new_token, int)

synapse/config/_util.py
62:        if isinstance(p, int):
```

which cover metrics, logic related to `jsonschema`, and replication and
data streams. AFAICS these are all internal to Synapse

* Changelog
2023-01-31 10:33:07 +00:00
David Robertson 510d4b06e7
Handle malformed values of `notification.room` in power level events (#14942)
* Better test for bad values in power levels events

The previous test only checked that Synapse didn't raise an exception,
but didn't check that we had correctly interpreted the value of the
dodgy power level.

It also conflated two things: bad room notification levels, and bad user
levels. There _is_ logic for converting the latter to integers, but we
should test it separately.

* Check we ignore types that don't convert to int

* Handle `None` values in `notifications.room`

* Changelog

* Also test that bad values are rejected by event auth

* Docstring

* linter scripttttttttt
2023-01-30 21:29:30 +00:00
Patrick Cloke 2a51f3ec36
Implement MSC3952: Intentional mentions (#14823)
MSC3952 defines push rules which searches for mentions in a list of
Matrix IDs in the event body, instead of searching the entire event
body for display name / local part.

This is implemented behind an experimental configuration flag and
does not yet implement the backwards compatibility pieces of the MSC.
2023-01-27 10:16:21 -05:00
David Robertson faecc6c083
Merge branch 'release-v1.76' into develop 2023-01-27 13:01:18 +00:00
Patrick Cloke 265735db9d
Use an enum for direction. (#14927)
For better type safety we  use an enum instead of strings to
configure direction (backwards or forwards).
2023-01-27 07:27:55 -05:00
Patrick Cloke fc35e0673f
Add missing type hints in tests (#14879)
* FIx-up type hints in tests.logging.
* Add missing type hints to test_transactions.
2023-01-26 14:45:24 -05:00
Patrick Cloke 345576bc34
Fix paginating /relations with a live token (#14866)
The `/relations` endpoint was not properly handle "live tokens"
(i.e sync tokens), to do this properly we abstract the code that
`/messages` has and re-use it.
2023-01-26 13:24:15 -05:00
Patrick Cloke ba79fb4a61
Use StrCollection in place of Collection[str] in (most) handlers code. (#14922)
Due to the increased safety of StrCollection over Collection[str]
and Sequence[str].
2023-01-26 12:31:58 -05:00
Patrick Cloke 8a05d5de21
Batch look-ups to see if rooms are partial stated. (#14917)
* Batch look-ups to see if rooms are partial stated.

* Fix issues found in linting.

* Fix typo.

* Apply suggestions from code review

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Clarify comments.

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Also improve the cache size while we're at it

* is_partial_state_rooms -> is_partial_state_room_batched

* Run `black`

* Improve annotation for `simple_select_many_batch`

* Fix is_partial_state_room_batched impl

* Okay, _actually_ fix impl

* Update description.

* Update synapse/storage/databases/main/room.py

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Run black.

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: David Robertson <davidr@element.io>
2023-01-26 17:15:36 +00:00
Sean Quah cf66d712c6
Fix initialization of `_device_list_id_gen` (#14914)
On startup, the `_device_list_id_gen` stream id generator is initialized
using the maximum stream id seen in a list of tables. When we started
populating the `device_list_remote_pending` table in #13913, we forgot
to add it to the aforementioned list of tables, so the stream id
generator can hand out old stream ids after a restart. The end result is
that Synapse can fail to handle device list update EDUs after a restart
when a partial state join is in progress.

Add the `device_list_remote_pending` table to the list of tables to
consider when initializing the `_device_list_id_gen` stream id generator.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-26 10:38:49 +00:00
Patrick Cloke 7e8d455280
Fix a bug in the send_local_online_presence_to module API (#14880)
Destination was being used incorrectly (a single destination instead
of a list of destinations was being passed).

This also updates some of the types in the area to not use Collection[str],
which is a footgun.
2023-01-25 21:34:37 +00:00
Patrick Cloke 3c3ba31507
Add missing type hints for tests.events. (#14904) 2023-01-25 15:14:03 -05:00
David Robertson 8e37ece015
Bump the client-side timeout for /state (#14912)
* Bump the client-side timeout for /state

to allow faster joins resyncs the chance to complete for large rooms.
We have seen this fair poorly (~90s for Matrix HQ's /state) in testing,
causing the resync to advance to another HS who hasn't seen our join yet.

* Changelog

* Milliseconds!!!!
2023-01-25 16:11:06 +00:00
Sean Quah a63d4cc9e9
Make sqlite database migrations transactional again (#14910)
#13873 introduced a regression which causes sqlite database migrations
to no longer run inside a transaction. Wrap them in a transaction again,
to avoid database corruption when migrations are interrupted.

Fixes #14909.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-25 13:38:53 +00:00
David Robertson 4607be0b7b
Request partial joins by default (#14905)
* Request partial joins by default

This is a little sloppy, but we are trying to gain confidence in faster
joins in the upcoming RC.

Admins can still opt out by adding the following to their Synapse
config:

```yaml
experimental:
    faster_joins: false
```

We may revert this change before the release proper, depending on how
testing in the wild goes.

* Changelog

* Try to fix the backfill test failures

* Upgrade notes

* Postgres compat?
2023-01-24 15:28:20 +00:00
David Robertson 80d44060c9
Faster joins: omit partial rooms from eager syncs until the resync completes (#14870)
* Allow `AbstractSet` in `StrCollection`

Or else frozensets are excluded. This will be useful in an upcoming
commit where I plan to change a function that accepts `List[str]` to
accept `StrCollection` instead.

* `rooms_to_exclude` -> `rooms_to_exclude_globally`

I am about to make use of this exclusion mechanism to exclude rooms for
a specific user and a specific sync. This rename helps to clarify the
distinction between the global config and the rooms to exclude for a
specific sync.

* Better function names for internal sync methods

* Track a list of excluded rooms on SyncResultBuilder

I plan to feed a list of partially stated rooms for this sync to ignore

* Exclude partial state rooms during eager sync

using the mechanism established in the previous commit

* Track un-partial-state stream in sync tokens

So that we can work out which rooms have become fully-stated during a
given sync period.

* Fix mutation of `@cached` return value

This was fouling up a complement test added alongside this PR.
Excluding a room would mean the set of forgotten rooms in the cache
would be extended. This means that room could be erroneously considered
forgotten in the future.

Introduced in #12310, Synapse 1.57.0. I don't think this had any
user-visible side effects (until now).

* SyncResultBuilder: track rooms to force as newly joined

Similar plan as before. We've omitted rooms from certain sync responses;
now we establish the mechanism to reintroduce them into future syncs.

* Read new field, to present rooms as newly joined

* Force un-partial-stated rooms to be newly-joined

for eager incremental syncs only, provided they're still fully stated

* Notify user stream listeners to wake up long polling syncs

* Changelog

* Typo fix

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Unnecessary list cast

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Rephrase comment

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Another comment

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

* Fixup merge(?)

* Poke notifier when receiving un-partial-stated msg over replication

* Fixup merge whoops

Thanks MV :)

Co-authored-by: Mathieu Velen <mathieuv@matrix.org>

Co-authored-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-01-23 15:44:39 +00:00
Patrick Cloke 82d3efa312
Skip processing stats for broken rooms. (#14873)
* Skip processing stats for broken rooms.

* Newsfragment

* Use a custom exception.
2023-01-23 11:36:20 +00:00
Sean Quah 2ec9c58496
Faster joins: Update room stats and the user directory on workers when finishing join (#14874)
* Faster joins: Update room stats and user directory on workers when done

When finishing a partial state join to a room, we update the current
state of the room without persisting additional events. Workers receive
notice of the current state update over replication, but neglect to wake
the room stats and user directory updaters, which then get incidentally
triggered the next time an event is persisted or an unrelated event
persister sends out a stream position update.

We wake the room stats and user directory updaters at the appropriate
time in this commit.

Part of #12814 and #12815.

Signed-off-by: Sean Quah <seanq@matrix.org>

* fixup comment

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-23 10:31:36 +00:00
reivilibre 22cc93afe3
Enable Faster Remote Room Joins against worker-mode Synapse. (#14752)
* Enable Complement tests for Faster Remote Room Joins on worker-mode

* (dangerous) Add an override to allow Complement to use FRRJ under workers

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Fix race where we didn't send out replication notification

* MORE HACKS

* Fix get_un_partial_stated_rooms_token to take instance_name

* Fix bad merge

* Remove warning

* Correctly advance un_partial_stated_room_stream

* Fix merge

* Add another notify_replication

* Fixups

* Create a separate ReplicationNotifier

* Fix test

* Fix portdb

* Create a separate ReplicationNotifier

* Fix test

* Fix portdb

* Fix presence test

* Newsfile

* Apply suggestions from code review

* Update changelog.d/14752.misc

Co-authored-by: Erik Johnston <erik@matrix.org>

* lint

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Erik Johnston <erik@matrix.org>
2023-01-22 21:10:11 +00:00
Sean Quah d329a566df
Faster joins: Fix incompatibility with restricted joins (#14882)
* Avoid clearing out forward extremities when doing a second remote join

When joining a restricted room where the local homeserver does not have
a user able to issue invites, we perform a second remote join. We want
to avoid clearing out forward extremities in this case because the
forward extremities we have are up to date and clearing out forward
extremities creates a window in which the room can get bricked if
Synapse crashes.

Signed-off-by: Sean Quah <seanq@matrix.org>

* Do a full join when doing a second remote join into a full state room

We cannot persist a partial state join event into a joined full state
room, so we perform a full state join for such rooms instead. As a
future optimization, we could always perform a partial state join and
compute or retrieve the full state ourselves if necessary.

Signed-off-by: Sean Quah <seanq@matrix.org>

* Add lock around partial state flag for rooms

Signed-off-by: Sean Quah <seanq@matrix.org>

* Preserve partial state info when doing a second partial state join

Signed-off-by: Sean Quah <seanq@matrix.org>

* Add newsfile

* Add a TODO(faster_joins) marker

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-22 19:19:31 +00:00
Erik Johnston 0ec12a3753
Reduce max time we wait for stream positions (#14881)
Now that we wait for stream positions whenever we do a HTTP replication
hit, we need to be less brutal in the case where we do timeout (as we
have bugs around this).
2023-01-20 21:04:33 +00:00
Erik Johnston 65d0386693
Always notify replication when a stream advances (#14877)
This ensures that all other workers are told about stream updates in a timely manner, without having to remember to manually poke replication.
2023-01-20 18:02:18 +00:00
Sean Quah cdea7c11d0
Faster joins: Avoid starting duplicate partial state syncs (#14844)
Currently, we will try to start a new partial state sync every time we
perform a remote join, which is undesirable if there is already one
running for a given room.

We intend to perform remote joins whenever additional local users wish
to join a partial state room, so let's ensure that we do not start more
than one concurrent partial state sync for any given room.

------------------------------------------------------------------------

There is a race condition where the homeserver leaves a room and later
rejoins while the partial state sync from the previous membership is
still running. There is no guarantee that the previous partial state
sync will process the latest join, so we restart it if needed.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-20 12:06:19 +00:00
Erik Johnston cdf2707678
Fix bug in wait for stream position (#14872)
This caused some requests to fail.

This caused some requests to fail.

This really only started causing issues due to #14856
2023-01-19 22:19:56 +00:00
Andrew Morgan a7b54ca8d8
Implement MSC3930: polls push rules (#14787) 2023-01-19 12:47:10 +00:00
Erik Johnston 9187fd940e
Wait for streams to catch up when processing HTTP replication. (#14820)
This should hopefully mitigate a class of races where data gets out of
sync due a HTTP replication request racing with the replication streams.
2023-01-18 19:35:29 +00:00
Catalan Lover e8f2bf5c40
Change default room version to 10. Implements MSC3904 (#14111)
* Change Documentation to have v10 as default room version

* Change Default Room version to 10

* Add changelog entry for default room version swap

* Add changelog entry for v10 default room version in docs

* Clarify doc changelog entry

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>

* Improve Documentation changes.

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>

* Update Changelog entry to have correct format

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>

* Update Spec Version to 1.5

* Only need 1 changelog.

* Fix test.

* Update "Changed in" line

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-01-18 18:59:48 +00:00
Patrick Cloke 4d6b1d3c47
Properly check for frozendicts in event auth code. (#14864)
Check for for an instance of a mapping instead of a dict.

This only affects room version 10 when frozen events are enabled.
2023-01-18 09:27:57 -05:00
David Robertson 5b3af1c7d0
Stabilise serving partial join responses (#14839)
Serving partial join responses is no longer experimental. They will only be served under the stable identifier if the the undocumented config flag experimental.msc3706_enabled is set to true.

Synapse continues to request a partial join only if the undocumented config flag experimental.faster_joins is set to true; this setting remains present and unaffected.
2023-01-17 12:44:15 +00:00
Erik Johnston 316590d1ea
Fix bug in `wait_for_stream_position` (#14856)
We were incorrectly checking if the *local* token had been advanced, rather than the token for the remote instance.

In practice, I don't think this has caused any bugs due to where we use `wait_for_stream_position`, as critically we don't use it on instances that also write to the given streams (and so the local token will lag behind all remote tokens).
2023-01-17 09:58:22 +00:00
Erik Johnston 2b084c5b71
Merge device list replication streams (#14833) 2023-01-17 09:29:58 +00:00
Sean Quah db5145a31d
Add parameter to control whether we do a partial state join (#14843)
When the local homeserver is already joined to a room and wants to
perform another remote join, we may find it useful to do a non-partial
state join if we already have the full state for the room.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-16 23:15:17 +00:00
Erik Johnston 4db3331bb9
Add an early return when handling no-op presence updates. (#14855)
This stops us from incrementing the presence stream position for no-op updates.
2023-01-16 14:20:12 +00:00
Sean Quah a302d3ecf7
Remove unnecessary reactor reference from `_PerHostRatelimiter` (#14842)
Fix up #14812 to avoid introducing a reference to the reactor.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-16 13:16:19 +00:00
David Robertson 85a7a201fa
Also use stable name in SendJoinResponse struct (#14841)
* Also use stable name in SendJoinResponse struct

follow-up to #14832

* Changelog

* Fix a rename I missed

* Run black

* Update synapse/federation/federation_client.py

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2023-01-16 12:40:25 +00:00
Andrew Morgan 54cd90ea60
Implement MSC3890: Remotely silence local notifications (#14775) 2023-01-13 19:32:10 +00:00
David Robertson 52ae80dd1a
Use stable identifiers for faster joins (#14832)
* Use new query param when requesting a partial join

* Read new query param when serving partial join

* Provide new field names when serving partial joins

* Read new field names from partial join response

* Changelog
2023-01-13 17:58:53 +00:00
Erik Johnston 73ff493dfb
Merge account data streams (#14826) 2023-01-13 14:57:43 +00:00
Dirk Klimpel 8d5325ec0c
Drop unused table `presence` (#14825) 2023-01-13 14:17:03 +00:00
Andrew Morgan 3a125625e7
Add some clarifying comments and refactor a portion of the `Keyring` class for readability (#14804) 2023-01-13 12:37:28 +00:00
Sean Quah 772e8c2385
Fix stack overflow in `_PerHostRatelimiter` due to synchronous requests (#14812)
When there are many synchronous requests waiting on a
`_PerHostRatelimiter`, each request will be started recursively just
after the previous request has completed. Under the right conditions,
this leads to stack exhaustion.

A common way for requests to become synchronous is when the remote
client disconnects early, because the homeserver is overloaded and slow
to respond.

Avoid stack exhaustion under these conditions by deferring subsequent
requests until the next reactor tick.

Fixes #14480.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-01-13 00:16:21 +00:00
Richard van der Hoff 0f061f39f0 Merge remote-tracking branch 'origin/release-v1.75' into develop 2023-01-12 16:45:23 +00:00