Commit Graph

8457 Commits

Author SHA1 Message Date
Erik Johnston 786de8570b
Speed up fetching partial-state rooms on sliding sync (#17666)
Instead of having a large cache of `room_id -> bool` about whether a
room is partially stated, replace with a "fetch rooms the user is which
are partially-stated". This is a lot faster as the set of partially
stated rooms at any point across the whole server is small, and so such
a query is fast.

The main issue with the bulk cache lookup is the CPU time looking all
the rooms up in the cache.
2024-09-06 11:12:54 +01:00
Erik Johnston d5accec2e5
Speed up sliding sync by avoiding copies (#17670)
We ended up spending ~10% CPU creating a new dictionary and
`_RoomMembershipForUser`, so let's avoid creating new dicts and copying
by returning `newly_joined`, `newly_left` and `is_dm` as sets directly.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-06 11:12:29 +01:00
Johannes Marbach de3363ef58
Stabilise MSC4156: `server_name` -> `via` (#17650) 2024-09-05 17:07:39 +01:00
Erik Johnston b09bcf16d9
Fix background update to handle invalid events (#17641)
Follow-up to #17634, https://github.com/element-hq/synapse/pull/17631
and https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17512
2024-09-05 14:15:04 +01:00
Eric Eastwood b054690c8c
Sliding Sync: Prevent duplicate tags being added to traces (#17655)
Prevent duplicate tags being added to traces.

Noticed because we see these warnings in Jaeger:

<img width="462" alt="Screenshot 2024-09-03 at 2 34 05 PM"
src="https://github.com/user-attachments/assets/6fac12ed-0074-435b-9451-eccde7e7012a">
2024-09-05 10:05:01 +01:00
Erik Johnston dce38f3faf
Fix sliding sync on workers (#17649)
Broke in #17630

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-09-04 10:52:46 +01:00
Erik Johnston 391c4f870b Merge remote-tracking branch 'origin/release-v1.114' into develop 2024-09-02 20:58:49 +01:00
Erik Johnston ac27c9e46a 1.114.0 2024-09-02 15:14:57 +01:00
Erik Johnston f729ef08c9
Enable sliding sync support by default (#17648)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-09-02 15:09:04 +01:00
Quentin Gliech 7d52ce7d4b
Format files with Ruff (#17643)
I thought ruff check would also format, but it doesn't.

This runs ruff format in CI and dev scripts. The first commit is just a
run of `ruff format .` in the root directory.
2024-09-02 12:39:04 +01:00
Erik Johnston 709b7363fe
Sliding sync: use new DB tables (#17630)
Based on https://github.com/element-hq/synapse/pull/17629

Utilizing the new sliding sync tables added in
https://github.com/element-hq/synapse/pull/17512 for fast acquisition of
rooms for the user and filtering/sorting.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-01 11:25:39 +01:00
Erik Johnston 560b43ac02
Sliding Sync: Split up `get_room_membership_for_user_at_to_token` (#17629)
This is to make it easier to reuse the logic when adding support for the
new tables

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-09-01 10:52:03 +01:00
Erik Johnston d52c17ce01
Sliding sync: various fixes to background update (#17636)
Follows on from #17512, other fixes include: #17633, #17634, #17635
2024-09-01 10:18:45 +01:00
Erik Johnston d6125c583d 1.114.0rc3 2024-08-30 16:38:08 +01:00
Erik Johnston da58e55a0b Fix starting non-media repos (#17626)
Regressed in #17543.

The `max_download_size` config is not available on workers that don't
load the media repo.

Besides, we should honour the max_size param that was passed into the
function.
2024-08-30 16:37:11 +01:00
Erik Johnston 7b75922020 1.114.0rc2 2024-08-30 15:35:18 +01:00
Quentin Gliech 26c1330764 Replace isort and black with ruff (#17620)
Ruff now has decent parity with black and isort, so this is going to just save us a bunch of time
2024-08-30 15:32:43 +01:00
Quentin Gliech 48303fcbcc MSC3861: load the issuer and account management URLs from OIDC discovery (#17407)
This will help mitigating any discrepancies between the issuer
configured and the one returned by the OIDC provider.

This also removes the need for configuring the `account_management_url`
explicitely, as it will now be loaded from the OIDC discovery, as per
MSC2965.

Because we may now fetch stuff for the .well-known/matrix/client
endpoint, this also transforms the client well-known resource to be
asynchronous.
2024-08-30 15:31:51 +01:00
Michael Telatynski 53a3783750 Use custom stage UIA error for MAS cross-signing reset (#17509)
Rather than 501 M_UNRECOGNISED

Client side implementation at
https://github.com/matrix-org/matrix-react-sdk/pull/12892/
2024-08-30 15:31:51 +01:00
Erik Johnston b913aaa788 Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 15:31:05 +01:00
Erik Johnston dab88a7b1f Sliding Sync: Make `PerConnectionState` immutable (#17600)
This is so that we can cache it.

We also move the sliding sync types to
`synapse/types/handlers/sliding_sync.py`. This is mainly in-prep for

The only change in behaviour is that
`RoomSyncConfig.combine_sync_config(..)` now returns a new room sync
config rather than mutating in-place.

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 15:29:07 +01:00
Quentin Gliech ca69d0f571
MSC3861: load the issuer and account management URLs from OIDC discovery (#17407)
This will help mitigating any discrepancies between the issuer
configured and the one returned by the OIDC provider.

This also removes the need for configuring the `account_management_url`
explicitely, as it will now be loaded from the OIDC discovery, as per
MSC2965.

Because we may now fetch stuff for the .well-known/matrix/client
endpoint, this also transforms the client well-known resource to be
asynchronous.
2024-08-30 14:04:08 +00:00
Michael Telatynski 02ebcf7725
Use custom stage UIA error for MAS cross-signing reset (#17509)
Rather than 501 M_UNRECOGNISED

Client side implementation at
https://github.com/matrix-org/matrix-react-sdk/pull/12892/
2024-08-30 14:52:57 +02:00
Quentin Gliech cdd5979129
Replace isort and black with ruff (#17620)
Ruff now has decent parity with black and isort, so this is going to just save us a bunch of time
2024-08-30 10:07:46 +02:00
Erik Johnston 89801e04ca
Sliding sync: Ignore tables with no create event in current state (#17633) 2024-08-30 08:54:14 +01:00
Erik Johnston 7098d47f29
Sliding sync: Fix bg update again (v3) (#17634)
Follow-up to https://github.com/element-hq/synapse/pull/17631 and
https://github.com/element-hq/synapse/pull/17632 to fix-up
https://github.com/element-hq/synapse/pull/17599

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-30 08:54:07 +01:00
Eric Eastwood 26f81fb5be
Sliding Sync: Fix outlier re-persisting causing problems with sliding sync tables (#17635)
Fix outlier re-persisting causing problems with sliding sync tables

Follow-up to https://github.com/element-hq/synapse/pull/17512

When running on `matrix.org`, we discovered that a remote invite is
first persisted as an `outlier` and then re-persisted again where it is
de-outliered. The first the time, the `outlier` is persisted with one
`stream_ordering` but when persisted again and de-outliered, it is
assigned a different `stream_ordering` that won't end up being used.
Since we call `_calculate_sliding_sync_table_changes()` before
`_update_outliers_txn()` which fixes this discrepancy (always use the
`stream_ordering` from the first time it was persisted), we're working
with an unreliable `stream_ordering` value that will possibly be unused
and not make it into the `events` table.
2024-08-30 08:53:57 +01:00
Erik Johnston d844afdc29
Fix background update for sliding sync (find previous membership) (#17632)
This reverts commit
ab414f2ab8.

Introduced in https://github.com/element-hq/synapse/pull/17512
2024-08-29 19:16:39 +01:00
Erik Johnston bb80894391
Fix background update for sliding sync (#17631)
This reverts commit ab414f2ab8.

Introduced in https://github.com/element-hq/synapse/pull/17599
2024-08-29 16:58:53 +01:00
Erik Johnston e43c2b023e
Sliding sync: Store the per-connection state in the database. (#17599)
Based on #17600

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:26:58 +01:00
Erik Johnston 2999a14aed
Sliding Sync: Make `PerConnectionState` immutable (#17600)
This is so that we can cache it.

We also move the sliding sync types to
`synapse/types/handlers/sliding_sync.py`. This is mainly in-prep for
#17599 to avoid circular imports.

The only change in behaviour is that
`RoomSyncConfig.combine_sync_config(..)` now returns a new room sync
config rather than mutating in-place.

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 16:22:57 +01:00
Eric Eastwood 1a6b718f8c
Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)
Pre-populate room data for quick filtering/sorting in the Sliding Sync
API

Spawning from
https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578

This PR is acting as the Synapse version `N+1` step in the gradual
migration being tracked by
https://github.com/element-hq/synapse/issues/17623

Adding two new database tables:

- `sliding_sync_joined_rooms`: A table for storing room meta data that
the local server is still participating in. The info here can be shared
across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the
relevant room current state changes or a new event is sent in the room.
- `sliding_sync_membership_snapshots`: A table for storing a snapshot of
room meta data at the time of the local user's membership. Keyed on
`(room_id, user_id)` and only updated when a user's membership in a room
changes.

Also adds background updates to populate these tables with all of the
existing data.


We want to have the guarantee that if a row exists in the sliding sync
tables, we are able to rely on it (accurate data). And if a row doesn't
exist, we use a fallback to get the same info until the background
updates fill in the rows or a new event comes in triggering it to be
fully inserted. This means we need a couple extra things in place until
we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the
`N+2` part of the gradual migration. For context on why we can't rely on
the tables without these things see [1].

1. On start-up, block until we clear out any rows for the rooms that
have had events since the max-`stream_ordering` of the
`sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of
the `events` table). For `sliding_sync_membership_snapshots`, we can
compare to the max-`stream_ordering` of `local_current_membership`
- This accounts for when someone downgrades their Synapse version and
then upgrades it again. This will ensure that we don't have any
stale/out-of-date data in the
`sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables
since any new events sent in rooms would have also needed to be written
to the sliding sync tables. For example a new event needs to bump
`event_stream_ordering` in `sliding_sync_joined_rooms` table or some
state in the room changing (like the room name). Or another example of
someone's membership changing in a room affecting
`sliding_sync_membership_snapshots`.
1. Add another background update that will catch-up with any rows that
were just deleted from the sliding sync tables (based on the activity in
the `events`/`local_current_membership`). The rooms that need
recalculating are added to the
`sliding_sync_joined_rooms_to_recalculate` table.
1. Making sure rows are fully inserted. Instead of partially inserting,
we need to check if the row already exists and fully insert all data if
not.

All of this extra functionality can be removed once the
`SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync
tables so people can no longer downgrade (the `N+2` part of the gradual
migration).


<details>
<summary><sup>[1]</sup></summary>

For `sliding_sync_joined_rooms`, since we partially insert rows as state
comes in, we can't rely on the existence of the row for a given
`room_id`. We can't even rely on looking at whether the background
update has finished. There could still be partial rows from when someone
reverted their Synapse version after the background update finished, had
some state changes (or new rooms), then upgraded again and more state
changes happen leaving a partial row.

For `sliding_sync_membership_snapshots`, we insert items as a whole
except for the `forgotten` column ~~so we can rely on rows existing and
just need to always use a fallback for the `forgotten` data. We can't
use the `forgotten` column in the table for the same reasons above about
`sliding_sync_joined_rooms`.~~ We could have an out-of-date membership
from when someone reverted their Synapse version. (same problems as
outlined for `sliding_sync_joined_rooms` above)

Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)

</details>


### TODO

 - [x] Update `stream_ordering`/`bump_stamp`
 - [x] Handle remote invites
 - [x] Handle state resets
- [x] Consider adding `sender` so we can filter `LEAVE` memberships and
distinguish from kicks.
     - [x] We should add it to be able to tell leaves from kicks 
- [x] Consider adding `tombstone` state to help address
https://github.com/element-hq/synapse/issues/17540
     - [x] We should add it `tombstone_successor_room_id`
- [x] Consider adding `forgotten` status to avoid extra
lookup/table-join on `room_memberships`
    - [x] We should add it
- [x] Background update to fill in values for all joined rooms and
non-join membership
 - [x] Clean-up tables when room is deleted
 - [ ] Make sure tables are useful to our use case
- First explored in
https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables
- Also explored in
76b5a576eb
 - [x] Plan for how can we use this with a fallback
     - See plan discussed above in main area of the issue description
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7)
 - [x] Plan for how we can rely on this new table without a fallback
- Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add
new tables and background update to backfill all rows. Since this is a
new table, we don't have to add any `NOT VALID` constraints and validate
them when the background update completes. Read from new tables with a
fallback in cases where the rows aren't filled in yet.
- Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump
`SCHEMA_COMPAT_VERSION` to `87` because we don't want people to
downgrade and miss writes while they are on an older version. Add a
foreground update to finish off the backfill so we can read from new
tables without the fallback. Application code can now rely on the new
tables being populated.
- Discussed in an [internal
meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj)




### Dev notes

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase

SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase
```

```
SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase
```

Reference:

- [Development docs on background updates and worked examples of gradual
migrations

](1dfa59b238/docs/development/database_schema.md (background-updates))
- A real example of a gradual migration:
https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514
- Adding `rooms.creator` field that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/10697
- Adding `rooms.room_version` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/6729
- Adding `room_stats_state.room_type` that needed a background update to
backfill data, https://github.com/matrix-org/synapse/pull/13031
- Tables from MSC2716: `insertion_events`, `insertion_event_edges`,
`insertion_event_extremities`, `batch_events`
- `current_state_events` updated in
`synapse/storage/databases/main/events.py`

---

```
persist_event (adds to queue)
_persist_event_batch
_persist_events_and_state_updates (assigns `stream_ordering` to events)
_persist_events_txn
	_store_event_txn
        _update_metadata_tables_txn
            _store_room_members_txn
	_update_current_state_txn
```

---

> Concatenated Indexes [...] (also known as multi-column, composite or
combined index)
>
> [...] key consists of multiple columns.
> 
> We can take advantage of the fact that the first index column is
always usable for searching
>
> *--
https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys*

---

Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`),
https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219

---

<details>
<summary>SQL queries:</summary>

Both of these are equivalent and work in SQLite and Postgres

Options 1:
```sql
WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
)
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT * FROM data_table
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

Option 2:
```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)})
SELECT 
    column1 as room_id,
    column2 as user_id,
    column3 as membership_event_id,
    column4 as membership,
    column5 as event_stream_ordering,
    {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))}
FROM (
    VALUES (
        ?, ?, ?,
        (SELECT membership FROM room_memberships WHERE event_id = ?),
        (SELECT stream_ordering FROM events WHERE event_id = ?),
        {", ".join("?" for _ in insert_values)}
    )
) as v
WHERE membership != ?
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

If we don't need the `membership` condition, we could use:

```sql
INSERT INTO sliding_sync_non_join_memberships
    (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)})
VALUES (
    ?, ?, ?,
    (SELECT membership FROM room_memberships WHERE event_id = ?),
    (SELECT stream_ordering FROM events WHERE event_id = ?),
    {", ".join("?" for _ in insert_values)}
)
ON CONFLICT (room_id, user_id)
DO UPDATE SET
    membership_event_id = EXCLUDED.membership_event_id,
    membership = EXCLUDED.membership,
    event_stream_ordering = EXCLUDED.event_stream_ordering,
    {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)}
```

</details>

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2024-08-29 16:09:51 +01:00
Gordan Trevis 594cd5f9fd
Fix Internal Server Error for Non-Local Users in Room Actions (#17607) 2024-08-29 14:34:29 +00:00
Erik Johnston b21134de3b
Fix starting non-media repos (#17626)
Regressed in #17543.

The `max_download_size` config is not available on workers that don't
load the media repo.

Besides, we should honour the max_size param that was passed into the
function.
2024-08-29 12:26:17 +00:00
meise a8f29c9913
docs: fix typo in saml2_config example (#17594) 2024-08-29 10:39:16 +00:00
Dirk Klimpel 9eed8cd878
fix listener docs - admin api only on main process (#17590) 2024-08-29 10:33:14 +00:00
Erik Johnston 8678516e79
Sliding sync: Always send your own receipts down (#17617)
When returning receipts in sliding sync for initial rooms we should
always include our own receipts in the room (even if they don't match
any timeline events).

Reviewable commit-by-commit.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-29 10:09:40 +01:00
Till 573c6d7e69
Use `max_upload_size` as the limit when following the `Location` header (#17543)
Otherwise we use the `expected_size` from the initial federation
request, which might be far too low.

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [x] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [x] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Erik Johnston <erikj@element.io>
2024-08-29 09:25:10 +02:00
Erik Johnston 689641b903
Sliding sync: factor out room list logic (#17622)
Move calculating of the room lists out of the core handler. This should
make it easier to switch things around to start using the tables in
#17512.

This is just moving code between files and methods.

Reviewable commit-by-commit
2024-08-28 18:42:19 +01:00
Krishan e75a23a63d
Fix hierarchy returning 403 when room is accessible through federation (#17194) 2024-08-28 15:45:49 +01:00
Shay e563e4bdf3
Fix content length on federation `/thumbnail` responses (#17532) 2024-08-28 11:29:12 +01:00
eyJhb 8da16e55fe
hash_password accepts stdin now (#17608)
`hash_password` now actually accepts password from stdin. The `getpass`
reads from TTY, and does NOT accept stdin in any way.

The manpage has been updated to reflect that.
2024-08-27 18:51:43 +01:00
Erik Johnston defd4aca67
Speed up fetching latest stream positions via cache (#17606)
The idea is to engineer it so that the vast majority of the rooms can
stay in the cache, so we can just ignore them.
2024-08-27 11:03:56 +00:00
Erik Johnston b4d95409fb
Fix @tag_args for non-methods (#17604)
The decorator assumed we were always wrapping function methods
2024-08-27 11:47:28 +01:00
Erik Johnston 92b38c1afd
Sliding sync: Split up handler into its own module (#17595)
That file was getting long.

The changes are non functional, and simply split things up into:
- the main class
- the connection store
- the extensions
- the types
2024-08-20 18:30:23 +00:00
Quentin Gliech 7c9684b5dc
1.114.0rc1 2024-08-20 14:57:22 +02:00
Erik Johnston f1e8d2d15a
Sliding Sync: Speed up getting receipts for initial rooms (#17592)
Let's only pull out the events we care about. Note that the index isn't
necessary here, as postgres is happy to scan the set of rooms for the
events.
2024-08-20 12:57:34 +01:00
Erik Johnston 10428046e4
Add metrics for sliding sync processing time (#17593)
This should let us see how quickly we actually process things in
practice.
2024-08-20 11:36:49 +00:00
Erik Johnston 6eb98a4f1c
Sliding Sync: Handle timeline limit changes (take 2) (#17579)
This supersedes #17503, given the per-connection state is being heavily
rewritten it felt easier to recreate the PR on top of that work.

This correctly handles the case of timeline limits going up and down.

This does not handle changes in `required_state`, but that can be done
as a separate PR.

Based on #17575.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-20 10:31:25 +01:00
Erik Johnston 950ba844f7
Sliding Sync: Batch up fetching receipts (#17589)
This is to make initial sliding sync a bit faster
2024-08-20 10:13:26 +01:00
Erik Johnston 8b8d74d12f
Sliding sync: Correctly track which read receipts we have or have not sent down. (#17575)
Add connection tracking to the receipts extension.

Based on #17574

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-19 21:16:07 +01:00
Erik Johnston 261e746281
Sliding sync: Add classes for per-connection state (#17574)
This is some prep work ahead of correctly tracking receipts, where we
will also want to track the room status in terms of last receipt we had
sent down.

Essentially, we add two classes `PerConnectionState` and a mutable
version, and then operate on those.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-19 20:09:41 +01:00
Erik Johnston 993644ded0
Fix zero length media handling (#17570)
Results in:

```
AssertionError: null
  File "synapse/http/server.py", line 332, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "synapse/http/server.py", line 544, in _async_render
    callback_return = await raw_callback_return
  File "synapse/federation/transport/server/_base.py", line 369, in new_func
    response = await func(
  File "synapse/federation/transport/server/federation.py", line 826, in on_GET
    await self.media_repo.get_local_media(
  File "synapse/media/media_repository.py", line 473, in get_local_media
    await respond_with_multipart_responder(
  File "synapse/media/_base.py", line 353, in respond_with_multipart_responder
    assert content_length is not None
```
2024-08-19 15:06:44 +01:00
Erik Johnston a5d25bb623
Test github token before running release script (#17562)
This stops people from getting half way through a step and it failing
due to the github token having expired (this happens to me every damn
time).
2024-08-19 14:15:36 +01:00
Erik Johnston f162c92f2a
Speed up `/keys/changes` (#17548)
Follow on from #17537.

This is just adding a batched lookup function (you might want to hide
whitespace in the diff).
2024-08-16 16:04:02 +01:00
Erik Johnston 9ce489be5e
Add a flag to /versions about SSS support (#17571)
So that clients can check for support. Note that if the feature is only
enabled for some users, the `/versions` request must be authenticated to
pick up that SSS is enabled for the user
2024-08-16 08:54:57 +01:00
Andrew Morgan fae75b0376
Register the media threadpool with our metrics (#17566) 2024-08-14 15:11:22 +01:00
Tulir Asokan f77bfbfa30
Fix fetching signing keys when `old_verify_keys` is omitted (#17568)
`old_verify_keys` isn't marked as required in
https://spec.matrix.org/v1.11/server-server-api/#get_matrixkeyv2server
and there's no functional difference between an empty object and
omitting the object, so I don't think there's any reason synapse should
explode when the field is omitted.
2024-08-14 14:13:56 +01:00
Erik Johnston 1892ba5f67
Fix 'Producer was not unregistered' error (#17569)
Follows on from #17567
2024-08-14 13:46:22 +01:00
Erik Johnston a51daffba5
Reduce concurrent thread usage in media (#17567)
Follow on from #17558

Basically, we want to reduce the number of threads we want to use at a
time, i.e. reduce the number of threads that are paused/blocked. We do
this by returning from the thread when the consumer pauses the producer,
rather than pausing in the thread.

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-08-14 12:41:53 +01:00
Shay b05b2e14bb
Handle lower-case http headers in `_Mulitpart_Parser_Protocol` (#17545) 2024-08-14 09:49:01 +01:00
Eric Eastwood a308d99f30
Sliding Sync: Exclude partially stated rooms if we must await full state (#17538)
Previously, we just had very basic partial room exclusion based on
whether we were lazy-loading room members. Now with this PR, we added
`must_await_full_state(...)` with rules to check if we have a we're only
requesting `required_state` which is completely satisfied even with
partial state.

Partially-stated rooms should have all state events except for remote
membership events so if we require a remote membership event anywhere,
then we need to return `True`.
2024-08-13 12:27:42 -05:00
Erik Johnston a9fc1fd112
Use a larger, dedicated threadpool for media sending (#17564) 2024-08-13 17:59:47 +01:00
Andrew Morgan 6a11bdf01d
Add a utility function for generating fake event IDs (#17557) 2024-08-13 16:55:05 +00:00
Patrick Cloke 8fea190a1f
Add missing docstrings related to profile methods. (#17559) 2024-08-13 17:04:35 +01:00
Erik Johnston aaa3c36420
Remove logging in multipart (#17563)
This is really spurious and causes a lot of spam. I don't think there is
a use for it even at DEBUG level.
2024-08-13 15:56:18 +01:00
Erik Johnston 3e7eb45eb1
Fixup media logcontexts (#17561)
Regression from #17558
2024-08-13 15:00:57 +01:00
Erik Johnston 9f9ec92526
Speed up responding to media requests (#17558)
We do this by reading from a threadpool, rather than blocking the main
thread.

This is broadly what we do in the [S3 storage
provider](https://github.com/matrix-org/synapse-s3-storage-provider/blob/main/s3_storage_provider.py#L234)
2024-08-13 14:06:17 +01:00
Andrew Morgan e1f5f0fbb8
Bump setuptools from 67.6.0 to 72.1.0 (#17542) 2024-08-12 14:58:01 +01:00
Erik Johnston 70b0e38603
Fix performance of device lists in `/key/changes` and sliding sync (#17537)
We do this by reusing the code from sync v2.

Reviewable commit-by-commit. The function `get_user_ids_changed` has
been rewritten entirely, so I would recommend not looking at the diff.
2024-08-09 11:59:44 +01:00
devonh f31360e34b
Start handlers for new media endpoints when media resource configured (#17483)
This is in response to issue #17473. 
Not all the necessary handlers to deal with media requests are started
now when configuring synapse to use a media worker as per the [example
config](https://element-hq.github.io/synapse/latest/workers.html#synapseappmedia_repository).
The new media endpoints introduced with authenticated media fall under
the `client` & `federation` handlers in synapse.
This PR starts up handlers for the new media endpoints if a worker has
been configured with only the `media` resource type.

### Pull Request Checklist

<!-- Please read
https://element-hq.github.io/synapse/latest/development/contributing_guide.html
before submitting your pull request -->

* [X] Pull request is based on the develop branch
* [x] Pull request includes a [changelog
file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog).
The entry should:
- Be a short description of your change which makes sense to users.
"Fixed a bug that prevented receiving messages from other servers."
instead of "Moved X method from `EventStore` to `EventWorkerStore`.".
  - Use markdown where necessary, mostly for `code blocks`.
  - End with either a period (.) or an exclamation mark (!).
  - Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by
@github_username." or "Contributed by [Your Name]." to the end of the
entry.
* [X] [Code
style](https://element-hq.github.io/synapse/latest/code_style.html) is
correct
(run the
[linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-08-08 14:35:46 +00:00
Andrew Morgan 3ad38b644d
Replace deprecated `HTTPAdapter.get_connection` method with `get_connection_with_tls_context` (#17536) 2024-08-08 14:59:37 +01:00
Erik Johnston 44ac2aa3b6
SSS: Implement PREVIOUSLY room tracking (#17535)
Implement tracking of rooms that have had updates that have not been
sent down to clients.

Simplified Sliding Sync (SSS)
2024-08-08 10:44:17 +01:00
Eric Eastwood 11db575218
Sliding Sync: Use `stream_ordering` based timeline pagination for incremental sync (#17510)
Use `stream_ordering` based `timeline` pagination for incremental
`/sync` in Sliding Sync. Previously, we were always using a
`topological_ordering` but we should only be using that for historical
scenarios (initial `/sync`, newly joined, or haven't sent the room down
the connection before).

This is slightly different than what the [spec
suggests](https://spec.matrix.org/v1.10/client-server-api/#syncing)

> Events are ordered in this API according to the arrival time of the
event on the homeserver. This can conflict with other APIs which order
events based on their partial ordering in the event graph. This can
result in duplicate events being received (once per distinct API
called). Clients SHOULD de-duplicate events based on the event ID when
this happens.

But we've had a [discussion below in this
PR](https://github.com/element-hq/synapse/pull/17510#discussion_r1699105569)
and this matches what Sync v2 already does and seems like it makes
sense. Created a spec issue
https://github.com/matrix-org/matrix-spec/issues/1917 to clarify this.

Related issues:

 - https://github.com/matrix-org/matrix-spec/issues/1917
 - https://github.com/matrix-org/matrix-spec/issues/852
 - https://github.com/matrix-org/matrix-spec-proposals/pull/4033
2024-08-07 11:27:50 -05:00
Erik Johnston ceb3686dcd
Fixup sliding sync comment (#17531)
c.f.
https://github.com/element-hq/synapse/pull/17529#discussion_r1705780925
2024-08-07 10:32:36 +01:00
Eric Eastwood 1dfa59b238
Sliding Sync: Add more tracing (#17514)
Spawning from looking at a couple traces and wanting a little more info.

Follow-up to github.com/element-hq/synapse/pull/17501

The changes in this PR allow you to find slow Sliding Sync traces ignoring the
`wait_for_events` time. In Jaeger, you can now filter for the `current_sync_for_user`
operation with `RESULT.result=true` indicating that it actually returned non-empty results.

If you want to find traces for your own user, you can use
`RESULT.result=true ARG.sync_config.user="@madlittlemods:matrix.org"`
2024-08-06 11:43:43 -05:00
Andrew Morgan bef6568537 Merge branch 'release-v1.113' into develop 2024-08-06 14:19:12 +01:00
Andrew Morgan 244a255065
Clarify `auto_accept_invites.worker_to_run_on` config docs (#17515) 2024-08-06 13:26:51 +01:00
Andrew Morgan 932cb0a928 1.113.0rc1 2024-08-06 12:24:47 +01:00
Erik Johnston c270355349
SS: Reset connection if token is unrecognized (#17529)
This triggers the client to start a new sliding sync connection. If we
don't do this and the client asks for the full range of rooms, we end up
sending down all rooms and their state from scratch (which can be very
slow)

This causes things like
https://github.com/element-hq/element-x-ios/issues/3115 after we restart
the server

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-08-06 10:39:11 +01:00
Eric Eastwood e3db7b2d81
Sliding Sync: Easier to understand timeline assertions in tests (#17511)
Added `_assertTimelineEqual(...)` because I got fed up trying to
understand the crazy diffs from the standard
`self.assertEqual(...)`/`self.assertListEqual(...)`

Before:
```
[FAIL]
Traceback (most recent call last):
  File "/home/eric/Documents/github/element/synapse/tests/rest/client/sliding_sync/test_rooms_timeline.py", line 103, in test_rooms_limited_initial_sync
    self.assertListEqual(
  File "/usr/lib/python3.12/unittest/case.py", line 1091, in assertListEqual
    self.assertSequenceEqual(list1, list2, msg, seq_type=list)
  File "/usr/lib/python3.12/unittest/case.py", line 1073, in assertSequenceEqual
    self.fail(msg)
twisted.trial.unittest.FailTest: Lists differ: ['$4QcmnzhdazSnDYcYSZCS_6-MWSzM_dN3RC7TRvW0w[95 chars]isM'] != ['$8N1XJ7e-3K_wxAanLVD3v8KQ96_B5Xj4huGkgy4N4[95 chars]nnU']

First differing element 0:
'$4QcmnzhdazSnDYcYSZCS_6-MWSzM_dN3RC7TRvW0wWA'
'$8N1XJ7e-3K_wxAanLVD3v8KQ96_B5Xj4huGkgy4N4-E'

- ['$4QcmnzhdazSnDYcYSZCS_6-MWSzM_dN3RC7TRvW0wWA',
-  '$8N1XJ7e-3K_wxAanLVD3v8KQ96_B5Xj4huGkgy4N4-E',
? ^

+ ['$8N1XJ7e-3K_wxAanLVD3v8KQ96_B5Xj4huGkgy4N4-E',
? ^

-  '$q4PRxQ_pBZkQI1keYuZPTtExQ23DqpUI3-Lxwfj_isM']
+  '$4QcmnzhdazSnDYcYSZCS_6-MWSzM_dN3RC7TRvW0wWA',
+  '$j3Xj-t2F1wH9kUHsI8X5yqS7hkdSyN2owaArfvk8nnU']
```

After:

```
[FAIL]
Traceback (most recent call last):
  File "/home/eric/Documents/github/element/synapse/tests/rest/client/sliding_sync/test_rooms_timeline.py", line 178, in test_rooms_limited_initial_sync
    self._assertTimelineEqual(
  File "/home/eric/Documents/github/element/synapse/tests/rest/client/sliding_sync/test_rooms_timeline.py", line 110, in _assertTimelineEqual
    self._assertListEqual(
  File "/home/eric/Documents/github/element/synapse/tests/rest/client/sliding_sync/test_rooms_timeline.py", line 79, in _assertListEqual
    self.fail(f"{diff_message}\n{message}")
twisted.trial.unittest.FailTest: Items must
Expected items to be in actual ('?' = missing expected items):
 [
   (10, master) $w-BoqW1PQQFU4TzVJW5OIelugxh0mY12wrfw6mbC6D4 (m.room.message) activity4
   (11, master) $sSidTZf1EOQmCVDU4mrH_1-bopMQhwcDUO2IhoemR6M (m.room.message) activity5
?  (12, master) $bgOcc3D-2QSkbk4aBxKVyOOQJGs7ZuncRJwG3cEANZg (m.room.member, @user1:test) join
 ]
Actual ('+' = found expected items):
 [
+  (11, master) $sSidTZf1EOQmCVDU4mrH_1-bopMQhwcDUO2IhoemR6M (m.room.message) activity5
+  (10, master) $w-BoqW1PQQFU4TzVJW5OIelugxh0mY12wrfw6mbC6D4 (m.room.message) activity4
   (9, master) $FmCNyc11YeFwiJ4an7_q6H0LCCjQOKd6UCr5VKeXXUw (m.room.message, None) activity3
 ]
```
2024-08-05 13:20:15 -05:00
Eric Eastwood 2b620e0a15
Sliding Sync: Add typing notification extension (MSC3961) (#17505)
[MSC3961](https://github.com/matrix-org/matrix-spec-proposals/pull/3961): Sliding Sync Extension: Typing Notifications

Based on
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
2024-07-31 13:20:23 -05:00
Eric Eastwood 39731bb205
Sliding Sync: Split and move tests (#17504)
Split and move Sliding Sync tests so we have some more sane test file
sizes
2024-07-31 12:20:46 -05:00
Eric Eastwood 1d6186265a
Sliding Sync: Fix `limited` response description (make accurate) (#17507) 2024-07-31 11:47:26 -05:00
Eric Eastwood 46de0ee16b
Sliding Sync: Update filters to be robust against remote invite rooms (#17450)
Update `filters.is_encrypted` and `filters.types`/`filters.not_types` to
be robust when dealing with remote invite rooms in Sliding Sync.

Part of
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync

Follow-up to https://github.com/element-hq/synapse/pull/17434

We now take into account current state, fallback to stripped state
for invite/knock rooms, then historical state. If we can't determine
the info needed to filter a room (either from state or stripped state),
it is filtered out.
2024-07-30 13:20:29 -05:00
Eric Eastwood b221f0b84b
Sliding Sync: Add receipts extension (MSC3960) (#17489)
[MSC3960](https://github.com/matrix-org/matrix-spec-proposals/pull/3960): Receipts extension

Based on
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
2024-07-30 12:49:55 -05:00
Erik Johnston 62ae56a4ac
Add some more opentracing to sliding sync (#17501)
This will make it easier to see what it is doing in jaeger.
2024-07-30 10:54:11 +01:00
Richard van der Hoff 808dab0699
Fix `failures` property in `/keys/query` (#17499)
Fixes: https://github.com/element-hq/synapse/issues/17498
Fixes: https://github.com/element-hq/element-web/issues/27867
2024-07-30 09:51:24 +01:00
Erik Johnston 34306be5aa
Only send rooms with updates down sliding sync (#17479)
Rather than always including all rooms in range.

Also adds a pre-filter to rooms that checks the stream change cache to
see if anything might have happened.

Based on #17447

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-30 09:30:44 +01:00
Erik Johnston be4a16ff44
Sliding Sync: Track whether we have sent rooms down to clients (#17447)
The basic idea is that we introduce a new token for a sliding sync
connection, which stores the mapping of room to room "status" (i.e. have
we sent the room down?). This token allows us to handle duplicate
requests properly. In future it can be used to store more
"per-connection" information safely.

In future this should be migrated into the DB, so its important that we
try to reduce the number of syncs where we need to update the
per-connection information. In this PoC this only happens when we: a)
send down a set of room for the first time, or b) we have previously
sent down a room and there are updates but we are not sending the room
down the sync (due to not falling in a list range)

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-29 22:45:48 +01:00
Eric Eastwood 568051c0f0
Refactor Sliding Sync tests to better utilize the `SlidingSyncBase.do_sync(...)` (pt. 2) (#17482)
`SlidingSyncBase.do_sync()` for tests was first introduced in
https://github.com/element-hq/synapse/pull/17452

Part 1: https://github.com/element-hq/synapse/pull/17481
2024-07-25 11:01:47 -05:00
Eric Eastwood ebbabfe782
Refactor Sliding Sync tests to better utilize the `SlidingSyncBase` (pt. 1) (#17481)
`SlidingSyncBase` for tests was first introduced in
https://github.com/element-hq/synapse/pull/17452

Part 2: https://github.com/element-hq/synapse/pull/17482
2024-07-25 10:43:35 -05:00
YLong Shi 69ac4b6a6e
Update config_documentation - Change example of msisdn in allowed_local_3pids (#17476)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-07-25 11:07:44 +00:00
Eric Eastwood 729026e604
Sliding Sync: Add Account Data extension (MSC3959) (#17477)
Extensions based on
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):
Sliding Sync
2024-07-24 17:10:38 -05:00
Erik Johnston bdf37ad4c4
Sliding Sync: ensure bump stamp ignores backfilled events (#17478)
Backfill events have a negative stream ordering, and so its not useful
to use to compare with other (positive) stream orderings.

Plus, the Rust SDK currently assumes `bump_stamp` is positive.
2024-07-24 15:21:56 +01:00
Erik Johnston 8bbc98e66d
Use a new token format for sliding sync (#17452)
This is in preparation for adding per-connection state.

---------

Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
2024-07-24 11:47:25 +01:00
Devon Hudson 48c1307911
1.112.0rc1 2024-07-23 09:01:43 -06:00
Erik Johnston d225b6b3eb
Speed up SS room sorting (#17468)
We do this by bulk fetching the latest stream ordering.

---------

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2024-07-23 14:03:14 +01:00
reivilibre 1daae43f3a
Reduce volume of 'Waiting for current token' logs, which were introduced in v1.109.0. (#17428)
Introduced in: #17215

This caused us a minor bit of grief as the volume of logs produced was
much higher than normal

---------

Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>
2024-07-23 11:51:34 +01:00
Michael Hollister a9ee832e48
Fixed presence results not returning offline users on initial sync (#17231)
This is to address an issue in which `m.presence` results on initial
sync are not returning entries of users who are currently offline.

The original behaviour was from
https://github.com/element-hq/synapse/issues/1535

This change is useful for applications that use the
presence system for tracking user profile information/updates (e.g.
https://github.com/element-hq/synapse/pull/16992 or for profile status
messages).

This is gated behind a new configuration option to avoid performance
impact for applications that don't need this, as a pragmatic solution
for now.
2024-07-23 09:59:24 +00:00