synapse/tests/storage/databases
Eric Eastwood adda2a4613
Sliding Sync: Slight optimization when fetching state for the room (`get_events_as_list(...)`) (#17718)
Spawning from @kegsay [pointing
out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re)
that the Sliding Sync endpoint doesn't handle a large room with a lot of
state well on initial sync (requesting all state via `required_state: [
["*","*"] ]`) (it just takes forever).

After investigating further, the slow part is just
`get_events_as_list(...)` fetching all of the current state ID's out for
the room (which can be 100k+ events for rooms with a lot of membership).
This is just a slow thing in Synapse in general and the same thing
happens in Sync v2 or the `/state` endpoint.


---

The only idea I had to improve things was to use `batch_iter` to only
try fetching a fixed amount at a time instead of working with large
maps, lists, and sets. This doesn't seem to have much effect though.

There is already a `batch_iter(event_ids, 200)` in
`_fetch_event_rows(...)` for when we actually have to touch the database
and that's inside a queue to deduplicate work.

I did notice one slight optimization to use `get_events_as_list(...)`
directly instead of `get_events(...)`. `get_events(...)` just turns the
result from `get_events_as_list(...)` into a dict and since we're just
iterating over the events, we don't need the dict/map.
2024-10-14 13:47:35 +01:00
..
main Sliding Sync: Slight optimization when fetching state for the room (`get_events_as_list(...)`) (#17718) 2024-10-14 13:47:35 +01:00
__init__.py Correctly mention previous copyright (#16820) 2024-01-23 11:26:48 +00:00