Fix archive not responding in big rooms because homeserver times out on `/context` request

Add `filter={"lazy_load_members":true}` so that `/context` responds without timing out by returning just the state for the sender of the included event. Otherwise, the homeserver returns all state in the room at that point in time which in big rooms, can be 100k member events that we don't care about anyway. Synapse seems to timeout at about the ~5k state event mark.


## Dev notes

Without `filter={"lazy_load_members":true}`, Synapse can only handle ~4k member events (probably state in general) before `/context` times out before it ever responds in the 180 second window. I'm only looking at the member count as a rough proxy but the number of member events + state will be more.


Loads | Member count | Alias | Matrix.to | API request
--- | --- | --- | --- | ---
 | 39k | `#matrix:matrix.org` | [🔗 ](https://matrix.to/#/!OGEhHVWSdvArJzumhm:matrix.org/$xo-tESRNP1Vg1RxLXIfmMeO6dA6-u9XuE2lv6toeKcw?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!OGEhHVWSdvArJzumhm:matrix.org/context/$xo-tESRNP1Vg1RxLXIfmMeO6dA6-u9XuE2lv6toeKcw?limit=0`
 | 31k |  `#openwisp_general:gitter.im` | [🔗 ](https://matrix.to/#/!RBzfoBeqYcCwLAAenz:gitter.im/$YFjmbmH0NRPmfVsPyxhM0jcK4RFR_CdCigtHSgCTLSc?via=matrix.org) |`https://matrix-client.matrix.org/_matrix/client/r0/rooms/!RBzfoBeqYcCwLAAenz:gitter.im/context/$YFjmbmH0NRPmfVsPyxhM0jcK4RFR_CdCigtHSgCTLSc?limit=0`
 | 16k | `#raspberrypi:matrix.org` | [🔗 ](https://matrix.to/#/!wOlkWNmgkAZFxbTaqj:matrix.org/$WMh_QauoyW6cjFDuUeNgsnUSgDn7C2XlrUnlijhdmdk?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!wOlkWNmgkAZFxbTaqj:matrix.org/context/$WMh_QauoyW6cjFDuUeNgsnUSgDn7C2XlrUnlijhdmdk?limit=0`
Loads with warm cache | 7.2k | `#element-android:matrix.org` | [🔗 ](https://matrix.to/#/!AZozoWghOYSIAfaZjJ:matrix.org/$Qmb8vmaD91PIM6ROfh-2jApPoJgH2Q7NnjiwRdiEPZE?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!AZozoWghOYSIAfaZjJ:matrix.org/context/$Qmb8vmaD91PIM6ROfh-2jApPoJgH2Q7NnjiwRdiEPZE?limit=0`
  | 4k | `#nim-science:envs.net` | [🔗 ](https://matrix.to/#/!IpFtPSbgfrZrVcVyti:envs.net/$bDRop4yvjFOl3HMouAyx4mtau-JaAxvBUJ-MqftAm7E?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!IpFtPSbgfrZrVcVyti:envs.net/context/$bDRop4yvjFOl3HMouAyx4mtau-JaAxvBUJ-MqftAm7E?limit=0`
  | 2.3k | `#nim-webdev:matrix.org` | [🔗 ](https://matrix.to/#/!EoyccMaVGwdqyfKMAL:matrix.org/$VdRyNb1F-gLdDP9nDYK7xAh_LcE822fEfDYWz5re8dE?via=matrix.org) | `https://matrix-client.matrix.org/_matrix/client/r0/rooms/!EoyccMaVGwdqyfKMAL:matrix.org/context/$VdRyNb1F-gLdDP9nDYK7xAh_LcE822fEfDYWz5re8dE?limit=0`
This commit is contained in:
Eric Eastwood 2022-06-29 12:37:40 +02:00 committed by GitHub
parent 724c562d6f
commit 57174db6e0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 14 additions and 2 deletions

View File

@ -47,16 +47,28 @@ async function fetchEventsFromTimestampBackwards(accessToken, roomId, ts, limit)
assert(eventIdForTimestamp);
//console.log('eventIdForTimestamp', eventIdForTimestamp);
// We only use this endpoint to get a pagination we can use with `/messages`.
//
// We add `limit=0` here because we want to grab
//
// Add `filter={"lazy_load_members":true}` so that this endpoint responds
// without timing out. Otherwise, the homeserver returns all state in the room
// at that point in time which in big rooms, can be 100k member events that we
// don't care about anyway. Synapse seems to timeout at about the ~5k state
// event mark.
const contextEndpoint = urlJoin(
matrixServerUrl,
`_matrix/client/r0/rooms/${roomId}/context/${eventIdForTimestamp}?limit=0`
`_matrix/client/r0/rooms/${roomId}/context/${eventIdForTimestamp}?limit=0&filter={"lazy_load_members":true}`
);
const contextResData = await fetchEndpointAsJson(contextEndpoint, {
accessToken,
});
//console.log('contextResData', contextResData);
// Add filter={"lazy_load_members":true,"include_redundant_members":true} to get member state events included
// TODO: Do we need `"include_redundant_members":true` here?
//
// Add `filter={"lazy_load_members":true,"include_redundant_members":true}` to
// get member state events included
const messagesEndpoint = urlJoin(
matrixServerUrl,
`_matrix/client/r0/rooms/${roomId}/messages?dir=b&from=${contextResData.end}&limit=${limit}&filter={"lazy_load_members":true,"include_redundant_members":true}`