Commit Graph

44 Commits

Author SHA1 Message Date
Patrick Cloke ae5b997cfa
Fix comments related to replication. (#16428) 2023-10-06 07:25:44 -04:00
Patrick Cloke 4e302b30b6
Add __slots__ to replication commands. (#16429)
To slightly reduce the amount of memory each command takes.
2023-10-05 07:38:55 -04:00
Patrick Cloke e9235d92f2
Track currently syncing users by device for presence (#16172)
Refactoring to use both the user ID & the device ID when tracking
the currently syncing users in the presence handler.

This is done both locally and over replication. Note that the device
ID is discarded but will be used in a future change.
2023-08-29 11:44:07 -04:00
Mathieu Velten 501da8ecd8
Task scheduler: add replication notify for new task to launch ASAP (#16184) 2023-08-28 14:03:51 +00:00
Erik Johnston ae55cc1e6b
Add ability to wait for locks and add locks to purge history / room deletion (#15791)
c.f. #13476
2023-07-31 10:58:03 +01:00
reivilibre 39dee30f01
Send `USER_IP` commands on a different Redis channel, in order to reduce traffic to workers that do not process these commands. (#12809) 2022-05-20 15:28:23 +01:00
reivilibre f871222880
Move `update_client_ip` background job from the main process to the background worker. (#12251) 2022-04-01 13:08:55 +01:00
Patrick Cloke d0e78af35e
Add missing type hints to synapse.replication. (#11938) 2022-02-08 11:03:08 -05:00
Jonathan de Jong bf72d10dbf
Use inline type hints in various other places (in `synapse/`) (#10380) 2021-07-15 11:02:43 +01:00
Jonathan de Jong 4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Patrick Cloke da75d2ea1f
Add type hints for the federation sender. (#9681)
Includes an abstract base class which both the FederationSender
and the FederationRemoteSendQueue must implement.
2021-03-29 11:43:20 -04:00
Erik Johnston 66f4949e7f
Fix deleting pushers when using sharded pushers. (#9465) 2021-02-22 21:14:42 +00:00
Eric Eastwood 0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Erik Johnston 8de3703d21
Make event persisters periodically announce position over replication. (#8499)
Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress.

This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
2020-10-12 15:51:41 +01:00
Patrick Cloke eebf52be06
Be stricter about JSON that is accepted by Synapse (#8106) 2020-08-19 07:26:03 -04:00
David Vo 4dd27e6d11
Reduce unnecessary whitespace in JSON. (#7372) 2020-08-07 08:02:55 -04:00
Erik Johnston f299441cc6
Add ability to shard the federation sender (#7798) 2020-07-10 18:26:36 +01:00
Patrick Cloke 38e1fac886
Fix some spelling mistakes / typos. (#7811) 2020-07-09 09:52:58 -04:00
Patrick Cloke e7efd8f827
Do not use simplejson in Synapse. (#7800) 2020-07-08 07:15:08 -04:00
Patrick Cloke 7d2532be36
Discard RDATA from already seen positions. (#7648) 2020-06-15 08:44:54 -04:00
Erik Johnston d7983b63a6
Support any process writing to cache invalidation stream. (#7436) 2020-05-07 13:51:08 +01:00
Erik Johnston 37f6823f5b
Add instance name to RDATA/POSITION commands (#7364)
This is primarily for allowing us to send those commands from workers, but for now simply allows us to ignore echoed RDATA/POSITION commands that we sent (we get echoes of sent commands when using redis). Currently we log a WARNING on the master process every time we receive an echoed RDATA.
2020-04-29 16:23:08 +01:00
Richard van der Hoff 71a1abb8a1
Stop the master relaying USER_SYNC for other workers (#7318)
Long story short: if we're handling presence on the current worker, we shouldn't be sending USER_SYNC commands over replication.

In an attempt to figure out what is going on here, I ended up refactoring some bits of the presencehandler code, so the first 4 commits here are non-functional refactors to move this code slightly closer to sanity. (There's still plenty to do here :/). Suggest reviewing individual commits.

Fixes (I hope) #7257.
2020-04-22 22:39:04 +01:00
Richard van der Hoff 82d8b1dd1f
Another go at fixing one-word commands (#7326)
I messed this up last time I tried (#7239 / e13c6c7).
2020-04-22 14:34:31 +01:00
Erik Johnston 51f7eaf908
Add ability to run replication protocol over redis. (#7040)
This is configured via the `redis` config options.
2020-04-22 13:07:41 +01:00
Richard van der Hoff c3e4b4edb2 Fix warnings about not calling superclass constructor
Separate `SimpleCommand` from `Command`, so that things which don't want to use
the `data` property don't have to, and thus fix the warnings PyCharm was giving
me about not calling `__init__` in the base class.
2020-04-07 17:40:22 +01:00
Richard van der Hoff 6a519a0ca0 Remove vestigal references to SYNC replication command
We've ripped pretty much all of this out: let's remove the remains.
2020-04-07 17:40:07 +01:00
Erik Johnston 4f21c33be3
Remove usage of "conn_id" for presence. (#7128)
* Remove `conn_id` usage for UserSyncCommand.

Each tcp replication connection is assigned a "conn_id", which is used
to give an ID to a remotely connected worker. In a redis world, there
will no longer be a one to one mapping between connection and instance,
so instead we need to replace such usages with an ID generated by the
remote instances and included in the replicaiton commands.

This really only effects UserSyncCommand.

* Add CLEAR_USER_SYNCS command that is sent on shutdown.

This should help with the case where a synchrotron gets restarted
gracefully, rather than rely on 5 minute timeout.
2020-03-30 16:37:24 +01:00
Erik Johnston 4cff617df1
Move catchup of replication streams to worker. (#7024)
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Erik Johnston a8a50f5b57
Wake up transaction queue when remote server comes back online (#6706)
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
2020-01-17 10:27:19 +00:00
Erik Johnston e8b68a4e4b
Fixup synapse.replication to pass mypy checks (#6667) 2020-01-14 14:08:06 +00:00
Amber Brown 32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Erik Johnston 313987187e Fix tightloop over connecting to replication server
If the client failed to process incoming commands during the initial set
up of the replication connection it would immediately disconnect and
reconnect, resulting in a tightloop.

This can happen, for example, when subscribing to a stream that has a
row that is too long in the backlog.

The fix here is to not consider the connection successfully set up until
the client has succesfully subscribed and caught up with the streams.
This ensures that the retry logic timers aren't reset until then,
meaning that if an error does happen during start up the client will
continue backing off before retrying again.
2019-02-26 15:05:41 +00:00
Richard van der Hoff 0e8d78f6aa Logcontexts for replication command handlers
Run the handlers for replication commands as background processes. This should
improve the visibility in our metrics, and reduce the number of "running db
transaction from sentinel context" warnings.

Ideally it means converting the things that fire off deferreds into the night
into things that actually return a Deferred when they are done. I've made a bit
of a stab at this, but it will probably be leaky.
2018-08-17 00:43:43 +01:00
Amber Brown 6350bf925e
Attempt to be more performant on PyPy (#3462) 2018-06-28 14:49:57 +01:00
Richard van der Hoff 3ee4ad09eb Fix json encoding bug in replication
json encoders have an encode method, not a dumps method.
2018-04-03 15:09:48 +01:00
Richard van der Hoff 05630758f2 Use static JSONEncoders
using json.dumps with custom options requires us to create a new JSONEncoder on
each call. It's more efficient to create one upfront and reuse it.
2018-03-29 23:13:33 +01:00
Erik Johnston 9aa5a0af51 Explicitly use simplejson 2018-03-20 09:58:13 +00:00
Erik Johnston 610accbb7f Fix replication after switch to simplejson
Turns out that simplejson serialises namedtuple's as dictionaries rather
than tuples by default.
2018-03-19 16:12:48 +00:00
Erik Johnston 926ba76e23 Replace ujson with simplejson 2018-03-15 23:43:31 +00:00
Erik Johnston 27f26e48b7 Serialize user ip command as json 2017-06-27 16:25:38 +01:00
Erik Johnston 78cefd78d6 Make workers report to master for user ip updates 2017-06-27 14:58:10 +01:00
Erik Johnston 36d2b66f90 Add a timestamp to USER_SYNC command
This timestamp is used to indicate when the user last sync'd
2017-03-31 15:42:22 +01:00
Erik Johnston 7450693435 Initial TCP protocol implementation
This defines the low level TCP replication protocol
2017-03-30 12:54:46 +01:00