Commit Graph

2970 Commits

Author SHA1 Message Date
Erik Johnston 4ba55559ac
Fix specifying cache factors via env vars with * in name. (#7580)
This mostly applise to `*stateGroupCache*` and co.

Broke in #6391.
2020-05-27 13:17:01 +01:00
Erik Johnston eefc6b3a0d
Don't apply cache factor to event cache. (#7578)
This is already correctly done when we instansiate the cache, but wasn't
when it got reloaded (which always happens at least once on startup).
2020-05-27 12:04:37 +01:00
Erik Johnston 9bac5d62b3
Ensure ReplicationStreamer is always started when replication enabled. (#7579)
Fixes #7566.
2020-05-27 11:44:19 +01:00
Brendan Abolivier 98483890ee Merge branch 'develop' of github.com:matrix-org/synapse into develop 2020-05-26 20:30:41 +02:00
Patrick Cloke ef884f6d04
Convert identity handler to async/await. (#7561) 2020-05-26 13:46:22 -04:00
Brendan Abolivier 3b19c17247 1.14.0 2020-05-26 16:45:37 +02:00
Richard van der Hoff edd9a7214c
Replace device_27_unique_idx bg update with a fg one (#7562)
The bg update never managed to complete, because it kept being interrupted by
transactions which want to take a lock.

Just doing it in the foreground isn't that bad, and is a good deal simpler.
2020-05-26 11:43:17 +01:00
Richard van der Hoff 04729b86f8
Fix incorrect exception handling in KeyUploadServlet.on_POST (#7563)
Introduced in #7556
2020-05-26 11:42:22 +01:00
Richard van der Hoff 00db90f409
Fix recording of federation stream token (#7564)
A couple of changes of significance:

 * remove the `_last_ack < federation_position` condition, so that
   updates will still be correctly processed after restart

 * Correctly wire up send_federation_ack to the right class.
2020-05-26 11:41:38 +01:00
Richard van der Hoff d14c4d6b6d
Simplify reap_monthly_active_users (#7558)
we can use `make_in_list_sql_clause` rather than doing our own half-baked
equivalent, which has the benefit of working just fine with empty lists.

(This has quite a lot of tests, so I think it's pretty safe)
2020-05-23 01:20:10 +01:00
Richard van der Hoff f4269694ce
Optimise some references to hs.config (#7546)
These are surprisingly expensive, and we only really need to do them at startup.
2020-05-22 21:47:07 +01:00
Erik Johnston 2901f54359
Fix missing CORS headers on OPTION responses (#7560)
Broke in #7534.
2020-05-22 17:42:39 +01:00
Erik Johnston e5c67d04db
Add option to move event persistence off master (#7517) 2020-05-22 16:11:35 +01:00
Patrick Cloke 4429764c9f
Return 200 OK for all OPTIONS requests (#7534) 2020-05-22 09:30:07 -04:00
Erik Johnston 1531b214fc
Add ability to wait for replication streams (#7542)
The idea here is that if an instance persists an event via the replication HTTP API it can return before we receive that event over replication, which can lead to races where code assumes that persisting an event immediately updates various caches (e.g. current state of the room).

Most of Synapse doesn't hit such races, so we don't do the waiting automagically, instead we do so where necessary to avoid unnecessary delays. We may decide to change our minds here if it turns out there are a lot of subtle races going on.

People probably want to look at this commit by commit.
2020-05-22 14:21:54 +01:00
Erik Johnston 06a02bc1ce
Convert sending mail to async/await. (#7557)
Mainly because sometimes the email push code raises exceptions where the
stack traces have gotten lost, which is hopefully fixed by this.
2020-05-22 13:41:11 +01:00
Patrick Cloke 66f2ebc22f
Use a non-empty RelayState for user interactive auth with SAML. (#7552) 2020-05-22 07:17:30 -04:00
Erik Johnston 710d958c64
On upgrade room only send canonical alias once. (#7547)
Instead of doing a complicated dance of deleting and moving aliases one
by one, which sends a canonical alias update into the old room for each
one, lets do it all in one go.

This also changes the function to move *all* local alias events to the new
room, however that happens later on anyway.
2020-05-22 11:41:41 +01:00
Erik Johnston 547e4dd83e
Fix exception reporting due to HTTP request errors. (#7556)
These are business as usual errors, rather than stuff we want to log at
error.
2020-05-22 11:39:20 +01:00
Ivan Shapovalov ac481a738e
synapse.metrics: implement detailed memory usage reporting on PyPy (#7536)
PyPy's gc.get_stats() returns an object containing detailed allocator statistics
which could be beneficial to collect as metrics.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2020-05-22 11:08:41 +01:00
Richard van der Hoff 8c75da916c
Refresh apt cache when building dh_virtualenv docker image (#7555)
When we tried to build debs for 1.13.0, the build failed because docker used a
base docker image which had a stale apt cache.

Fixes: #7540
2020-05-22 10:17:47 +01:00
Richard van der Hoff a0f99f81b3
Fix stacktrace mangling in `patch_inline_callbacks` (#7554)
`Failure()` is more cunning than `Failure(e)`.
2020-05-22 10:17:36 +01:00
Richard van der Hoff d84bdfe599
mypy for synapse.http.site (#7553) 2020-05-22 10:12:17 +01:00
Richard van der Hoff 66a564c859
Fix some DETECTED VIOLATIONS in the config file (#7550)
consistency ftw
2020-05-22 10:11:50 +01:00
Brendan Abolivier d1ae1015ec
Retry to sync out of sync device lists (#7453)
When a call to `user_device_resync` fails, we don't currently mark the remote user's device list as out of sync, nor do we retry to sync it.

https://github.com/matrix-org/synapse/pull/6776 introduced some code infrastructure to mark device lists as stale/out of sync.

This commit uses that code infrastructure to mark device lists as out of sync if processing an incoming device list update makes the device handler realise that the device list is out of sync, but we can't resync right now.

It also adds a looping call to retry all failed resync every 30s. This shouldn't cause too much spam in the logs as this commit also removes the "Failed to handle device list update for..." warning logs when catching `NotRetryingDestination`.

Fixes #7418
2020-05-21 17:41:12 +02:00
Richard van der Hoff 0bbbd10513
Stub out GET presence requests in the frontend proxy (#7545)
We don't really make any promises about returning accurate presence data when
presence is disabled, so we may as well just return a static response, rather
than making the master handle a request.
2020-05-21 14:36:46 +01:00
David Vo d74cdc1a42
Ensure worker config exists in systemd service (#7528) 2020-05-21 13:47:23 +01:00
Erik Johnston f6f92845f8
Fix bug in persist events when dealing with non member types. (#7548)
`_is_server_still_joined` will throw if it is given state updates with non-user ID state keys with local user leaves. This is actually rarely a problem since local leaves almost always get persisted by themselves.

(I discovered this on a branch that was otherwise broken, so I haven't seen this in the wild)
2020-05-21 13:20:10 +01:00
Patrick Cloke b2b8699070
Remove Ubuntu Cosmic and Disco which are both EOL. (#7539) 2020-05-20 10:08:46 -04:00
Patrick Cloke 9dc6f3075a
Hash passwords earlier in the password reset process (#7538)
This now matches the logic of the registration process as modified in
56db0b1365 / #7523.
2020-05-20 09:48:03 -04:00
Richard van der Hoff 4fa74c7606
Minor clarifications to the TURN docs (#7533) 2020-05-20 11:04:34 +01:00
Romain Bouyé a57863d2b4
synctl warns when no process is stopped and avoids start (#6598)
* If an error occurs when stopping a process synctl now logs a warning.
* During a restart, synctl will avoid attempting to start Synapse if an error
  occurs during stopping Synapse.
2020-05-19 08:47:45 -04:00
Aaron Raimist 250f3eb991
Omit displayname or avatar_url if they aren't set instead of returning null (#7497)
Per https://github.com/matrix-org/matrix-doc/issues/1436#issuecomment-410089470 they should be omitted instead of returning null or "". They aren't marked as required in the spec.

Fixes https://github.com/matrix-org/synapse/issues/7333

Signed-off-by: Aaron Raimist <aaron@raim.ist>
2020-05-19 10:31:25 +01:00
Erik Johnston 51055c8c44
Allow ReplicationRestResource to be added to workers (#7515)
This allows workers to talk to each other over HTTP replication.
2020-05-18 12:24:48 +01:00
Richard van der Hoff 4d1afb1dfe
Merge pull request #7519 from matrix-org/rav/kill_py2_code
Kill off some old python 2 code
2020-05-18 10:45:30 +01:00
Richard van der Hoff 164f50f5f2
fix mypy for tests/replication (#7518) 2020-05-18 10:43:05 +01:00
Patrick Cloke c29915bd05
Add type hints to room member handlers (#7513) 2020-05-15 15:05:25 -04:00
Richard van der Hoff ab57353de3 changelog 2020-05-15 19:37:41 +01:00
Richard van der Hoff 6c1f7c722f
Fix limit logic for AccountDataStream (#7384)
Make sure that the AccountDataStream presents complete updates, in the right
order.

This is much the same fix as #7337 and #7358, but applied to a different stream.
2020-05-15 19:03:25 +01:00
Patrick Cloke a3cf36f76e
Support UI Authentication for OpenID Connect accounts (#7457) 2020-05-15 12:26:02 -04:00
Erik Johnston 03aff4c75e
Add a worker store for search insertion. (#7516)
This is required as both event persistence and the background update needs access to this function. It should be perfectly safe for two workers to write to that table at the same time.
2020-05-15 17:22:47 +01:00
Andrew Morgan 16090a077f
Prevent 0-member/null room_version rooms from appearing in group room queries (#7465) 2020-05-15 17:17:42 +01:00
Erik Johnston 1f36ff69e8
Move event stream handling out of slave store. (#7491)
This allows us to have the logic on both master and workers, which is necessary to move event persistence off master.

We also combine the instantiation of ID generators from DataStore and slave stores to the base worker stores. This allows us to select which process writes events independently of the master/worker splits.
2020-05-15 16:43:59 +01:00
Patrick Cloke 5355421295
Add type hints to event_auth code. (#7505) 2020-05-15 11:19:43 -04:00
Andrew Morgan 86614e251f
Fix a small typo in the arguments of simple_update in update_remote_profile_cache (#7511) 2020-05-15 16:17:12 +01:00
Richard van der Hoff 24d9151a08
Formatting for reverse-proxy docs (#7514)
also a small clarification to nginx
2020-05-15 15:13:39 +01:00
Jeff Peeler 572b444dab
Add Caddy 2 example (#7463)
The specific headers that are passed using this new configuration format
are Host and X-Forwarded-For, which should be all that's required.

Note that for production another matcher should be added in the first
section to properly handle the base_url lookup:
reverse_proxy /.well-known/matrix/* http://localhost:8008

Signed-off-by: Jeff Peeler <jpeeler@gmail.com>
2020-05-15 14:36:01 +01:00
Patrick Cloke e9f3de0bab
Update the room member handler to use async/await. (#7507) 2020-05-15 09:32:13 -04:00
Patrick Cloke 08bc80ef09
Implement room version 6 (MSC2240). (#7506) 2020-05-15 09:30:10 -04:00
Andrew Morgan 02d97fc3ba
Ignore incoming presence updates when presence is disabled (#7508) 2020-05-15 11:44:00 +01:00
Patrick Cloke 56b66db78a
Strictly enforce canonicaljson requirements in a new room version (#7381) 2020-05-14 13:24:01 -04:00
Patrick Cloke fef3ff5cc4
Enforce MSC2209: auth rules for notifications in power level event (#7502)
In a new room version, the "notifications" key of power level events are
subject to restricted auth rules.
2020-05-14 12:38:17 -04:00
Andrew Morgan 5611644519
Workaround for failure to wrap reason in Failure (#7473) 2020-05-14 17:07:24 +01:00
Richard van der Hoff eafd103fc7
Fix b'GET' in prometheus metrics (#7503) 2020-05-14 17:01:34 +01:00
Andrew Morgan 225c165087
Allow expired accounts to logout (#7443) 2020-05-14 16:32:49 +01:00
Erik Johnston 4734a7bbe4
Move EventStream handling into default ReplicationDataHandler (#7493)
This is so that the logic can happen on both master and workers when we move event persistence out.
2020-05-14 14:01:39 +01:00
Erik Johnston 1de36407d1
Add `instance_map` config and route replication calls (#7495) 2020-05-14 14:00:58 +01:00
Erik Johnston 1124111a12
Allow censoring of events to happen on workers. (#7492)
This is safe as we can now write to cache invalidation stream on workers, and is required for when we move event persistence off master.
2020-05-13 17:15:40 +01:00
Paul Tötterman 46cb2550bb
Fix copypasted comment (#7477)
Signed-off-by: Paul Tötterman <paul.totterman@iki.fi>
2020-05-13 16:55:43 +01:00
Erik Johnston 18c1e52d82
Clean up replication unit tests. (#7490) 2020-05-13 16:01:47 +01:00
Erik Johnston 782e4e64df
Shuffle persist event data store functions. (#7440)
The aim here is to get to a stage where we have a `PersistEventStore` that holds all the write methods used during event persistence, so that we can take that class out of the `DataStore` mixin and instansiate it separately. This will allow us to instansiate it on processes other than master, while also ensuring it is only available on processes that are configured to write to events stream.

This is a bit of an architectural change, where we end up with multiple classes per data store (rather than one per data store we have now). We end up having:

1. Storage classes that provide high level APIs that can talk to multiple data stores.
2. Data store modules that consist of classes that must point at the same database instance.
3. Classes in a data store that can be instantiated on processes depending on config.
2020-05-13 13:38:22 +01:00
Erik Johnston 7ee24c5674
Have all instances correctly respond to REPLICATE command. (#7475)
Before all streams were only written to from master, so only master needed to respond to `REPLICATE` commands.

Before all instances wrote to the cache invalidation stream, but didn't respond to `REPLICATE`. This was a bug, which could lead to missed rows from cache invalidation stream if an instance is restarted, however all the caches would be empty in that case so it wasn't a problem.
2020-05-13 10:27:02 +01:00
Erik Johnston 8ca79613e6
Fix Redis reconnection logic (#7482)
Proactively send out `POSITION` commands (as if we had just received a `REPLICATE`) when we connect to Redis. This is important as other instances won't notice we've connected to issue a `REPLICATE` command (unlike for direct TCP connections). This is only currently an issue if master process reconnects without restarting (if it restarts then it won't have written anything and so other instances probably won't have missed anything).
2020-05-13 09:57:15 +01:00
Patrick Cloke 51fb0fc2e5
Update documentation about SSO mapping providers (#7458) 2020-05-12 10:51:07 -04:00
Erik Johnston 1a1da60ad2
Fix new flake8 errors (#7470) 2020-05-12 11:20:48 +01:00
Patrick Cloke 8c8858e124
Convert federation handler to async/await. (#7459) 2020-05-11 15:12:46 -04:00
Patrick Cloke be309d99cf
Convert search code to async/await. (#7460) 2020-05-11 15:12:39 -04:00
Amber Brown 7cb8b4bc67
Allow configuration of Synapse's cache without using synctl or environment variables (#6391) 2020-05-11 18:45:23 +01:00
Andrew Morgan a8580c5f19
Remove unused store method get_hosts_in_room (#7448) 2020-05-11 16:55:57 +01:00
Andrew Morgan 5cf758cdd6 Merge branch 'release-v1.13.0' into develop
* release-v1.13.0:
  Don't UPGRADE database rows
  RST indenting
  Put rollback instructions in upgrade notes
  Fix changelog typo
  Oh yeah, RST
  Absolute URL it is then
  Fix upgrade notes link
  Provide summary of upgrade issues in changelog. Fix )
  Move next version notes from changelog to upgrade notes
  Changelog fixes
  1.13.0rc1
  Documentation on setting up redis (#7446)
  Rework UI Auth session validation for registration (#7455)
  Fix errors from malformed log line (#7454)
  Drop support for redis.dbid (#7450)
2020-05-11 16:46:33 +01:00
Andrew Morgan 20ffaa7209 1.13.0rc1 2020-05-11 14:54:38 +01:00
Neil Johnson 85155654c5
Documentation on setting up redis (#7446) 2020-05-11 13:21:15 +01:00
Patrick Cloke 0ad6d28b0d
Rework UI Auth session validation for registration (#7455)
Be less strict about validation of UI authentication sessions during
registration to match client expecations.
2020-05-08 16:08:58 -04:00
Andrew Morgan 67feea8044
Extend spam checker to allow for multiple modules (#7435) 2020-05-08 19:25:48 +01:00
Quentin Gliech 616af44137
Implement OpenID Connect-based login (#7256) 2020-05-08 08:30:40 -04:00
Manuel Stahl a4a5ec4096
Add room details admin endpoint (#7317) 2020-05-07 15:33:07 -04:00
Richard van der Hoff aa5aa6f96a
Fix errors from malformed log line (#7454) 2020-05-07 19:51:38 +01:00
Richard van der Hoff da9b2db3af
Drop support for redis.dbid (#7450)
Since we only use pubsub, the dbid is irrelevant.
2020-05-07 16:46:15 +01:00
Brendan Abolivier 5bb26b7c4f Merge branch 'release-v1.13.0' into develop 2020-05-07 17:31:19 +02:00
Patrick Cloke 9e0384dd3f
Fixes typo (bellow -> below) (#7449) 2020-05-07 09:31:06 -04:00
Patrick Cloke 22246919e3
Add more type hints to SAML handler. (#7445) 2020-05-07 09:30:45 -04:00
Erik Johnston d7983b63a6
Support any process writing to cache invalidation stream. (#7436) 2020-05-07 13:51:08 +01:00
Brendan Abolivier 2929ce29d6
Merge pull request #7398 from Starbix/alpine-3.11
Update docker runtime image to Alpine v3.11
2020-05-07 11:56:56 +02:00
Brendan Abolivier d9b8d27494
Add a configuration setting for the dummy event threshold (#7422)
Add dummy_events_threshold which allows configuring the number of forward extremities a room needs for Synapse to send forward extremities in it.
2020-05-07 10:35:23 +01:00
Patrick Cloke d7c2df2fa3
Improve per-block CPU and DB usage metrics (#7426) 2020-05-06 16:43:39 -04:00
Andrew Morgan 4162c39dcf
Port group attestation renewal slow down from matrix-org-hotfixes (#7442) 2020-05-06 20:21:38 +01:00
Richard van der Hoff e053c86a96
Make redis go faster with hiredis (#7439)
For the record, the reason we need this is as follows:

each RDATA command comes down the redis pipe as a subscription message. txredisapi as written needs at least three reactor ticks to read each subscription message from the tcp buffer. Hence, once the process gets loaded, it starts getting behind, and eventually redis knifes the connection. it then takes ages for the master to work its way through the backlog, before it reconnects again, during which any commands from any workers are dropped.
2020-05-06 17:36:46 +01:00
Richard van der Hoff 62ee862119 Merge branch 'release-v1.13.0' into develop 2020-05-06 15:56:03 +01:00
Andrew Morgan aee9130a83
Stop Auth methods from polling the config on every req. (#7420) 2020-05-06 15:54:58 +01:00
Richard van der Hoff fa0b2bd28d
Merge pull request #7428 from matrix-org/rav/cross_signing_keys_cache
Make get_e2e_cross_signing_key delegate to get_e2e_cross_signing_keys_bulk
2020-05-06 12:00:01 +01:00
Richard van der Hoff 16b67c404d Make get_e2e_cross_signing_key delegate to get_e2e_cross_signing_keys_bulk
... mostly because the latter has a cache.
2020-05-06 11:59:19 +01:00
Richard van der Hoff 2e0c46ca07 Merge branch 'release-v1.13.0' into develop 2020-05-06 11:58:31 +01:00
Richard van der Hoff 30a19daa02 Merge branch 'develop' into rav/upsert_for_device_list 2020-05-06 11:43:11 +01:00
Richard van der Hoff e48361545d use an upsert to update device_lists_outbound_last_success 2020-05-06 11:41:23 +01:00
Erik Johnston b26f3e582c
Merge pull request #7423 from matrix-org/erikj/faster_device_lists_fetch
Speed up fetching device lists changes in sync.
2020-05-06 11:14:13 +01:00
Richard van der Hoff a8c17da245 Merge branch 'release-v1.13.0' into rav/fix_dropped_messages 2020-05-05 23:01:12 +01:00
Richard van der Hoff 1242267316 Merge branch 'release-v1.13.0' into rav/fix_dropped_messages 2020-05-05 22:38:44 +01:00
Richard van der Hoff 7bf788ac73 changelog 2020-05-05 22:38:16 +01:00
Brendan Abolivier 5b8023dc7f
Move logs about discarded RDATA to debug (#7421) 2020-05-05 21:07:33 +02:00
Richard van der Hoff 13dd458b8d Merge branch 'release-v1.13.0' into erikj/faster_device_lists_fetch 2020-05-05 18:14:00 +01:00