synapse-old

Commit Graph

Author	SHA1	Message	Date
Patrick Cloke	c619253db8	Stop sub-classing object (#8249 )	2020-09-04 06:54:56 -04:00
Erik Johnston	208e1d3eb3	Fix typing for `@cached` wrapped functions (#8240 ) This requires adding a mypy plugin to fiddle with the type signatures a bit.	2020-09-03 15:38:32 +01:00
Patrick Cloke	d2ac767de2	Convert ReadWriteLock to async/await. (#8202 )	2020-08-28 16:47:11 -04:00
Patrick Cloke	eebf52be06	Be stricter about JSON that is accepted by Synapse (#8106 )	2020-08-19 07:26:03 -04:00
Patrick Cloke	d294f0e7e1	Remove the unused inlineCallbacks code-paths in the caching code (#8119 )	2020-08-19 07:09:07 -04:00
Andrew Morgan	5cf7c12995	Remove : from allowed client_secret chars (#8101 ) Closes: https://github.com/matrix-org/synapse/issues/6766 Equivalent Sydent PR: https://github.com/matrix-org/sydent/pull/309 I believe it's now time to remove the extra allowed `:` from `client_secret` parameters.	2020-08-18 14:14:27 +01:00
Erik Johnston	9d1e4942ab	Fix typing for notifier (#8064 )	2020-08-12 14:03:08 +01:00
Patrick Cloke	4e874ed593	Remove unnecessary maybeDeferred calls (#8044 )	2020-08-07 09:44:48 -04:00
David Vo	4dd27e6d11	Reduce unnecessary whitespace in JSON. (#7372 )	2020-08-07 08:02:55 -04:00
Patrick Cloke	fe6cfc80ec	Convert some util functions to async (#8035 )	2020-08-06 08:39:35 -04:00
Richard van der Hoff	0a86850ba3	Stop the parent process flushing the logs on exit (#8012 ) This solves the problem that the first few lines are logged twice on matrix.org. Hopefully the comments explain it.	2020-08-05 09:35:17 +01:00
Richard van der Hoff	916cf2d439	re-implement daemonize (#8011 ) This has long been something I've wanted to do. Basically the `Daemonize` code is both too flexible and not flexible enough, in that it offers a bunch of features that we don't use (changing UID, closing FDs in the child, logging to syslog) and doesn't offer a bunch that we could do with (redirecting stdout/err to a file instead of /dev/null; having the parent not exit until the child is running). As a first step, I've lifted the Daemonize code and removed the bits we don't use. This should be a non-functional change. Fixing everything else will come later.	2020-08-04 10:03:41 +01:00
Karthikeyan Singaravelan	a7b06a81f0	Fix deprecation warning: import ABC from collections.abc (#7892 )	2020-07-20 13:33:04 -04:00
Patrick Cloke	6b3ac3b8cd	Convert device handler to async/await (#7871 )	2020-07-17 07:09:25 -04:00
Patrick Cloke	38e1fac886	Fix some spelling mistakes / typos. (#7811 )	2020-07-09 09:52:58 -04:00
Dirk Klimpel	21a212f8e5	Fix inconsistent handling of upper and lower cases of email addresses. (#7021 ) fixes #7016	2020-07-03 14:03:13 +01:00
Patrick Cloke	231252516c	Fix "argument of type 'ObservableDeferred' is not iterable" error (#7708 )	2020-06-16 12:01:18 -04:00
Dagfinn Ilmari Mannsåker	a3f11567d9	Replace all remaining six usage with native Python 3 equivalents (#7704 )	2020-06-16 08:51:47 -04:00
Patrick Cloke	bd6dc17221	Replace iteritems/itervalues/iterkeys with native versions. (#7692 )	2020-06-15 07:03:36 -04:00
Andrew Morgan	f4e6495b5d	Performance improvements and refactor of Ratelimiter (#7595 ) While working on https://github.com/matrix-org/synapse/issues/5665 I found myself digging into the `Ratelimiter` class and seeing that it was both: * Rather undocumented, and * causing a lot of config checks This PR attempts to refactor and comment the `Ratelimiter` class, as well as encourage config file accesses to only be done at instantiation. Best to be reviewed commit-by-commit.	2020-06-05 10:47:20 +01:00
Erik Johnston	35c308731d	Speed up processing of federation stream RDATA rows. Instead of storing and sending an ACK for every single row we send synchronously, we instead do it asynchronously while batching up updates.	2020-05-27 19:34:07 +01:00
Erik Johnston	eefc6b3a0d	Don't apply cache factor to event cache. (#7578 ) This is already correctly done when we instansiate the cache, but wasn't when it got reloaded (which always happens at least once on startup).	2020-05-27 12:04:37 +01:00
Richard van der Hoff	a0f99f81b3	Fix stacktrace mangling in `patch_inline_callbacks` (#7554 ) `Failure()` is more cunning than `Failure(e)`.	2020-05-22 10:17:36 +01:00
Richard van der Hoff	d4676910c9	remove miscellaneous PY2 code	2020-05-15 19:37:41 +01:00
Richard van der Hoff	65902e08c3	remove to_ascii this is a no-op on python 3.	2020-05-15 19:12:03 +01:00
Richard van der Hoff	08fa96f030	Remove `exception_to_unicode` this is a no-op on python 3.	2020-05-15 19:07:24 +01:00
Patrick Cloke	56b66db78a	Strictly enforce canonicaljson requirements in a new room version (#7381 )	2020-05-14 13:24:01 -04:00
Amber Brown	7cb8b4bc67	Allow configuration of Synapse's cache without using synctl or environment variables (#6391 )	2020-05-11 18:45:23 +01:00
Erik Johnston	f9073893af	Speed up fetching device lists changes in sync. Currently we copy `users_who_share_room` needlessly about three times, which is expensive when the set is large (which it can easily be).	2020-05-05 17:40:29 +01:00
Richard van der Hoff	13683a3a22	Extend StreamChangeCache to support multiple entities per stream ID (#7303 ) First some background: StreamChangeCache is used to keep track of what "entities" have changed since a given stream ID. So for example, we might use it to keep track of when the last to-device message for a given user was received [1], and hence whether we need to pull any to-device messages from the database on a sync [2]. Now, it turns out that StreamChangeCache didn't support more than one thing being changed at a given stream_id (this was part of the problem with #7206). However, it's entirely valid to send to-device messages to more than one user at a time. As it turns out, this did in fact work, because some methods of StreamChangeCache coped ok with having multiple things changing on the same stream ID, and it seems we never actually use the methods which don't work on the stream change caches where we allow multiple changes at the same stream ID. But that feels horribly fragile, hence: let's update StreamChangeCache to properly support this, and add some typing and some more tests while we're at it. [1]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L301 [2]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L47-L51	2020-04-22 13:45:40 +01:00
Richard van der Hoff	0f8f02bc39	On catchup, process each row with its own stream id (#7286 ) Other parts of the code (such as the StreamChangeCache) assume that there will not be multiple changes with the same stream id. This code was introduced in #7024, and I hope this fixes #7206.	2020-04-20 11:43:29 +01:00
Richard van der Hoff	7966a1cde9	Rewrite prune_old_outbound_device_pokes for efficiency (#7159 ) make sure we clear out all but one update for the user	2020-03-30 19:06:52 +01:00
Richard van der Hoff	39230d2171	Clean up some LoggingContext stuff (#7120 ) * Pull Sentinel out of LoggingContext ... and drop a few unnecessary references to it * Factor out LoggingContext.current_context move `current_context` and `set_context` out to top-level functions. Mostly this means that I can more easily trace what's actually referring to LoggingContext, but I think it's generally neater. * move copy-to-parent into `stop` this really just makes `start` and `stop` more symetric. It also means that it behaves correctly if you manually `set_log_context` rather than using the context manager. * Replace `LoggingContext.alive` with `finished` Turn `alive` into `finished` and make it a bit better defined.	2020-03-24 14:45:33 +00:00
Patrick Cloke	509e381afa	Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957 ) Ensure good comprehension hygiene using flake8-comprehensions.	2020-02-21 07:15:07 -05:00
Erik Johnston	ed630ea17c	Reduce amount of logging at INFO level. (#6862 ) A lot of the things we log at INFO are now a bit superfluous, so lets make them DEBUG logs to reduce the amount we log by default. Co-Authored-By: Brendan Abolivier <babolivier@matrix.org> Co-authored-by: Brendan Abolivier <github@brendanabolivier.com>	2020-02-06 13:31:05 +00:00
Erik Johnston	ae5b3104f0	Fix stacktraces when using ObservableDeferred and async/await (#6836 )	2020-02-03 17:10:54 +00:00
Andrew Morgan	9f7aaf90b5	Validate client_secret parameter (#6767 )	2020-01-24 14:28:40 +00:00
Richard van der Hoff	acc7820574	Log saml assertions rather than the whole response ... since the whole response is huge. We even need to break up the assertions, since kibana otherwise truncates them.	2020-01-16 22:26:34 +00:00
Richard van der Hoff	14d8f342d5	move batch_iter to a separate module	2020-01-16 22:25:32 +00:00
Richard van der Hoff	01243b98e1	Handle `config` not being set for synapse plugin modules Some modules don't need any config, so having to define a `config` property just to keep the loader happy is a bit annoying.	2020-01-12 21:34:36 +00:00
Richard van der Hoff	bc7de87650	Persist auth/state events at backwards extremities when we fetch them (#6526 ) The main point here is to make sure that the state returned by _get_state_in_room has been authed before we try to use it as state in the room.	2019-12-16 12:26:28 +00:00
Hubert Chathi	cb2db17994	look up cross-signing keys from the DB in bulk (#6486 )	2019-12-12 12:03:28 -05:00
Erik Johnston	f166a8d1f5	Remove SnapshotCache in favour of ResponseCache	2019-12-09 13:42:49 +00:00
Richard van der Hoff	18660a34d8	Fix inaccurate per-block metrics (#6491 ) `Measure` incorrectly assumed that it was the only thing being done by the parent `LoggingContext`. For instance, during a "renew group attestations" operation, hundreds of `outbound_request` calls could take place in parallel, all using the same `LoggingContext`. This would mean that any resources used during any of those calls would be reported against all of them, producing wildly inaccurate results. Instead, we now give each `Measure` block its own `LoggingContext` (using the parent `LoggingContext` mechanism to ensure that the log lines look correct and that the metrics are ultimately propogated to the top level for reporting against requests/backgrond tasks).	2019-12-09 11:55:30 +00:00
Erik Johnston	8437e2383e	Port SyncHandler to async/await	2019-12-05 17:58:25 +00:00
Andrew Morgan	bc29a19731	Replace instance variations of homeserver with correct case/spacing	2019-11-12 13:08:12 +00:00
V02460	affcc2cc36	Fix LruCache callback deduplication (#6213 )	2019-11-07 09:43:51 +00:00
Andrew Morgan	54fef094b3	Remove usage of deprecated logger.warn method from codebase (#6271 ) Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.	2019-10-31 10:23:24 +00:00
Erik Johnston	6e677403b7	Clarify docstring	2019-10-30 11:52:04 +00:00
Erik Johnston	326b3dace7	Make ObservableDeferred.observe() always return deferred. This makes it easier to use in an async/await world. Also fixes a bug where cache descriptors would occaisonally return a raw value rather than a deferred.	2019-10-30 11:35:46 +00:00
Andrew Morgan	b39ca49db1	Handle FileNotFound error in checking git repository version (#6284 )	2019-10-30 11:00:15 +00:00
Erik Johnston	09a135b039	Make concurrently_execute work with async/await	2019-10-29 15:02:23 +00:00
Erik Johnston	e6c7e239ef	Update docstring	2019-10-29 11:48:30 +00:00
Erik Johnston	d0d8a22c13	Quick fix to ensure cache descriptors always return deferreds	2019-10-28 13:33:04 +00:00
Erik Johnston	3c2d6c708c	Add maybe_awaitable and fix __init__ bugs	2019-10-11 15:26:09 +01:00
Erik Johnston	fe1c1e6c28	Fixup comments Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2019-10-10 13:17:19 +01:00
Erik Johnston	59e0ed8306	Fix py3.5	2019-10-10 12:47:07 +01:00
Erik Johnston	c349e3ebaf	Fix py3.5	2019-10-10 12:29:38 +01:00
Erik Johnston	f735aeec65	sort	2019-10-10 12:20:29 +01:00
Erik Johnston	941edad583	Appease mypy	2019-10-10 12:15:17 +01:00
Erik Johnston	791a8c559b	Add coments	2019-10-10 11:53:57 +01:00
Erik Johnston	ec0596f2ab	Log correct context	2019-10-10 11:11:38 +01:00
Erik Johnston	3e4272961a	Test for sentinel commit	2019-10-10 10:58:32 +01:00
Erik Johnston	1d6dd1c294	Move patch_inline_callbacks into synapse/	2019-10-10 10:53:06 +01:00
Richard van der Hoff	66537e10ce	add some metrics on the federation sender (#6160 )	2019-10-03 17:47:20 +01:00
Amber Brown	864f144543	Fix up some typechecking (#6150 ) * type checking fixes * changelog	2019-10-02 05:29:01 -07:00
Erik Johnston	f44f1d2e83	Fix errors storing large retry intervals. We have set the max retry interval to a value larger than a postgres or sqlite int can hold, which caused exceptions when updating the destinations table. To fix postgres we need to change the column to a bigint, and for sqlite we lower the max interval to 2**62 (which is still incredibly long).	2019-10-02 10:36:27 +01:00
Richard van der Hoff	284e1cb027	Merge branch 'develop' into rav/fix_attribute_mapping	2019-09-19 20:32:25 +01:00
Richard van der Hoff	b74606ea22	Fix a bug with saml attribute maps. Fixes a bug where the default attribute maps were prioritised over user-specified ones, resulting in incorrect mappings. The problem is that if you call SPConfig.load() multiple times, it adds new attribute mappers to a list. So by calling it with the default config first, and then the user-specified config, we would always get the default mappers before the user-specified mappers. To solve this, let's merge the config dicts first, and then pass them to SPConfig.	2019-09-19 20:32:14 +01:00
Richard van der Hoff	1e19ce00bf	Add 'failure_ts' column to 'destinations' table (#6016 ) Track the time that a server started failing at, for general analysis purposes.	2019-09-17 11:41:54 +01:00
Richard van der Hoff	3d882a7ba5	Remove the cap on federation retry interval. (#6026 ) Essentially the intention here is to end up blacklisting servers which never respond to federation requests. Fixes https://github.com/matrix-org/synapse/issues/5113.	2019-09-12 13:00:13 +01:00
Richard van der Hoff	0388beafe4	Fix bug in calculating the federation retry backoff period (#6025 ) This was intended to introduce an element of jitter; instead it gave you a 30/60 chance of resetting to zero.	2019-09-12 12:59:43 +01:00
Andrew Morgan	9fc71dc5ee	Use the v2 Identity Service API for lookups (MSC2134 + MSC2140) (#5976 ) This is a redo of https://github.com/matrix-org/synapse/pull/5897 but with `id_access_token` accepted. Implements [MSC2134](https://github.com/matrix-org/matrix-doc/pull/2134) plus Identity Service v2 authentication ala [MSC2140](https://github.com/matrix-org/matrix-doc/pull/2140). Identity lookup-related functions were also moved from `RoomMemberHandler` to `IdentityHandler`.	2019-09-11 16:02:42 +01:00
Richard van der Hoff	7902bf1e1d	Clean up some code in the retry logic (#6017 ) * remove some unused code * make things which were constants into constants for efficiency and clarity	2019-09-11 15:14:56 +01:00
Andrew Morgan	3057095a5d	Revert "Use the v2 lookup API for 3PID invites (#5897 )" (#5937 ) This reverts commit `71fc04069a`. This broke 3PID invites as #5892 was required for it to work correctly.	2019-08-30 12:00:20 +01:00
Andrew Morgan	71fc04069a	Use the v2 lookup API for 3PID invites (#5897 ) Fixes https://github.com/matrix-org/synapse/issues/5861 Adds support for the v2 lookup API as defined in [MSC2134](https://github.com/matrix-org/matrix-doc/pull/2134). Currently this is only used for 3PID invites. Sytest PR: https://github.com/matrix-org/sytest/pull/679	2019-08-28 14:59:26 +02:00
Erik Johnston	17e1e80726	Retry well-known lookup before expiry. This gives a bit of a grace period where we can attempt to refetch a remote `well-known`, while still using the cached result if that fails. Hopefully this will make the well-known resolution a bit more torelant of failures, rather than it immediately treating failures as "no result" and caching that for an hour.	2019-08-13 16:20:38 +01:00
Brendan Abolivier	244953be3f	Add kwargs and doc	2019-07-29 10:03:14 +02:00
Brendan Abolivier	08352d44f8	Add ability to pass arguments to looping calls	2019-07-29 09:54:37 +02:00
Richard van der Hoff	618bd1ee76	Fix some error cases in the caching layer. (#5749 ) There was some inconsistent behaviour in the caching layer around how exceptions were handled - particularly synchronously-thrown ones. This seems to be most easily handled by pushing the creation of ObservableDeferreds down from CacheDescriptor to the Cache.	2019-07-25 15:59:45 +01:00
Richard van der Hoff	418635e68a	Add a prometheus metric for active cache lookups. (#5750 ) * Add a prometheus metric for active cache lookups. * changelog	2019-07-24 11:33:13 +01:00
Amber Brown	4806651744	Replace returnValue with return (#5736 )	2019-07-23 23:00:55 +10:00
Erik Johnston	5ea773c505	Cache get_version_string. The version of a module isn't going to change over the lifetime of the process (assuming no funky hot reloading is going on, which it isn't), so let's just cache the result to avoid spawning lots of git subprocesses. Fixes #5672.	2019-07-22 13:15:08 +01:00
Richard van der Hoff	9481707a52	Fixes to the federation rate limiter (#5621 ) - Put the default window_size back to 1000ms (broken by #5181) - Make the `rc_federation` config actually do something - fix an off-by-one error in the 'concurrent' limit - Avoid creating an unused `_PerHostRatelimiter` object for every single incoming request	2019-07-05 11:10:19 +01:00
Amber Brown	1ee268d33d	Improve the backwards compatibility re-exports of synapse.logging.context (#5617 ) * Improve the backwards compatibility re-exports of synapse.logging.context. * reexport logformatter too	2019-07-05 02:32:02 +10:00
Amber Brown	463b072b12	Move logging utilities out of the side drawer of util/ and into logging/ (#5606 )	2019-07-04 00:07:04 +10:00
Richard van der Hoff	cb8d568cf9	Fix 'utime went backwards' errors on daemonization. (#5609 ) * Fix 'utime went backwards' errors on daemonization. Fixes #5608 * remove spurious debug	2019-07-03 22:40:45 +10:00
Richard van der Hoff	91753cae59	Fix a number of "Starting txn from sentinel context" warnings (#5605 ) Fixes #5602, #5603	2019-07-03 09:31:27 +01:00
Amber Brown	0ee9076ffe	Fix media repo breaking (#5593 )	2019-07-02 19:01:28 +01:00
Andrew Morgan	ef8c62758c	Prevent multiple upgrades on the same room at once (#5051 ) Closes #4583 Does slightly less than #5045, which prevented a room from being upgraded multiple times, one after another. This PR still allows that, but just prevents two from happening at the same time. Mostly just to mitigate the fact that servers are slow and it can take a moment for the room upgrade to actually complete. We don't want people sending another request to upgrade the room when really they just thought the first didn't go through.	2019-06-25 14:19:21 +01:00
Richard van der Hoff	dc94773e60	Avoid raising exceptions in metrics Sentry will catch the errors if they happen, so that should be good enough, and woun't make things explode if we hit the error condition.	2019-06-24 10:01:16 +01:00
Richard van der Hoff	5097aee740	Merge branch 'develop' into rav/cleanup_metrics	2019-06-24 10:00:13 +01:00
Amber Brown	32e7c9e7f2	Run Black. (#5482 )	2019-06-20 19:32:02 +10:00
Richard van der Hoff	fe641df770	Sanity-checking for metrics updates Check that our clocks go forward.	2019-06-19 21:18:38 +01:00
Richard van der Hoff	aa530e6800	Call RetryLimiter correctly (#5340 ) Fixes a regression introduced in #5335.	2019-06-04 22:02:53 +01:00
Richard van der Hoff	dce6e9e0c1	Avoid rapidly backing-off a server if we ignore the retry interval	2019-06-03 23:58:42 +01:00
Richard van der Hoff	3dcf2feba8	Improve logging for logcontext leaks. (#5288 )	2019-05-29 19:27:50 +01:00
Amber Brown	f1e5b41388	Make all the rate limiting options more consistent (#5181 )	2019-05-15 12:06:04 -05:00
Erik Johnston	0aba6c8251	Merge pull request #5183 from matrix-org/erikj/async_serialize_event Allow client event serialization to be async	2019-05-15 10:36:30 +01:00
Erik Johnston	8ed2f182f7	Update docstring with correct return type Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2019-05-15 09:52:52 +01:00
Richard van der Hoff	daa2fb6317	comment about user_joined_room	2019-05-14 18:53:09 +01:00
Erik Johnston	b54b03f9e1	Allow client event serialization to be async	2019-05-14 11:58:01 +01:00
Richard van der Hoff	836d3adcce	Merge branch 'master' into develop	2019-05-03 19:25:01 +01:00
Richard van der Hoff	247dc1bd0b	Use SystemRandom for token generation	2019-05-03 13:02:55 +01:00
Andrew Morgan	caa76e6021	Remove periods from copyright headers (#5046 )	2019-04-11 17:08:13 +01:00
Richard van der Hoff	329688c161	Fix disappearing exceptions in manhole. (#5035 ) Avoid sending syntax errors from the manhole to sentry.	2019-04-10 07:23:48 +01:00
Richard van der Hoff	bc5f6e1797	Add a caching layer to .well-known responses (#4516 )	2019-01-30 10:55:25 +00:00
Richard van der Hoff	457fbfaf22	Merge pull request #4486 from xperimental/workaround-4216 Implement workaround for login error.	2019-01-30 07:06:11 +00:00
Robert Jacob	2a7f0b8953	Implement workaround for login error. Signed-off-by: Robert Jacob <xperimental@solidproject.de>	2019-01-30 01:06:39 +01:00
Amber Brown	f815bd7feb	Make linearizer more quiet (#4507 )	2019-01-29 11:05:31 +00:00
Richard van der Hoff	676cf2ee26	Fix incorrect logcontexts after a Deferred was cancelled (#4407 )	2019-01-17 14:00:23 +00:00
Richard van der Hoff	ecc23188f4	Fix UnicodeDecodeError when postgres is not configured in english (#4253 ) This is a bit of a half-assed effort at fixing https://github.com/matrix-org/synapse/issues/4252. Fundamentally the right answer is to drop support for Python 2.	2018-12-04 11:55:52 +01:00
Erik Johnston	b94a43d5b5	Merge branch 'develop' of github.com:matrix-org/synapse into erikj/alias_disallow_list	2018-10-25 15:25:31 +01:00
Richard van der Hoff	5c445114d3	Correctly account for cpu usage by background threads (#4074 ) Wrap calls to deferToThread() in a thing which uses a child logcontext to attribute CPU usage to the right request. While we're in the area, remove the logcontext_tracer stuff, which is never used, and afaik doesn't work. Fixes #4064	2018-10-23 13:12:32 +01:00
Amber Brown	e1728dfcbe	Make scripts/ and scripts-dev/ pass pyflakes (and the rest of the codebase on py3) (#4068 )	2018-10-20 11:16:55 +11:00
Amber Brown	e404ba9aac	Fix manhole on py3 (pt 2) (#4067 )	2018-10-19 22:26:00 +11:00
Erik Johnston	9fafdfa97d	Anchor returned regex to start and end of string	2018-10-19 10:22:45 +01:00
Erik Johnston	084046456e	Add config option to control alias creation	2018-10-19 10:22:45 +01:00
Amber Brown	a36b0ec195	make a bytestring	2018-10-19 09:24:00 +11:00
Erik Johnston	6982320572	Remove unnecessary extra function call layer	2018-10-08 14:06:19 +01:00
Erik Johnston	8a1817f0d2	Use errback pattern and catch async failures	2018-10-08 13:29:47 +01:00
Erik Johnston	f7199e8734	Log looping call exceptions If a looping call function errors, then it kills the loop entirely. Currently it throws away the exception logs, so we should make it actually log them. Fixes #3929	2018-10-05 11:24:12 +01:00
Erik Johnston	4f3e3ac192	Correctly match 'dict.pop' api	2018-10-01 12:25:27 +01:00
Erik Johnston	8ea887856c	Don't update eviction metrics on explicit removal	2018-10-01 12:00:58 +01:00
Richard van der Hoff	9c8cec5dab	Merge remote-tracking branch 'origin/develop' into erikj/destination_retry_cache	2018-09-28 10:51:09 +01:00
Richard van der Hoff	4a15a3e4d5	Include eventid in log lines when processing incoming federation transactions (#3959 ) when processing incoming transactions, it can be hard to see what's going on, because we process a bunch of stuff in parallel, and because we may end up recursively working our way through a chain of three or four events. This commit creates a way to use logcontexts to add the relevant event ids to the log lines.	2018-09-27 11:25:34 +01:00
Richard van der Hoff	5b4028fa78	Merge branch 'rav/fix_expiring_cache_len' into erikj/destination_retry_cache	2018-09-26 12:55:53 +01:00
Richard van der Hoff	7ee94fc1ba	Log which cache is throwing exceptions	2018-09-26 12:43:08 +01:00
Erik Johnston	3baf6e1667	Fix ExpiringCache.__len__ to be accurate It used to try and produce an estimate, which was sometimes negative. This caused metrics to be sad, so lets always just calculate it from scratch. (This appears to have been a longstanding bug, but one which has been made more of a problem by #3932 and #3933). (This was originally done by Erik as part of #3933. I'm cherry-picking it because really it's a fix in its own right)	2018-09-26 12:32:29 +01:00
Erik Johnston	19dc676d1a	Fix ExpiringCache.__len__ to be accurate It used to try and produce an estimate, which was sometimes negative. This caused metrics to be sad, so lets always just calculate it from scratch.	2018-09-21 16:25:42 +01:00
Erik Johnston	fdd1a62e8d	Add a five minute cache to get_destination_retry_timings Hopefully helps with #3931	2018-09-21 14:56:12 +01:00
Erik Johnston	79eded1ae4	Make ExpiringCache slightly more performant	2018-09-21 14:52:21 +01:00
Erik Johnston	8601c24287	Fix some instances of ExpiringCache not expiring cache items ExpiringCache required that `start()` be called before it would actually start expiring entries. A number of places didn't do that. This PR removes `start` from ExpiringCache, and automatically starts backround reaping process on creation instead.	2018-09-21 14:19:46 +01:00
Richard van der Hoff	642199570c	Improve the logging when handling a federation transaction (#3904 ) Let's try to rationalise the logging that happens when we are processing an incoming transaction, to make it easier to figure out what is going wrong when they take ages. In particular: - make everything start with a [room_id event_id] prefix - make sure we log a warning when catching exceptions rather than just turning them into other, more cryptic, exceptions.	2018-09-19 17:28:18 +01:00
Erik Johnston	9407bcf37a	Replace custom DeferredTimeoutError with defer.TimeoutError	2018-09-19 11:07:29 +01:00
Erik Johnston	6c48aa0256	Run canceller first to allow it to generate correct error	2018-09-19 11:07:27 +01:00
Erik Johnston	a334e1cace	Update to use new timeout function everywhere. The existing deferred timeout helper function (and the one into twisted) suffer from a bug when a deferred's canceller throws an exception, #3842. The new helper function doesn't suffer from this problem.	2018-09-19 10:39:40 +01:00
Erik Johnston	24efb2a70d	Fix timeout function Turns out deferred.cancel sometimes throws, so we do that last to ensure that we always do resolve the new deferred.	2018-09-15 11:38:39 +01:00
Erik Johnston	fcfe7a850d	Add an awful secondary timeout to fix wedged requests This is an attempt to mitigate #3842 by adding yet-another-timeout	2018-09-14 19:23:07 +01:00
Erik Johnston	0a81038ea0	Add in flight real time metrics for Measure blocks	2018-09-14 15:08:37 +01:00
Erik Johnston	9e05c8d309	Change the manhole SSH key to have more bits Newer versions of openssh client refuse to connect to the old key due to its length.	2018-09-11 10:42:10 +01:00
Richard van der Hoff	be6527325a	Fix exceptions when a connection is closed before we read the headers This fixes bugs introduced in #3700, by making sure that we behave sanely when an incoming connection is closed before the headers are read.	2018-08-20 18:21:10 +01:00
Richard van der Hoff	55e6bdf287	Robustness fix for logcontext filter Make the logcontext filter not explode if it somehow ends up with a logcontext of None, since that infinite-loops the whole logging system.	2018-08-20 18:20:07 +01:00
Amber Brown	324525f40c	Port over enough to get some sytests running on Python 3 (#3668 )	2018-08-20 23:54:49 +10:00
Richard van der Hoff	c31793a784	Merge branch 'rav/fix_linearizer_cancellation' into develop	2018-08-10 14:57:27 +01:00
Amber Brown	b37c472419	Rename async to async_helpers because `async` is a keyword on Python 3.7 (#3678 )	2018-08-10 23:50:21 +10:00
Richard van der Hoff	638d35ef08	Fix linearizer cancellation on twisted < 18.7 Turns out that cancellation of inlineDeferreds didn't really work properly until Twisted 18.7. This commit refactors Linearizer.queue to avoid inlineCallbacks.	2018-08-10 10:59:09 +01:00
Amber Brown	da7785147d	Python 3: Convert some unicode/bytes uses (#3569 )	2018-08-02 00:54:06 +10:00
Richard van der Hoff	a8cbce0ced	fix invalidation	2018-07-27 16:17:17 +01:00
Richard van der Hoff	f102c05856	Rewrite cache list decorator Because it was complicated and annoyed me. I suspect this will be more efficient too.	2018-07-27 13:47:04 +01:00
Richard van der Hoff	03751a6420	Fix some looping_call calls which were broken in #3604 It turns out that looping_call does check the deferred returned by its callback, and (at least in the case of client_ips), we were relying on this, and I broke it in #3604. Update run_as_background_process to return the deferred, and make sure we return it to clock.looping_call.	2018-07-26 11:48:08 +01:00
Richard van der Hoff	3d6df84658	Test and fix support for cancellation in Linearizer	2018-07-20 13:59:55 +01:00
Richard van der Hoff	7c712f95bb	Combine Limiter and Linearizer Linearizer was effectively a Limiter with max_count=1, so rather than maintaining two sets of code, let's combine them.	2018-07-20 13:11:43 +01:00
Richard van der Hoff	8462c26485	Improvements to the Limiter * give them names, to improve logging * use a deque rather than a list for efficiency	2018-07-20 12:50:27 +01:00
Richard van der Hoff	d7275eecf3	Add a sleep to the Limiter to fix stack overflows. Fixes #3570	2018-07-20 12:37:12 +01:00
Amber Brown	95ccb6e2ec	Don't spew errors because we can't save metrics (#3563 )	2018-07-19 20:58:18 +10:00
Richard van der Hoff	8c69b735e3	Make Distributor run its processes as a background process This is more involved than it might otherwise be, because the current implementation just drops its logcontexts and runs everything in the sentinel context. It turns out that we aren't actually using a bunch of the functionality here (notably suppress_failures and the fact that Distributor.fire returns a deferred), so the easiest way to fix this is actually by simplifying a bunch of code.	2018-07-18 20:55:05 +01:00
Richard van der Hoff	667fba68f3	Run things as background processes This fixes #3518, and ensures that we get useful logs and metrics for lots of things that happen in the background. (There are certainly more things that happen in the background; these are just the common ones I've found running a single-process synapse locally).	2018-07-18 20:55:05 +01:00
Erik Johnston	b2aa05a8d6	Use efficient .intersection	2018-07-17 11:07:04 +01:00
Erik Johnston	547b1355d3	Fix perf regression in PR #3530 The get_entities_changed function was changed to return all changed entities since the given stream position, rather than only those changed from a given list of entities. This resulted in the function incorrectly returning large numbers of entities that, for example, caused large increases in database usage.	2018-07-17 10:27:51 +01:00
Amber Brown	3fe0938b76	Merge pull request #3530 from matrix-org/erikj/stream_cache Don't return unknown entities in get_entities_changed	2018-07-17 13:44:46 +10:00
Richard van der Hoff	33b40d0a25	Make FederationRateLimiter queue requests properly popitem removes the most recent item by default [1]. We want the oldest. Fixes #3524 [1]: https://docs.python.org/2/library/collections.html#collections.OrderedDict.popitem	2018-07-13 16:19:40 +01:00
Erik Johnston	77b692e65d	Don't return unknown entities in get_entities_changed The stream cache keeps track of all entities that have changed since a particular stream position, so get_entities_changed does not need to return unknown entites when given a larger stream position. This makes it consistent with the behaviour of has_entity_changed.	2018-07-13 15:26:10 +01:00
Richard van der Hoff	fa5c2bc082	Reduce set building in get_entities_changed This line shows up as about 5% of cpu time on a synchrotron: not_known_entities = set(entities) - set(self._entity_to_key) Presumably the problem here is that _entity_to_key can be largeish, and building a set for its keys every time this function is called is slow. Here we rewrite the logic to avoid building so many sets.	2018-07-12 11:37:44 +01:00
Richard van der Hoff	c3c29aa196	Attempt to include db threads in cpu usage stats (#3496 ) Let's try to include time spent in the DB threads in the per-request/block cpu usage metrics.	2018-07-10 16:12:36 +01:00
Richard van der Hoff	55370331da	Refactor logcontext resource usage tracking (#3501 ) Factor out the resource usage tracking out to a separate object, which can be passed around and copied independently of the logcontext itself.	2018-07-10 13:56:07 +01:00
Amber Brown	49af402019	run isort	2018-07-09 16:09:20 +10:00
Amber Brown	6350bf925e	Attempt to be more performant on PyPy (#3462 )	2018-06-28 14:49:57 +01:00
Amber Brown	72d2143ea8	Revert "Revert "Try to not use as much CPU in the StreamChangeCache"" (#3454 )	2018-06-28 11:04:18 +01:00
Matthew Hodgson	8057489b26	Revert "Try to not use as much CPU in the StreamChangeCache"	2018-06-26 18:09:01 +01:00
Amber Brown	1202508067	fixes	2018-06-26 17:29:01 +01:00
Amber Brown	bd3d329c88	fixes	2018-06-26 17:28:12 +01:00
Amber Brown	abfe4b2957	try and make loading items from the cache faster	2018-06-26 17:25:34 +01:00
Amber Brown	07cad26d65	Remove all global reactor imports & pass it around explicitly (#3424 )	2018-06-25 14:08:28 +01:00
Richard van der Hoff	43e02c409d	Disable partial state group caching for wildcard lookups When _get_state_for_groups is given a wildcard filter, just do a complete lookup. Hopefully this will give us the best of both worlds by not filling up the ram if we only need one or two keys, but also making the cache still work for the federation reader usecase.	2018-06-22 11:52:07 +01:00
Richard van der Hoff	70e6501913	Merge pull request #3419 from matrix-org/rav/events_per_request Log number of events fetched from DB	2018-06-22 11:17:56 +01:00
Richard van der Hoff	0495fe0035	Indirect evt_count updates via method call so that we can stub it for the sentinel and not have a billion failing UTs	2018-06-22 10:42:28 +01:00
Amber Brown	77ac14b960	Pass around the reactor explicitly (#3385 )	2018-06-22 09:37:10 +01:00
Richard van der Hoff	b088aafcae	Log number of events fetched from DB When we finish processing a request, log the number of events we fetched from the database to handle it. [I'm trying to figure out which requests are responsible for large amounts of event cache churn. It may turn out to be more helpful to add counts to the prometheus per-request/block metrics, but that is an extension to this code anyway.]	2018-06-21 06:15:03 +01:00
Amber Brown	a61738b316	Remove run_on_reactor (#3395 )	2018-06-14 18:27:37 +10:00
Amber Brown	f7869f8f8b	Port to sortedcontainers (with tests!) (#3332 )	2018-06-06 00:13:57 +10:00
Erik Johnston	042eedfa2b	Add hacky cache factor override system	2018-06-04 15:39:28 +01:00
Amber Brown	c936a52a9e	Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307 )	2018-05-31 19:03:47 +10:00
Amber Brown	debff7ae09	Merge pull request #3281 from NotAFile/py3-six-isinstance remaining isintance fixes	2018-05-30 12:44:46 +10:00
Adrian Tschira	7873cde526	pep8	2018-05-29 17:35:55 +02:00
Amber Brown	57ad76fa4a	fix up tests	2018-05-28 19:51:53 +10:00
Amber Brown	3ef5cd74a6	update to more consistently use seconds in any metrics or logging	2018-05-28 19:39:27 +10:00
Amber Brown	357c74a50f	add comment about why unreg	2018-05-28 19:14:41 +10:00
Amber Brown	754826a830	Merge remote-tracking branch 'origin/develop' into 3218-official-prom	2018-05-28 18:57:23 +10:00
Adrian Tschira	4ee4450d66	fix recursion error	2018-05-24 21:44:10 +02:00
Adrian Tschira	dd068ca979	remaining isintance fixes Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-05-24 20:55:08 +02:00
Amber Brown	36501068d8	Merge pull request #3247 from NotAFile/py3-misc Misc Python3 fixes	2018-05-24 12:58:37 -05:00
Amber Brown	2aff6eab6d	Merge pull request #3245 from NotAFile/batch-iter Add batch_iter to utils	2018-05-24 12:54:12 -05:00
Amber Brown	53cc2cde1f	cleanup	2018-05-22 17:32:57 -05:00
Amber Brown	071206304d	cleanup pep8 errors	2018-05-22 16:54:22 -05:00
Amber Brown	85ba83eb51	fixes	2018-05-22 16:28:23 -05:00
Amber Brown	a8990fa2ec	Merge remote-tracking branch 'origin/develop' into 3218-official-prom	2018-05-22 10:50:26 -05:00
Erik Johnston	7948ecf234	Comment	2018-05-22 11:39:43 +01:00
Erik Johnston	020377a550	Fix logcontext resource usage tracking	2018-05-22 11:16:07 +01:00
Amber Brown	df9f72d9e5	replacing portions	2018-05-21 19:47:37 -05:00
Adrian Tschira	45b55e23d3	Add batch_iter to utils There's a frequent idiom I noticed where an iterable is split up into a number of chunks/batches. Unfortunately that method does not work with iterators like dict.keys() in python3. This implementation works with iterators. Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-05-19 17:48:30 +02:00
Adrian Tschira	73cbdef5f7	fix py3 intern and remove unnecessary py3 encode Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-05-19 17:35:31 +02:00
Richard van der Hoff	093d8c415a	Merge remote-tracking branch 'origin/develop' into rav/warn_on_logcontext_fail	2018-05-03 14:59:29 +01:00
Richard van der Hoff	a7fe62f0cb	Fix logcontext leaks in rate limiter	2018-05-03 12:31:59 +01:00
Richard van der Hoff	415c6b672e	Merge branch 'develop' into rav/more_logcontext_leaks	2018-05-02 16:16:01 +01:00
Richard van der Hoff	f22e7cda2c	Fix a class of logcontext leaks So, it turns out that if you have a first `Deferred` `D1`, you can add a callback which returns another `Deferred` `D2`, and `D2` must then complete before any further callbacks on `D1` will execute (and later callbacks on `D1` get the result of `D2` rather than `D2` itself). So, `D1` might have `called=True` (as in, it has started running its callbacks), but any new callbacks added to `D1` won't get run until `D2` completes - so if you `yield D1` in an `inlineCallbacks` function, your `yield` will 'block'. In conclusion: some of our assumptions in `logcontext` were invalid. We need to make sure that we don't optimise out the logcontext juggling when this situation happens. Fortunately, it is easy to detect by checking `D1.paused`.	2018-05-02 11:58:00 +01:00
Richard van der Hoff	e482f8cd85	Fix incorrect reference to StringIO This was introduced in `4f2f5171`	2018-05-02 09:12:26 +01:00
Richard van der Hoff	fdb6849b81	Merge pull request #3144 from matrix-org/rav/run_in_background_exception_handling Trap exceptions thrown within run_in_background	2018-04-30 10:23:02 +01:00
Richard van der Hoff	db75c86e84	Merge branch 'develop' into py3-xrange-1	2018-04-30 01:02:25 +01:00
Richard van der Hoff	049b0b5af2	Merge pull request #3154 from NotAFile/py3-stringio Replace stringIO imports with six	2018-04-30 00:59:04 +01:00
Richard van der Hoff	dbf6f28d64	Merge pull request #3155 from NotAFile/py3-bytes-1 more bytes strings	2018-04-30 00:38:21 +01:00
Richard van der Hoff	aab2e4da60	Merge pull request #3140 from matrix-org/rav/use_run_in_background Use run_in_background in preference to preserve_fn	2018-04-30 00:34:28 +01:00
Adrian Tschira	e9143b6593	more bytes strings Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-29 00:13:57 +02:00
Adrian Tschira	d82b6ea9e6	Move more xrange to six plus a bonus next() Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-28 13:57:00 +02:00
Adrian Tschira	4f2f5171b7	replace stringIO imports	2018-04-28 13:46:23 +02:00
Richard van der Hoff	fc149b4eeb	Merge remote-tracking branch 'origin/develop' into rav/use_run_in_background	2018-04-27 14:31:23 +01:00
Richard van der Hoff	6146332387	Merge remote-tracking branch 'origin/develop' into rav/deferred_timeout	2018-04-27 14:18:00 +01:00
Richard van der Hoff	2a13af23bc	Use run_in_background in preference to preserve_fn While I was going through uses of preserve_fn for other PRs, I converted places which only use the wrapped function once to use run_in_background, to avoid creating the function object.	2018-04-27 12:55:51 +01:00
Richard van der Hoff	9d2c1b8429	Backport deferred.addTimeout Twisted 16.0 doesn't have addTimeout, so let's backport it.	2018-04-27 12:52:30 +01:00
Richard van der Hoff	13843f771e	Trap exceptions thrown within run_in_background Turn any exceptions that get thrown synchronously within run_in_background into Failures instead.	2018-04-27 12:17:13 +01:00
Richard van der Hoff	9255a6cb17	Improve exception handling for background processes There were a bunch of places where we fire off a process to happen in the background, but don't have any exception handling on it - instead relying on the unhandled error being logged when the relevent deferred gets garbage-collected. This is unsatisfactory for a number of reasons: - logging on garbage collection is best-effort and may happen some time after the error, if at all - it can be hard to figure out where the error actually happened. - it is logged as a scary CRITICAL error which (a) I always forget to grep for and (b) it's not really CRITICAL if a background process we don't care about fails. So this is an attempt to add exception handling to everything we fire off into the background.	2018-04-27 11:07:40 +01:00
Richard van der Hoff	1ea904b9f0	Use deferred.addTimeout instead of time_bound_deferred This doesn't feel like a wheel we need to reinvent.	2018-04-23 00:53:18 +01:00
Richard van der Hoff	8dc4a6144b	Merge pull request #3107 from NotAFile/py3-bool-nonzero add __bool__ alias to __nonzero__ methods	2018-04-20 15:43:39 +01:00
Richard van der Hoff	c09a6daf09	Merge pull request #3110 from NotAFile/py3-six-queue Replace Queue with six.moves.queue	2018-04-20 15:35:00 +01:00
Richard van der Hoff	11a67b7c9d	Merge pull request #3093 from matrix-org/rav/response_cache_wrap Refactor ResponseCache usage	2018-04-20 11:31:17 +01:00
Adrian Tschira	878995e660	Replace Queue with six.moves.queue and a six.range change which I missed the last time Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-16 00:46:21 +02:00
Adrian Tschira	f63ff73c7f	add __bool__ alias to __nonzero__ methods Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-15 20:40:47 +02:00
Richard van der Hoff	d3347ad485	Revert "Use sortedcontainers instead of blist" This reverts commit `9fbe70a7dc`. It turns out that sortedcontainers.SortedDict is not an exact match for blist.sorteddict; in particular, `popitem()` removes things from the opposite end of the dict. This is trivial to fix, but I want to add some unit tests, and potentially some more thought about it, before we do so.	2018-04-13 11:16:43 +01:00
Richard van der Hoff	60f6014bb7	ResponseCache: fix handling of completed results Turns out that ObservableDeferred.observe doesn't return a deferred if the result is already completed. Fix handling and improve documentation.	2018-04-13 07:32:29 +01:00
Richard van der Hoff	b78395b7fe	Refactor ResponseCache usage Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a (get, set) pair, and then use it throughout the codebase. This will be largely non-functional, but does include the following functional changes: * federation_server.on_context_state_request: drops use of _server_linearizer which looked redundant and could cause incorrect cache misses by yielding between the get and the set. * RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks * the wrap function includes some logging. I'm hoping this won't be too noisy on production.	2018-04-12 13:02:15 +01:00
Richard van der Hoff	d5c74b9f6c	Merge pull request #3092 from matrix-org/rav/response_cache_metrics Add metrics for ResponseCache	2018-04-12 12:59:36 +01:00
Richard van der Hoff	261124396e	Merge pull request #3059 from matrix-org/rav/doc_response_cache Document the behaviour of ResponseCache	2018-04-12 11:22:30 +01:00
Richard van der Hoff	b3384232a0	Add metrics for ResponseCache	2018-04-10 23:14:47 +01:00
Vincent Breitmoser	9fbe70a7dc	Use sortedcontainers instead of blist This commit drop-in replaces blist with SortedContainers. They are written in pure python so work with pypy, but perform as good as native implementations, at least in a couple benchmarks: http://www.grantjenks.com/docs/sortedcontainers/performance.html	2018-04-10 11:29:51 +02:00
Richard van der Hoff	13decdbf96	Revert "Merge pull request #3066 from matrix-org/rav/remove_redundant_metrics" We aren't ready to release this yet, so I'm reverting it for now. This reverts commit `d1679a4ed7`, reversing changes made to `e089100c62`.	2018-04-09 12:59:12 +01:00
Richard van der Hoff	3449da3bc7	Merge pull request #3068 from matrix-org/rav/fix_cache_invalidation Improve database cache performance	2018-04-05 17:21:44 +01:00
Richard van der Hoff	01afc563c3	Fix overzealous cache invalidation Fixes an issue where a cache invalidation would invalidate all pending entries, rather than just the entry that we intended to invalidate.	2018-04-05 16:24:04 +01:00
Richard van der Hoff	518f6de088	Remove redundant metrics which were deprecated in 0.27.0.	2018-04-04 19:46:28 +01:00
Richard van der Hoff	a9a74101a4	Document the behaviour of ResponseCache it looks like everything that uses ResponseCache expects to have to `make_deferred_yieldable` its results. It's debatable whether that is the best approach, but let's document it for now to avoid further confusion.	2018-04-04 09:06:22 +01:00
Richard van der Hoff	05630758f2	Use static JSONEncoders using json.dumps with custom options requires us to create a new JSONEncoder on each call. It's more efficient to create one upfront and reuse it.	2018-03-29 23:13:33 +01:00
Matthew Hodgson	8cbbfaefc1	404 correctly on missing paths via NoResource fixes https://github.com/matrix-org/synapse/issues/2043 and https://github.com/matrix-org/synapse/issues/2029	2018-03-23 10:32:50 +00:00
Erik Johnston	9a0d783c11	Add comments	2018-03-19 11:35:53 +00:00
Richard van der Hoff	5a6e54264d	Make 'unexpected logging context' into warnings I think we've now fixed enough of these that the rest can be logged at warning.	2018-03-15 18:40:38 +00:00
Erik Johnston	7c7706f42b	Fix bug where state cache used lots of memory The state cache bases its size on the sum of the size of entries. The size of the entry is calculated once on insertion, so it is important that the size of entries does not change. The DictionaryCache modified the entries size, which caused the state cache to incorrectly think it was smaller than it actually was.	2018-03-15 15:46:54 +00:00
Richard van der Hoff	20f40348d4	Factor run_in_background out from preserve_fn It annoys me that we create temporary function objects when there's really no need for it. Let's factor the gubbins out of preserve_fn and start using it.	2018-03-08 11:50:11 +00:00
Richard van der Hoff	3a75de923b	Rewrite make_deferred_yieldable avoiding inlineCallbacks ... because (a) it's actually simpler (b) it might be marginally more performant?	2018-03-01 12:40:05 +00:00
Richard van der Hoff	bc496df192	report metrics on number of cache evictions	2018-02-05 15:34:01 +00:00
Matthew Hodgson	ab9f844aaf	Add federation_domain_whitelist option (#2820 ) Add federation_domain_whitelist gives a way to restrict which domains your HS is allowed to federate with. useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network	2018-01-22 19:11:18 +01:00
Matthew Hodgson	d84f65255e	Merge pull request #2813 from matrix-org/matthew/registrations_require_3pid add registrations_require_3pid and allow_local_3pids	2018-01-22 13:57:22 +00:00
Matthew Hodgson	8fe253f19b	fix PR nitpicking	2018-01-19 18:23:45 +00:00
Matthew Hodgson	447f4f0d5f	rewrite based on PR feedback: * [ ] split config options into allowed_local_3pids and registrations_require_3pid * [ ] simplify and comment logic for picking registration flows * [ ] fix docstring and move check_3pid_allowed into a new util module * [ ] use check_3pid_allowed everywhere @erikjohnston PTAL	2018-01-19 15:33:55 +00:00
Erik Johnston	b6dc7044a9	Merge pull request #2804 from matrix-org/erikj/file_consumer Add decent impl of a FileConsumer	2018-01-18 16:31:33 +00:00
Richard van der Hoff	d57765fc8a	Fix bugs in block metrics ... which I introduced in #2785	2018-01-18 12:24:42 +00:00
Erik Johnston	be0dfcd4a2	Do logcontexts correctly	2018-01-18 11:57:57 +00:00
Erik Johnston	1432f7ccd5	Move test stuff to tests	2018-01-18 11:57:57 +00:00
Erik Johnston	2f18a2647b	Make all fields private	2018-01-18 11:57:54 +00:00
Erik Johnston	dc519602ac	Ensure we registerProducer isn't called twice	2018-01-18 11:07:17 +00:00
Erik Johnston	17b54389fe	Fix _notify_empty typo	2018-01-18 11:05:34 +00:00
Erik Johnston	28b338ed9b	Move definition of paused_producer to __init__	2018-01-18 11:04:41 +00:00
Erik Johnston	a177325b49	Fix comments	2018-01-18 11:02:43 +00:00
Erik Johnston	bc67e7d260	Add decent impl of a FileConsumer Twisted core doesn't have a general purpose one, so we need to write one ourselves. Features: - All writing happens in background thread - Supports both push and pull producers - Push producers get paused if the consumer falls behind	2018-01-17 16:43:03 +00:00
Richard van der Hoff	3d12d97415	Track DB scheduling delay per-request For each request, track the amount of time spent waiting for a db connection. This entails adding it to the LoggingContext and we may as well add metrics for it while we are passing.	2018-01-16 17:23:32 +00:00
Richard van der Hoff	6324b65f08	Track db txn time in millisecs ... to reduce the amount of floating-point foo we do.	2018-01-16 15:53:18 +00:00
Richard van der Hoff	44a498418c	Optimise LoggingContext creation and copying It turns out that the only thing we use the __dict__ of LoggingContext for is `request`, and given we create lots of LoggingContexts and then copy them every time we do a db transaction or log line, using the __dict__ seems a bit redundant. Let's try to optimise things by making the request attribute explicit.	2018-01-16 15:49:42 +00:00
Richard van der Hoff	39f4e29d01	Reorganise request and block metrics In order to circumvent the number of duplicate foo:count metrics increasing without bounds, it's time for a rearrangement. The following are all deprecated, and replaced with synapse_util_metrics_block_count: synapse_util_metrics_block_timer:count synapse_util_metrics_block_ru_utime:count synapse_util_metrics_block_ru_stime:count synapse_util_metrics_block_db_txn_count:count synapse_util_metrics_block_db_txn_duration:count The following are all deprecated, and replaced with synapse_http_server_response_count: synapse_http_server_requests synapse_http_server_response_time:count synapse_http_server_response_ru_utime:count synapse_http_server_response_ru_stime:count synapse_http_server_response_db_txn_count:count synapse_http_server_response_db_txn_duration:count The following are renamed (the old metrics are kept for now, but deprecated): synapse_util_metrics_block_timer:total -> synapse_util_metrics_block_time_seconds synapse_util_metrics_block_ru_utime:total -> synapse_util_metrics_block_ru_utime_seconds synapse_util_metrics_block_ru_stime:total -> synapse_util_metrics_block_ru_stime_seconds synapse_util_metrics_block_db_txn_count:total -> synapse_util_metrics_block_db_txn_count synapse_util_metrics_block_db_txn_duration:total -> synapse_util_metrics_block_db_txn_duration_seconds synapse_http_server_response_time:total -> synapse_http_server_response_time_seconds synapse_http_server_response_ru_utime:total -> synapse_http_server_response_ru_utime_seconds synapse_http_server_response_ru_stime:total -> synapse_http_server_response_ru_stime_seconds synapse_http_server_response_db_txn_count:total -> synapse_http_server_response_db_txn_count synapse_http_server_response_db_txn_duration:total synapse_http_server_response_db_txn_duration_seconds	2018-01-15 17:09:44 +00:00
Richard van der Hoff	b2cd6accf5	Remove __PreservingContextDeferred too	2017-11-14 23:00:10 +00:00
Richard van der Hoff	7e6fa29cb5	Remove preserve_context_over_{fn, deferred} Both of these functions ae known to leak logcontexts. Replace the remaining calls to them and kill them off.	2017-11-14 11:22:42 +00:00
Richard van der Hoff	bf993db11c	Logging and logcontext fixes for Limiter Add some logging to the Limiter in a similar spirit to the Linearizer, to help debug issues. Also fix a logcontext leak. Also refactor slightly to avoid throwing exceptions.	2017-11-07 00:48:57 +00:00
Richard van der Hoff	0be99858f3	fix vars named `l` E741 says "do not use variables named ‘l’, ‘O’, or ‘I’".	2017-10-23 15:56:38 +01:00
Richard van der Hoff	eaaabc6c4f	replace 'except:' with 'except Exception:' what could possibly go wrong	2017-10-23 15:52:32 +01:00
Richard van der Hoff	2e9f5ea31a	Fix logcontext handling for persist_events * don't use preserve_context_over_deferred, which is known broken. * remove a redundant preserve_fn. * add/improve some comments	2017-10-17 10:59:30 +01:00
Richard van der Hoff	cc794d60e7	Merge pull request #2532 from matrix-org/rav/fix_linearizer Fix stackoverflow and logcontexts from linearizer	2017-10-11 17:29:32 +01:00
Richard van der Hoff	f30c4ed2bc	logformatter: fix AttributeError make sure we have the relevant fields before we try to log them.	2017-10-11 17:26:17 +01:00
Richard van der Hoff	4fad8efbfb	Fix stackoverflow and logcontexts from linearizer 1. make it not blow out the stack when there are more than 50 things waiting for a lock. Fixes https://github.com/matrix-org/synapse/issues/2505. 2. Make it not mess up the log contexts.	2017-10-11 15:05:05 +01:00
Richard van der Hoff	3cc852d339	Fancy logformatter to format exceptions better This is a bit of an experimental change at this point; the idea is to see if it helps us track down where our stack overflows are coming from by logging the stack when the exception was caught and turned into a Failure. (We'll also need `edf2704420`). If we deploy this, we'll be able to enable it via the log config yaml.	2017-10-09 17:44:42 +01:00
Richard van der Hoff	148428ce76	Fix logcontext handling for concurrently_execute Avoid preserve_context_over_deferred, which is broken.	2017-10-06 22:24:28 +01:00
David Baker	8ad5f34908	pep8	2017-09-26 19:21:41 +01:00
David Baker	9fd086e506	unnecessary parens	2017-09-26 17:59:46 +01:00
David Baker	0b03a97708	Add module_loader.py	2017-09-26 17:56:41 +01:00
Erik Johnston	495f075b41	Increase default cache factor size.	2017-07-04 09:58:32 +01:00
Erik Johnston	b5e8d529e6	Define CACHE_SIZE_FACTOR once	2017-07-04 09:56:44 +01:00
Erik Johnston	c72058bcc6	Use an ExpiringCache for storing registration sessions This is because pruning them was a significant performance drain on matrix.org	2017-06-29 14:08:37 +01:00
Erik Johnston	efc2b7db95	Rewrite conditional	2017-06-09 13:35:15 +01:00
Erik Johnston	eed59dcc1e	Fix has_any_entity_changed Occaisonally has_any_entity_changed would throw the error: "Set changed size during iteration" when taking the max of the `sorteddict`. While its uncertain how that happens, its quite inefficient to iterate over the entire dict anyway so we change to using the more traditional `bisect_*` functions.	2017-06-09 11:44:01 +01:00
Erik Johnston	304880d185	Add stream change cache	2017-05-31 15:46:36 +01:00
Erik Johnston	bd7bb5df71	Pull out if statement from for loop	2017-05-22 15:12:19 +01:00
Erik Johnston	e3417a06e2	Update list cache to handle one arg case We update the normal cache descriptors to handle caches with a single argument specially so that the key wasn't a 1-tuple. We need to update the cache list to be aware of this.	2017-05-22 15:04:42 +01:00
Erik Johnston	bbfe4e996c	Make get_state_groups_from_groups faster. Most of the time was spent copying a dict to filter out sentinel values that indicated that keys did not exist in the dict. The sentinel values were added to ensure that we cached the non-existence of keys. By updating DictionaryCache to keep track of which keys were known to not exist itself we can remove a dictionary copy.	2017-05-17 15:12:15 +01:00
Erik Johnston	ffad4fe35b	Don't update event cache hit ratio from get_joined_users Otherwise the hit ration of plain get_events gets completely skewed by calls to get_joined_users* functions.	2017-05-08 16:06:17 +01:00
Erik Johnston	d2d8ed4884	Optimise caches with single key	2017-05-04 14:18:46 +01:00
Richard van der Hoff	2e996271fe	Instantiate DeferredTimedOutError correctly Call `super` correctly, so that we correctly initialise the `errcode` field. Fixes https://github.com/matrix-org/synapse/issues/2179.	2017-05-02 13:26:17 +01:00
Erik Johnston	d9aa645f86	Reduce size of joined_user cache The _get_joined_users_from_context cache stores a mapping from user_id to avatar_url and display_name. Instead of storing those in a dict, store them in a namedtuple as that uses much less memory. We also try converting the string to ascii to further reduce the size.	2017-04-25 14:38:51 +01:00
Erik Johnston	efab1dadde	Remove DEBUG_CACHES	2017-04-25 10:54:09 +01:00
Erik Johnston	119cb9bbcf	Reduce cache size by not storing deferreds Currently the cache descriptors store deferreds rather than raw values, this is a simple way of triggering only one database hit and sharing the result if two callers attempt to get the same value. However, there are a few caches that simply store a mapping from string to string (or int). These caches can have a large number of entries, under the assumption that each entry is small. However, the size of a deferred (specifically the size of ObservableDeferred) is signigicantly larger than that of the raw value, 2kb vs 32b. This PR therefore changes the cache descriptors to store the raw values rather than the deferreds. As a side effect cached storage function now either return a deferred or the actual value, as the cached list decriptor already does. This is fine as we always end up just yield'ing on the returned value eventually, which handles that case correctly.	2017-04-25 10:23:11 +01:00
Erik Johnston	d134d0935e	Only intern ascii strings	2017-04-24 14:07:48 +01:00
Richard van der Hoff	e2eebf1696	Fix fixme in preserve_fn `preserve_fn` is no longer used as a decorator anywhere, so we can safely fix a fixme therein.	2017-04-03 15:38:02 +01:00
Erik Johnston	4d17add8de	Remove unused instance variable	2017-03-31 09:38:27 +01:00
Erik Johnston	5b5b171f3e	Docs	2017-03-30 17:05:53 +01:00
Erik Johnston	b282fe7170	Revert log context change	2017-03-30 17:03:59 +01:00
Erik Johnston	6194a64ae9	Doc new instance variables	2017-03-30 14:19:10 +01:00

... 4 5 6 7 8 ...

852 Commits