Commit Graph

220 Commits

Author SHA1 Message Date
Matthew Hodgson 9b334b3f97 WIP experiment in lazyloading room members 2018-03-11 20:01:41 +00:00
Erik Johnston 106906a65e Don't serialize current state over replication 2018-02-15 13:53:18 +00:00
Erik Johnston fd1601c596 Fix state group storage bug in workers
We needed to move `_count_state_group_hops_txn` to the
StateGroupWorkerStore.
2018-02-15 11:04:32 +00:00
Erik Johnston 3d33eef6fc
Store state groups separately from events (#2784)
* Split state group persist into seperate storage func

* Add per database engine code for state group id gen

* Move store_state_group to StateReadStore

This allows other workers to use it, and so resolve state.

* Hook up store_state_group

* Fix tests

* Rename _store_mult_state_groups_txn

* Rename StateGroupReadStore

* Remove redundant _have_persisted_state_group_txn

* Update comments

* Comment compute_event_context

* Set start val for state_group_id_seq

... otherwise we try to recreate old state groups

* Update comments

* Don't store state for outliers

* Update comment

* Update docstring as state groups are ints
2018-02-06 14:31:24 +00:00
Richard van der Hoff 35a4b63240 Pull out bits of StateStore to a mixin
... so that we don't need to secretly gut-wrench it for use in the slaved
stores. I haven't done the other stores yet, but we should. I'm tired of the
workers breaking every time we tweak the stores because I forgot to gut-wrench
the right method.

fixes https://github.com/matrix-org/synapse/issues/2655.
2017-11-14 11:43:58 +00:00
Richard van der Hoff 4dd1bfa8c1 Revert "Revert "move _state_group_cache to statestore""
We're going to fix this properly on this branch, so that the _state_group_cache
can end up in StateGroupReadStore.

This reverts commit ab335edb02.
2017-11-14 11:43:58 +00:00
Richard van der Hoff 6cfee09be9 Make __init__ consitstent across Store heirarchy
Add db_conn parameters to the `__init__` methods of the *Store classes, so that
they are all consistent, which makes the multiple inheritance work correctly
(and so that we can later extract mixins which can be used in the slavedstores)
2017-11-13 10:46:07 +00:00
Erik Johnston ab335edb02 Revert "move _state_group_cache to statestore"
This reverts commit f5cf3638e9.
2017-11-13 10:05:33 +00:00
Richard van der Hoff f5cf3638e9 move _state_group_cache to statestore
this is internal to statestore, so let's keep it there.
2017-11-07 16:43:00 +00:00
Erik Johnston 5946aa0877 Prefill forward extrems and event to state groups 2017-06-29 15:38:48 +01:00
Erik Johnston 8060974344 Fix replication 2017-06-09 16:40:52 +01:00
Erik Johnston b0d975e216 Comments 2017-06-09 16:25:42 +01:00
Erik Johnston e54d7d536e Cache state deltas 2017-06-09 16:24:00 +01:00
Erik Johnston ea11ee09f3 Ensure we don't use unpersisted state group as prev group 2017-06-08 11:59:57 +01:00
Erik Johnston 6ba21bf2b8 Comments 2017-06-07 11:08:36 +01:00
Erik Johnston dfbda5e025 Faster cache for get_joined_hosts 2017-05-25 17:24:44 +01:00
Erik Johnston bbfe4e996c Make get_state_groups_from_groups faster.
Most of the time was spent copying a dict to filter out sentinel values
that indicated that keys did not exist in the dict. The sentinel values
were added to ensure that we cached the non-existence of keys.

By updating DictionaryCache to keep track of which keys were known to
not exist itself we can remove a dictionary copy.
2017-05-17 15:12:15 +01:00
Erik Johnston 608b5a6317 Take a copy before prefilling, as it may be a frozendict 2017-05-16 12:55:29 +01:00
Erik Johnston e4435b014e Update comment 2017-05-15 15:11:30 +01:00
Erik Johnston 871605f4e2 Comments 2017-05-15 15:11:30 +01:00
Erik Johnston bfbc907cec Prefill state caches 2017-05-15 15:11:13 +01:00
Erik Johnston 587f07543f Revert "Prefill state caches" 2017-05-04 15:07:27 +01:00
Erik Johnston 2c2dcf81d0 Update comment 2017-05-03 10:00:29 +01:00
Erik Johnston 1827057acc Comments 2017-05-03 09:56:05 +01:00
Erik Johnston a2c89a225c Prefill state caches 2017-05-02 10:40:31 +01:00
Erik Johnston f053a1409e Make state caches cache in ascii 2017-04-25 17:22:55 +01:00
Erik Johnston 22f3d3ae76 Reduce _get_state_group_for_event cache size 2017-04-25 11:43:03 +01:00
Erik Johnston e4f3431116 Remove unused cache 2017-04-24 13:27:38 +01:00
Erik Johnston 2a3e822f44 Comment 2017-04-07 13:47:04 +01:00
Erik Johnston d72667fcce Speed up get_current_state_ids
Using _simple_select_list is fairly expensive for functions that return
a lot of rows and/or get called a lot. (This is because it carefully
constructs a list of dicts).

get_current_state_ids gets called a lot on startup and e.g. when the IRC
bridge decided to send tonnes of joins/leaves (as it invalidates the
cache). We therefore replace it with a custon txn function that builds
up the final result dict without building up and intermediate
representation.
2017-04-07 10:10:49 +01:00
Erik Johnston 3ce8d59176 Increase cache size for _get_state_group_for_event 2017-03-29 14:31:46 +01:00
Erik Johnston d58b1ffe94 Replace some calls to cursor_to_dict
cursor_to_dict can be surprisinglh expensive for large result sets, so lets
only call it when we need to.
2017-03-24 11:07:02 +00:00
Erik Johnston e71940aa64 Use iter(items|values) 2017-03-24 10:57:02 +00:00
Erik Johnston 00957d1aa4 User Cursor.__iter__ instead of fetchall
This prevents unnecessary construction of lists
2017-03-23 17:53:49 +00:00
Richard van der Hoff 0c01f829ae Avoid resetting state on rejected events
When we get a rejected event, give it the same state_group as its prev_event,
rather than no state_group at all.

This should fix https://github.com/matrix-org/synapse/issues/1935.
2017-03-17 15:06:08 +00:00
Erik Johnston 8ffbe43ba1 Get current state by using current_state_events table 2017-03-10 17:39:35 +00:00
Richard van der Hoff fc2f29c1d0 Fix bugs in the /keys/changes api
* `get_forward_extremeties_for_room` takes a numeric `stream_ordering`. We were
  passing a `RoomStreamToken`, which meant that it returned the *current*
  extremities, rather than those corresponding to the `from_token`. However:
* `get_state_ids_for_events` required a second ('types') parameter; this meant
  that a `TypeError` was thrown and we ended up acting as though there was *no*
  prev state.
* `get_state_ids_for_events` actually returns a map from event_id to state
  dictionary - just looking up the state keys in it again meant that we acted
  as though there was no prev state. We now check if each member's state has
  changed since *any* of the extremities.

Also add/fix some comments.
2017-02-14 13:59:50 +00:00
Erik Johnston 21b7375778 Add an index to make membership queries faster 2017-01-31 15:15:57 +00:00
Erik Johnston a55fa2047f Insert delta of current_state_events to be more efficient 2017-01-20 17:10:18 +00:00
Erik Johnston 897f8752da Up cache max entries for state 2017-01-16 15:08:17 +00:00
Erik Johnston 01521299c7 Increase cache size limit 2017-01-16 11:56:51 +00:00
Erik Johnston 2fae34bd2c Optionally measure size of cache by sum of length of values 2017-01-13 17:46:17 +00:00
Matthew Hodgson 8cfc0165e9 fix annoying typos 2017-01-05 13:39:43 +00:00
Erik Johnston 587d8ac60f Correctly intern keys in state cache 2016-11-08 11:53:25 +00:00
Erik Johnston 4974147aa3 Remove duplication 2016-09-27 09:27:54 +01:00
Erik Johnston 13122e5e24 Remove unused variable 2016-09-27 09:21:51 +01:00
Erik Johnston cf3e1cc200 Fix perf of fetching state in SQLite 2016-09-26 17:16:24 +01:00
Erik Johnston 00f51493f5 Fix reindex 2016-09-14 10:18:30 +01:00
Erik Johnston d5ae1f1291 Ensure we don't mutate state cache entries 2016-09-14 10:03:48 +01:00
Erik Johnston 03a98aff3c Create new index concurrently 2016-09-12 14:27:01 +01:00
Erik Johnston 897d57bc58 Change state fetch query for postgres to be faster
It turns out that postgres doesn't like doing a list of OR's and is
about 1000x slower, so we just issue a query for each specific type
seperately.
2016-09-12 10:05:07 +01:00
Erik Johnston 5beda10bbd Reindex state_groups_state after pruning 2016-09-08 16:18:01 +01:00
Erik Johnston b568ca309c Temporarily disable sequential scans for state fetching 2016-09-08 09:38:54 +01:00
Erik Johnston 513188aa56 Comment 2016-09-07 14:53:23 +01:00
Erik Johnston fadb01551a Add appopriate framing clause 2016-09-07 14:39:01 +01:00
Erik Johnston d25c20ccbe Use windowing function to make use of index 2016-09-07 14:22:22 +01:00
Erik Johnston 0595413c0f Scale the batch size so that we're not bitten by the minimum 2016-09-05 15:49:57 +01:00
Erik Johnston a7032abb2e Correctly handle reindexing state groups that already have an edge 2016-09-05 15:07:23 +01:00
Erik Johnston 70332a12dd Take value in a better way 2016-09-05 14:57:14 +01:00
Erik Johnston 373654c635 Comment about sqlite and WITH RECURSIVE 2016-09-05 14:50:36 +01:00
Erik Johnston 628e65721b Add comments 2016-09-05 10:41:27 +01:00
Erik Johnston a99e933550 Add upgrade script that will slowly prune state_groups_state entries 2016-09-05 10:05:36 +01:00
Erik Johnston 598317927c Limit the length of state chains 2016-09-02 10:41:38 +01:00
Erik Johnston 9e25443db8 Move to storing state_groups_state as deltas 2016-09-01 14:31:26 +01:00
Erik Johnston 0cfd6c3161 Use state_groups table to test existence 2016-08-31 16:25:57 +01:00
Erik Johnston c10cb581c6 Correctly handle the difference between prev and current state 2016-08-31 14:26:22 +01:00
Erik Johnston 1bb8ec296d Generate state group ids in state layer 2016-08-31 10:09:46 +01:00
Erik Johnston 5dc2a702cf Make _state_groups_id_gen a normal IdGenerator 2016-08-30 16:55:11 +01:00
Erik Johnston 778fa85f47 Make sync not pull out full state 2016-08-25 18:59:44 +01:00
Erik Johnston a3dc1e9cbe Replace context.current_state with context.current_state_ids 2016-08-25 17:32:22 +01:00
Erik Johnston 17f4f14df7 Pull out event ids rather than full events for state 2016-08-25 13:42:44 +01:00
Erik Johnston ba214a5e32 Remove lru option 2016-08-19 14:17:11 +01:00
Erik Johnston 61c7edfd34 Add cache to _get_state_groups_from_groups 2016-04-19 17:22:03 +01:00
Mark Haines 87f2dec8d4 Make the cache objects be per instance rather than being global 2016-04-06 13:08:05 +01:00
Mark Haines 89e6839a48 Merge pull request #686 from matrix-org/markjh/doc_strings
Use google style doc strings.
2016-04-01 16:20:09 +01:00
Mark Haines 2a37467fa1 Use google style doc strings.
pycharm supports them so there is no need to use the other format.

Might as well convert the existing strings to reduce the risk of
people accidentally cargo culting the wrong doc string format.
2016-04-01 16:12:07 +01:00
Mark Haines e36bfbab38 Use a stream id generator for backfilled ids 2016-04-01 13:29:05 +01:00
Mark Haines 31a9eceda5 Add a replication stream for state groups 2016-03-30 16:01:58 +01:00
Mark Haines 1e25f62ee6 Use a stream id generator to assign state group ids 2016-03-30 12:55:02 +01:00
Erik Johnston 2f0180b09e Don't bother interning keys that are already interned 2016-03-23 16:29:46 +00:00
Erik Johnston acdfef7b14 Intern all the things 2016-03-23 16:25:54 +00:00
Erik Johnston 75daede92f String intern 2016-03-23 14:53:53 +00:00
Erik Johnston 8b0dfc9fc4 Don't cache events in get_current_state_for_key 2016-03-23 11:42:17 +00:00
Erik Johnston 2c86187a1b Don't cache events in _state_group_cache
Instead, simply cache the event ids, relying on the event cache to cache
the actual events.

The problem was that while the state groups cache was limited in the
number of groups it could hold, each individual group could consist of
thousands of events.
2016-03-22 12:00:09 +00:00
Mark Haines 54172924c8 Load the current id in the IdGenerator constructor
Rather than loading them lazily. This allows us to remove all
the yield statements and spurious arguments for the get_next
methods.

It also allows us to replace all instances of get_next_txn with
get_next since get_next no longer needs to access the db.
2016-03-01 14:32:56 +00:00
Erik Johnston 5189bfdef4 Batch fetch _get_state_groups_from_groups 2016-02-10 13:24:42 +00:00
Erik Johnston 24f00a6c33 Use _simple_select_many for _get_state_group_for_events 2016-02-10 12:57:50 +00:00
Matthew Hodgson 6c28ac260c copyrights 2016-01-07 04:26:29 +00:00
Richard van der Hoff fddedd51d9 Fix a few race conditions in the state calculation
Be a bit more careful about how we calculate the state to be returned by
/sync. In a few places, it was possible for /sync to return slightly later
state than that represented by the next_batch token and the timeline. In
particular, the following cases were susceptible:

* On a full state sync, for an active room
* During a per-room incremental sync with a timeline gap
* When the user has just joined a room. (Refactor check_joined_room to make it
  less magical)

Also, use store.get_state_for_events() (and thus the existing stategroups) to
calculate the state corresponding to a particular sync position, rather than
state_handler.compute_event_context(), which recalculates from first principles
(and tends to miss some state).

Merged from PR https://github.com/matrix-org/synapse/pull/372
2015-11-13 10:39:09 +00:00
Erik Johnston 927004e349 Remove unused room_id parameter 2015-10-12 15:06:14 +01:00
Mark Haines 1cd65a8d1e synapse/storage/state.py: _make_group_id was unused 2015-09-23 10:37:58 +01:00
Erik Johnston 1bd1a43073 Actually check if event_id isn't returned by _get_state_groups 2015-08-21 14:30:34 +01:00
Erik Johnston a82938416d Remove newline because vertical whitespace makes mjark sad 2015-08-18 16:28:13 +01:00
Erik Johnston 0bfdaf1f4f Rejig the code to make it nicer 2015-08-18 16:26:07 +01:00
Erik Johnston 8199475ce0 Ensure we never return a None event from _get_state_for_groups 2015-08-18 11:44:10 +01:00
Erik Johnston 85d0bc3bdc Reduce cache size from obscenely large to quite large 2015-08-18 11:00:38 +01:00
Erik Johnston f9d4da7f45 Fix bug where we were leaking None into state event lists 2015-08-17 09:39:45 +01:00
Erik Johnston 2bb2c02571 Remove some vertical space 2015-08-13 17:11:30 +01:00
Erik Johnston 57877b01d7 Replace list comprehension 2015-08-13 17:00:17 +01:00
Erik Johnston 0fbed2a8fa Comment 2015-08-12 17:22:54 +01:00