Commit Graph

201 Commits

Author SHA1 Message Date
Erik Johnston 1fcdbeb3ab
Start an opentracing span for background processes. (#8567)
This should reduce the number of `There was no active span` errors we
see.

Fixes #8510.
2020-10-19 12:26:26 +01:00
Richard van der Hoff 6d2d42f8fb Rewrite BucketCollector
This was a bit unweildy for what I wanted: in particular, I wanted to assign
each measurement straight into a bucket, rather than storing an intermediate
Counter which didn't do any bucketing at all.

I've replaced it with something that is hopefully a bit easier to use.

(I'm not entirely sure what the difference between a HistogramMetricFamily and
a GaugeHistogramMetricFamily is, but given our counters can go down as well as
up the latter *sounds* more accurate?)
2020-09-30 16:49:15 +01:00
Richard van der Hoff 1c8ca2c543 Fix _exposition.py to stop stripping samples
Our hacked-up `_exposition.py` was stripping out some samples it shouldn't
have been. Put them back in, to more closely match the upstream
`exposition.py`.
2020-09-30 16:45:43 +01:00
Richard van der Hoff ceafb5a1c6
Drop support for ancient prometheus_client (#8426)
Drop compatibility hacks for prometheus-client pre 0.4.0. Debian stretch and
Fedora 31 both have newer versions, so hopefully this will be ok.
2020-09-30 16:42:05 +01:00
Patrick Cloke aec294ee0d
Use slots in attrs classes where possible (#8296)
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
2020-09-14 12:50:06 -04:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Patrick Cloke d89692ea84
Convert runWithConnection to async. (#8121) 2020-08-19 07:09:24 -04:00
Patrick Cloke c36228c403
Convert run_as_background_process inner function to async. (#8032) 2020-08-06 08:20:42 -04:00
Richard van der Hoff 8ca39bd2c3
Improve stacktraces from exceptions in background processes (#7808)
use `Failure()` to fish out the real exception.
2020-07-09 13:01:33 +01:00
Erik Johnston a99658074d
Add some metrics for inbound and outbound federation processing times (#7755) 2020-06-30 16:58:06 +01:00
Christian Svensson 8bbe87f42d
Set Content-Length for Metrics requests (#7730)
HTTP requires the response to contain a Content-Length header unless chunked encoding is being used.
Prometheus metrics endpoint did not set this, causing software such as prometheus-proxy to not be able to scrape synapse for metrics.

Signed-off-by: Christian Svensson <blue@cmd.nu>
2020-06-23 18:06:01 +01:00
Patrick Cloke bd6dc17221
Replace iteritems/itervalues/iterkeys with native versions. (#7692) 2020-06-15 07:03:36 -04:00
Erik Johnston f5353eff21
Make inflight background metrics more efficient. (#7597)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2020-05-29 13:25:32 +01:00
Ivan Shapovalov ac481a738e
synapse.metrics: implement detailed memory usage reporting on PyPy (#7536)
PyPy's gc.get_stats() returns an object containing detailed allocator statistics
which could be beneficial to collect as metrics.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2020-05-22 11:08:41 +01:00
Amber Brown 7cb8b4bc67
Allow configuration of Synapse's cache without using synctl or environment variables (#6391) 2020-05-11 18:45:23 +01:00
Richard van der Hoff 8c75667ad7
Add prometheus metrics for the number of active pushers (#7103) 2020-03-19 10:00:24 +00:00
Patrick Cloke 509e381afa
Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957)
Ensure good comprehension hygiene using flake8-comprehensions.
2020-02-21 07:15:07 -05:00
Amber Brown 864f144543
Fix up some typechecking (#6150)
* type checking fixes

* changelog
2019-10-02 05:29:01 -07:00
Richard van der Hoff a96318127d Update comments and docstring 2019-09-25 18:17:39 +01:00
Erik Johnston 367158a609 Add wrap_as_background_process decorator.
This does the same thing as `run_as_background_process` but means we
don't need to create superfluous functions.
2019-09-24 15:53:17 +01:00
Amber Brown b617864cd9
Fix for structured logging tests stomping on logs (#6023) 2019-09-13 02:29:55 +10:00
Amber Brown aeb9b2179e
Add a build info metric to Prometheus (#6005) 2019-09-10 00:14:58 +10:00
Amber Brown 7ad1d76356
Support Prometheus_client 0.4.0+ (#5636) 2019-07-18 23:57:15 +10:00
Amber Brown 463b072b12
Move logging utilities out of the side drawer of util/ and into logging/ (#5606) 2019-07-04 00:07:04 +10:00
Amber Brown 071150ce19
Don't log GC 0s at INFO (#5557) 2019-06-28 21:45:33 +10:00
Amber Brown 32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Erik Johnston 3ed595e327 Prometheus histograms are cumalative 2019-06-14 14:07:32 +01:00
Amber H. Brown a10c8dae85 fix prometheus rendering error 2019-06-14 21:09:33 +10:00
Amber Brown 6312d6cc7c
Expose statistics on extrems to prometheus (#5384) 2019-06-13 22:40:52 +10:00
Richard van der Hoff 82ca6d1f9f
Add metrics for number of outgoing EDUs, by type (#4695) 2019-02-20 14:13:14 +00:00
Erik Johnston 7c570bff74 Fix exception in background metrics collection
We attempted to iterate through a list on a separate thread without
doing the necessary copying.
2018-10-03 11:28:01 +01:00
Erik Johnston 94ae1dea3c Add missing logger 2018-09-20 17:05:34 +01:00
Erik Johnston 9ea408441f Handle exceptions thrown by background tasks
Fixes #3921
2018-09-20 16:15:21 +01:00
Erik Johnston d0f6c1ce21 Remove spurious comment 2018-09-14 15:12:36 +01:00
Erik Johnston 0a81038ea0 Add in flight real time metrics for Measure blocks 2018-09-14 15:08:37 +01:00
Erik Johnston 3f6762f0bb isort 2018-08-21 09:38:38 +01:00
Erik Johnston 1058d14127 Make the in flight background process metrics thread safe 2018-08-20 17:27:24 +01:00
Richard van der Hoff bab94da79c fix metric name 2018-08-07 22:11:45 +01:00
Richard van der Hoff 53bca4690b more metrics for the federation and appservice senders 2018-08-07 19:09:48 +01:00
Richard van der Hoff 03751a6420 Fix some looping_call calls which were broken in #3604
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.

Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
2018-07-26 11:48:08 +01:00
Richard van der Hoff 6e3fc657b4 Resource tracking for background processes
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.

This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.

We *could* do this with Measure blocks, but:
 - I think having them pulled out as a completely separate metric class will
   make it easier to distinguish top-level processes from those which are
   nested.

 - I want to be able to report on in-flight background processes, and I don't
   think we want to do this for *all* Measure blocks.
2018-07-18 10:50:33 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown 6350bf925e
Attempt to be more performant on PyPy (#3462) 2018-06-28 14:49:57 +01:00
Richard van der Hoff cbbfaa4be8 Fix description of "python_gc_time" metric 2018-06-21 10:02:42 +01:00
Matthew Hodgson ccfdaf68be spell gauge correctly 2018-06-16 07:10:34 +01:00
Amber Brown f116f32ace
add a last seen metric (#3396) 2018-06-14 20:26:59 +10:00
Richard van der Hoff 694968fa81 Hopefully, fix LaterGuage error handling 2018-06-04 15:59:14 +01:00
Amber Brown febe0ec8fd
Run Prometheus on a different port, optionally. (#3274) 2018-05-31 19:04:50 +10:00
Matthew Hodgson ff1bc0a279 pep8 2018-05-29 02:32:15 +01:00
Matthew Hodgson 0a240ad36e disable CPUMetrics if no /proc/self/stat
fixes build on macOS again
2018-05-29 02:23:30 +01:00
Amber Brown 5c40ce3777 invalid syntax :( 2018-05-28 19:16:09 +10:00
Amber Brown a2eb5db4a0 update metrics to be in seconds 2018-05-28 19:10:27 +10:00
Amber Brown 389dac2c15 pepeightttt 2018-05-23 13:08:59 -05:00
Amber Brown 472a5ec4e2 add back CPU metrics 2018-05-23 13:03:56 -05:00
Amber Brown b6063631c3 more cleanup 2018-05-22 17:36:20 -05:00
Amber Brown 53cc2cde1f cleanup 2018-05-22 17:32:57 -05:00
Amber Brown 85ba83eb51 fixes 2018-05-22 16:28:23 -05:00
Amber Brown a8990fa2ec Merge remote-tracking branch 'origin/develop' into 3218-official-prom 2018-05-22 10:50:26 -05:00
Amber Brown df9f72d9e5 replacing portions 2018-05-21 19:47:37 -05:00
Amber Brown c60e0d5e02 don't need the resource portion 2018-05-21 17:03:20 -05:00
Amber Brown f258deffcb remove old metrics libs 2018-05-21 17:01:15 -05:00
Erik Johnston 6d8ec3462d Note that label values can be anything 2018-05-03 16:25:05 +01:00
Erik Johnston 95b6912045 Fix metrics that have integer value labels 2018-05-03 15:51:04 +01:00
Erik Johnston a41117c63b Make _escape_character take MatchObject 2018-05-02 17:27:27 +01:00
Erik Johnston 32015e1109 Escape label values in prometheus metrics 2018-05-02 16:52:42 +01:00
Erik Johnston d7bf3a68f0 s/list/tuple 2018-04-12 11:19:04 +01:00
Erik Johnston 4dae4a97ed Track last processed event received_ts 2018-04-11 14:27:09 +01:00
Erik Johnston 92e34615c5 Track where event stream processing have gotten up to 2018-04-11 12:13:40 +01:00
Erik Johnston ab825aa328 Add GaugeMetric 2018-04-11 12:13:40 +01:00
Vincent Breitmoser 6d7f0f8dd3 Don't disable GC when running on PyPy
PyPy's incminimark GC can't be triggered manually. From what I observed
there are no obvious issues with just letting it run normally. And
unlike CPython, it actually returns unused RAM to the system.

Signed-off-by: Vincent Breitmoser <look@my.amazin.horse>
2018-04-10 11:35:34 +02:00
Richard van der Hoff 88541f9009 Add a metric which increments when a request is received
It's useful to know when there are peaks in incoming requests - which isn't
quite the same as there being peaks in outgoing responses, due to the time
taken to handle requests.
2018-03-09 16:30:26 +00:00
Richard van der Hoff bc496df192 report metrics on number of cache evictions 2018-02-05 15:34:01 +00:00
Richard van der Hoff 87b7d72760 Add some comments about the reactor tick time metric 2018-01-19 23:51:04 +00:00
Richard van der Hoff ce236f8ac8 better exception logging in callbackmetrics
when we fail to render a metric, give a clue as to which metric it was
2018-01-18 11:30:49 +00:00
Richard van der Hoff 992018d1c0 mechanism to render metrics with alternative names 2018-01-15 17:04:39 +00:00
Richard van der Hoff 80fa610f9c Add some comments to metrics classes 2018-01-15 16:52:52 +00:00
Richard van der Hoff 19d274085f Make Counter render floats
Prometheus handles all metrics as floats, and sometimes we store non-integer
values in them (notably, durations in seconds), so let's render them as floats
too.

(Note that the standard client libraries also treat Counters as floats.)
2018-01-12 23:49:44 +00:00
Paul "LeoNerd" Evans 2938a00825 Rename the python-specific metrics now the docs claim that we have done 2016-11-03 17:03:52 +00:00
Paul "LeoNerd" Evans 5219f7e060 Since we don't export per-filetype fd counts any more, delete all the code related to that too 2016-11-03 16:41:32 +00:00
Paul "LeoNerd" Evans 93ebeb2aa8 Remove now-unused 'resource' import 2016-11-03 16:37:09 +00:00
Paul "LeoNerd" Evans c1b077cd19 Now we have new-style metrics don't bother exporting legacy-named process ones 2016-11-03 16:34:16 +00:00
Paul "LeoNerd" Evans 1cc22da600 Set up the process collector during metrics __init__; that way all split-process workers have it 2016-10-27 18:09:34 +01:00
Paul "LeoNerd" Evans aac13b1f9a Pass the Metrics group into the process collector instead of having it find its own one; this avoids it needing to import from synapse.metrics 2016-10-27 18:08:15 +01:00
Paul "LeoNerd" Evans ccc1a3d54d Allow creation of a 'subspace' within a Metrics object, returning another one 2016-10-27 18:07:34 +01:00
Paul "LeoNerd" Evans b01aaadd48 Split callback metric lambda functions down onto their own lines to keep line lengths under 90 2016-10-19 18:26:13 +01:00
Paul "LeoNerd" Evans 1071c7d963 Adjust code for <100 char line limit 2016-10-19 18:23:25 +01:00
Paul "LeoNerd" Evans 6453d03edd Cut the raw /proc/self/stat line up into named fields at collection time 2016-10-19 18:21:40 +01:00
Paul "LeoNerd" Evans 3ae48a1f99 Move the process metrics collector code into its own file 2016-10-19 18:10:24 +01:00
Paul "LeoNerd" Evans 4cedd53224 A slightly neater way to manage metric collector functions 2016-10-19 17:54:09 +01:00
Paul "LeoNerd" Evans 5663137e03 appease pep8 2016-10-19 16:09:42 +01:00
Paul "LeoNerd" Evans b202531be6 Also guard /proc/self/fds-related code with a suitable psuedoconstant 2016-10-19 15:37:41 +01:00
Paul "LeoNerd" Evans 1b179455fc Guard registration of process-wide metrics by existence of the requisite /proc entries 2016-10-19 15:34:38 +01:00
Paul "LeoNerd" Evans 981f852d54 Add standard process_start_time_seconds metric 2016-10-19 15:05:22 +01:00
Paul "LeoNerd" Evans def63649df Add standard process_max_fds metric 2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans 06f1ad1625 Add standard process_open_fds metric 2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans 95fc70216d Add standard process_*_memory_bytes metrics 2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans 9b0316c75a Use /proc/self/stat to generate the new process_cpu_*_seconds_total metrics 2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans 03c2720940 Export CPU usage metrics also under prometheus-standard metric name 2016-10-19 15:05:21 +01:00
Paul "LeoNerd" Evans b21b9dbc37 Callback metric values might not just be integers - allow floats 2016-10-19 15:05:15 +01:00
Erik Johnston 7c1a92274c Make psutil optional 2016-08-08 11:12:21 +01:00