Opentracing Documentation (#5703)
* Opentracing survival guide * Update decorator names in doc * Doc cleanup These are all alterations as a result of comments in #5703, it includes mostly typos and clarifications. The most interesting changes are: - Split developer and user docs into two sections - Add a high level description of OpenTracing * newsfile * Move contributer specific info to docstring. * Sample config. * Trailing whitespace. * Update 5703.misc * Apply suggestions from code review Mostly just rewording parts of the docs for clarity. Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
This commit is contained in:
parent
a3e40bd5b4
commit
826e6ec3bd
|
@ -0,0 +1 @@
|
|||
Documentation for opentracing.
|
|
@ -0,0 +1,100 @@
|
|||
===========
|
||||
OpenTracing
|
||||
===========
|
||||
|
||||
Background
|
||||
----------
|
||||
|
||||
OpenTracing is a semi-standard being adopted by a number of distributed tracing
|
||||
platforms. It is a common api for facilitating vendor-agnostic tracing
|
||||
instrumentation. That is, we can use the OpenTracing api and select one of a
|
||||
number of tracer implementations to do the heavy lifting in the background.
|
||||
Our current selected implementation is Jaeger.
|
||||
|
||||
OpenTracing is a tool which gives an insight into the causal relationship of
|
||||
work done in and between servers. The servers each track events and report them
|
||||
to a centralised server - in Synapse's case: Jaeger. The basic unit used to
|
||||
represent events is the span. The span roughly represents a single piece of work
|
||||
that was done and the time at which it occurred. A span can have child spans,
|
||||
meaning that the work of the child had to be completed for the parent span to
|
||||
complete, or it can have follow-on spans which represent work that is undertaken
|
||||
as a result of the parent but is not depended on by the parent to in order to
|
||||
finish.
|
||||
|
||||
Since this is undertaken in a distributed environment a request to another
|
||||
server, such as an RPC or a simple GET, can be considered a span (a unit or
|
||||
work) for the local server. This causal link is what OpenTracing aims to
|
||||
capture and visualise. In order to do this metadata about the local server's
|
||||
span, i.e the 'span context', needs to be included with the request to the
|
||||
remote.
|
||||
|
||||
It is up to the remote server to decide what it does with the spans
|
||||
it creates. This is called the sampling policy and it can be configured
|
||||
through Jaeger's settings.
|
||||
|
||||
For OpenTracing concepts see
|
||||
https://opentracing.io/docs/overview/what-is-tracing/.
|
||||
|
||||
For more information about Jaeger's implementation see
|
||||
https://www.jaegertracing.io/docs/
|
||||
|
||||
=====================
|
||||
Seting up OpenTracing
|
||||
=====================
|
||||
|
||||
To receive OpenTracing spans, start up a Jaeger server. This can be done
|
||||
using docker like so:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
docker run -d --name jaeger
|
||||
-p 6831:6831/udp \
|
||||
-p 6832:6832/udp \
|
||||
-p 5778:5778 \
|
||||
-p 16686:16686 \
|
||||
-p 14268:14268 \
|
||||
jaegertracing/all-in-one:1.13
|
||||
|
||||
Latest documentation is probably at
|
||||
https://www.jaegertracing.io/docs/1.13/getting-started/
|
||||
|
||||
|
||||
Enable OpenTracing in Synapse
|
||||
-----------------------------
|
||||
|
||||
OpenTracing is not enabled by default. It must be enabled in the homeserver
|
||||
config by uncommenting the config options under ``opentracing`` as shown in
|
||||
the `sample config <./sample_config.yaml>`_. For example:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
opentracing:
|
||||
tracer_enabled: true
|
||||
homeserver_whitelist:
|
||||
- "mytrustedhomeserver.org"
|
||||
- "*.myotherhomeservers.com"
|
||||
|
||||
Homeserver whitelisting
|
||||
-----------------------
|
||||
|
||||
The homeserver whitelist is configured using regular expressions. A list of regular
|
||||
expressions can be given and their union will be compared when propagating any
|
||||
spans contexts to another homeserver.
|
||||
|
||||
Though it's mostly safe to send and receive span contexts to and from
|
||||
untrusted users since span contexts are usually opaque ids it can lead to
|
||||
two problems, namely:
|
||||
|
||||
- If the span context is marked as sampled by the sending homeserver the receiver will
|
||||
sample it. Therefore two homeservers with wildly different sampling policies
|
||||
could incur higher sampling counts than intended.
|
||||
- Sending servers can attach arbitrary data to spans, known as 'baggage'. For safety this has been disabled in Synapse
|
||||
but that doesn't prevent another server sending you baggage which will be logged
|
||||
to OpenTracing's logs.
|
||||
|
||||
==================
|
||||
Configuring Jaeger
|
||||
==================
|
||||
|
||||
Sampling strategies can be set as in this document:
|
||||
https://www.jaegertracing.io/docs/1.13/sampling/
|
|
@ -1422,18 +1422,8 @@ opentracing:
|
|||
#enabled: true
|
||||
|
||||
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
||||
#
|
||||
# Though it's mostly safe to send and receive span contexts to and from
|
||||
# untrusted users since span contexts are usually opaque ids it can lead to
|
||||
# two problems, namely:
|
||||
# - If the span context is marked as sampled by the sending homeserver the receiver will
|
||||
# sample it. Therefore two homeservers with wildly disparaging sampling policies
|
||||
# could incur higher sampling counts than intended.
|
||||
# - Span baggage can be arbitrary data. For safety this has been disabled in synapse
|
||||
# but that doesn't prevent another server sending you baggage which will be logged
|
||||
# to opentracing logs.
|
||||
#
|
||||
# This a list of regexes which are matched against the server_name of the
|
||||
# See docs/opentracing.rst
|
||||
# This is a list of regexes which are matched against the server_name of the
|
||||
# homeserver.
|
||||
#
|
||||
# By defult, it is empty, so no servers are matched.
|
||||
|
|
|
@ -48,18 +48,8 @@ class TracerConfig(Config):
|
|||
#enabled: true
|
||||
|
||||
# The list of homeservers we wish to send and receive span contexts and span baggage.
|
||||
#
|
||||
# Though it's mostly safe to send and receive span contexts to and from
|
||||
# untrusted users since span contexts are usually opaque ids it can lead to
|
||||
# two problems, namely:
|
||||
# - If the span context is marked as sampled by the sending homeserver the receiver will
|
||||
# sample it. Therefore two homeservers with wildly disparaging sampling policies
|
||||
# could incur higher sampling counts than intended.
|
||||
# - Span baggage can be arbitrary data. For safety this has been disabled in synapse
|
||||
# but that doesn't prevent another server sending you baggage which will be logged
|
||||
# to opentracing logs.
|
||||
#
|
||||
# This a list of regexes which are matched against the server_name of the
|
||||
# See docs/opentracing.rst
|
||||
# This is a list of regexes which are matched against the server_name of the
|
||||
# homeserver.
|
||||
#
|
||||
# By defult, it is empty, so no servers are matched.
|
||||
|
|
|
@ -24,6 +24,131 @@
|
|||
# this move the methods have work very similarly to opentracing's and it should only
|
||||
# be a matter of few regexes to move over to opentracing's access patterns proper.
|
||||
|
||||
"""
|
||||
============================
|
||||
Using OpenTracing in Synapse
|
||||
============================
|
||||
|
||||
Python-specific tracing concepts are at https://opentracing.io/guides/python/.
|
||||
Note that Synapse wraps OpenTracing in a small module (this one) in order to make the
|
||||
OpenTracing dependency optional. That means that the access patterns are
|
||||
different to those demonstrated in the OpenTracing guides. However, it is
|
||||
still useful to know, especially if OpenTracing is included as a full dependency
|
||||
in the future or if you are modifying this module.
|
||||
|
||||
|
||||
OpenTracing is encapsulated so that
|
||||
no span objects from OpenTracing are exposed in Synapse's code. This allows
|
||||
OpenTracing to be easily disabled in Synapse and thereby have OpenTracing as
|
||||
an optional dependency. This does however limit the number of modifiable spans
|
||||
at any point in the code to one. From here out references to `opentracing`
|
||||
in the code snippets refer to the Synapses module.
|
||||
|
||||
Tracing
|
||||
-------
|
||||
|
||||
In Synapse it is not possible to start a non-active span. Spans can be started
|
||||
using the ``start_active_span`` method. This returns a scope (see
|
||||
OpenTracing docs) which is a context manager that needs to be entered and
|
||||
exited. This is usually done by using ``with``.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from synapse.logging.opentracing import start_active_span
|
||||
|
||||
with start_active_span("operation name"):
|
||||
# Do something we want to tracer
|
||||
|
||||
Forgetting to enter or exit a scope will result in some mysterious and grievous log
|
||||
context errors.
|
||||
|
||||
At anytime where there is an active span ``opentracing.set_tag`` can be used to
|
||||
set a tag on the current active span.
|
||||
|
||||
Tracing functions
|
||||
-----------------
|
||||
|
||||
Functions can be easily traced using decorators. There is a decorator for
|
||||
'normal' function and for functions which are actually deferreds. The name of
|
||||
the function becomes the operation name for the span.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from synapse.logging.opentracing import trace, trace_deferred
|
||||
|
||||
# Start a span using 'normal_function' as the operation name
|
||||
@trace
|
||||
def normal_function(*args, **kwargs):
|
||||
# Does all kinds of cool and expected things
|
||||
return something_usual_and_useful
|
||||
|
||||
# Start a span using 'deferred_function' as the operation name
|
||||
@trace_deferred
|
||||
@defer.inlineCallbacks
|
||||
def deferred_function(*args, **kwargs):
|
||||
# We start
|
||||
yield we_wait
|
||||
# we finish
|
||||
defer.returnValue(something_usual_and_useful)
|
||||
|
||||
Operation names can be explicitly set for functions by using
|
||||
``trace_using_operation_name`` and
|
||||
``trace_deferred_using_operation_name``
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from synapse.logging.opentracing import (
|
||||
trace_using_operation_name,
|
||||
trace_deferred_using_operation_name
|
||||
)
|
||||
|
||||
@trace_using_operation_name("A *much* better operation name")
|
||||
def normal_function(*args, **kwargs):
|
||||
# Does all kinds of cool and expected things
|
||||
return something_usual_and_useful
|
||||
|
||||
@trace_deferred_using_operation_name("Another exciting operation name!")
|
||||
@defer.inlineCallbacks
|
||||
def deferred_function(*args, **kwargs):
|
||||
# We start
|
||||
yield we_wait
|
||||
# we finish
|
||||
defer.returnValue(something_usual_and_useful)
|
||||
|
||||
Contexts and carriers
|
||||
---------------------
|
||||
|
||||
There are a selection of wrappers for injecting and extracting contexts from
|
||||
carriers provided. Unfortunately OpenTracing's three context injection
|
||||
techniques are not adequate for our inject of OpenTracing span-contexts into
|
||||
Twisted's http headers, EDU contents and our database tables. Also note that
|
||||
the binary encoding format mandated by OpenTracing is not actually implemented
|
||||
by jaeger_client v4.0.0 - it will silently noop.
|
||||
Please refer to the end of ``logging/opentracing.py`` for the available
|
||||
injection and extraction methods.
|
||||
|
||||
Homeserver whitelisting
|
||||
-----------------------
|
||||
|
||||
Most of the whitelist checks are encapsulated in the modules's injection
|
||||
and extraction method but be aware that using custom carriers or crossing
|
||||
unchartered waters will require the enforcement of the whitelist.
|
||||
``logging/opentracing.py`` has a ``whitelisted_homeserver`` method which takes
|
||||
in a destination and compares it to the whitelist.
|
||||
|
||||
=======
|
||||
Gotchas
|
||||
=======
|
||||
|
||||
- Checking whitelists on span propagation
|
||||
- Inserting pii
|
||||
- Forgetting to enter or exit a scope
|
||||
- Span source: make sure that the span you expect to be active across a
|
||||
function call really will be that one. Does the current function have more
|
||||
than one caller? Will all of those calling functions have be in a context
|
||||
with an active span?
|
||||
"""
|
||||
|
||||
import contextlib
|
||||
import logging
|
||||
import re
|
||||
|
|
Loading…
Reference in New Issue