<linkrel="index"title="Index"href="../genindex.html"/><linkrel="search"title="Search"href="../search.html"/><linkrel="prev"title="Welcome to the Synapse Developer Documentation!"href="../index.html"/>
<h1>Federation Sender<aclass="headerlink"href="#federation-sender"title="Permalink to this heading">#</a></h1>
<p>The Federation Sender is responsible for sending Persistent Data Units (PDUs)
and Ephemeral Data Units (EDUs) to other homeservers using
the <codeclass="docutils literal notranslate"><spanclass="pre">/send</span></code> Federation API.</p>
<h2class="rubric"id="how-do-pdus-get-sent">How do PDUs get sent?</h2>
<p>The Federation Sender is made aware of new PDUs due to <codeclass="docutils literal notranslate"><spanclass="pre">FederationSender.notify_new_events</span></code>.
When the sender is notified about a newly-persisted PDU that originates from this homeserver
and is not an out-of-band event, we pass the PDU to the <codeclass="docutils literal notranslate"><spanclass="pre">_PerDestinationQueue</span></code> for each
remote homeserver that is in the room at that point in the DAG.</p>
<p>There is one <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> per ‘destination’ homeserver.
The <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> maintains the following information about the destination:</p>
<ulclass="simple">
<li><p>whether the destination is currently in <aclass="reference internal"href="#catch-up-mode"><spanclass="xref myst">catch-up mode (see below)</span></a>;</p></li>
<li><p>a queue of PDUs to be sent to the destination; and</p></li>
<li><p>a queue of EDUs to be sent to the destination (not considered in this section).</p></li>
</ul>
<p>Upon a new PDU being enqueued, <codeclass="docutils literal notranslate"><spanclass="pre">attempt_new_transaction</span></code> is called to start a new
transaction if there is not already one in progress.</p>
<h3class="rubric"id="transactions-and-the-transaction-transmission-loop">Transactions and the Transaction Transmission Loop</h3>
<p>Each federation HTTP request to the <codeclass="docutils literal notranslate"><spanclass="pre">/send</span></code> endpoint is referred to as a ‘transaction’.
The body of the HTTP request contains a list of PDUs and EDUs to send to the destination.</p>
<p>The <em>Transaction Transmission Loop</em> (<codeclass="docutils literal notranslate"><spanclass="pre">_transaction_transmission_loop</span></code>) is responsible
for emptying the queued PDUs (and EDUs) from a <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> by sending
them to the destination.</p>
<p>There can only be one transaction in flight for a given destination at any time.
(Other than preventing us from overloading the destination, this also makes it easier to
reason about because we process events sequentially for each destination.
This is useful for <em>Catch-Up Mode</em>, described later.)</p>
<p>The loop continues so long as there is anything to send. At each iteration of the loop, we:</p>
<ulclass="simple">
<li><p>dequeue up to 50 PDUs (and up to 100 EDUs).</p></li>
<li><p>make the <codeclass="docutils literal notranslate"><spanclass="pre">/send</span></code> request to the destination homeserver with the dequeued PDUs and EDUs.</p></li>
<li><p>if successful, make note of the fact that we succeeded in transmitting PDUs up to
the given <codeclass="docutils literal notranslate"><spanclass="pre">stream_ordering</span></code> of the latest PDU by</p></li>
<li><p>if unsuccessful, back off from the remote homeserver for some time.
If we have been unsuccessful for too long (when the backoff interval grows to exceed 1 hour),
the in-memory queues are emptied and we enter <aclass="reference internal"href="#catch-up-mode"><spanclass="xref myst"><em>Catch-Up Mode</em>, described below</span></a>.</p></li>
<p>When the <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> has the catch-up flag set, the <em>Catch-Up Transmission Loop</em>
(<codeclass="docutils literal notranslate"><spanclass="pre">_catch_up_transmission_loop</span></code>) is used in lieu of the regular <codeclass="docutils literal notranslate"><spanclass="pre">_transaction_transmission_loop</span></code>.
(Only once the catch-up mode has been exited can the regular tranaction transmission behaviour
be resumed.)</p>
<p><em>Catch-Up Mode</em>, entered upon Synapse startup or once a homeserver has fallen behind due to
connection problems, is responsible for sending PDUs that have been missed by the destination
homeserver. (PDUs can be missed because the <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> is volatile — i.e. resets
on startup — and it does not hold PDUs forever if <codeclass="docutils literal notranslate"><spanclass="pre">/send</span></code> requests to the destination fail.)</p>
<p>The catch-up mechanism makes use of the <codeclass="docutils literal notranslate"><spanclass="pre">last_successful_stream_ordering</span></code> column in the
<codeclass="docutils literal notranslate"><spanclass="pre">destinations</span></code> table (which gives the <codeclass="docutils literal notranslate"><spanclass="pre">stream_ordering</span></code> of the most recent successfully
sent PDU) and the <codeclass="docutils literal notranslate"><spanclass="pre">stream_ordering</span></code> column in the <codeclass="docutils literal notranslate"><spanclass="pre">destination_rooms</span></code> table (which gives,
for each room, the <codeclass="docutils literal notranslate"><spanclass="pre">stream_ordering</span></code> of the most recent PDU that needs to be sent to this
destination).</p>
<p>Each iteration of the loop pulls out 50 <codeclass="docutils literal notranslate"><spanclass="pre">destination_rooms</span></code> entries with the oldest
<codeclass="docutils literal notranslate"><spanclass="pre">stream_ordering</span></code>s that are greater than the <codeclass="docutils literal notranslate"><spanclass="pre">last_successful_stream_ordering</span></code>.
In other words, from the set of latest PDUs in each room to be sent to the destination,
the 50 oldest such PDUs are pulled out.</p>
<p>These PDUs could, in principle, now be directly sent to the destination. However, as an
optimisation intended to prevent overloading destination homeservers, we instead attempt
to send the latest forward extremities so long as the destination homeserver is still
eligible to receive those.
This reduces load on the destination <strong>in aggregate</strong> because all Synapse homeservers
will behave according to this principle and therefore avoid sending lots of different PDUs
at different points in the DAG to a recovering homeserver.
<em>This optimisation is not currently valid in rooms which are partial-state on this homeserver,
since we are unable to determine whether the destination homeserver is eligible to receive
the latest forward extremities unless this homeserver sent those PDUs — in this case, we
just send the latest PDUs originating from this server and skip this optimisation.</em></p>
<p>Whilst PDUs are sent through this mechanism, the position of <codeclass="docutils literal notranslate"><spanclass="pre">last_successful_stream_ordering</span></code>
is advanced as normal.
Once there are no longer any rooms containing outstanding PDUs to be sent to the destination
<em>that are not already in the <codeclass="docutils literal notranslate"><spanclass="pre">PerDestinationQueue</span></code> because they arrived since Catch-Up Mode
was enabled</em>, Catch-Up Mode is exited and we return to <codeclass="docutils literal notranslate"><spanclass="pre">_transaction_transmission_loop</span></code>.</p>
<h4class="rubric"id="a-note-on-failures-and-back-offs">A note on failures and back-offs</h4>
<p>If a remote server is unreachable over federation, we back off from that server,
with an exponentially-increasing retry interval.
Whilst we don’t automatically retry after the interval, we prevent making new attempts
until such time as the back-off has cleared.
Once the back-off is cleared and a new PDU or EDU arrives for transmission, the transmission
loop resumes and empties the queue by making federation requests.</p>
<p>If the backoff grows too large (> 1 hour), the in-memory queue is emptied (to prevent
unbounded growth) and Catch-Up Mode is entered.</p>
<p>It is worth noting that the back-off for a remote server is cleared once an inbound
request from that remote server is received (see <codeclass="docutils literal notranslate"><spanclass="pre">notify_remote_server_up</span></code>).
At this point, the transaction transmission loop is also started up, to proactively
send missed PDUs and EDUs to the destination (i.e. you don’t need to wait for a new PDU
or EDU, destined for that destination, to be created in order to send out missed PDUs and