2023-04-26 23:22:41 -06:00
|
|
|
# FAQ
|
|
|
|
|
|
|
|
## Can I run my own instance?
|
|
|
|
|
|
|
|
Yes! We host a public canonical version of the Matrix Public Archive at
|
|
|
|
[archive.matrix.org](https://archive.matrix.org/) that everyone can use but feel free to
|
|
|
|
also run your own instance (setup instructions in the [readme](../README.md)).
|
|
|
|
|
|
|
|
## How is this different from [`view.matrix.org`](https://view.matrix.org/)?
|
|
|
|
|
|
|
|
https://view.matrix.org/ (https://github.com/matrix-org/matrix-static) already existed
|
|
|
|
before the Matrix Public Archive but there was some desire to make something with more
|
|
|
|
Element-feeling polish and loading faster (avoid the slow 502's errors that are frequent
|
|
|
|
on `view.matrix.org`).
|
|
|
|
|
2023-04-27 20:41:22 -06:00
|
|
|
And with the introduction of the jump to date API via
|
|
|
|
[MSC3030](https://github.com/matrix-org/matrix-spec-proposals/pull/3030), we could show
|
|
|
|
messages from any given date and day-by-day navigation.
|
|
|
|
|
2023-06-09 15:05:20 -06:00
|
|
|
## Why did the archive bot join my room?
|
|
|
|
|
|
|
|
Only public Matrix rooms with `shared` or `world_readable` [history
|
|
|
|
visibility](https://spec.matrix.org/latest/client-server-api/#room-history-visibility) are
|
|
|
|
accessible in the Matrix Public Archive. In some clients like Element, the `shared`
|
|
|
|
option equates to "Members only (since the point in time of selecting this option)" and
|
|
|
|
`world_readable` to "Anyone" under the **room settings** -> **Security & Privacy** ->
|
|
|
|
**Who can read history?**.
|
|
|
|
|
|
|
|
But the archive bot (`@archive:matrix.org`) will join any public room because it doesn't
|
|
|
|
know the history visibility without first joining. Any room without `world_readable` or
|
|
|
|
`shared` history visibility will lead a `403 Forbidden`. And if the public room is in
|
|
|
|
the room directory, it will be listed in the archive but will still lead to a `403
|
|
|
|
Forbidden` in that case.
|
|
|
|
|
|
|
|
The Matrix Public Archive doesn't hold onto any data (it's
|
|
|
|
stateless) and requests the messages from the homeserver every time. The
|
|
|
|
[archive.matrix.org](https://archive.matrix.org/) instance has some caching in place, 5
|
|
|
|
minutes for the current day, and 2 days for past content.
|
2023-05-30 12:57:27 -06:00
|
|
|
|
2023-06-09 15:05:20 -06:00
|
|
|
The Matrix Public Archive only allows rooms with `world_readable` history visibility to
|
|
|
|
be indexed by search engines. See the [opt
|
|
|
|
out](#how-do-i-opt-out-and-keep-my-room-from-being-indexed-by-search-engines) topic
|
|
|
|
below for more details.
|
2023-05-30 12:57:27 -06:00
|
|
|
|
2023-06-09 15:05:20 -06:00
|
|
|
### Why does the archive user join rooms instead of browsing them as a guest?
|
2023-05-30 12:57:27 -06:00
|
|
|
|
|
|
|
Guests require `m.room.guest_access` to access a room. Most public rooms do not allow
|
|
|
|
guests because even the `public_chat` preset when creating a room does not allow guest
|
|
|
|
access. Not being able to view most public rooms is the major blocker on being able to
|
|
|
|
use guest access. The idea is if I can view the messages from a Matrix client as a
|
|
|
|
random user, I should also be able to see the messages in the archive.
|
|
|
|
|
2023-06-09 15:05:20 -06:00
|
|
|
Guest access is also a much different ask than read-only access since guests can also
|
|
|
|
send messages in the room which isn't always desirable. The archive bot is read-only and
|
|
|
|
does not send messages.
|
|
|
|
|
|
|
|
## How do I opt out and keep my room from being indexed by search engines?
|
|
|
|
|
|
|
|
Only public Matrix rooms with `shared` or `world_readable` history visibility are
|
|
|
|
accessible to view in the Matrix Public Archive. But only rooms with history visibility
|
|
|
|
set to `world_readable` are indexable by search engines.
|
|
|
|
|
|
|
|
Also see https://github.com/matrix-org/matrix-public-archive/issues/47 to track better
|
|
|
|
opt out controls.
|
|
|
|
|
|
|
|
As a workaround for [archive.matrix.org](https://archive.matrix.org/) today, you can ban
|
|
|
|
the `@archive:matrix.org` user if you don't want your room content to be shown in the
|
|
|
|
archive at all.
|
2023-04-26 23:22:41 -06:00
|
|
|
|
|
|
|
## Technical details
|
|
|
|
|
2023-05-05 16:42:28 -06:00
|
|
|
The main readme has a [technical overview](../README.md#technical-overview) of the
|
2023-04-26 23:22:41 -06:00
|
|
|
project. Here are a few more details.
|
|
|
|
|
|
|
|
### How do I figure out what version of the Matrix Public Archive is running?
|
|
|
|
|
|
|
|
Just visit the `/health-check` endpoint which will return information like the following:
|
|
|
|
|
|
|
|
```
|
|
|
|
{
|
|
|
|
"ok": true,
|
|
|
|
"commit": "954b22995a44bf11bfcd5850b62e206e46ee2db9",
|
|
|
|
"version": "main",
|
|
|
|
"versionDate": "2023-04-05T09:26:12.524Z",
|
|
|
|
"packageVersion": "0.0.0"
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
### How does the archive room URL relate to what is displayed on the page?
|
|
|
|
|
|
|
|
We start the end of the date/time specified in the URL looking backward up to the limit.
|
|
|
|
|
|
|
|
### Why does the time selector only appear for some pages?
|
|
|
|
|
|
|
|
The time selector only appears for pages that have a lot of messages on a given
|
|
|
|
day/hour/minute/second (more than the configured `archiveMessageLimit`).
|