View the history of public and world readable Matrix rooms
Go to file
Eric Eastwood 2ecf7bd786 Prevent Cloudflare from overriding our own 504 timeout page
Explored in https://gitlab.matrix.org/matrix-public-archive/deployment/-/issues/2

> Cloudflare returns an Cloudflare-branded HTTP 502 or 504 error when your origin web server responds with a standard HTTP 502 bad gateway or 504 gateway timeout error:
>
> *-- https://developers.cloudflare.com/support/troubleshooting/cloudflare-errors/troubleshooting-cloudflare-5xx-errors/#502504-from-your-origin-web-server*

The only way to disable this functionality is to have an Enterprise Cloudflare plan and use the `Enable Origin Error Pages` option:

> Enable Origin Error Pages
>
> When Origin Error Page is set to “On”, Cloudflare will proxy the 502 and 504 error pages directly from the origin.
>
> Requires Enterprise or higher

So instead of dealing with that headache, we're just working around this by
responding with a 500 error when we timeout. Should be good enough I think.
The user won't know any difference but may affect what Search Engines think.
Not sure they care about the distinction since the page is slow to respond
anyway which they punish.
2023-05-11 16:11:35 -05:00
.github/workflows Timeout requests and stop processing further (#204) 2023-05-02 00:39:01 -05:00
build-scripts Fix styles on timeout page (#203) 2023-05-01 15:13:16 -05:00
client Add image metadata for URL previews (#224) 2023-05-10 00:50:12 -05:00
config Prevent Cloudflare from overriding our own 504 timeout page 2023-05-11 16:11:35 -05:00
docs Various updates to put `archive.matrix.org` in the forefront (#220) 2023-05-05 17:42:28 -05:00
server Prevent Cloudflare from overriding our own 504 timeout page 2023-05-11 16:11:35 -05:00
shared Add image metadata for URL previews (#224) 2023-05-10 00:50:12 -05:00
test Mark NSFW room pages with `<meta name="rating" content="adult">` (#216) 2023-05-05 15:36:26 -05:00
.eslintignore Fix lints 2022-02-15 21:33:31 -06:00
.eslintrc.json Migrate from `eslint-plugin-node` to `eslint-plugin-n` (#179) 2023-04-25 00:39:59 -05:00
.gitignore Production ready build (#175) 2023-04-24 23:50:53 -05:00
.prettierignore Add linting to CI (#74) 2022-09-27 22:21:00 -05:00
.prettierrc.json SSR with linkedom 2022-02-03 23:44:50 -06:00
CHANGELOG.md Prepare changelog with initial release (#227) 2023-05-11 15:38:11 -05:00
Dockerfile Timeout requests and stop processing further (#204) 2023-05-02 00:39:01 -05:00
LICENSE.md Add Apache 2.0 license (#55) 2022-08-30 18:35:36 -05:00
README.md Various updates to put `archive.matrix.org` in the forefront (#220) 2023-05-05 17:42:28 -05:00
docker-health-check.js Make sure container is able to start up (#23) 2022-06-15 17:12:44 -05:00
package-lock.json 0.1.0 2023-05-11 15:38:21 -05:00
package.json 0.1.0 2023-05-11 15:38:21 -05:00

README.md

Matrix Public Archive

Join the community and get support at #matrix-public-archive:matrix.org

In the vein of feature parity with Gitter, the goal is to make a public archive site for world_readable Matrix rooms like Gitter's archives which search engines can index and keep all of the content accessible/available.

Try it out: archive.matrix.org 🌌

Room directory homepage Archive room view
A reference for how the Matrix Public Archive homepage looks. Search bar where you can find thousands of rooms using Matrix and homeserver selector. Grid of room cards showing the results. A reference for how the Matrix Public Archive looks. Showing off a day of messages in #gitter:matrix.org on 2021-08-06. There is a date picker calendar in the right sidebar and a traditional chat app layout on the left.

Demo videos

  • May 2023: Introducing archive.matrix.org, the shiny new public instance of the Matrix Public Archive that everyone can share and link to.
  • Aug 2022 (blog post): A quick intro of what the project looks like, the goals, what it accomplishes, and how it's a new portal into the Matrix ecosystem.
  • Oct 2022: Showing off the room directory landing page used to browse everything available in the archive.

Technical overview

We server-side render (SSR) the Hydrogen Matrix client on a Node.js server (since both use JavaScript) and serve pages on the fly (with some Cloudflare caching on top) when someone requests /archives/r/matrixhq:matrix.org/${year}/${month}/${day}. To fetch the events for a given day/time, we use MSC3030's /timestamp_to_event endpoint to jump to a given day in the timeline and fetch the messages from a Matrix homeserver.

Re-using Hydrogen gets us pretty and native(to Element) looking styles and keeps the maintenance burden of supporting more event types in Hydrogen.

FAQ

See the FAQ page.

Setup

Prerequisites

Get the app running

$ npm install
$ npm run build

# Edit `config/config.user-overrides.json` so that `matrixServerUrl` points to
# your homeserver and has `matrixAccessToken` defined
$ cp config/config.default.json config/config.user-overrides.json

$ npm run start

Development

# Clone and install the `matrix-public-archive` project
$ git clone git@github.com:matrix-org/matrix-public-archive.git
$ cd matrix-public-archive
$ npm install

# Edit `config/config.user-overrides.json` so that `matrixServerUrl` points to
# your homeserver and has `matrixAccessToken` defined
$ cp config/config.default.json config/config.user-overrides.json

# This will watch for changes, rebuild bundles and restart the server
$ npm run start-dev

If you want to make changes to the underlying Hydrogen SDK as well, you can locally link it into this project with the following instructions:

# We need to use a draft branch of Hydrogen to get the custom changes needed for
# `matrix-public-archive` to run. Hopefully soon, we can get all of the custom
# changes mainlined so this isn't necessary.
$ git clone git@github.com:vector-im/hydrogen-web.git
$ cd hydrogen-web
$ git checkout madlittlemods/matrix-public-archive-scratch-changes
$ yarn install
$ yarn build:sdk
$ cd target/ && npm link && cd ..
$ cd ..

$ cd matrix-public-archive
$ npm link hydrogen-view-sdk

Running tests

See the testing documentation.

Tracing

See the tracing documentation.