Commit Graph

107 Commits

Author SHA1 Message Date
Eric Eastwood 858c9dde8b
We can better detect static assets to avoid tracing nowadays (#207)
Because all assets are served from `/assets` since https://github.com/matrix-org/matrix-public-archive/pull/175
2023-05-02 00:55:22 -05:00
Eric Eastwood 9078abf4f1
Timeout requests and stop processing further (#204)
Fix https://github.com/matrix-org/matrix-public-archive/issues/148
Fix https://github.com/matrix-org/matrix-public-archive/issues/40

 - Apply timeout middleware to all room directory and room routes
 - Stop messing with the response after we timeout. Fix https://github.com/matrix-org/matrix-public-archive/issues/148
    - This also involves cancelling any `async/await` things like requests in the routes so we throw an abort error instead of continuing on. Fix https://github.com/matrix-org/matrix-public-archive/issues/40
 - Also abort the route if we see that the user closed the request before we could respond to them
 - Bumps minimum supported Node.js version to v18 because we're now using the built-in native `fetch` in Node.js vs `node-fetch`. This gives us the custom `signal.reason` that we aborted with instead of a generic `AbortError`.
    - This also means we had to add some instrumentation for `fetch` which uses `undici` under the hood. Settled on some unofficial instrumentation: [`opentelemetry-instrumentation-fetch-node`](https://www.npmjs.com/package/opentelemetry-instrumentation-fetch-node)
2023-05-02 00:39:01 -05:00
Eric Eastwood f3318446f8
Expose child errors that only occur in stderr log output (#205)
Who knows why we can't capture these errors via the more conventional `child.on('error', (err) => { })` listener 🤷 


### Before

```
RethrownError: Failed to render Hydrogen to string. In order to reproduce, feed in these arguments into `renderHydrogenToString(...)`:
    renderHydrogenToString arguments: { ... }
    at renderHydrogenToString (server/hydrogen-render/render-hydrogen-to-string.js:58:11)
    --- Original Error ---
    RethrownError: Child process exited with code 1
        at assembleErrorAfterChildExitsWithErrors (server/child-process-runner/run-in-child-process.js:60:29)
        --- Original Error ---
        No child errors
```

### After

```
RethrownError: Failed to render Hydrogen to string. In order to reproduce, feed in these arguments into `renderHydrogenToString(...)`:
    renderHydrogenToString arguments: { ... }
    at renderHydrogenToString (server/hydrogen-render/render-hydrogen-to-string.js:58:11)
    --- Original Error ---
    RethrownError: Child process exited with code 1
        at assembleErrorAfterChildExitsWithErrors (server/child-process-runner/run-in-child-process.js:60:29)
        --- Original Error ---
        No child errors but there might be something in stderr=node:internal/modules/cjs/loader:936
          throw err;
          ^

        Error: Cannot find module '../lib/rethrown-error'
        Require stack:
        - server/child-process-runner/child-fork-script.js
            at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
            at Function.Module._load (node:internal/modules/cjs/loader:778:27)
            at Module.require (node:internal/modules/cjs/loader:1005:19)
            at require (node:internal/modules/cjs/helpers:102:18)
            at Object.<anonymous> (server/child-process-runner/child-fork-script.js:8:23)
            at Module._compile (node:internal/modules/cjs/loader:1103:14)
            at Object.Module._extensions..js (node:internal/modules/cjs/loader:1155:10)
            at Module.load (node:internal/modules/cjs/loader:981:32)
            at Function.Module._load (node:internal/modules/cjs/loader:822:12)
            at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12) {
          code: 'MODULE_NOT_FOUND',
          requireStack: [
            'server//child-process-runner//child-fork-script.js'
          ]
        }
```
2023-05-01 17:33:48 -05:00
Eric Eastwood 0df1a79754
Fix styles on timeout page (#203)
Fix styles on timeout page since we started using the `manifest.json` for asset paths in https://github.com/matrix-org/matrix-public-archive/pull/175.
2023-05-01 15:13:16 -05:00
Eric Eastwood 53a1d4b43b
Update docs in preparation for Matrix Public Archive being generally available (#194) 2023-04-27 00:22:41 -05:00
Eric Eastwood f71fc2bb9c
Cache derived info from the `manifest.json` (#191)
- Like getting all of the dependencies for a given entry point
 - And the favicons
 
Also fix the problem where `server/hydrogen-render/render-page-html.js` was calling `getFaviconAssetUrls()` right away before the client build had a chance to generate `dist/manifest.json` and result in `Error: Cannot find module '../../dist/manifest.json'`
2023-04-26 17:04:49 -05:00
Eric Eastwood c297270f39
Link prior art and reasoning why we still always join before (#190)
See https://github.com/matrix-org/matrix-public-archive/issues/50
2023-04-26 16:39:53 -05:00
Eric Eastwood e20a67d2ba
Preload fonts and images (#187)
Part of https://github.com/matrix-org/matrix-public-archive/issues/132
2023-04-26 16:35:00 -05:00
Eric Eastwood 27863a1945
Iterate on `crossorigin` language in `Link` preload header comments (#186)
Hopefully more accurate now 🤞
2023-04-26 04:05:11 -05:00
Eric Eastwood a3952f1d31
Fix preload link headers (#185)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/171 and https://github.com/matrix-org/matrix-public-archive/pull/175 where they broke because we went from scripts to modules.

Part of https://github.com/matrix-org/matrix-public-archive/issues/132

Before this PR, we were seeing these warning in the Chrome devtools console:

```
A preload for 'foo' is found, but is not used because the request credentials mode does not match. Consider taking a look at crossorigin attribute.
```

This is caused by a credentials mode mismatch between the `<script type="module">` tag and the `Link` header. A `<script type="module">` with no `crossorigin` attribute indicates a credentials mode of `omit` and a naive `Link: </foo-url>; rel=preload; as=script;` has a  default credentials mode of `same-origin`, hence the mismatch and warning we're seeing.

We could set the credentials mode to match using `Link: </foo-url>; rel=preload; as=script; omit` but there is an even better option! We can use the dedicated `Link: </foo-url>; rel=modulepreload` link type which not only downloads and puts the the file in the cache like a normal preload but the browser also knows it's a JavaScript module now and can parse/compile it so it's ready to go.

---

Future consideration: Adding `nopush` to preload link headers. Many servers initiate an HTTP/2 Server Push when they encounter a preload link in HTTP header form otherwise. Do we want/care about that (or maybe we don't)? (mentioned in https://medium.com/reloading/preload-prefetch-and-priorities-in-chrome-776165961bbf#6f54)

---

References for preload `Link` headers:

  - https://medium.com/reloading/preload-prefetch-and-priorities-in-chrome-776165961bbf#6f54
  - https://html.spec.whatwg.org/multipage/links.html#link-type-preload
  - https://www.smashingmagazine.com/2016/02/preload-what-is-it-good-for/#headers
 - https://developer.chrome.com/blog/modulepreload/#ok-so-why-doesnt-link-relpreload-work-for-modules
2023-04-26 03:29:57 -05:00
Eric Eastwood d3e35a5de1
Make sure to restart the server after Vite `manifest.json` changes (#184)
Make sure to restart the server after Vite `manifest.json` changes so it can pick up the latest and serve pages correctly.
2023-04-26 02:09:46 -05:00
Eric Eastwood 2c12fec1e6
Fix scripts not loading from the production ready build PR (#183)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/175
2023-04-25 03:54:49 -05:00
Eric Eastwood 630e58fadc
Remove stray logs (#181)
Accidentally introduced in https://github.com/matrix-org/matrix-public-archive/pull/175
2023-04-25 01:21:53 -05:00
Eric Eastwood ac1419cdca
Only `require.resolve(...)` the path once (#180)
Perhaps an early optimization or not even needed but doesn't seem wise to keep pulling this over and over (best case it's cached).
2023-04-25 00:50:43 -05:00
Eric Eastwood 0f26dc94d3
Migrate from `eslint-plugin-node` to `eslint-plugin-n` (#179) 2023-04-25 00:39:59 -05:00
Eric Eastwood 9c0b6fe85e
Production ready build (#175)
- Rename `public` -> `client` so it doesn't get copied automagically as-is (without hashes which we want for cache busting), https://vitejs.dev/guide/assets.html#the-public-directory
     - We still build the version files to `public/` so their copied as-is and Vite handles it for us (so we can use `emptyOutDir`) 
 - Use a multiple entrypoint `.js` Vite build so things can be more intelligently bundled and take less time
     - We aren't using library mode because it doesn't minify or bundle assets
 - Using hash asset tags for cache busting. Hash of the file included in the file name
 - We lookup these hashed assets from `manifest.json` that Vite builds (https://vitejs.dev/guide/backend-integration.html) to serve and preload
 - In terms of optimized bundles, I know the current output isn't great now but will have to opt to fix that up separately in the future. Tracked by https://github.com/matrix-org/matrix-public-archive/issues/176
2023-04-24 23:50:53 -05:00
Eric Eastwood 50a1d658e8
Only read version tag files once on startup (#174)
We already read it once for the `/health-check` endpoint and cached the response but this way we can use `getVersionTags()` everywhere without worrying about it.

Also, it's no longer `async` so we can use it in things like Express route paths and CDN asset tags more easily.
2023-04-19 15:57:22 -05:00
Eric Eastwood 78ee88e094
Add route identifiers for easy metric reporting (#173)
Pre-requisite for https://github.com/matrix-org/matrix-public-archive/issues/162 and https://github.com/matrix-org/matrix-public-archive/issues/148
2023-04-19 15:09:51 -05:00
Eric Eastwood 27afaea8ca
Serve Hydrogen assets from `/hydrogen-assets/` sub-directory for easier targeting of cache rules (#172)
Fix https://github.com/matrix-org/matrix-public-archive/issues/160
2023-04-19 14:44:12 -05:00
Eric Eastwood 17a39ab8db
Add preload link headers for downstream Cloudflare early hints (#171)
Because it takes us at best several seconds to request information from a homeserver and then server-side render the page, the browser has to wait for the response before it can even try loading the necessary assets. With this change that facilitates early hints, the browser can preload all of the assets necessary before we are done generating the response and will be ready to go by the time we're all done on the server.

Fix https://github.com/matrix-org/matrix-public-archive/issues/32

Part of https://github.com/matrix-org/matrix-public-archive/issues/132

See https://developers.cloudflare.com/cache/about/early-hints/ for information on enabling in Cloudflare
2023-04-19 14:20:01 -05:00
Eric Eastwood 321c6a4f26
Slightly easier to understand renderHydrogenVmRenderScriptToPageHtml API surface (#170) 2023-04-19 13:48:12 -05:00
Eric Eastwood 551b4e72d1
Follow tombstone and predecessor history (#167)
Fix https://github.com/matrix-org/matrix-public-archive/issues/59

Other updates:

 - Update tests to use `/roomid/room1/date/2022/01/03` format instead of trying to retrofit the weird alias stuff on there. Which also makes the fancy to actual URL utilities much more simple.
 - Update to specify `archiveMessageLimit` in the test case because pages have different number of events depending on if we are against a boundary, hidden events, etc.
2023-04-19 01:26:15 -05:00
Eric Eastwood 6c789eae69
Do our best to get the user to the right place and try joining `via` derived server name (#168)
Split out from https://github.com/matrix-org/matrix-public-archive/pull/167
2023-04-11 15:09:44 -05:00
Eric Eastwood e99a0d6912
Rename to build-scripts to it appears in GitHub file finder (#166)
It seems like the `build/` directory is ignored in the GitHub file
finder as a sane default for people who put compiled assets there.

`build-scripts/` probably makes more sense anyway
2023-04-07 13:17:46 -05:00
Eric Eastwood 57d2cb3dd3
Refactor tests to use single source of truth ASCII diagram (#164)
- Less test bulk
 - Single source of truth: there is no mismatch between the comment and the expectations (we already caught a few mistakes in the conversion thanks to this benefit)
 - Easier to maintain and update
2023-04-07 12:52:41 -05:00
Eric Eastwood 954b22995a
Add a way to select time of day (#139)
- Fix https://github.com/matrix-org/matrix-public-archive/issues/7
 - A URL with time looks like
    - `/r/too-many-messages-on-day:my.synapse.server/date/2022/11/16T23:59`
    - Or when more precision is required (seconds): `/r/too-many-messages-on-day:my.synapse.server/date/2022/11/16T23:59:59`
 - Add new custom time picker/scrubber (pictured below) with momentum scrubbing
    - Native built-in `<input type="time">` for easier picking if you prefer that and accessibility.
    - Uses localized time strings
    - Design inspired by Thiago Sanchez's *Time Zone Translate* concept, https://dribbble.com/shots/14590546-Time-Zone-Translate
2023-04-05 04:25:31 -05:00
Philip Durbin 8f9e1631ae
Switch /timestamp_to_event from unstable to stable v1 #142 (#154) 2023-02-16 20:52:28 -06:00
Michael[tm] Smith 2999691eea
Enable CORS support (#147)
This change enables CORS support in the archive — to allow web developers to create web applications with frontend JavaScript code that can fetch pages from the archive (for example, for scraping content from chat logs).

Otherwise, without this change, web developers can’t create web apps with frontend JavaScript that can fetch chat logs from the archive and then consume the content of the logs.

It’s imaginable that web developers may find use cases for consuming the chat logs in the archive from frontend JavaScript code — at the simplest level, web apps that fetch and scrape logs to get data out of them or to pull out particular snippets from the logs.

Developers can anyway already scrape the contents of the archive — by using server-side programming languages or by using `curl` or whatever from the command line. They just can’t do the same from frontend JavaScript code, unless CORS support is enabled.
2022-11-28 21:47:57 -06:00
Michael[tm] Smith 6b493ff807
Only assign `vmContext.global.crypto` if not already global (#143)
Fixes https://github.com/matrix-org/matrix-public-archive/issues/141

Node.js v19 has `crypto` set on the global already, so this change causes `vmContext.global.crypto` to be assigned only if `vmContext.global.crypto` isn’t already defined.

Otherwise, without this change, the room directory fails to render in Node.js v19+, and instead _"TypeError: Cannot set property crypto of `#<Object>` which has only a getter"_ gets thrown.
2022-11-18 12:27:50 -06:00
Eric Eastwood 11cbf39460
Add Matrix favicon (#135)
It's a cleaned up version of what [Matrix.org](https://matrix.org/) is using since that one is [so blurry](https://user-images.githubusercontent.com/558581/201302097-411b8033-4281-4cd3-a069-0c97ba3aa01f.png).

Part of https://github.com/matrix-org/matrix-public-archive/issues/94
2022-11-11 14:50:41 -06:00
Eric Eastwood fa4720af04
Increase perceived performance by scrolling to the right spot before Hydrogen loads (#128) 2022-11-09 18:57:33 -06:00
Eric Eastwood dc85e839a1
Add config to disable search engine indexing (#127) 2022-11-08 22:41:58 -06:00
Eric Eastwood b3c553a863
Add comment reference to issue about adding hour chunk time slices (#126) 2022-11-08 22:35:28 -06:00
Eric Eastwood 026a08a77a
Jump forward and backward seamlessly (#121)
Fix https://github.com/matrix-org/matrix-public-archive/issues/120
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/114

 - Uses event permalinking (`?at=$xxx`) to continue the scroll where you should start reading again.
 - When we jump forwards, we make sure that we go a day back to ensure there isn't more than the page limit between where we jumped from and the day so we don't lose any messages in a gap.
2022-11-03 05:06:53 -05:00
Eric Eastwood 2dff7ecea5
Refactor `fetchEndpointAsText`/`fetchEndpointAsJson` to return `res` alongside `data` (#122)
Split out of https://github.com/matrix-org/matrix-public-archive/pull/121
where we needed to use `res.url`.
2022-11-03 04:12:00 -05:00
Eric Eastwood 08254cbb49
Add a way to jump forwards and backwards to more activity in the room (seamless navigation) (#114)
Fix https://github.com/matrix-org/matrix-public-archive/issues/46
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/71

Summary:

 - Changes the "Jump to next activity in room" to actually continue you to the next 100 messages ahead. Previously, it only jumped you to the single next event in the room which meant a lot of backwards overlap each time.
    - Jumping this direction will also start your scroll position at the top of the timeline to continue reading seamlessly `?continue=top`
 - Adds "Jump to previous activity in room" to the top of the timeline to continue reading the previous part of the conversation.

[1]: There is a caveat with seamless here which is also commented on in the code:

> XXX: This is flawed in the fact that when we go `/messages?dir=b` it could  backfill messages which will fill up the response before we perfectly connect and  continue from the position they were jumping from before. When `/messages?dir=f`  backfills, we won't have this problem anymore because any messages backfilled in  the forwards direction would be picked up the same going backwards.

(need forwards fill MSC)
2022-11-02 04:27:30 -05:00
Eric Eastwood 2b4ecb737a
Add support for client-side room alias hash `#` redirects to the correct URL (#111)
This helps when someone just pastes a room alias on the end of the domain,

 - `/#room-alias:server` -> `/r/room-alias:server`
 - `/r/#room-alias:server/date/2022/10/27` -> `/r/room-alias:server/date/2022/10/27`

Since these redirects happen on the client, we can't write any e2e tests. Those e2e tests do everything but run client-side JavaScript.

Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/107

Part of https://github.com/matrix-org/matrix-public-archive/issues/25
2022-10-28 00:32:24 -05:00
Eric Eastwood 7a88ea0c19
Add support for room aliases (#107)
Also does friendly redirects if you don't exactly use the right URL pattern.
For example, if you paste the full room ID with the `!` like `/roomid/!foo:bar`,
it will properly redirect you to `/roomid/foo:bar`. It also does this sort of
thing for URL encoded room ID's and aliases.

Fix https://github.com/matrix-org/matrix-public-archive/issues/25
2022-10-27 01:09:13 -05:00
Eric Eastwood 1e89179f09
Page-load with the correct homeserver selected (#98)
Page-load with the correct homeserver selected (according to `?homeserver`).

Fix https://github.com/matrix-org/matrix-public-archive/issues/92

Also makes sure that the `?homeserver` is always available somewhere in the list; whether that be in the available homeserver list or the added homeserver list depending on it someone cleared it out or never had it because they visited from someone else's link.
2022-10-21 02:09:26 -05:00
Eric Eastwood 6bb88b1ecd
Load room directory and show error message when we're unable to fetch rooms (#96)
Follow-up to https://github.com/matrix-org/matrix-public-archive/pull/84 to address https://github.com/matrix-org/matrix-public-archive/issues/80

Also explains why we show the details of the error message.

Part of https://github.com/matrix-org/internal-config/issues/1342

Related to https://github.com/matrix-org/matrix-public-archive/issues/97
2022-10-20 22:48:00 -05:00
Eric Eastwood b34c1b817d
Add homeserver selector to room directory landing page (#87)
Opting for the simple solution and using `include_all_networks` instead of needing to fetch the information about the third-party networks.

Fix https://github.com/matrix-org/matrix-public-archive/issues/6 (last piece done with this PR)
2022-10-20 02:06:43 -05:00
Eric Eastwood a0089b0fe4
Add `Content-Security-Policy` (CSP) (#81)
Add `Content-Security-Policy` (CSP) that restricts the page to just what it is expected to do.

This helps limit the damage that can be done by any XSS attack.

Fix https://github.com/matrix-org/internal-config/issues/1341
2022-10-19 12:07:39 -05:00
Eric Eastwood df89750401
Throw more understandable error when we fail to fetch from the homeserver room directory (#84)
Fix https://github.com/matrix-org/matrix-public-archive/issues/80

```
RethrownError: Unable to fetch rooms from room directory (homeserver=http://localhost:8008/)
    searchTerm=, paginationToken=undefined, limit=9
    at matrix-public-archive\server\routes\room-directory-routes.js:55:13
    --- Original Error ---
    Error: HTTP Error Response: 500 Internal Server Error: {"errcode":"M_UNKNOWN","error":"Internal server error"}
        URL=http://localhost:8008/_matrix/client/v3/publicRooms?
        at checkResponseStatus (matrix-public-archive\server\lib\fetch-endpoint.js:21:11)
        at processTicksAndRejections (node:internal/process/task_queues:96:5)
        at async fetchEndpoint (matrix-public-archive\server\lib\fetch-endpoint.js:38:3)
        at async fetchEndpointAsJson (matrix-public-archive\server\lib\fetch-endpoint.js:63:15)
        at async fetchPublicRooms (matrix-public-archive\server\lib\matrix-utils\fetch-public-rooms.js:26:26)
        at async matrix-public-archive\server\tracing\trace-utilities.js:31:24
        at async matrix-public-archive\server\routes\room-directory-routes.js:45:62
```
2022-10-18 16:42:33 -05:00
Eric Eastwood b8062b16a2
Fix wrong path to Hydrogen styles on timeout error page (#83)
Regressed in https://github.com/matrix-org/matrix-public-archive/pull/61 where we tried to serve this under `/css/hydrogen-styles.css` but it doesn't work because all of the image and font references in the CSS file expect it to be at the domain root so just reverted back to serving at the root `/`.
2022-10-18 03:42:37 -05:00
Eric Eastwood f796afe55e
Sanity check that we are not leaking the access token to the client (#82)
This isn't spawning from any previous security issue. Just adding an extra check to help ensure we don't ever regress this in the future.

```
AssertionError [ERR_ASSERTION]: We should not be leaking the `config.matrixAccessToken` to the Hydrogen render function because this will reach the client!
    at renderHydrogenToString (matrix-public-archive\server\hydrogen-render\render-hydrogen-to-string.js:24:3)
    at renderHydrogenVmRenderScriptToPageHtml (matrix-public-archive\server\hydrogen-render\render-hydrogen-vm-render-script-to-page-html.js:22:36)
    at matrix-public-archive\server\routes\room-directory-routes.js:53:28
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
```
2022-10-18 02:40:40 -05:00
Eric Eastwood 1d77c721d0
Use rainbow Matrix.org gradient (#75)
Another iteration of the design,
https://www.figma.com/file/lpW5CqaEbPsYX2pmfIhzRo/Matrix-Public-Archive

Part of https://github.com/matrix-org/matrix-public-archive/issues/6
2022-10-18 01:30:26 -05:00
Eric Eastwood 2581f88495
Fix XSS when blatting `window.matrixPublicArchiveContext` to the page (#79)
Fix https://github.com/matrix-org/internal-config/issues/1335
2022-10-13 14:36:04 -05:00
Eric Eastwood ff315141fd
Add domain to tracing service to distinguish different Matrix public archive instances (#76) 2022-10-11 16:03:33 -05:00
Eric Eastwood be837515fe
Show surrounding messages for a full screen of content (#71)
1. Add surrounding messages to the given messages so we have a full screen of content to make it feel lively even in quiet rooms
    - As you scroll around the timeline across different days, the date changes in the URL, calendar, etc
 2. Add summary item to the bottom of the timeline that explains if we couldn't find any messages in the specific day requested 
    - Also allows you to the jump to the next activity in the room. Adds `/:roomId/jump?ts=xxx&dir=[f|b]` to facilitate this.
    - Part of https://github.com/matrix-org/matrix-public-archive/issues/46
 3. Add developer options modal which is linked from the bottom of the right-panel
    - Adds an option so you can debug the `IntersectionObserver` and how it's selecting the active day from the top-edge of the scroll viewport.
    - In the future, this will also include a nice little visualization of the backend timing traces
2022-09-20 16:02:09 -05:00
Eric Eastwood 92668996d7
Add search to room directory landing page (#70)
Part of https://github.com/matrix-org/matrix-public-archive/issues/6
2022-09-15 20:41:55 -05:00